Negative SEO vs. MattCutts.com

Negative SEO has been a hot topic for a number of years now, and we all know the typical routes to demoting your competition:

1) buy loads of crappy links (ie. xrumer/fiverr blasts)
2) duplicate the target site across hundreds of other domains, (squidoo lenses for instance)
3) breach GWMT security and de-index
4) manipulate mass spam annotations (email/DMCA’s/spamhaus etc)
5) link takedowns (send email to people linking to a competitor, and ask/threaten them to remove them)

Any of those might work, but none are particularly reliable.
Most actually stand a good chance of positively influencing their results instead.
Crucially, its not hard to notice this stuff going on, as long as you know where/how to look.

A savvy SEO would probably take another route.*

Evil Panda
(*by recruiting Pandas & Penguins, and directing
them towards problems you can create).

If I were going to look for SEO vulnerabilities, I would be more inclined to just carry out a full SEO audit on a competitors website and look for weak spots, then try and optimise their weakness.

Example: MattCutts.com

Lets just say I have a site that I wanted to rank for the term, “iPhone User Agent“.

While doing some competitive research, I stumble on the domain mattcutts.com which ranks 4th:

Matt Cutts negative SEO

So after checking out his site, we can see that its powered by wordpress, therefore *could* have all sorts of vulnerabilities, but lets assume that the blog owner keeps everything up to date, and doesn’t have any rogue plugins installed.

Manufacturing Duplicate Content

Lets be clear, if someone copies your content, there is very little you can do.
Lets also be clear, Google is really bad at identifying the source of any piece of content.

BUT – it is YOUR RESPONSIBILITY to make sure you haven’t allowed a technical problem that might result in duplicate content issues on site.

There are two main ways of manufacturing on-site duplicate content, both have a major dependency:
(if this is not in place, negative SEO would not work)

** a page with a DIFFERENT URL containing the same content MUST NOT render an http 200 header **

You could just add a query string for instance: www.mattcutts.com/blog/iphone-user-agent/?variable=dupecontent that would serve the purpose of rendering the page again, with the same content, thus creating duplicate content.  

UNLESS the target site has rel=canonical correctly implemented (which it does).

But thats not all we can do when screwing around with URL’s…

Wildcard Subdomains

Basically, a really bad idea, 99% of the time.  You can see a site that allows wildcard subdomains by inserting a random character string before the FQD, lets take a look at what happens when you go to:

blahblah.martinmacdonald.net

wildcard subdomain

That is behaving EXACTLY as its meant to, it returns a straight 404 not found response code, but lets look at a similar page on MattCutts.com:

Matt Cutts Negative SEO
Using ANY subdomain apart from www.* results in a full duplicate copy of the Matt Cutts blog, each page returning a 200 http response.  Pretty handy if you wanted to, for instance, de-index it for a specific keyword term. 😉

So lets pick on that iPhone post again.  For google to see it, I need to link to it.  Something like this would work: iPhone User Agent.  Now google is going to see the page, and index it.  Unless that Rel=Canonical is setup correctly again, so lets check:

Matt Cutts rel canonical

as you can see in the screenshot, that page is correctly canonicalised to the correct page, on the REAL subdomain.  

DAMNIT. The blog is protected.  Isn’t it?

Well, not quite.  Knowing that the server is accepting wildcard subdomain requests, we know there is an SEO vulnerability – and this motivates us like nothing else to find a route through the protection, so lets fire up Screaming Frog!

First thing is to configure the spider to search for the inclusion (or omission) of a rel=canonical tag (handy guide available here from SEER) and find pages that DO NOT HAVE REL CANONICAL in the source code.

Surprisingly, this turned out a few URL’s, crucially the homepage and the /blog/ homepage are included in that list.  So now we know we can screw with the rankings for those pages by creating on-domain duplicate content, and using searchmetrics we can pretty quickly work out what those pages rank for:

Matt Cutts Rankings

I particularly like the rankings: SEO Blog & webmaster blog so lets start with those, and simply put the links in this line of text should be enough to use the Panda algorithm update to confuse google into not ranking the site, by creating domain level duplicate content.

By linking to the duplicate page with exact match anchor text, we theoretically are sending google a signal that those pages are significant for the query as used in the anchor text, and by MattCutts.com serving the same page as it would do on the ‘www.’ subdomain its creating duplicate content.

Sweet!

So we’ve found a vulnerability, and taken advantage of it to hurt rankings on the target site, but what about that iphone user agent post?
Well: unfortunately, the above would really only work on pages where you can create duplicates without a canonical tag, and that post has one – so this will not directly impact that ranking.  It will however negatively impact the site as a whole, and if it were replicated hundreds of thousands / millions of times, its likely to cause significant crawl equity and subsequent ranking problems to the main site.

Hey, I’d love to help test this:

If you would like to help out this negative SEO test, please just link to the following pages and anchor texts:

http://seo-blog.mattcutts.com/blog/  Anchor text: “SEO Blog”
http://webmaster-blog.mattcutts.com/blog/  Anchor text: “Webmaster Blog”

From any sites that you have direct access to.  As with all negative SEO tests, I strongly recommend only doing this on domains that you can immediately remove the links from in future should the need arise.

Hey, I’m Matt Cutts and I’d like to prevent this:

Two ways: either correct the server config to disallow wildcard subdomains,
OR
make totally sure that every page on your site has the correct fully qualified URL within the rel canonical tag.

Footnote:

a quick way of finding which pages have been linked to externally, but do not carry a rel canonical (hence are indexed) is by doing this:

cutts

which reveals some other interesting stuff:

1) there is a duplicate installation of wordpress on mattcutts.com under the /blog1/ folder.  My guess is that if you wanted to brute force a WP installation on that domain, choosing this one is probably a good idea.

2) the second orange arrow points out the subdomain seomofo.mattcutts.com so I assume that Darren Slatten has also noticed this vulnerability.

 

(angry)

UPDATE 


As if to prove my point, this very post (with several hundred tweets/plus ones etc.) is ALREADY being outranked by a near identical copy, which even links to my page:  http://www. blackhatworld.com/ blackhat-seo/black-hat-seo /565146-negative-search-engine-optimization-vs-mattcutts-com.html (you will have to remove spaces to get the link to work).

ffs

as you can see, the BHW thread already outranks me for my OWN CONTENT.

Let me just remind you that “my own content” has shitloads of social citations, probably quite a few links, and not to mention a LINK FROM THE COPIED ARTICLE AS  WELL.  It really shouldn’t be this hard Google.

 

(Happy)

Update

So less than 20 hours after posting the above, Matt appears to have taken the steps noted above and prevented wildcard subdomains from rendering on his personal domain:

omgwtfmattcutts

 

 

Footnote:
If Matt Cutts considered it important enough to fix, you probably should as well!

82 thoughts on “Negative SEO vs. MattCutts.com”

  1. Good sleuthing, although I can see another solution for Matt:

    He could serve a 301 redirect, or 404/410 response on all but the www sub-domain, except where he’s sure it is Martin Macdonald looking at the site.

    The classic cloaking double bluff 🙂

  2. There is a way to NSEO sites with the canonical links that are set up correctly. It’s barely an issue.

    You need to put the site in a state where googles spiders can no longer crawl, thus they think you shut the site down.

    DDOS is king here. Would I recommend it though? no… The only thing the governments are arresting these anonymous hackers for is just this, but if you can pay someone to do it or get away with it then its worth doing if you hate the person enough or badly need the site removed from Google (reputation management).

  3. Yeah – so kind of the point here is that you can theoretically get negative SEO under the radar, not so much by doing anything “dark” but by optimising the other websites problems.

  4. It certainly is a worthy test, well Martin consider it done!

    Created a subdomain and pointed a series of links. Lets see how soon does Matt install Joost plugin 😉 (PS may I say a must for WP)

    cheers for the great article.

  5. Probably worth telling him to type site:mattcutts.com/ -site:www.mattcutts.com into Google to find some of the duplicate content that he already has indexed 😀

  6. Hi Joost,

    Love all the SEO Plugins for wordpress. Do you (or anyone else) do anything similar for weebly sites?

    Long shot but thought I’d ask. Thanks Daniel

  7. Interesting post, to say the least. I’d agree with Yoast, since I’m using his plugin on all my sites. It’s weird to see that even Matt’s site has vulnerabilities that are in fact easy to fix with a solid SEO plugin like Yoast’s.

  8. Nice job of digging, Martin. I’m thinkin’ Matt may plug this hole sometime today. 😉

    Joost, if you hadn’t piped in about your plugin, I’d have done it for you. WP has a number of inherent issues, and WordPress SEO handles most of them very nicely!

  9. That’s cute and all but anyone running a bot blocker will stop Screaming Frog or any other tool from looking for those vulnerabilities.

    Additionally, you can use good old fashioned htaccess or a script and 301 redirects for anything inbound with extra parameters and what not, to filter those out and redirect to the actual page, something that works in all search engines, no rel=canonical needed but it’s a good backup.

    I know this works as I’ve had some SEOs try screwing with one of my sites and deflecting their nonsense was trivial.

    However, if you really want to mess with someone, use a filtering proxy server and remove those rel=canonicals and a few other assorted items on the page and try a good old fashioned 302 hijacking as most sites still don’t properly validate Googlebot and a subset of the old 302 problem still exists. I’ve seen it recently used by a couple of sites duplicating forums in foreign countries.

  10. A similar problem can also arise when Google allows proxy URLs to be indexed — returning dozens of duplicates with your canonical URL systematically replaced.

    It’s a problem I’m familiar with, but have been unable to remedy.

  11. Google is keeping the wrong date from the forum thread and showing it in the SERP. I wonder if this counts for defining the original.
    Anyway, it’s not a news Google choosing a more authoritative domain in SERPs and filtering the original one. One’s more likely to read the same news on BBC rather then on Checkmynews even if the second site is the source.

  12. A good heads-up for the importance of canonical.

    Yet, years later, people still fail to understand what kind of “duplicate content” Panda was after…
    It’s not about exact copies that exist by technicality.
    It’s about the duplication of topics with low quality pages.

    This action, besides being obvious and not triggering anything on Google’s side, will just contribute a few more domain links with good anchors to mattcutts.com

  13. That’s weird you didn’t notice the seomofo sub domain as the guy since while ago did test a similar thing.
    Anyway it’s more weird that Cutts in all this time haven’t logged in on his dns panel to remove that bloody asterisk.

  14. I’m experiencing the same thing with proxy hijackers. It’s easy to turn it right back on the competitors though.

  15. well, Yes Yoast is the best plugin indeed. Matt cutts seems to me that he is still in old age of doing SEO work by doing codding on blog header 😀
    Thanks for writing this post.

    Saif

  16. Wow. This is the most fun and learning I’ve had in a while. Screaming Frog is a great tool! Thanks for the post!

  17. Nice post, but it’s disheartening to see that this kind of lack of awareness/education is still even an issue.

    Protect your site(s), shore up WordPress, set permissions appropriately at every level, and use canonical tags properly. Failure to do any of these results in vulnerabilities and/or performance hits. Not rocket science, and certainly not new news.
    Just disappointing that most sites are *still* sub-standard, and even Matt Cutts can’t be bothered to run a properly configured website. Slack.

  18. Martin, you obviously dont have got a clue how search engines work, or how Negative SEO works !!!!

    If a >forum< post can outrank this page, then obviously your site has issues, so ironic, isn't it !!!!

  19. Good read. An interesting way to educate people on how to “smarten up” with their websites. 🙂

    FYI – Similar things have been tried before. The original SEOMofo article can be found here:
    http://www.seomofo.com/experiments/spam-search-results.html

    The update at the end is the best part for me. It really shouldn’t be this hard to look at signals to help determine who is (most likely) the original author of a specific article!

  20. but if you are doing NSEO you give 0 fucks and a shit about if you are doing it the ‘good’ way through your method or the evil one like mine.

  21. Love this… a very good example of SEO’s place in the online marketing mix. Demonstrates that a technical understanding of how to optimise and protect your site is just as important as all the other more visible elements of digital marketing.

  22. I’m up for a but of spamming, err, experimenting… How spammy would you like the sources of those links to be? Will 100,000 suffice?

  23. My site lost its rankings for its “about us” page! What a f%&king joke that was.
    Google ranking position #1 are dominated by one page sites that were only created only 4 weeks ago on many high value keywords. I have spent 3 weeks trying to tell Matt Cutts and John Mueller, but both have ignored every message I have sent.

    This shows how Matt Cutts would rather protect his own site than help anyone else.

    Is there anyone from Google here that would care to even listen and fix this damn problem??? Its embarrassing guys!

    The webspam team never reacts in a timely manner so they encourage spam because it works for churn and burn sites.

  24. Interesting post, l’ll try to do some testing on this. I’ve seen some websites using sub domains and I’m not sure if this would be the same with a sub sub domain ?

  25. The problem with Matt’s blog is that it is using Thesis theme which makes WordPress SEO plugin incompatible. Is there any way to make the theme compatible with the plugin?

  26. Actually, Joost, if you view the HTML on Matt’s site, you’ll notice it contains several instances of the string, wp-content/themes/thesis_. As I’m sure you already know, that string is a WordPress convention used by theme developers to indicate that their theme is likely to explode when mixed with SEO plugins. Therefore, a better solution for Matt would be to install a real WordPress theme first…then install your plugin. 😉

  27. Oops!
    Thanks for a great post, I had quite a few domains running the wildcard option, now fixed!

    /Jonas

  28. Don’t worry. We all know you were the first ;).

    Time to update your article again! It surprises me that it has lasted for that long.

  29. I remember reading this on BHW first, then I saw it again today when researching your blog after you did a bit of engagement with me on Twitter (it does work btw!)

  30. I found you on twitter Martin.
    Very interesting blog post, and its a wake up call for me to
    look after my sites.

    Cheers

  31. PANDA 4 is a JOKE and GOOGLE is a bigger joker. Try Googling Web design or web design company.

    A company called stectech.com ranks #1 organically by shady linking on some of the richest adwords keywords ( they keep changing their names and domain) These guys offer free download of a plugin and then embed link from their plugin back to their website loaded with keywords. You can use link:Stectech.com to see how many websites link to them, but these links are generated by illegal hidden backlinks. How come Google is letting them get away with murder.

  32. Loved this post, Martin.

    I’m starting to get a picture that Negative SEO is a much bigger problem than Google or anyone else wants to admit. I think that every time they update their algorithm, it only creates more possibilities for negative SEO?

    I just read this blog post the other day: http://yokebreak.com/ultimate-guide-to-negative-seo/

    This guy is describing so many different kinds of penalties…it’s scary really. I wonder how long before “negative seo” becomes a job title?

Comments are closed.