I am bothered by comment spam as much as the next person, but in the pursuit of science (honest) I  also like playing about with the latest spamming tools.

They all work roughly the same way, some better than others, but its all pretty standard…

spam1

The recipe is simple, you spin some comments, submit them to thousands (even millions) of sites in the hope that a few them stick.

The problem is though, 99.9% of them end up sat in spam queues, on a thousand million websites like this:

spam3

Spam is BAD for the  environment!

Think about it!

  • Spamming in the first place uses electricity,
  • spidering and submitting spam relies on copper wiring or fibre optics that have manufacturing costs,
  • it sits in MySql databases on servers that use precious metals,
  • filling hard discs up uses yet more electricity that more often than not is produced by burning fossil fuels!

If we look at the global greenhouse gas emissions since 1990 you can see that: 

Greenhouse Gases have a strong correlation to the amount of spam sent across the internet:

spam5
Now, I’m not saying its absolutely certain that its just spam thats caused this explosion in greenhouse gases, but its a risk I’m not willing to take for this guy:

spam4

In SEO we’ve often paid attention to just as tenuous correlative studies, so this shouldn’t be too huge a logic jump 😉

 

Lets Save the Polar Bears!

The best way that I can think to do this is by increasing the efficiency of comment spam.

Lets call it “my little way of giving back to the planet“.

The problem as I’ve already covered is the hundreds of millions of spam comments left on blogs that will never publish them. Lets start by looking at the near 1,300 comments currently in my moderation queue.

First) login to your wordpress blogs phpmyadmin and grab the table wp_comments:

spam6

and choose to download that as a CSV.

Second) Import that into excel and then filter the list to remove any comments you’ve manually cleared.  If you just export it normally that should be column K, although that may be different so check first.

If all goes well you should now be looking at a list like this of lovely spam comments:

spam7

What we’re really interested in here is column E, which contains the landing page URL of the comment spam.

This is where those indiscriminate spammers are trying to send links to, so lets grab that list and drop it into a another sheet and using the rather excellent SEO Tools for Excel by Niels Bosma, alongside API access from Majestic SEO.

Third) Lets filter out the domains that are likely to be spammers, and set aside the others for a moment (we’ll come back to those in a later post).

Take the Column E from above, and just use the rather nifty function =UrlProperty(A1,”domain) with A1 being the cell reference to the top result in your list of spammy URL destination pages:

spam8

Now you’ll have a nice list of both the specific URL that’s being spammed, and the main domain name itself – the next thing that we need to do is grab the Majestic Citation Flow for the Fully Qualified Domain, and the total number of links pointing at the page in question.

The Citation Flow function is =MajesticSEOIndexItemInfo(A1,”CitationFlow”,”fresh”,TRUE) again, where A1 appears you want to put the first cell reference of your list of FQDs, and grab the entire list.

The External Backlinks per URL function is =MajesticSEOIndexItemInfo(A1,”ExtBackLinks”,”fresh”,TRUE)  again, where A1 appears you want to put the first cell reference of your list of URLs, and grab the entire list.

Now you should end up with something like this:

spam9

Fourth) Now lets filter the list in place, to only domains with a Citation Flow of less than 80 (cF FQD), and sort by total number of backlinks to the URL.

All being well you’re going to see a nice ordered list of the best (well, spammiest) domains that have left comment spam on your blog!

spam13

Select the top 10 or 20 of those URLs, copy them into a new sheet, and filter in place to unique records only.

Now Copy the unique records list, and transpose them into another sheet.  You should now be looking at a list of unique, really spammy URLs with one per column.

Fifth)  In each column we now want to grab the top 1000 spammy links pointing at each, so we need to use a MajesticSEO dump formula: =Dump(MajesticSEOBackLinkData(A1,”SourceURL”,1000,”fresh”,FALSE))

Place the above function BELOW the first URL in your transposed list, and drag it right.  You should now be looking at something like this:

spam10

Arrow 1) This is your row of spam heavy URLs from the previous step, transposed.
Arrow 2) This is your column of the top 1000 backlinks for each of the above URLs.

Lets get save the planet!

Now that you’ve got a list of thousands and thousands of pages that have already accepted spam comments, its relatively easy increasing your comment dropping success rate from around 0.01% well up into the 80-90% area.

You can build spam links with far more efficiency, you’re saving bandwidth, saving hard disc space, and saving the planet!!

spam12

This guy thanks you for saving the world, and I thank you for not submitting comment spam on my blog!

I’m not a blackhat, why do I care?

Well, now that you conceptually know how to mine your comments for URL’s you could just look at comments you’ve already accepted, and check the Citation Flow on those.

Thats a super quick way of finding domains where either the owner or a member of staff have already approached you (by leaving a legitimate comment) and you could reach out to them to see if you can get a link back on their site 😉

 

MartinMacdonald

Founder of WebMarketingSchool.com and a career professional in SEO and web marketing. Experienced in travel, gambling & entertainment niches. Former head of SEO for Omnicom UK, Inbound Marketing Director at Expedia & current Senior Director for SEO at Orbitz Worldwide.

MartinMacdonald

@searchmartin

Head of SEO & Content for @orbitz & @cheaptickets. Blogs @forbes, @huffpost 40+ global conferences & keynotes

@iamrofe @screamingfrog lol, well jel - 1 day ago

Categories: Blackhat, Case Studies

Leave a Reply

13 Responses

  • Gordon Campbell

    Genius! I’m going to use this. You have also given me an idea to extract previous emails into an excel file, remove common email services such as hotmail, gmail and yahoo and use the technique to do the same with previous contacts. Worth a try I think!

    September 3, 2013 at 10:22 am
  • Spamtastic

    I would love to see a peer reviewed article investigating whether comment spam or celebrity gossip traffic is responsible for more penguin deaths worldwide

    September 3, 2013 at 10:29 am
  • Alan Charncok

    To an uneducated person, you seem to be very angry about tinned meat.

    September 3, 2013 at 10:30 am
  • Charles Floate

    As you’re in a sharing mood Martin, I thought I’d join in with a couple additional helpful tactics/tricks that I actively use myself to Spam (Yes, even I do the occasional comment blast!)

    1) When spinning comments, make them niche relevant, take a look at my tutorial on Jacob Kings site:
    http://www.jacobking.com/ultimate-guide-to-scrapebox#Chapter_9_Niche_Relevant_Comments

    2) Use XRumer, GSA & ScrapeBox to pull footprints, see SEOsUnite Tutorial –
    http://www.youtube.com/watch?v=zi0YG1Lf14o

    3) Use different private proxies, if I’m doing specifically niche relevant and not spamming style stuff I’ll go with really expensive proxies from the likes of Elite Private Proxies and using Squid/Buy proxies for a couple 100 cheap proxies to spam with..

    Hope these tips help peoples become better black hats! #BlackHatFoLyfe

    September 3, 2013 at 11:21 am
    • Martin Macdonald

      I haven’t looked, but don’t both of those strategies kill Polar Bears?

      I’m trying to save the world here, not pollute it with yet more comment spam :p

      September 3, 2013 at 11:24 am
      • Charles Floate

        Nope! Niche comments mean higher approval rate due to them being relevant to the content, proxies mean more speed for less resources as they are strong and efficient proxies. The GSA Method means you can remove the need for searching in GSA which is pretty resource intensive.

        September 3, 2013 at 12:19 pm
  • Chris Gedge

    Stop spamming comments and spam Twitter instead! 😉

    September 3, 2013 at 11:29 am
  • Jan-Willem Bobbink

    Doesn’t this blog evoke even more comment spam? 🙂

    September 3, 2013 at 11:32 am
    • Martin Macdonald

      @Jan – I don’t think so, as if you’re going to actually comment spam, you need to use a few other apps and techniques that I haven’t gone into here. What the post is about allows you to be far more targeted in your site list before you start spamming, and hopefully prevents me from having to delete SO MUCH DAMN COMMENT SPAM :p

      😉

      September 3, 2013 at 11:36 am
  • Stoyan Irchev

    Great article! I can certainly see your point, making an economy of both effort and time will certainly help. I don’t think that polar bears would notice 😀 , but steps taken towards more efficiency are steps taken forward.

    September 3, 2013 at 1:04 pm
  • Rajesh

    Nowadays I’m getting too many spams bored with it, that for sharing this trick. Will surely try this, lets start targeted spamming, LOL. I just hate spamming …

    September 4, 2013 at 8:56 pm