Some our our sites that feature free classified ads (amadorable.com, goatseeker.com, and bunnytrade.com) have been hit with a few recurring spammers trying to plant ads for various off-topic products (like cell phones, etc.) I guess this is a good sign; our sites are visible and spammers feel it is worth their time and trouble to post an ad (and no, as far as I can tell, these are not bots - they're human-generated spam.)
In reviewing the referrer logs, I've noticed that in nearly every case, spammers use search engines to find sites that have been spammed previously using known keywords - or just sites that offer free classified ads or open posting capability. For example, I find these google searches in my logs, just prior to the spammer creating an account and attempting to deposit the spam content.
Spam Bait. Nice Try.
This page has become spam bait of sorts - it shows up in all of the above searches, naturally. A spammer attempted to post an anonymous comment to this page, containing the typical phone spam. My spam filters flagged it immediately, so no big deal. I do wonder if these losers actually read the pages they try to spam? Sheesh.
The interesting thing is that none of the keywords are relevant to the site topic (rather, they are generic "post classified ad" searches). This tells me that the spammers are looking for any site that allows classified ads, and, in some cases, they are looking for sites that have already allowed similar ads to creep into the SERPs.
And You Shall Know Them By Their E-Mail Addresses
Some of the spammer email addresses:
- email@example.com (2006.12.01)
- firstname.lastname@example.org (2006.12.02, amadorable.com, phone spam)
- email@example.com (2006.12.03, bunnytrade.com, spam blocked - no postings allowed)
- firstname.lastname@example.org (2006.12.06, amadorable.com, spam blocked - no postings allowed)
- email@example.com (2006.12.15, exodusdev.com, spam blocked - no postings allowed)
- firstname.lastname@example.org, email@example.com (2006.12.21, lame spammer posted comments to this page.)
(Searching Google or Yahoo for these addresses reveals widespread ad spam.)
It's spam if the content has nothing to do with the site topic - rabbits, goats, or things related to Amador County - so, if someone visits and has a business operating in say, New Jersey, posts one or more ads with dozens of Nokia phone models for sale, then as far as I'm concerned, that's spam. If a commercial enterprise wants to advertise their business on a site, they really should ask us for a rate sheet. The sites' features are offered as a free service conditioned upon adherance to an acceptable use policy - and spamming is not an acceptable use according to our sites' AUP.
And, the other problem is that I want to preserve the quality of my sites' content, and avoid having search engines penalize the sites by associating with spam content - since these spam advertisers post repetitive content that appears in dozens or hundreds of other web site pages, Google, Yahoo and others are likely to invoke a penalty.
Tools to predict spam postings
Perhaps it might be useful to examine the referrer logs to determine how a visitor first comes to a site - then signs up for a new account and/or starts posting content immediately - one might be able to look for certain patterns as a hint that a site is about to suffer a spam attack.
It would be interesting to watch for this kind of activity and monitor:
- Referrer log - watch for regexp patterns, or, look for a referrer query that does not contain a site-specific keyword
- Record IP address of visitor matching above pattern
- Watch subsequent traffic from that IP address, including account creation and/or new postings
- New: Bonus points: query google/msn/yahoo for user email or keywords from the new account or content, and see if there are any matching results - the more matches, the higher the probabliity of spam!
- Possibly trigger spam filters and moderation queue for all postings coming from that user or IP address
Obviously, a spammer could get around this kind of detection pretty easily, but this might be another helpful tool in our spam detection and prevention arsenal.
Does anyone know of a drupal (or PHP) add-in module that does this kind of monitoring? How about "Bad Behavior" ?