What I’m Up Against
Posted on October 17, 2006 Posted by John Scalzi 37 Comments
I made a mention a few days ago about the amount of comment spam I get, and I thought you might enjoy some numbers: In the last week, the Whatever received 4,760 pieces of comment spam, as of about 6:20am this morning. I know this because I deleted them all from the junk folder today, because all that crap in the junk folder was slowing down the site refreshes.
For those of you not wishing to do the math, that’s 680 spam messages daily. As a comparison, in the last 24 hours I’ve logged 116 legit comments (i.e., made by real humans), which is pretty much an average day around here. So the spam outpaces the humans by about 6:1.
The vast majority of the spam gets trapped in my moderating and junk queues thanks to my filters, which are updated via a shared blacklist, although from time to time a new purveyor gets through. I woke up this morning and found over 100 spam messages on the site; each had the same keyword. Dropped the keyword into the local junk filter with an instruction to catch all its variants; since I did that 20 more spam comments went right into the junk filter.
I could probably reduce spam comments by entirely closing off comment threads older than a couple of weeks, but some threads are still active by way of real people (the “Writing Tips for Teens” thread is an example of this), and I’m loathe to just seal them off unilaterally. Active spam management is a reasonable compromise at the moment, although if it gets much worse I may have to revisit this idea. I do seal off individual comment threads if I see that the only traffic they get is spam. It’s a continual battle between the forces of good and evil, it is.
The fact I get 4700 spam messages a week is the primary reason why I suspect, should direct brain computer interfaces ever become available, I won’t be getting one. Does anyone doubt that within weeks, the spammers would have found a way in, and your entire visual field would be riddled with spam advertisements for Tramadol, in Cyrillic lettering? And they would never go away. Yeah, I’ll be keeping my brain unwired, I suspect.
I won’t be getting one. Does anyone doubt that within weeks, the spammers would have found a way in, and your entire visual field would be riddled with spam advertisements for Tramadol, in Cyrillic lettering?
I feel like this now.
I do wonder how much brain power we exert ignoring all the advertizing we’re constantly assaulted with. It must take some energy to resist these constant conscious and subconscious calls to “BUY THIS,” right? Add that all up, I bet it’s significant.
I won’t be getting one. Does anyone doubt that within weeks, the spammers would have found a way in, and your entire visual field would be riddled with spam advertisements for Tramadol, in Cyrillic lettering?
I feel like this now.
I do wonder how much brain power we exert ignoring all the advertizing we’re constantly assaulted with. It must take some energy to resist these constant conscious and subconscious calls to “BUY THIS,” right? Add that all up, I bet it’s significant.
*cough* last paragraph “per day”? Wasn’t that “per week”? *cough*
*cough* last paragraph “per day”? Wasn’t that “per week”? *cough*
Honestly, I would highly recommend you switch over to using WordPress for your blogging platform and use the Akismet plugin to filter your spam. I can count on one hand the number of spam comments – per month – that make it around the filter. And usually I only have to remove one or two every three months or so. WordPress + Akismet = just that good.
Honestly, I would highly recommend you switch over to using WordPress for your blogging platform and use the Akismet plugin to filter your spam. I can count on one hand the number of spam comments – per month – that make it around the filter. And usually I only have to remove one or two every three months or so. WordPress + Akismet = just that good.
A tangent: The fact that spam occasionally sneaks through the robust filters at my office is a major reason I keep the preview/reading pane turned off in Outlook. An officemate of mine used to send me what he thought of as choice spams – “Hey, look at how dumb this one is . . .” etc. He saw them because his preview pane was on. I had to inform him that I *never* saw spam, and I *never* wanted to look at even one – not even for laughs. Life is too short.
So, John, I’m with you – no direct interfaces through which spammer/sociopaths can inflict their nonsense on me – not even for a second.
I have a coworker who keeps all of the spam he’s ever gotten and tracks the trends of the new technologies they’re using to get past the filters and the things they are selling.
I find it to be wonderful.
Just curious — have you considered using something like Captcha for your comment threads? Here’s the wiki: http://en.wikipedia.org/wiki/Captcha .
Not really. I dislike having to fill them out, so I don’t know why I’d make anyone else do that.
Not really. I dislike having to fill them out, so I don’t know why I’d make anyone else do that.
There’s a current theory that spam will deal the deathblow to Microsoft’s Zune MP3 player. The Zune’s big selling point is the wireless sharing feature, which is a big old pile of red meat to the spam sharks. This suggests the chilling idea of auditory spam…you pick a song at random and hit the play button, only to first hear an advertisement for POWERFUL MEDS to INCREASE YOUR MANHOOD to PLEASE HER BRING A SMILE TO HER FACE shouted at MAXIMUM VOLUME.
[b]Jim[/b]
I know nothing about blogging software, but it seems like spammers would try to target higher traffic blogs to get more readers. I’m assuming Whatever gets more traffic than Open-Dialogue. So could the difference be related to the number of attackers rather than the quality of the defenses?
Oops, sorry. Got those tags wrong. So I’ll taglessly mention that CAPTCHAs are annoying, and aren’t bulletproof either. The ones that are really secure against OCR are annoyingly difficult to read. And for the vision-impaired, they can be a serious roadblock.
Plus there are ways around even the best CAPTCHAs. For instance, there are free porn sites that requires viewers to fill out a CAPTCHA before viewing each picture. Spam software copies to CAPTCHA image to his porn page, HNG enters the text for the CAPTCHA, spam software enters the text. Spammer gets to spam, HNG gets his porn. It’s a slick workaround, in a creepily enraging kinda way.
Thanks for not going the CAPTCHA route John, even though it makes more work for you. Much appreciated.
“Won’t get a direct brain interface?” !!!
So says the man who invented “BrainPal(tm)”
Although since you didn’t put anti-spam or antivirus on the BrainPal network, I can understand.
(Just busting your chops, sir.)
I use a simple text captcha on the comment entry screen. It says “type Purdue below” and if you do the comment goes through. I get irritated when I fail a captcha test because I can’t tell what the damn letters are. However, typing a simple word rendered in plain text seems to be a reasonable compromise. It’s a plug-in for WordPress, and nobody so far had bothered with the effort to create a bot to beat it.
I’ve noticed that I posted my email here a few times and since then, I’ve been lambasted with spam, usually of the masculine type. I’d think the name Cassie was feminine enough to clue in the spaminators that I don’t have the dangly bits.
I like Captcha. I’d much rather by favorite blogs kept on having open comment threads, and so the less spam the owner has to kill, the better. There are already 3 boxes to fill out to leave a comment, what’s a fourth?
For spam, I figure it’s not that you have to be impregnable: you just have to be more impregnable than the next guy.
I like Captcha. I’d much rather by favorite blogs kept on having open comment threads, and so the less spam the owner has to kill, the better. There are already 3 boxes to fill out to leave a comment, what’s a fourth?
For spam, I figure it’s not that you have to be impregnable: you just have to be more impregnable than the next guy.
I’ve noticed that I posted my email here a few times and since then, I’ve been lambasted with spam, usually of the masculine type. I’d think the name Cassie was feminine enough to clue in the spaminators that I don’t have the dangly bits.
I’ve noticed that I posted my email here a few times and since then, I’ve been lambasted with spam, usually of the masculine type. I’d think the name Cassie was feminine enough to clue in the spaminators that I don’t have the dangly bits.
I don’t think spammers make any effort to filter their list by gender or anything else (except perhaps if it is a valid email address). It goes against the underlying concept, that other people’s time and resources are worthless, and only the spammer’s greed matters. Why should he waste his valuable time filtering out women, in order to save your worthless (to him) time?
For a less annoying alternative to the text-based captchas, you should check out KittenAuth, which is both amusing and potentially far more effective than more traditional captchas:
http://www.kittenauth.com/
If you hate kittens, I believe it’s possible to use whatever seed images you’d like (sci-fi novel covers perhaps?). The new version isn’t quite ready for prime time though, but some of the new features should be worth the wait.
I highly recommend the CCode plugin. It’s main drawback is that it requires Javascript to be enabled for commenting, but it uses a variation on the captcha thing in that it inserts a random variable and doublechecks it later, so that automatic spam gets caught because it doesn’t have it. Works like a champ once you get it set up:
CCOde
I’m a fan of simple questions text-based CAPTCHAs (no images at all). I mean, stuff like “how many eggs are in a dozen?” or “What’s 2 plus 2?”.
I’m not sure if one has been developed for MT yet, though it really isn’t hard to do. It involved a randomly generated id for each question, so even if spam bots try to index the questions with their answers, the same question will never have the same id again. I’ve had no spam, and unless spam bots are developed that can actually parse the text of the question and answer it, probably never will.
Open source code available for the tech-inclined: http://forum.nucleuscms.org/viewtopic.php?t=11644
(be sure to look at page 4 of that thread, there’s an important update listed there)
I second Jim’s recommendation of WordPress, though I’m more a fan of Spam Karma than Akismet.
MT has several text-based captcha plugins; the one I keep eyeing is “TinyTuring”, http://www.staggernation.com/mtplugins/TinyTuring/ , which asks people to input a single letter. (I haven’t tried it, though.
MT has several text-based captcha plugins; the one I keep eyeing is “TinyTuring”, http://www.staggernation.com/mtplugins/TinyTuring/ , which asks people to input a single letter. (I haven’t tried it, though.
You mean to tell me that if BrainPal WERE invented, you’d never get one?
That WOULD suck though … brain spam. I could imagine it now.
Walking along on a beach, you see a woman in a bikini, and then suddenly you hear a voice …
“ORDER [INSERT NAME OF V-NAMED DRUG THAT IS PROBABLY IN YOUR FILTER]! TO ORDER NOW FROM CANADA FOR A LOW LOW PRICE OF $0.04 PER PILL, SIMPLY THINK ‘I WANT SOME, DAMNIT!’ AND, WHILE YOU’RE AT IT, ASK US ABOUT OUR LOW, LOW MORTGAGE RATES. AND WEALTHY PEOPLE FROM NIGERIA.”
So I can’t bitch about the daily (one) spam I get on my blog? Then again, your blog has a PR of 7, and mine has a PR of 5. And how I get that high is beyond me.
Jon, Akismet works no matter how much traffic a site gets. It essentially filters out known spam addresses. The more people use it, the better it gets. It’s become so popular, in fact, that it’s been modded to fit numerous blogging platforms and websites.
Adam, from what I understand, Akismet and Spam Karma actually work very well together. I know a number of bloggers, including several ‘A’ bloggers, who use them in tandem and have zero problems with comment spam.
Yes, I’m a WordPress fanboy. ;-)
I have Captcha, but that’s only to catch the one or two a month that YAASP doesn’t catch.
A quick description of YAASP:
As most spam comments are posted automatically by a script, YAASP (Yet Another Anti-Spam Plugin) requires that the value of a hidden field within the comments page also be submitted with each comment. As the automated scripts don’t actually load the comments page and submit comments via the form, their comments won’t be accepted. As a backup measure, spam comments that are submitted directly via the form are also blocked. Comments made within a set amount of seconds (I have mine set to 5) seconds of the comments page being loaded are automatically refused, meaning quick succession spamming will be prevented.
I don’t know if a version of this is made for the blog software you use; I use CMS Nucleus. Blacklist stopped working for me when the main blacklist itself stopped getting updated by the fellow who wrote the code; sounds like you’re suffering from the same problem.
I suggest either finding a version of YAASP for your blog, or converting to CMS Nucleus if you can.
You’re welcome. (I say this in advance because YOU WILL THANK ME. Oh yes you will.)
“…got infected with a meme virus that ran adverts for roach motels in Hindi in the lower right quarter of his visual field, twenty-four hours a day, until he whacked himself…”
Neal Stephenson, “The Diamond Age”
“…got infected with a meme virus that ran adverts for roach motels in Hindi in the lower right quarter of his visual field, twenty-four hours a day, until he whacked himself…”
Neal Stephenson, “The Diamond Age”
I’m one of those masochists who have “rolled their own” blogging code. I use both Captcha and the hidden field trick. So far the combination seems to be working. I have the comment moderation option in reserve, just in case. I wasn’t keen on adding Captcha, but for me it struck the balance between allowing an open conversation and moderated comments.
John, I have no idea if it’s compatible with MT, but I installed Bad Behaviour on WordPress last Friday. It’s caught 1200+ spam for me in the last week, meaning that Akismet only had to filter out 3, and absolutely zero got through.
So, for the first time in a month, spam was below real comments. By one whole integer. I really need to actually post more on that site.
I can heartily recommend Bad Behaviour if it’s usable, they’re porting it to various things, including BBs and Wikis, but of course given that MT is closed source it may not be possible. Harass 6A a bit tog et them to integrate it?
Jon; spam is nothing to do with people seeing it, it’s virtually all to do with Google (and other, inferior SEs) seeing it. All that matters is they get a large number of links out there with their keywords within the anchors. That’s all Google needs.
Hence my low traffic but mid-range PR site gets a fair bit of spam, higher traffic but lower PR sites may get less, but it all depends on whether youg et caught, 90% of my spam hits one of 5 threads out of nearly 500, as that’s what the spambots have been set for currently.
I keep thinking to myself that the whole enterprise of Spam seems a bit like the plot of Blazing Saddles. Where is the black sherrif who will make Hedley Lamar rue the day he set his goons on that town?
Or maybe it is like the plot to the Incredibles…where an evil no-good rich man sics a mult-armed undefeatable monster on innocent people in order to become their savior apparent…
Any ideas?
I keep thinking to myself that the whole enterprise of Spam seems a bit like the plot of Blazing Saddles. Where is the black sherrif who will make Hedley Lamar rue the day he set his goons on that town?
Or maybe it is like the plot to the Incredibles…where an evil no-good rich man sics a mult-armed undefeatable monster on innocent people in order to become their savior apparent…
Any ideas?