reCAPTCHA, I Think I Get It

I received a very helpful (and FAST) response from the reCAPTCHA team earlier today:

Basically, when you get a reCAPTCHA incorrect on wordpress, the comment is saved but it is marked as spam. When a legitimate user gets the CAPTCHA incorrect, they are redirected to a page that saves their comment and allows them to correct the CAPTCHA. Once we recover the comment, it’s deleted from the database. However, if a spam bot enters the comment, it will not follow the redirect, so sometimes the comment stays in the DB.

Basically, when using reCAPTCHA, any comment you see in the moderation queue was caught by reCAPTCHA — it doesn’t mean that the CAPTCHA is broken.

The only thing I’ll add to this is my observation that WordPress does not always make it clear which “comments” come from the comment form versus trackbacks and pingbacks. So “any comment you see in the moderation queue” MIGHT have come via reCAPTCHA, but it might also have originated via trackbacks. Read on…

Following Their Advice

Today I turned off email notification for the moderation queue, as the reCAPTCHA FAQ suggests. But this has a side-effect…I also had this option selected in WordPress: “Comment author must have a previously approved comment”.

Damn. This means I’ll no longer see email notifications for those legitimate human comments. So…I went ahead and turned off that, as well.

Trackback Vulnerability

After changing these settings, I very quickly saw a spam comment make it onto my blog, because (I think) the “Comment author must have a previously approved comment” checkbox is now unselected. I missed my opportunity to moderate the comment (which is actually a trackback), so it went on through.

AFTER the spam makes it through the system, WordPress sends an Email notification. Part of this email contains this phrase:

You can see all trackbacks on this post here:

Aha! Now I know this came from a trackback, not from a normal comment.

I now believe most of the comment spam I’ve been seeing in recent weeks has originated from trackbacks, it’s just that WordPress does not always make it easy to tell if the comments are coming from trackbacks or the comment form. reCAPTCHA does a very good job with the comment form, but does not address trackback or pingback spam in any way.

Closing Thoughts

I really, really like reCAPTCHA. It is very effective and the people on their team are always helpful and promptly answer questions. Having reCAPTCHA in place saves me significant time. It’s also damn cool that they are harnessing people power to digitize books. Watching Luis von Ahn’s Google Tech Talk was the reason I tried reCAPTCHA in the first place. It is an incredibly clever idea that any geek can appreciate.

I used Akismet in the past, and wading through thousands of spams in the Akismet queue grew very burdensome. Furthermore, it occasionally marked valid comments as spam — definitely more often than once every 6 months — so I never felt comfortable ignoring the spams.

Maybe WordPress needs to improve its moderation queue? Some ideas:

  • Offer separate moderation queues for comments, trackbacks, and pingbacks.
  • Let me configure different moderation policies for these different queues.
  • At all stages, make it very obvious if a comment originated from the comment form, a trackback, or a pingback.

Trackbacks and pingbacks are now off. Comments — from people — are open and encouraged. Spammers, you suck.


8 Responses to “reCAPTCHA, I Think I Get It”

Eric Burke Says:

Damn…disabling pings is not so easy. When you GLOBALLY disable pings in WordPress, it only affects future posts. All legacy posts still allow pings.

I had to manually execute some SQL to set ping_status=’closed’ on all entries in the posts table. I hope I didn’t just screw up my database.

Ben Maurer Says:

Hi,

Yeah, I agree there could really be a better way of doing moderation. The way we do the plugin in wordpress actually leaves a lot to be desired. For example, getting it so that the user’s comment is saved if they make a typo on the CAPTCHA requires quite a bit of hackery. With a patch to wordpress, this could be made much easier.

In terms of moderation itself, I think you hit the key point: at all times it needs to be transparent what is allowed and why specific comments are getting through. For example, at the end of every comment, it’d be good to see:

This comment came from 128.2.1.1 and was marked as spam by the reCAPTCHA plugin

One other important thing that’d be good is if you could get moderation emails for only non-spam comments. This is good if you want to moderate the comments but don’t want to watch the spam comments from Akismet/Wordpress

Sam.Halliday Says:

I also tried ReCAPTCHA on my blog, but I have one major complaint… if the user types in the wrong code then the comment is lost in the refresh. It doesn’t seem to happen on your setup though, so perhaps I should look again.

CReview Says:

I’m also trying to use recaptcha, for me it seems to automatically bypass the moderation queue, so all posts are posted without me having the chance to moderate them, if you see what I mean.
Mike
http://www.conspiracyreview.com

shoano Says:

hi hi recaptcha

Slots Says:

I also tried ReCAPTCHA on my blog, but I have one major complaint… if the user types in the wrong code then the comment is lost in the refresh. It doesn’t seem to happen on your setup though, so perhaps I should look again.

Louis Says:

I’ve been investigating Recaptchas for a Blog I’m building, but I can go to most anyone who is using it and type whatever I want in the Recaptcha box…and my comments are posted right before my eyes. I don’t fully understand what’s going on, but I think a couple possibilities could be 1)improper implementation, 2)it doesn’t actually work 3)A combination of the two 4)I’m a complete idiot and I don’t understand anything…….likely answer is #4. Thanks for any comments. If you get this post….it was after i typed Gibberish in the Recaptcha box.

Louis Says:

Just so you know….I only had to type the first word correctly….then I typed any ol’ think I wanted and my comment is present. Any thoughts on this?

Leave a Reply