WP Plugin » SpamKit Plugin 0.0 – Time-Based-Tokens to Fight Spam

This is the first release and prototype of SpamKit for Wordpress.

SpamKit was written by Gerard Calderhead; it’s a PHP library that uses secure time-based-tokens to aid validating form post’s and can be used on guestbooks, blogs, form-mailers etc.

It does this by generating a checksum’d and encrypted ‘token’ containing the UNIX-timestamp from when it was generated. This ‘token’ is written out into the form as a hidden field. When the form is posted back to the server, the token’s value is validated. If it is invalid or tampered with validation will automatically fail, if the token has ‘expired’ it will also fail.

I took SpamKit and plugged it into Wordpress to do the following:

- When a comment form is drawn, a time-based-token is generated and inserted as a hidden field in the form.
- Where the comment would normally be approved, SpamKit is used to validate the token; if corrupt, missing or expired the comment is flagged as ‘spam’ preventing any email notification of the comment being posted.
- After the comment has been saved (as ‘spam’) by Wordpress the plugin changes the comment’s status to ‘Awaiting Moderation’ to allow the moderator to delete it at a later date.

The end result is comment-spam sits in the ‘Awaiting Moderation’ list without generating any email to say so.

The third step may not be what everyone desires for the plugin’s functionality but being a prototype there are no option pages to control this as yet.

The SpamKit plugin has been tested on Wordpress 1.5 only and found to operate as expected on even the most liberal configurations.

Installation is simple, there are no configuration options that require changing, simple copy it into the plugins directory and activate it from the administration screen.

Download: spamkit-plugin.zip

Comments, Questions and Feedback welcomed!

Updated [3rd January 2006] – Download link points to wp-plugins.org

‘NASA Search 1.0′ ??? Something Google should worry about ???

Having written my own Wordpress logging / statistics plug-in over the weekend – which still in prototype, consider it a ‘coming soon’ - I have started to notice more and more peculiar User-Agents visiting my blog.

I quite like to keep an eye on what spiders / bots visit my sites, how often they return and try to infer something about how they were designed by watching them visit.

I was surprised recently to see that the big three ( Yahoo!, MSN & Google ) actually pull RSS feeds as well as HTML pages – of course this makes sense from a efficiency & bandwidth side of things, the RSS feed is the interesting stuff already stripped out.

Today’s one is a real winner though, coming from the following net block and advertising itself as “NASA Search 1.0”.

CODE:
  1. Comcast Cable Communications, Inc. NJ-SOUTH-4 (NET-68-46-128-0-1)
  2.                                   68.46.128.0 - 68.46.191.255

The bot / spider crawled my entire site within a few minutes, starting from my ‘changes-in-wordpress-152’ post and was completely oblivious to my robots.txt (it didn’t even request it).

Also, it appeared to be quite a primitive HTTP client, providing no referrer information or any of the usual headers “Connection: close”, “Accept: */*” even though it was sending a “HTTP/1.1” request. Surprisingly though it did persist a session cookie for the duration of its visit.

I Google’d for the phrase “NASA Search 1.0” and only seemed to find results where auto-generated-stats pages list visiting User-Agents.

It would be quite interesting (and maybe even fun – in a very geeky way) to write a Wordpress plug-in that watches for these peculiar bots and pings their details to a centralised stats database – forming a sort of spider-spider.

Anyway, I will be keeping a keen eye out for the return of “NASA Search 1.0” … Could it be the next greatest NASA funded project? Or is it just some smart a** that has figured out how to change the User-Agent string in his favourite spider/bot.

Stay tuned!