<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:wfw="http://wellformedweb.org/CommentAPI/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	xmlns:atom="http://www.w3.org/2005/Atom"
	xmlns:sy="http://purl.org/rss/1.0/modules/syndication/"
	xmlns:slash="http://purl.org/rss/1.0/modules/slash/"
	>

<channel>
	<title>Weblog of Michael Cutler &#187; Internet</title>
	<atom:link href="http://blog.lobstertechnology.com/category/internet/feed/" rel="self" type="application/rss+xml" />
	<link>http://blog.lobstertechnology.com</link>
	<description>"I felt a great disturbance in the Force, as if millions of peers suddenly cried out in terror and were suddenly silenced."</description>
	<lastBuildDate>Tue, 17 Oct 2006 14:40:43 +0000</lastBuildDate>
	<generator>http://wordpress.org/?v=2.8.4</generator>
	<language>en</language>
	<sy:updatePeriod>hourly</sy:updatePeriod>
	<sy:updateFrequency>1</sy:updateFrequency>
			<item>
		<title>Patch to mod_evasive to enhance reporting</title>
		<link>http://blog.lobstertechnology.com/2006/03/29/patch-to-mod_evasive-to-enhance-reporting/</link>
		<comments>http://blog.lobstertechnology.com/2006/03/29/patch-to-mod_evasive-to-enhance-reporting/#comments</comments>
		<pubDate>Wed, 29 Mar 2006 09:27:58 +0000</pubDate>
		<dc:creator>Michael Cutler</dc:creator>
				<category><![CDATA[Apache]]></category>
		<category><![CDATA[Internet]]></category>
		<category><![CDATA[Security]]></category>

		<guid isPermaLink="false">http://blog.lobstertechnology.com/2006/03/29/patch-to-mod_evasive-to-enhance-reporting/</guid>
		<description><![CDATA[This morning I took the opportunity to install mod_evasive on my Apache Web Server after being hammered by zombies last night. It appears to work well, I tested it out by loading it up with small scale DoS attacks. It blocked the offending addresses as expected and produced the relevant syslog entires &#038; triggered my external reporting script. I was however a little disappointed with its script execution functionality, it basically did a "system" call allowing you to pass only one argument - the offending IP address.]]></description>
			<content:encoded><![CDATA[<p>This morning I took the opportunity to install mod_evasive on my Apache Web Server after being hammered by zombies last night. Quote from [<a href="http://www.nuclearelephant.com/projects/mod_evasive/">www.nuclearelephant.com</a>]:</p>
<blockquote><p>mod_evasive is an evasive maneuvers module for Apache to provide evasive action in the event of an HTTP DoS or DDoS attack or brute force attack. It is also designed to be a detection and network management tool, and can be easily configured to talk to ipchains, firewalls, routers, and etcetera. mod_evasive presently reports abuses via email and syslog facilities.</p></blockquote>
<p>It appears to work well, I tested it out by loading it up with small scale DoS attacks. It blocked the offending addresses as expected and produced the relevant syslog entires &#038; triggered my external reporting script. I was however a little disappointed with its script execution functionality, it basically did a "system" call allowing you to pass only one argument - the offending IP address.</p>
<p>I already have <a href="http://www.modsecurity.org/">mod_security</a> installed which also executes an external reporting script. However mod_security has a neat little feature which I took for granted, it passes all the 'environment' variables from the request to the script allowing you to see the request itself &#038; any headers passed.</p>
<p>For example, a typical mod_security email alert for me would contain:</p>
<p><code>DOCUMENT_ROOT=/usr/local/apache/vhosts/www.domain.com<br />
GATEWAY_INTERFACE=CGI/1.1<br />
HTTP_ACCEPT=*/*<br />
HTTP_ACCEPT_ENCODING=gzip, x-gzip<br />
HTTP_CONNECTION=close<br />
HTTP_HOST=www.domain.com<br />
HTTP_MOD_SECURITY_ACTION=500<br />
HTTP_MOD_SECURITY_EXECUTED=/usr/local/scripts/modsec_alert.pl<br />
HTTP_MOD_SECURITY_MESSAGE=Access denied with code 500. Error normalizing REQUEST_URI: Invalid URL encoding detected: not enough characters<br />
HTTP_USER_AGENT=Mozilla/4.0<br />
PATH=/bin:/sbin...<br />
PATH_INFO=/search.cgi<br />
PATH_TRANSLATED=/usr/local/scripts/modsec_alert.pl<br />
QUERY_STRING=q='object+levels%<br />
REDIRECT_STATUS=302<br />
REMOTE_ADDR=XXX.XXX.XXX.XXX<br />
REMOTE_PORT=45852<br />
REQUEST_METHOD=GET<br />
REQUEST_URI=/cgi-bin/search.cgi?q='object+levels%<br />
SCRIPT_FILENAME=/usr/local/apache/vhosts/www.domain.com/cgi-bin<br />
SCRIPT_NAME=/cgi-bin<br />
SERVER_ADDR=XXX.XXX.XXX.XXX<br />
SERVER_ADMIN=foo@bar<br />
SERVER_NAME=www.domain.com<br />
SERVER_PORT=80<br />
SERVER_PROTOCOL=HTTP/1.1<br />
SERVER_SIGNATURE=<br />
SERVER_SOFTWARE=Apache</code></p>
<p>This shows me detailed information about the request which was declined and why. I wanted to get similar functionality out of mod_evasive and I achieved this with the following additional code (butchered from mod_security).</p>
<div class="igBar"><span id="lcpp-1"><a href="#" onclick="javascript:showPlainTxt('cpp-1'); return false;">PLAIN TEXT</a></span></div>
<div class="syntax_hilite"><span class="langName">C++:</span>
<div id="cpp-1">
<div class="cpp">
<ol>
<li style="font-family: 'Courier New', Courier, monospace; color: black; font-weight: normal; font-style: normal;color:#3A6A8B;">
<div style="font-family: 'Courier New', Courier, monospace; font-weight: normal;"><span style="color: #0000ff;">if</span> <span style="color: #000000;">&#40;</span>sys_command != <span style="color: #0000ff;">NULL</span><span style="color: #000000;">&#41;</span> <span style="color: #000000;">&#123;</span></div>
</li>
<li style="font-weight: bold;color:#26536A;">
<div style="font-family: 'Courier New', Courier, monospace; font-weight: normal;">&nbsp; <span style="color: #0000ff;">char</span> **env = <span style="color: #0000ff;">NULL</span>;</div>
</li>
<li style="font-family: 'Courier New', Courier, monospace; color: black; font-weight: normal; font-style: normal;color:#3A6A8B;">
<div style="font-family: 'Courier New', Courier, monospace; font-weight: normal;">&nbsp; <span style="color: #0000ff;">const</span> <span style="color: #0000ff;">char</span> *args<span style="color: #000000;">&#91;</span><span style="color: #0000dd;color:#800000;">5</span><span style="color: #000000;">&#93;</span>;</div>
</li>
<li style="font-weight: bold;color:#26536A;">
<div style="font-family: 'Courier New', Courier, monospace; font-weight: normal;">&nbsp;</div>
</li>
<li style="font-family: 'Courier New', Courier, monospace; color: black; font-weight: normal; font-style: normal;color:#3A6A8B;">
<div style="font-family: 'Courier New', Courier, monospace; font-weight: normal;">&nbsp; ap_add_cgi_vars<span style="color: #000000;">&#40;</span>r<span style="color: #000000;">&#41;</span>;</div>
</li>
<li style="font-weight: bold;color:#26536A;">
<div style="font-family: 'Courier New', Courier, monospace; font-weight: normal;">&nbsp; ap_add_common_vars<span style="color: #000000;">&#40;</span>r<span style="color: #000000;">&#41;</span>;</div>
</li>
<li style="font-family: 'Courier New', Courier, monospace; color: black; font-weight: normal; font-style: normal;color:#3A6A8B;">
<div style="font-family: 'Courier New', Courier, monospace; font-weight: normal;">&nbsp;</div>
</li>
<li style="font-weight: bold;color:#26536A;">
<div style="font-family: 'Courier New', Courier, monospace; font-weight: normal;">&nbsp; env = <span style="color: #000000;">&#40;</span><span style="color: #0000ff;">char</span> **<span style="color: #000000;">&#41;</span>ap_create_environment<span style="color: #000000;">&#40;</span>r-&gt;pool, r-&gt;subprocess_env<span style="color: #000000;">&#41;</span>;</div>
</li>
<li style="font-family: 'Courier New', Courier, monospace; color: black; font-weight: normal; font-style: normal;color:#3A6A8B;">
<div style="font-family: 'Courier New', Courier, monospace; font-weight: normal;">&nbsp;</div>
</li>
<li style="font-weight: bold;color:#26536A;">
<div style="font-family: 'Courier New', Courier, monospace; font-weight: normal;">&nbsp; ap_cleanup_for_exec<span style="color: #000000;">&#40;</span><span style="color: #000000;">&#41;</span>;</div>
</li>
<li style="font-family: 'Courier New', Courier, monospace; color: black; font-weight: normal; font-style: normal;color:#3A6A8B;">
<div style="font-family: 'Courier New', Courier, monospace; font-weight: normal;">&nbsp;</div>
</li>
<li style="font-weight: bold;color:#26536A;">
<div style="font-family: 'Courier New', Courier, monospace; font-weight: normal;">&nbsp; args<span style="color: #000000;">&#91;</span><span style="color: #0000dd;color:#800000;">0</span><span style="color: #000000;">&#93;</span> = filename;</div>
</li>
<li style="font-family: 'Courier New', Courier, monospace; color: black; font-weight: normal; font-style: normal;color:#3A6A8B;">
<div style="font-family: 'Courier New', Courier, monospace; font-weight: normal;">&nbsp; args<span style="color: #000000;">&#91;</span><span style="color: #0000dd;color:#800000;">1</span><span style="color: #000000;">&#93;</span> = text_add;</div>
</li>
<li style="font-weight: bold;color:#26536A;">
<div style="font-family: 'Courier New', Courier, monospace; font-weight: normal;">&nbsp; args<span style="color: #000000;">&#91;</span><span style="color: #0000dd;color:#800000;">2</span><span style="color: #000000;">&#93;</span> = <span style="color: #0000ff;">NULL</span>;</div>
</li>
<li style="font-family: 'Courier New', Courier, monospace; color: black; font-weight: normal; font-style: normal;color:#3A6A8B;">
<div style="font-family: 'Courier New', Courier, monospace; font-weight: normal;">&nbsp; execve<span style="color: #000000;">&#40;</span>sys_command, <span style="color: #000000;">&#40;</span><span style="color: #0000ff;">char</span> ** <span style="color: #0000ff;">const</span><span style="color: #000000;">&#41;</span>&amp;args, env<span style="color: #000000;">&#41;</span>;</div>
</li>
<li style="font-weight: bold;color:#26536A;">
<div style="font-family: 'Courier New', Courier, monospace; font-weight: normal;"><span style="color: #000000;">&#125;</span> </div>
</li>
</ol>
</div>
</div>
</div>
<p></p>
<p>The original mod_evasive code is expecting a sprintf format string as the parameter 'sys_command' allowing you to define a position with '%s' where the IP address should be inserted. My code above does not to this, it expects 'sys_command' to be the path to the executable which takes a single argument of the IP address.</p>
<p>This change can be applied automagically - to the Apache 1.3.x version of mod_evasive.c only - with the following patch: <a href="http://svn.lobstertechnology.com/mod_evasive/mod_evasive_execve.patch">mod_evasive_execve.patch</a></p>
<p>Assuming mod_evasive_1.10.1.tar.gz &#038; mod_evasive_execve.patch have already been downloaded to the same directory:</p>
<p><code>[foo@bar ~]$ <strong>tar zxf mod_evasive_1.10.1.tar.gz</strong><br />
[foo@bar ~]$ <strong>cd mod_evasive</strong><br />
[foo@bar mod_evasive]$ <strong>patch -p1 &lt; ../mod_evasive_execve.patch</strong><br />
patching file mod_evasive.c<br />
[foo@bar mod_evasive]$ <strong>$APACHE_ROOT/bin/apxs -iac mod_evasive.c</strong><br />
gcc -DLINUX=22 -DEAPI -I/usr/include/gdbm -DUSE_HSREGEX -fpic  -DEAPI -DSHARED_MODULE -I/usr/local/apache/include  -c mod_evasive.c<br />
gcc -shared -o mod_evasive.so mod_evasive.o<br />
[activating module `evasive' in /usr/local/apache/conf/httpd.conf]<br />
cp mod_evasive.so /usr/local/apache/libexec/mod_evasive.so<br />
chmod 755 /usr/local/apache/libexec/mod_evasive.so<br />
cp /usr/local/apache/conf/httpd.conf /usr/local/apache/conf/httpd.conf.bak<br />
cp /usr/local/apache/conf/httpd.conf.new /usr/local/apache/conf/httpd.conf<br />
rm /usr/local/apache/conf/httpd.conf.new<br />
[foo@bar mod_evasive]$ </code></p>
<p>Now create a simple shell/perl/something script to use this info. My example emails myself and the address listed as the SERVER_ADMIN, because each VirtualHost on my server has a 'ServerAdmin' entry with the owners email address, my customers get a copy of the email too.</p>
<div class="igBar"><span id="lperl-2"><a href="#" onclick="javascript:showPlainTxt('perl-2'); return false;">PLAIN TEXT</a></span></div>
<div class="syntax_hilite"><span class="langName">PERL:</span>
<div id="perl-2">
<div class="perl">
<ol>
<li style="font-family: 'Courier New', Courier, monospace; color: black; font-weight: normal; font-style: normal;color:#3A6A8B;">
<div style="font-family: 'Courier New', Courier, monospace; font-weight: normal;"><span style="color: #808080; font-style: italic;">#!/usr/bin/perl</span></div>
</li>
<li style="font-weight: bold;color:#26536A;">
<div style="font-family: 'Courier New', Courier, monospace; font-weight: normal;"><span style="color: #808080; font-style: italic;"># /usr/local/scripts/mod_evasive_alert.pl</span></div>
</li>
<li style="font-family: 'Courier New', Courier, monospace; color: black; font-weight: normal; font-style: normal;color:#3A6A8B;">
<div style="font-family: 'Courier New', Courier, monospace; font-weight: normal;"><span style="color: #0000ff;">$IP</span>=<span style="color: #0000ff;">$ARGV</span><span style="color: #66cc66;">&#91;</span><span style="color: #cc66cc;color:#800000;">0</span><span style="color: #66cc66;">&#93;</span>;</div>
</li>
<li style="font-weight: bold;color:#26536A;">
<div style="font-family: 'Courier New', Courier, monospace; font-weight: normal;"><span style="color: #0000ff;">$MSG</span>=<span style="color: #ff0000;">"mod_evasive has blacklisted the IP $IP.<span style="color: #000099; font-weight: bold;">\n</span><span style="color: #000099; font-weight: bold;">\n</span>"</span>;</div>
</li>
<li style="font-family: 'Courier New', Courier, monospace; color: black; font-weight: normal; font-style: normal;color:#3A6A8B;">
<div style="font-family: 'Courier New', Courier, monospace; font-weight: normal;">&nbsp;</div>
</li>
<li style="font-weight: bold;color:#26536A;">
<div style="font-family: 'Courier New', Courier, monospace; font-weight: normal;"><span style="color: #b1b100;">foreach</span> <span style="color: #0000ff;">$key</span> <span style="color: #66cc66;">&#40;</span> <a href="http://www.perldoc.com/perl5.6/pod/func/sort.html"><span style="color: #000066;">sort</span></a> <a href="http://www.perldoc.com/perl5.6/pod/func/keys.html"><span style="color: #000066;">keys</span></a> <span style="color: #0000ff;">%ENV</span> <span style="color: #66cc66;">&#41;</span> <span style="color: #66cc66;">&#123;</span></div>
</li>
<li style="font-family: 'Courier New', Courier, monospace; color: black; font-weight: normal; font-style: normal;color:#3A6A8B;">
<div style="font-family: 'Courier New', Courier, monospace; font-weight: normal;">&nbsp; &nbsp;<span style="color: #0000ff;">$MSG</span> .= <span style="color: #0000ff;">$key</span> . <span style="color: #ff0000;">"="</span> . <span style="color: #0000ff;">$ENV</span><span style="color: #66cc66;">&#123;</span><span style="color: #0000ff;">$key</span><span style="color: #66cc66;">&#125;</span> . <span style="color: #ff0000;">"<span style="color: #000099; font-weight: bold;">\n</span>"</span>;</div>
</li>
<li style="font-weight: bold;color:#26536A;">
<div style="font-family: 'Courier New', Courier, monospace; font-weight: normal;"><span style="color: #66cc66;">&#125;</span></div>
</li>
<li style="font-family: 'Courier New', Courier, monospace; color: black; font-weight: normal; font-style: normal;color:#3A6A8B;">
<div style="font-family: 'Courier New', Courier, monospace; font-weight: normal;">&nbsp;</div>
</li>
<li style="font-weight: bold;color:#26536A;">
<div style="font-family: 'Courier New', Courier, monospace; font-weight: normal;"><a href="http://www.perldoc.com/perl5.6/pod/func/open.html"><span style="color: #000066;">open</span></a><span style="color: #66cc66;">&#40;</span>SENDMAIL, <span style="color: #ff0000;">"|/usr/sbin/sendmail -t"</span><span style="color: #66cc66;">&#41;</span> <span style="color: #b1b100;">or</span> <a href="http://www.perldoc.com/perl5.6/pod/func/die.html"><span style="color: #000066;">die</span></a> <span style="color: #ff0000;">"Cannot open sendmail: $!"</span>;</div>
</li>
<li style="font-family: 'Courier New', Courier, monospace; color: black; font-weight: normal; font-style: normal;color:#3A6A8B;">
<div style="font-family: 'Courier New', Courier, monospace; font-weight: normal;"><a href="http://www.perldoc.com/perl5.6/pod/func/print.html"><span style="color: #000066;">print</span></a> SENDMAIL <span style="color: #ff0000;">"Reply-To: foo<span style="color: #000099; font-weight: bold;">\@</span>bar<span style="color: #000099; font-weight: bold;">\n</span>"</span>;</div>
</li>
<li style="font-weight: bold;color:#26536A;">
<div style="font-family: 'Courier New', Courier, monospace; font-weight: normal;"><a href="http://www.perldoc.com/perl5.6/pod/func/print.html"><span style="color: #000066;">print</span></a> SENDMAIL <span style="color: #ff0000;">"Subject: [lobstertechnology.com] mod_evasive alert $IP<span style="color: #000099; font-weight: bold;">\n</span>"</span>;</div>
</li>
<li style="font-family: 'Courier New', Courier, monospace; color: black; font-weight: normal; font-style: normal;color:#3A6A8B;">
<div style="font-family: 'Courier New', Courier, monospace; font-weight: normal;"><a href="http://www.perldoc.com/perl5.6/pod/func/print.html"><span style="color: #000066;">print</span></a> SENDMAIL <span style="color: #ff0000;">"To: "</span> . <span style="color: #0000ff;">$ENV</span><span style="color: #66cc66;">&#123;</span><span style="color: #ff0000;">'SERVER_ADMIN'</span><span style="color: #66cc66;">&#125;</span> . <span style="color: #ff0000;">"<span style="color: #000099; font-weight: bold;">\n</span>"</span>;</div>
</li>
<li style="font-weight: bold;color:#26536A;">
<div style="font-family: 'Courier New', Courier, monospace; font-weight: normal;"><a href="http://www.perldoc.com/perl5.6/pod/func/print.html"><span style="color: #000066;">print</span></a> SENDMAIL <span style="color: #ff0000;">"Cc: foo<span style="color: #000099; font-weight: bold;">\@</span>bar<span style="color: #000099; font-weight: bold;">\n</span>"</span>;</div>
</li>
<li style="font-family: 'Courier New', Courier, monospace; color: black; font-weight: normal; font-style: normal;color:#3A6A8B;">
<div style="font-family: 'Courier New', Courier, monospace; font-weight: normal;"><a href="http://www.perldoc.com/perl5.6/pod/func/print.html"><span style="color: #000066;">print</span></a> SENDMAIL <span style="color: #ff0000;">"Content-type: text/plain<span style="color: #000099; font-weight: bold;">\n</span><span style="color: #000099; font-weight: bold;">\n</span>"</span>;</div>
</li>
<li style="font-weight: bold;color:#26536A;">
<div style="font-family: 'Courier New', Courier, monospace; font-weight: normal;"><a href="http://www.perldoc.com/perl5.6/pod/func/print.html"><span style="color: #000066;">print</span></a> SENDMAIL <span style="color: #0000ff;">$MSG</span>;</div>
</li>
<li style="font-family: 'Courier New', Courier, monospace; color: black; font-weight: normal; font-style: normal;color:#3A6A8B;">
<div style="font-family: 'Courier New', Courier, monospace; font-weight: normal;"><a href="http://www.perldoc.com/perl5.6/pod/func/close.html"><span style="color: #000066;">close</span></a><span style="color: #66cc66;">&#40;</span>SENDMAIL<span style="color: #66cc66;">&#41;</span>; </div>
</li>
</ol>
</div>
</div>
</div>
<p></p>
<p>Now configure mod_evasive to execute your script when it is triggered, add the following to your $APACHE_ROOT/conf/httpd.conf:</p>
<div class="igBar"><span id="lcode-3"><a href="#" onclick="javascript:showPlainTxt('code-3'); return false;">PLAIN TEXT</a></span></div>
<div class="syntax_hilite"><span class="langName">CODE:</span>
<div id="code-3">
<div class="code">
<ol>
<li style="font-family: 'Courier New', Courier, monospace; color: black; font-weight: normal; font-style: normal;color:#3A6A8B;">
<div style="font-family: 'Courier New', Courier, monospace; font-weight: normal;">&lt;ifmodule mod_evasive.<span style="">c</span>&gt;</div>
</li>
<li style="font-weight: bold;color:#26536A;">
<div style="font-family: 'Courier New', Courier, monospace; font-weight: normal;">&nbsp; &nbsp; DOSSystemCommand&nbsp; &nbsp; <span style="color:#CC0000;">"/usr/local/scripts/mod_evasive_alert.pl"</span></div>
</li>
<li style="font-family: 'Courier New', Courier, monospace; color: black; font-weight: normal; font-style: normal;color:#3A6A8B;">
<div style="font-family: 'Courier New', Courier, monospace; font-weight: normal;">&lt;/ifmodule&gt; </div>
</li>
</ol>
</div>
</div>
</div>
<p></p>
<p>Now restart Apache:</p>
<p><code>[foo@bar mod_evasive]$ <strong>$APACHE_ROOT/bin/apachectl restart</strong><br />
/usr/local/apache/bin/apachectl restart: httpd restarted</code></p>
<p>Tada! You're done. Use the 'test.pl' script provided by mod_evasive to trigger a blocking of your IP and see the email generated.</p>
]]></content:encoded>
			<wfw:commentRss>http://blog.lobstertechnology.com/2006/03/29/patch-to-mod_evasive-to-enhance-reporting/feed/</wfw:commentRss>
		<slash:comments>2</slash:comments>
		</item>
		<item>
		<title>Analysis of Spamming Zombie Botnets</title>
		<link>http://blog.lobstertechnology.com/2006/03/02/analysis-of-spamming-zombie-botnets/</link>
		<comments>http://blog.lobstertechnology.com/2006/03/02/analysis-of-spamming-zombie-botnets/#comments</comments>
		<pubDate>Thu, 02 Mar 2006 01:13:47 +0000</pubDate>
		<dc:creator>Michael Cutler</dc:creator>
				<category><![CDATA[Internet]]></category>
		<category><![CDATA[Spam]]></category>
		<category><![CDATA[SpamKit]]></category>
		<category><![CDATA[Wordpress]]></category>

		<guid isPermaLink="false">http://blog.lobstertechnology.com/2006/03/02/analysis-of-spamming-zombie-botnets/</guid>
		<description><![CDATA[Since writing my SpamKit Plugin I have been keeping a keen eye on the comment/trackback spam subject and have guinea pig'd my ideas on my own blog. Recently I noticed a distinct change in the sophistication of comment-spammers.
The early comment-spammers were using very basic HTTP clients, mostly without thinking about what's going on 'under the [...]]]></description>
			<content:encoded><![CDATA[<p>Since writing my SpamKit Plugin I have been keeping a keen eye on the comment/trackback spam subject and have guinea pig'd my ideas on my own blog. Recently I noticed a distinct change in the sophistication of comment-spammers.</p>
<p>The early comment-spammers were using very basic HTTP clients, mostly without thinking about what's going on 'under the hood'. As such their spam-messages would come through with easily filtered HTTP "User-Agent" headers like "<code>PEAR HTTP_Request class ( http://pear.php.net/ )</code>" and "<code>libwww-perl/5.803</code>". Over a period of a few months these – what I call 1st generation – bots began to dwindle in numbers, replaced by slightly more sophisticated clients which loosely emulated real browsers.</p>
<p>These 2nd generation bots were still very primitive, apart from changing the "User-Agent" and adding a few other headers they were still pretty basic and would repeatedly attempt to post comments over the period of a few seconds on a number of posts. This activity is also easily filtered since not even a superhuman Blog-fiend could comment on your top ten posts in less than 10 seconds.</p>
<p>All the attempts so far have been very basic, beginners in Perl / PHP could probably pull it off easily, and they are just as easily filtered out.</p>
<p>Over the Christmas period I observed some very unusual activity, a 'spam attack' coming from dozens of source IP addresses, coordinated within a few minutes. I initially spotted it because the "User-Agent" header was completely empty – stands out a bit. After some investigation and further attacks I became pretty confident this wasn't a fluke or coincidence of independent spammers.</p>
<p>I knocked up a quick Wordpress plug-in to capture as much info about these suspicious requests as possible. Here is one of the first attacks.</p>
<p><small><code>03/02/2006 20:37:44 212.0.XXX.XXX GET /<br />
03/02/2006 20:38:14 201.242.XXX.XXX GET /category/wordpress/plugins/<br />
03/02/2006 20:39:54 210.183.XXX.XXX GET /2006/02/02/search-term-highlighter-plugin-0-0/<br />
03/02/2006 20:40:25 200.122.XXX.XXX GET /category/java/jakarta-velocity/<br />
03/02/2006 20:40:37 62.23.XXX.XXX GET /2006/02/02/sitecom-cn-502-usb-bluetooth-dongle-works-on-linux/<br />
03/02/2006 20:40:55 68.96.XXX.XXX GET /2006/02/02/search-term-highlighter-plugin-0-0/<br />
03/02/2006 20:41:18 70.88.XXX.XXX POST /wp-comments-post.php<br />
03/02/2006 20:41:20 70.88.XXX.XXX GET /category/thoughts/<br />
03/02/2006 20:41:44 200.21.XXX.XXX POST /wp-comments-post.php<br />
03/02/2006 20:41:48 200.21.XXX.XXX GET /2006/01/25/ti-7x21-flashmedia-sd-host-controller-104c-8033/<br />
03/02/2006 20:42:16 61.145.XXX.XXX GET /category/wordpress/plugins/search-term-highlighter/<br />
03/02/2006 20:42:24 217.113.XXX.XXX GET /category/flash/<br />
03/02/2006 20:42:48 212.251.XXX.XXX GET /category/internet/<br />
03/02/2006 20:43:04 205.180.XXX.XXX POST /wp-comments-post.php<br />
03/02/2006 20:43:22 82.76.XXX.XXX GET /keywords/<br />
03/02/2006 20:43:56 218.248.XXX.XXX GET /2006/02/02/search-term-highlighter-plugin-0-0/#postcomment<br />
03/02/2006 20:44:13 206.191.XXX.XXX GET /2006/02/02/search-term-highlighter-plugin-0-0/%23postcomment<br />
03/02/2006 20:44:14 206.191.XXX.XXX GET /category/tools/<br />
03/02/2006 20:44:15 206.191.XXX.XXX GET /category/wordpress/plugins/search-term-highlighter/<br />
03/02/2006 20:44:38 62.23.XXX.XXX GET /category/wordpress/plugins/search-term-highlighter/<br />
03/02/2006 20:45:33 82.76.XXX.XXX POST /wp-comments-post.php<br />
03/02/2006 20:45:34 82.76.XXX.XXX GET /category/tools/<br />
03/02/2006 20:45:35 82.76.XXX.XXX POST /wp-comments-post.php<br />
03/02/2006 20:45:48 203.162.XXX.XXX POST /wp-comments-post.php</code></small></p>
<p>In this particular instance, the attack was over a ten minute period. The first request was a HTTP GET on the root of my Blog "/" almost definitely used to feed the other bots with URL's. Next, other clients in the Botnet continue to spider my Blog in parallel, building a list of URL's to try later and lastly the first of the attempts to post a comment.</p>
<p>If you examine the sequence of requests, the bots are posting a comment, then coming back to check if it was successful. Analysis of later attacks even found other bots in the group checking if the comment posted by a peer bot was successful. The participating hosts are located all over the world but the majority are in North America and Asia.</p>
<p>This obviously demonstrates a very high level of sophistication. Initially I presumed that there was a single client application running requests in parallel over a group of HTTP proxies. After tracing down the locations &#038; owners of each of the participants in the attacks I concluded it was infeasible that they all happened to have open proxies being abused in this way. A large proportion of the machines being used are actually web servers which have probably been exploited and are running IRC-controlled Trojan software.</p>
<p>Backing this up is the pace these attacks are evolving, the first few were very primitive without even a HTTP "User-Agent" header; however this was very quickly amended. The most recent attack I observed (1st March 2006) showed even more improvements, each client was almost indistinguishable from normal visitors. Providing full 'Internet Explorer' like headers of accepted mime types, charsets, languages and even including valid HTTP referrer headers and cookies.<br />
Thankfully, all their time seems to be invested in improving the client software; the actual content of the comment was practically identical.</p>
<p>My SpamKit Plugin has so far easily handled each of these situations. It uses <a href='http://www.webofshite.com/?p=8'>Gerry</a>'s "Time Based Tokens" which were auto-generated and written into a hidden form field. Any incoming comments without a token or with an invalid token could be held for moderation while at the same time having zero impact on real visitors writing comments. Unlike techniques used by other solutions it does not require the user to type in a random key from an image like the 'captcha' technique, nor does it rely on JavaScript support in the browser. Until these spam bots reach a level of sophistication where they are parsing out HTML forms including hidden values and posting them, the current version of SpamKit will still be an effective solution.</p>
<p>However there is one major drawback with SpamKit; pingback/trackback's are machine-generated, they will not have a "Time Based Token" and will be held for moderation as if they were spam. The problem with this is that spammers are also increasingly using the pingback/trackback mechanism to get their comments through the net. A lot of thought and discussion on this subject with <a href='http://www.webofshite.com/'>Gerry</a> lead to one potential solution; scoring &#038; validation on the URL the pingback/trackback is supposedly from.</p>
<p>In early examples of trackback spam the URL given pointed straight to some advertising-based web page. Something like this lends itself to easy detection and filtering as the content when examined would score highly for spam key words like 'Viagra' etc. However these attacks have also evolved, the most recent of which point to real web pages or Blogs that contain obfuscated JavaScript redirection code – redirecting real visitor's browsers but avoiding any page content detection techniques. In some cases the code has been inserted into Bulletin Boards or Guestbook's which allow unfiltered HTML.</p>
<p>An example page with obfuscated JavaScript redirection (warning, this will redirect you to <code>mp3search.ru</code>)</p>
<p><code>http://zigfrid.blog.kataweb.it/il_mio_weblog/</code></p>
<p>So, what measures can be taken to stop spam?</p>
<p>Personally I don't think you will ever get rid of spam, you have a pretty good chance of eradicating all but the most sophisticated of spammers, but you'll never stop 100% of spam. The best methodology is to constantly evolve your defences at the same rate or faster than the opposition. For starters <a href='http://www.webofshite.com/'>Gerry</a> &#038; I are constantly dreaming up new ways we can enhance SpamKit… Recent updates include encoding the original source IP address in the "Time Based Token" which would become invalid if submitted from a different address. Other works in progress include hardcore validation of the email address submitted; does the domain exist? does it have a mail exchanger MX record? etc. content validation, key word searching and probabilities of the content being spam – progress will be reported here and on <a href='http://www.webofshite.com/'>Gerry</a>'s site.</p>
<p>In the long term spammers are going to have clients that pretty much replicate real users down to the delays &#038; randomness between requests. Countermeasures are going to have to be just as sophisticated, evaluating content and even executing JavaScript as if they were also real clients.</p>
]]></content:encoded>
			<wfw:commentRss>http://blog.lobstertechnology.com/2006/03/02/analysis-of-spamming-zombie-botnets/feed/</wfw:commentRss>
		<slash:comments>1</slash:comments>
		</item>
		<item>
		<title>GoogleBot Experiment Success!</title>
		<link>http://blog.lobstertechnology.com/2006/02/24/googlebot-experiment-success/</link>
		<comments>http://blog.lobstertechnology.com/2006/02/24/googlebot-experiment-success/#comments</comments>
		<pubDate>Fri, 24 Feb 2006 02:58:28 +0000</pubDate>
		<dc:creator>Michael Cutler</dc:creator>
				<category><![CDATA[Internet]]></category>
		<category><![CDATA[Search Engines]]></category>
		<category><![CDATA[Spiders & Bots]]></category>
		<category><![CDATA[Wordpress]]></category>

		<guid isPermaLink="false">http://blog.lobstertechnology.com/2006/02/24/googlebot-experiment-success/</guid>
		<description><![CDATA[A month has past since I made a change to my Wordpress templates to experiment with Google bot (see previous post) and I can proudly report that it works like a charm.
My original problem was that Google was returning search results pointing to index-style pages on my Blog instead of the post's themselves. These index [...]]]></description>
			<content:encoded><![CDATA[<p>A month has past since I made a change to my Wordpress templates to experiment with Google bot (<a href='http://blog.lobstertechnology.com/2006/01/23/experimenting-with-googlebot/'>see previous post</a>) and I can proudly report that it works like a charm.</p>
<p>My original problem was that Google was returning search results pointing to index-style pages on my Blog instead of the post's themselves. These index pages like Categories &#038; Archives would quickly update and the majority of visitors coming from Google search results were having a poor experience - the post that drew them in wasn't obviously visible.</p>
<p>I knew I could use Robots.txt directives to control the INDEX-ing and FOLLOW-ing of my site, but I was hesitant about applying experimental rules to all Search Engine robots. Thankfully GoogleBot looks for a header specific to itself only, this let me apply custom rules to Google only very easily.</p>
<p>Using my Wordpress templates I added the following header on all index-style pages except the home page:</p>
<p><code>&lt;meta name="GOOGLEBOT" content="NOINDEX,FOLLOW"/&gt;</code></p>
<p>Basically this is instructing GoogleBot to follow links on this page, but not to index the page itself. The end result is that search results pointing to my blog are using the 'permalink' URL, not the index page it is listed on.</p>
<p> <img src='http://blog.lobstertechnology.com/wp-includes/images/smilies/icon_wink.gif' alt=';)' class='wp-smiley' /> </p>
]]></content:encoded>
			<wfw:commentRss>http://blog.lobstertechnology.com/2006/02/24/googlebot-experiment-success/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Using sshdfilter to secure an SSH server</title>
		<link>http://blog.lobstertechnology.com/2006/02/13/using-sshdfilter-to-secure-an-ssh-server/</link>
		<comments>http://blog.lobstertechnology.com/2006/02/13/using-sshdfilter-to-secure-an-ssh-server/#comments</comments>
		<pubDate>Mon, 13 Feb 2006 23:26:56 +0000</pubDate>
		<dc:creator>Michael Cutler</dc:creator>
				<category><![CDATA[Internet]]></category>
		<category><![CDATA[Linux]]></category>
		<category><![CDATA[Security]]></category>
		<category><![CDATA[Tools]]></category>

		<guid isPermaLink="false">http://blog.lobstertechnology.com/2006/02/13/using-sshdfilter-to-secure-an-ssh-server/</guid>
		<description><![CDATA[Since moving my OpenSSH server down to its standard port number I have been hit daily by service scanning software and brute force password attacks. Gerry pointed out that sshdfilter can help.
sshdfilter blocks the frequent brute force attacks on ssh daemons, it does this by directly reading the sshd logging output and generating iptables rules, [...]]]></description>
			<content:encoded><![CDATA[<p>Since moving my OpenSSH server down to its standard port number I have been hit daily by service scanning software and brute force password attacks. <a href="http://webofshite.com/">Gerry</a> pointed out that <a href="http://www.csc.liv.ac.uk/~greg/sshdfilter/">sshdfilter</a> can help.</p>
<blockquote><p>sshdfilter blocks the frequent brute force attacks on ssh daemons, it does this by directly reading the sshd logging output and generating iptables rules, the process can be quick enough to block an attack before they get a chance to enter any password at all.</p></blockquote>
<p>It's quick and simple to setup, I enabled email alerts to see what it gets upto and can report it is all working fine on my servers (Red Hat 9 customised).</p>
<p>It will block when triggered by:</p>
<li style='list-style:square'>An attempt to login as a user which doesn't exist</li>
<li style='list-style:square'>After N failed attempts to login to an existing user account</li>
<li style='list-style:square'>If the incoming connection fails to provide an SSH version banner which is part of the SSH protocol, it's most likely a port scanner or dumb client</li>
<p>The length of time the block remains in place is all configurable.</p>
]]></content:encoded>
			<wfw:commentRss>http://blog.lobstertechnology.com/2006/02/13/using-sshdfilter-to-secure-an-ssh-server/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Brute force password attacks on Linux over SSH</title>
		<link>http://blog.lobstertechnology.com/2006/02/08/brute-force-passwords-attacks-on-linux-over-ssh/</link>
		<comments>http://blog.lobstertechnology.com/2006/02/08/brute-force-passwords-attacks-on-linux-over-ssh/#comments</comments>
		<pubDate>Wed, 08 Feb 2006 10:24:01 +0000</pubDate>
		<dc:creator>Michael Cutler</dc:creator>
				<category><![CDATA[Internet]]></category>
		<category><![CDATA[Linux]]></category>
		<category><![CDATA[Security]]></category>

		<guid isPermaLink="false">http://blog.lobstertechnology.com/2006/02/08/brute-force-passwords-attacks-on-linux-over-ssh/</guid>
		<description><![CDATA[This is one of the main reasons I hate running SSH on the standard port numbers, every day I get log-alerts like these. As per usual I notify the originating ISP, at least I have an email template for it.
Failed logins from these:
  invalid user abdul (password) from 203.98.XXX.XXX: 2 Time(s)
  invalid user [...]]]></description>
			<content:encoded><![CDATA[<p>This is one of the main reasons I hate running SSH on the standard port numbers, every day I get log-alerts like these. As per usual I notify the originating ISP, at least I have an email template for it.</p>
<p><code>Failed logins from these:<br />
  invalid user abdul (password) from 203.98.XXX.XXX: 2 Time(s)<br />
  invalid user abort (password) from 203.98.XXX.XXX: 2 Time(s)<br />
  invalid user abs (password) from 203.98.XXX.XXX: 1 Time(s)<br />
  invalid user adam (password) from 203.98.XXX.XXX: 2 Time(s)<br />
  invalid user admin (password) from 125.240.XXX.XXX: 28 Time(s)<br />
  invalid user admin (password) from 203.98.XXX.XXX: 14 Time(s)<br />
  invalid user advertise (password) from 203.98.XXX.XXX: 2 Time(s)<br />
  invalid user alan (password) from 203.98.XXX.XXX: 4 Time(s)<br />
  invalid user alcatel (password) from 203.98.XXX.XXX: 4 Time(s)<br />
  invalid user alex (password) from 125.240.XXX.XXX: 28 Time(s)<br />
  invalid user alex (password) from 203.98.XXX.XXX: 6 Time(s)<br />
  invalid user allan (password) from 203.98.XXX.XXX: 1 Time(s)<br />
  invalid user aloha (password) from 203.98.XXX.XXX: 2 Time(s)<br />
  invalid user alpha (password) from 203.98.XXX.XXX: 2 Time(s)<br />
  invalid user alter (password) from 203.98.XXX.XXX: 1 Time(s)<br />
  invalid user ameno (password) from 203.98.XXX.XXX: 2 Time(s)<br />
  invalid user amman (password) from 203.98.XXX.XXX: 2 Time(s)<br />
  invalid user andy (password) from 125.240.XXX.XXX: 28 Time(s)<br />
  invalid user angel (password) from 203.98.XXX.XXX: 2 Time(s)<br />
  invalid user antidot (password) from 203.98.XXX.XXX: 2 Time(s)<br />
  invalid user apache (password) from 125.240.XXX.XXX: 30 Time(s)<br />
  invalid user apache (password) from 203.98.XXX.XXX: 2 Time(s)<br />
  invalid user ariane (password) from 203.98.XXX.XXX: 2 Time(s)<br />
  invalid user aron (password) from 203.98.XXX.XXX: 2 Time(s)<br />
  invalid user art (password) from 203.98.XXX.XXX: 1 Time(s)<br />
  invalid user artificial (password) from 203.98.XXX.XXX: 2 Time(s)<br />
  invalid user asahi (password) from 203.98.XXX.XXX: 2 Time(s)<br />
  invalid user aspect (password) from 203.98.XXX.XXX: 2 Time(s)<br />
  invalid user aspidistra (password) from 203.98.XXX.XXX: 2 Time(s)<br />
  invalid user atempt (password) from 203.98.XXX.XXX: 2 Time(s)<br />
  invalid user atilla (password) from 203.98.XXX.XXX: 2 Time(s)<br />
  invalid user atom (password) from 203.98.XXX.XXX: 2 Time(s)<br />
  invalid user aurel (password) from 203.98.XXX.XXX: 4 Time(s)<br />
  invalid user avsadmin (password) from 203.98.XXX.XXX: 2 Time(s)<br />
  invalid user azazel (password) from 203.98.XXX.XXX: 2 Time(s)<br />
  invalid user backup (password) from 203.98.XXX.XXX: 2 Time(s)<br />
  invalid user base (password) from 203.98.XXX.XXX: 2 Time(s)<br />
  invalid user bash (password) from 203.98.XXX.XXX: 2 Time(s)<br />
  invalid user beast (password) from 203.98.XXX.XXX: 2 Time(s)<br />
  invalid user berg (password) from 203.98.XXX.XXX: 2 Time(s)<br />
  invalid user beta (password) from 203.98.XXX.XXX: 2 Time(s)<br />
  invalid user binary (password) from 203.98.XXX.XXX: 2 Time(s)<br />
  invalid user black (password) from 203.98.XXX.XXX: 2 Time(s)<br />
  invalid user bobo (password) from 203.98.XXX.XXX: 2 Time(s)<br />
  invalid user bogdan (password) from 203.98.XXX.XXX: 2 Time(s)<br />
  invalid user book (password) from 203.98.XXX.XXX: 2 Time(s)<br />
  invalid user bourn (password) from 203.98.XXX.XXX: 2 Time(s)<br />
  invalid user brett (password) from 203.98.XXX.XXX: 2 Time(s)<br />
  invalid user brian (password) from 125.240.XXX.XXX: 28 Time(s)<br />
  invalid user buche (password) from 203.98.XXX.XXX: 2 Time(s)<br />
  invalid user cable (password) from 203.98.XXX.XXX: 2 Time(s)<br />
  invalid user cache (password) from 203.98.XXX.XXX: 2 Time(s)<br />
  invalid user cain (password) from 203.98.XXX.XXX: 2 Time(s)<br />
  invalid user cambera (password) from 203.98.XXX.XXX: 2 Time(s)<br />
  invalid user camelia (password) from 203.98.XXX.XXX: 2 Time(s)<br />
  invalid user cesna (password) from 203.98.XXX.XXX: 2 Time(s)<br />
  invalid user chat (password) from 203.98.XXX.XXX: 2 Time(s)<br />
  invalid user chris (password) from 125.240.XXX.XXX: 28 Time(s)<br />
  invalid user church (password) from 203.98.XXX.XXX: 2 Time(s)<br />
  invalid user clark (password) from 203.98.XXX.XXX: 2 Time(s)<br />
  invalid user client (password) from 125.240.XXX.XXX: 28 Time(s)<br />
  invalid user coffee (password) from 203.98.XXX.XXX: 2 Time(s)<br />
  invalid user common (password) from 203.98.XXX.XXX: 2 Time(s)<br />
  invalid user costel (password) from 203.98.XXX.XXX: 2 Time(s)<br />
  invalid user costi (password) from 203.98.XXX.XXX: 2 Time(s)<br />
  invalid user crack (password) from 203.98.XXX.XXX: 2 Time(s)<br />
  invalid user cristina (password) from 203.98.XXX.XXX: 4 Time(s)<br />
  invalid user cyclon (password) from 203.98.XXX.XXX: 2 Time(s)<br />
  invalid user dalton (password) from 203.98.XXX.XXX: 2 Time(s)<br />
  invalid user danny (password) from 203.98.XXX.XXX: 2 Time(s)<br />
  invalid user darling (password) from 203.98.XXX.XXX: 2 Time(s)<br />
  invalid user dasilva (password) from 203.98.XXX.XXX: 2 Time(s)<br />
  invalid user data (password) from 203.98.XXX.XXX: 2 Time(s)<br />
  invalid user dave (password) from 125.240.XXX.XXX: 28 Time(s)<br />
  invalid user david (password) from 125.240.XXX.XXX: 26 Time(s)<br />
  invalid user david (password) from 203.98.XXX.XXX: 2 Time(s)<br />
  invalid user davod (password) from 125.240.XXX.XXX: 2 Time(s)<br />
  invalid user deserve (password) from 203.98.XXX.XXX: 2 Time(s)<br />
  invalid user desire (password) from 203.98.XXX.XXX: 2 Time(s)<br />
  invalid user dns (password) from 203.98.XXX.XXX: 2 Time(s)<br />
  invalid user domain (password) from 203.98.XXX.XXX: 2 Time(s)<br />
  invalid user donna (password) from 125.240.XXX.XXX: 28 Time(s)<br />
  invalid user dool (password) from 203.98.XXX.XXX: 2 Time(s)<br />
  invalid user down (password) from 203.98.XXX.XXX: 2 Time(s)<br />
  invalid user dragon (password) from 203.98.XXX.XXX: 2 Time(s)<br />
  invalid user dudu (password) from 203.98.XXX.XXX: 1 Time(s)<br />
  invalid user earth (password) from 203.98.XXX.XXX: 2 Time(s)<br />
  invalid user elixir (password) from 203.98.XXX.XXX: 2 Time(s)<br />
  invalid user elvis (password) from 125.240.XXX.XXX: 28 Time(s)<br />
  invalid user epsilon (password) from 203.98.XXX.XXX: 2 Time(s)<br />
  invalid user eric (password) from 125.240.XXX.XXX: 28 Time(s)<br />
  invalid user example (password) from 203.98.XXX.XXX: 2 Time(s)<br />
  invalid user fadeh (password) from 203.98.XXX.XXX: 2 Time(s)<br />
  invalid user fatih (password) from 203.98.XXX.XXX: 2 Time(s)<br />
  invalid user fax (password) from 125.240.XXX.XXX: 28 Time(s)<br />
  invalid user felix (password) from 125.240.XXX.XXX: 28 Time(s)<br />
  invalid user fiat (password) from 203.98.XXX.XXX: 2 Time(s)<br />
  invalid user filter (password) from 203.98.XXX.XXX: 2 Time(s)<br />
  invalid user finale (password) from 203.98.XXX.XXX: 2 Time(s)<br />
  invalid user fire (password) from 203.98.XXX.XXX: 2 Time(s)<br />
  invalid user foon (password) from 203.98.XXX.XXX: 1 Time(s)<br />
  invalid user ford (password) from 125.240.XXX.XXX: 28 Time(s)<br />
  invalid user found (password) from 203.98.XXX.XXX: 2 Time(s)<br />
  invalid user frank (password) from 125.240.XXX.XXX: 28 Time(s)<br />
  invalid user freddy (password) from 125.240.XXX.XXX: 14 Time(s)<br />
  invalid user ftpuser (password) from 125.240.XXX.XXX: 30 Time(s)<br />
  invalid user gamma (password) from 203.98.XXX.XXX: 2 Time(s)<br />
  invalid user ganja (password) from 203.98.XXX.XXX: 2 Time(s)<br />
  invalid user gaspar (password) from 203.98.XXX.XXX: 2 Time(s)<br />
  invalid user george (password) from 203.98.XXX.XXX: 2 Time(s)<br />
  invalid user gerhard (password) from 203.98.XXX.XXX: 2 Time(s)<br />
  invalid user ghost (password) from 203.98.XXX.XXX: 4 Time(s)<br />
  invalid user gone (password) from 203.98.XXX.XXX: 2 Time(s)<br />
  invalid user grand (password) from 203.98.XXX.XXX: 2 Time(s)<br />
  invalid user granicus (password) from 203.98.XXX.XXX: 2 Time(s)<br />
  invalid user gregory (password) from 125.240.XXX.XXX: 28 Time(s)<br />
  invalid user grims (password) from 203.98.XXX.XXX: 2 Time(s)<br />
  invalid user guest (password) from 125.240.XXX.XXX: 28 Time(s)<br />
  invalid user guest (password) from 203.98.XXX.XXX: 4 Time(s)<br />
  invalid user gushi (password) from 203.98.XXX.XXX: 2 Time(s)<br />
  invalid user hang (password) from 203.98.XXX.XXX: 2 Time(s)<br />
  invalid user hassan (password) from 203.98.XXX.XXX: 2 Time(s)<br />
  invalid user health (password) from 203.98.XXX.XXX: 2 Time(s)<br />
  invalid user helen (password) from 125.240.XXX.XXX: 28 Time(s)<br />
  invalid user hell (password) from 203.98.XXX.XXX: 2 Time(s)<br />
  invalid user helmut (password) from 203.98.XXX.XXX: 2 Time(s)<br />
  invalid user heracle (password) from 203.98.XXX.XXX: 2 Time(s)<br />
  invalid user honour (password) from 203.98.XXX.XXX: 2 Time(s)<br />
  invalid user host (password) from 203.98.XXX.XXX: 2 Time(s)<br />
  invalid user http (password) from 203.98.XXX.XXX: 2 Time(s)<br />
  invalid user httpd (password) from 203.98.XXX.XXX: 4 Time(s)<br />
  invalid user iarin (password) from 203.98.XXX.XXX: 2 Time(s)<br />
  invalid user ident (password) from 203.98.XXX.XXX: 2 Time(s)<br />
  invalid user include (password) from 203.98.XXX.XXX: 2 Time(s)<br />
  invalid user info (password) from 125.240.XXX.XXX: 28 Time(s)<br />
  invalid user info (password) from 203.98.XXX.XXX: 4 Time(s)<br />
  invalid user iolanda (password) from 203.98.XXX.XXX: 2 Time(s)<br />
  invalid user ion (password) from 203.98.XXX.XXX: 1 Time(s)<br />
  invalid user ionut (password) from 203.98.XXX.XXX: 2 Time(s)<br />
  invalid user irina (password) from 203.98.XXX.XXX: 2 Time(s)<br />
  invalid user jack (password) from 125.240.XXX.XXX: 26 Time(s)<br />
  invalid user jamal (password) from 203.98.XXX.XXX: 2 Time(s)<br />
  invalid user james (password) from 125.240.XXX.XXX: 28 Time(s)<br />
  invalid user jasmina (password) from 203.98.XXX.XXX: 2 Time(s)<br />
  invalid user jason (password) from 125.240.XXX.XXX: 28 Time(s)<br />
  invalid user java (password) from 203.98.XXX.XXX: 2 Time(s)<br />
  invalid user jeffrey (password) from 203.98.XXX.XXX: 2 Time(s)<br />
  invalid user jelem (password) from 203.98.XXX.XXX: 2 Time(s)<br />
  invalid user jenny (password) from 125.240.XXX.XXX: 28 Time(s)<br />
  invalid user jerry (password) from 125.240.XXX.XXX: 28 Time(s)<br />
  invalid user jessica (password) from 125.240.XXX.XXX: 28 Time(s)<br />
  invalid user jessie (password) from 125.240.XXX.XXX: 14 Time(s)<br />
  invalid user jhony (password) from 203.98.XXX.XXX: 1 Time(s)<br />
  invalid user jiang (password) from 203.98.XXX.XXX: 2 Time(s)<br />
  invalid user jihad (password) from 203.98.XXX.XXX: 2 Time(s)<br />
  invalid user jim (password) from 125.240.XXX.XXX: 28 Time(s)<br />
  invalid user joe (password) from 125.240.XXX.XXX: 26 Time(s)<br />
  invalid user john (password) from 125.240.XXX.XXX: 28 Time(s)<br />
  invalid user john (password) from 203.98.XXX.XXX: 4 Time(s)<br />
  invalid user jupiter (password) from 203.98.XXX.XXX: 2 Time(s)<br />
  invalid user just (password) from 203.98.XXX.XXX: 2 Time(s)<br />
  invalid user justice (password) from 203.98.XXX.XXX: 2 Time(s)<br />
  invalid user justin (password) from 125.240.XXX.XXX: 28 Time(s)<br />
  invalid user kadir (password) from 203.98.XXX.XXX: 2 Time(s)<br />
  invalid user kain (password) from 203.98.XXX.XXX: 2 Time(s)<br />
  invalid user kaleb (password) from 203.98.XXX.XXX: 2 Time(s)<br />
  invalid user kelly (password) from 125.240.XXX.XXX: 28 Time(s)<br />
  invalid user kevin (password) from 125.240.XXX.XXX: 28 Time(s)<br />
  invalid user kevin (password) from 203.98.XXX.XXX: 2 Time(s)<br />
  invalid user kline (password) from 203.98.XXX.XXX: 2 Time(s)<br />
  invalid user koln (password) from 203.98.XXX.XXX: 2 Time(s)<br />
  invalid user kondor (password) from 203.98.XXX.XXX: 2 Time(s)<br />
  invalid user lampard (password) from 203.98.XXX.XXX: 2 Time(s)<br />
  invalid user larry (password) from 125.240.XXX.XXX: 28 Time(s)<br />
  invalid user laura (password) from 125.240.XXX.XXX: 16 Time(s)<br />
  invalid user laura (password) from 203.98.XXX.XXX: 2 Time(s)<br />
  invalid user law (password) from 203.98.XXX.XXX: 2 Time(s)<br />
  invalid user lawyer (password) from 203.98.XXX.XXX: 2 Time(s)<br />
  invalid user leroi (password) from 203.98.XXX.XXX: 2 Time(s)<br />
  invalid user leslie (password) from 125.240.XXX.XXX: 28 Time(s)<br />
  invalid user lex (password) from 203.98.XXX.XXX: 2 Time(s)<br />
  invalid user library (password) from 125.240.XXX.XXX: 28 Time(s)<br />
  invalid user library (password) from 203.98.XXX.XXX: 2 Time(s)<br />
  invalid user light (password) from 203.98.XXX.XXX: 2 Time(s)<br />
  invalid user lincoln (password) from 203.98.XXX.XXX: 1 Time(s)<br />
  invalid user linda (password) from 125.240.XXX.XXX: 28 Time(s)<br />
  invalid user linux (password) from 203.98.XXX.XXX: 2 Time(s)<br />
  invalid user lisa (password) from 125.240.XXX.XXX: 28 Time(s)<br />
  invalid user locco (password) from 203.98.XXX.XXX: 2 Time(s)<br />
  invalid user lost (password) from 203.98.XXX.XXX: 2 Time(s)<br />
  invalid user louis (password) from 125.240.XXX.XXX: 28 Time(s)<br />
  invalid user louise (password) from 125.240.XXX.XXX: 28 Time(s)<br />
  invalid user lucky (password) from 125.240.XXX.XXX: 28 Time(s)<br />
  invalid user mailtest (password) from 125.240.XXX.XXX: 24 Time(s)<br />
  invalid user malaga (password) from 203.98.XXX.XXX: 2 Time(s)<br />
  invalid user mano (password) from 203.98.XXX.XXX: 2 Time(s)<br />
  invalid user maria (password) from 203.98.XXX.XXX: 6 Time(s)<br />
  invalid user mariana (password) from 203.98.XXX.XXX: 2 Time(s)<br />
  invalid user mark (password) from 125.240.XXX.XXX: 28 Time(s)<br />
  invalid user mark (password) from 203.98.XXX.XXX: 1 Time(s)<br />
  invalid user marte (password) from 203.98.XXX.XXX: 4 Time(s)<br />
  invalid user marty (password) from 125.240.XXX.XXX: 14 Time(s)<br />
  invalid user mary (password) from 125.240.XXX.XXX: 14 Time(s)<br />
  invalid user master (password) from 203.98.XXX.XXX: 2 Time(s)<br />
  invalid user matt (password) from 125.240.XXX.XXX: 28 Time(s)<br />
  invalid user media (password) from 203.98.XXX.XXX: 2 Time(s)<br />
  invalid user mercur (password) from 203.98.XXX.XXX: 2 Time(s)<br />
  invalid user mercury (password) from 203.98.XXX.XXX: 2 Time(s)<br />
  invalid user michael (password) from 203.98.XXX.XXX: 2 Time(s)<br />
  invalid user mike (password) from 125.240.XXX.XXX: 28 Time(s)<br />
  invalid user mike (password) from 203.98.XXX.XXX: 4 Time(s)<br />
  invalid user mind (password) from 203.98.XXX.XXX: 2 Time(s)<br />
  invalid user minerva (password) from 203.98.XXX.XXX: 2 Time(s)<br />
  invalid user mister (password) from 203.98.XXX.XXX: 2 Time(s)<br />
  invalid user mistero (password) from 203.98.XXX.XXX: 2 Time(s)<br />
  invalid user mobifon (password) from 203.98.XXX.XXX: 1 Time(s)<br />
  invalid user mohamed (password) from 203.98.XXX.XXX: 2 Time(s)<br />
  invalid user mona (password) from 203.98.XXX.XXX: 4 Time(s)<br />
  invalid user monaco (password) from 203.98.XXX.XXX: 2 Time(s)<br />
  invalid user monica (password) from 203.98.XXX.XXX: 2 Time(s)<br />
  invalid user mooka (password) from 203.98.XXX.XXX: 2 Time(s)<br />
  invalid user moon (password) from 203.98.XXX.XXX: 2 Time(s)<br />
  invalid user mount (password) from 203.98.XXX.XXX: 2 Time(s)<br />
  invalid user mrdev (password) from 203.98.XXX.XXX: 2 Time(s)<br />
  invalid user mumu (password) from 203.98.XXX.XXX: 2 Time(s)<br />
  invalid user munis (password) from 203.98.XXX.XXX: 2 Time(s)<br />
  invalid user mysql (password) from 125.240.XXX.XXX: 28 Time(s)<br />
  invalid user mysql (password) from 203.98.XXX.XXX: 2 Time(s)<br />
  invalid user nancy (password) from 125.240.XXX.XXX: 14 Time(s)<br />
  invalid user neptun (password) from 203.98.XXX.XXX: 3 Time(s)<br />
  invalid user nino (password) from 203.98.XXX.XXX: 2 Time(s)<br />
  invalid user noise (password) from 203.98.XXX.XXX: 2 Time(s)<br />
  invalid user nokia (password) from 203.98.XXX.XXX: 4 Time(s)<br />
  invalid user office (password) from 125.240.XXX.XXX: 28 Time(s)<br />
  invalid user okubo (password) from 203.98.XXX.XXX: 2 Time(s)<br />
  invalid user omega (password) from 203.98.XXX.XXX: 2 Time(s)<br />
  invalid user oracle (password) from 125.240.XXX.XXX: 26 Time(s)<br />
  invalid user oracle (password) from 203.98.XXX.XXX: 10 Time(s)<br />
  invalid user oracle1 (password) from 203.98.XXX.XXX: 2 Time(s)<br />
  invalid user osama (password) from 203.98.XXX.XXX: 1 Time(s)<br />
  invalid user osiris (password) from 203.98.XXX.XXX: 2 Time(s)<br />
  invalid user osman (password) from 203.98.XXX.XXX: 2 Time(s)<br />
  invalid user palm (password) from 203.98.XXX.XXX: 2 Time(s)<br />
  invalid user panama (password) from 203.98.XXX.XXX: 2 Time(s)<br />
  invalid user pascal (password) from 203.98.XXX.XXX: 2 Time(s)<br />
  invalid user patricia (password) from 125.240.XXX.XXX: 12 Time(s)<br />
  invalid user patrick (password) from 125.240.XXX.XXX: 28 Time(s)<br />
  invalid user paul (password) from 125.240.XXX.XXX: 28 Time(s)<br />
  invalid user paul (password) from 203.98.XXX.XXX: 2 Time(s)<br />
  invalid user peter (password) from 125.240.XXX.XXX: 28 Time(s)<br />
  invalid user pgsql (password) from 203.98.XXX.XXX: 4 Time(s)<br />
  invalid user port (password) from 203.98.XXX.XXX: 1 Time(s)<br />
  invalid user portal (password) from 203.98.XXX.XXX: 2 Time(s)<br />
  invalid user postfix (password) from 125.240.XXX.XXX: 26 Time(s)<br />
  invalid user postgres (password) from 125.240.XXX.XXX: 26 Time(s)<br />
  invalid user public (password) from 125.240.XXX.XXX: 28 Time(s)<br />
  invalid user quarter (password) from 203.98.XXX.XXX: 2 Time(s)<br />
  invalid user rajev (password) from 203.98.XXX.XXX: 2 Time(s)<br />
  invalid user read (password) from 203.98.XXX.XXX: 2 Time(s)<br />
  invalid user rehash (password) from 203.98.XXX.XXX: 2 Time(s)<br />
  invalid user relay (password) from 203.98.XXX.XXX: 2 Time(s)<br />
  invalid user remove (password) from 203.98.XXX.XXX: 2 Time(s)<br />
  invalid user rename (password) from 203.98.XXX.XXX: 2 Time(s)<br />
  invalid user repection (password) from 203.98.XXX.XXX: 2 Time(s)<br />
  invalid user request (password) from 203.98.XXX.XXX: 2 Time(s)<br />
  invalid user resin (password) from 203.98.XXX.XXX: 2 Time(s)<br />
  invalid user restore (password) from 203.98.XXX.XXX: 2 Time(s)<br />
  invalid user richard (password) from 125.240.XXX.XXX: 28 Time(s)<br />
  invalid user richard (password) from 203.98.XXX.XXX: 2 Time(s)<br />
  invalid user road (password) from 203.98.XXX.XXX: 2 Time(s)<br />
  invalid user robert (password) from 125.240.XXX.XXX: 28 Time(s)<br />
  invalid user robin (password) from 125.240.XXX.XXX: 14 Time(s)<br />
  invalid user roger (password) from 125.240.XXX.XXX: 18 Time(s)<br />
  invalid user sales (password) from 125.240.XXX.XXX: 28 Time(s)<br />
  invalid user sales (password) from 203.98.XXX.XXX: 4 Time(s)<br />
  invalid user sam (password) from 125.240.XXX.XXX: 28 Time(s)<br />
  invalid user samba (password) from 203.98.XXX.XXX: 2 Time(s)<br />
  invalid user same (password) from 203.98.XXX.XXX: 2 Time(s)<br />
  invalid user sandy (password) from 125.240.XXX.XXX: 28 Time(s)<br />
  invalid user sarah (password) from 125.240.XXX.XXX: 28 Time(s)<br />
  invalid user saturn (password) from 203.98.XXX.XXX: 4 Time(s)<br />
  invalid user scott (password) from 125.240.XXX.XXX: 28 Time(s)<br />
  invalid user script (password) from 203.98.XXX.XXX: 2 Time(s)<br />
  invalid user search (password) from 203.98.XXX.XXX: 2 Time(s)<br />
  invalid user send (password) from 203.98.XXX.XXX: 2 Time(s)<br />
  invalid user serafim (password) from 203.98.XXX.XXX: 2 Time(s)<br />
  invalid user server (password) from 203.98.XXX.XXX: 2 Time(s)<br />
  invalid user service (password) from 125.240.XXX.XXX: 26 Time(s)<br />
  invalid user shadow (password) from 203.98.XXX.XXX: 2 Time(s)<br />
  invalid user shake (password) from 203.98.XXX.XXX: 2 Time(s)<br />
  invalid user sharon (password) from 125.240.XXX.XXX: 28 Time(s)<br />
  invalid user sharon (password) from 203.98.XXX.XXX: 2 Time(s)<br />
  invalid user shell (password) from 203.98.XXX.XXX: 2 Time(s)<br />
  invalid user shoot (password) from 203.98.XXX.XXX: 2 Time(s)<br />
  invalid user shop (password) from 203.98.XXX.XXX: 2 Time(s)<br />
  invalid user shrike (password) from 203.98.XXX.XXX: 2 Time(s)<br />
  invalid user sigmund (password) from 203.98.XXX.XXX: 2 Time(s)<br />
  invalid user siliciu (password) from 203.98.XXX.XXX: 2 Time(s)<br />
  invalid user silla (password) from 203.98.XXX.XXX: 2 Time(s)<br />
  invalid user silva (password) from 203.98.XXX.XXX: 2 Time(s)<br />
  invalid user silvia (password) from 203.98.XXX.XXX: 2 Time(s)<br />
  invalid user sirg (password) from 203.98.XXX.XXX: 2 Time(s)<br />
  invalid user smash (password) from 203.98.XXX.XXX: 2 Time(s)<br />
  invalid user smell (password) from 203.98.XXX.XXX: 2 Time(s)<br />
  invalid user smuf (password) from 203.98.XXX.XXX: 2 Time(s)<br />
  invalid user snake (password) from 203.98.XXX.XXX: 2 Time(s)<br />
  invalid user sole (password) from 203.98.XXX.XXX: 2 Time(s)<br />
  invalid user sombrero (password) from 203.98.XXX.XXX: 2 Time(s)<br />
  invalid user sorina (password) from 203.98.XXX.XXX: 2 Time(s)<br />
  invalid user sound (password) from 203.98.XXX.XXX: 2 Time(s)<br />
  invalid user space (password) from 203.98.XXX.XXX: 2 Time(s)<br />
  invalid user sparc (password) from 203.98.XXX.XXX: 2 Time(s)<br />
  invalid user spool (password) from 203.98.XXX.XXX: 2 Time(s)<br />
  invalid user sport (password) from 203.98.XXX.XXX: 2 Time(s)<br />
  invalid user squad (password) from 203.98.XXX.XXX: 2 Time(s)<br />
  invalid user staff (password) from 203.98.XXX.XXX: 2 Time(s)<br />
  invalid user stanley (password) from 125.240.XXX.XXX: 28 Time(s)<br />
  invalid user start (password) from 203.98.XXX.XXX: 2 Time(s)<br />
  invalid user stealth (password) from 203.98.XXX.XXX: 2 Time(s)<br />
  invalid user steel (password) from 203.98.XXX.XXX: 2 Time(s)<br />
  invalid user stepfen (password) from 203.98.XXX.XXX: 2 Time(s)<br />
  invalid user stephen (password) from 203.98.XXX.XXX: 2 Time(s)<br />
  invalid user steve (password) from 125.240.XXX.XXX: 28 Time(s)<br />
  invalid user steven (password) from 125.240.XXX.XXX: 28 Time(s)<br />
  invalid user stick (password) from 203.98.XXX.XXX: 2 Time(s)<br />
  invalid user storm (password) from 203.98.XXX.XXX: 2 Time(s)<br />
  invalid user stream (password) from 203.98.XXX.XXX: 2 Time(s)<br />
  invalid user student (password) from 125.240.XXX.XXX: 26 Time(s)<br />
  invalid user student (password) from 203.98.XXX.XXX: 2 Time(s)<br />
  invalid user sun (password) from 203.98.XXX.XXX: 2 Time(s)<br />
  invalid user support (password) from 125.240.XXX.XXX: 28 Time(s)<br />
  invalid user support (password) from 203.98.XXX.XXX: 2 Time(s)<br />
  invalid user susan (password) from 125.240.XXX.XXX: 28 Time(s)<br />
  invalid user susan (password) from 203.98.XXX.XXX: 2 Time(s)<br />
  invalid user system (password) from 203.98.XXX.XXX: 2 Time(s)<br />
  invalid user target (password) from 203.98.XXX.XXX: 2 Time(s)<br />
  invalid user tay (password) from 203.98.XXX.XXX: 2 Time(s)<br />
  invalid user temp (password) from 125.240.XXX.XXX: 28 Time(s)<br />
  invalid user temp (password) from 203.98.XXX.XXX: 2 Time(s)<br />
  invalid user tener (password) from 203.98.XXX.XXX: 2 Time(s)<br />
  invalid user test (password) from 125.240.XXX.XXX: 26 Time(s)<br />
  invalid user test (password) from 203.98.XXX.XXX: 14 Time(s)<br />
  invalid user testuser (password) from 125.240.XXX.XXX: 26 Time(s)<br />
  invalid user tetra (password) from 203.98.XXX.XXX: 2 Time(s)<br />
  invalid user thanatos (password) from 203.98.XXX.XXX: 2 Time(s)<br />
  invalid user thoor (password) from 203.98.XXX.XXX: 2 Time(s)<br />
  invalid user tony (password) from 125.240.XXX.XXX: 28 Time(s)<br />
  invalid user tony (password) from 203.98.XXX.XXX: 2 Time(s)<br />
  invalid user torpe (password) from 203.98.XXX.XXX: 2 Time(s)<br />
  invalid user track (password) from 203.98.XXX.XXX: 2 Time(s)<br />
  invalid user travel (password) from 203.98.XXX.XXX: 2 Time(s)<br />
  invalid user tristan (password) from 203.98.XXX.XXX: 2 Time(s)<br />
  invalid user truth (password) from 203.98.XXX.XXX: 2 Time(s)<br />
  invalid user unix (password) from 203.98.XXX.XXX: 2 Time(s)<br />
  invalid user user (password) from 125.240.XXX.XXX: 28 Time(s)<br />
  invalid user user (password) from 203.98.XXX.XXX: 16 Time(s)<br />
  invalid user username (password) from 203.98.XXX.XXX: 4 Time(s)<br />
  invalid user venus (password) from 203.98.XXX.XXX: 4 Time(s)<br />
  invalid user verset (password) from 203.98.XXX.XXX: 2 Time(s)<br />
  invalid user video (password) from 203.98.XXX.XXX: 2 Time(s)<br />
  invalid user vincent (password) from 125.240.XXX.XXX: 28 Time(s)<br />
  invalid user virtual (password) from 203.98.XXX.XXX: 2 Time(s)<br />
  invalid user vision (password) from 203.98.XXX.XXX: 2 Time(s)<br />
  invalid user visual (password) from 203.98.XXX.XXX: 2 Time(s)<br />
  invalid user warez (password) from 203.98.XXX.XXX: 1 Time(s)<br />
  invalid user web (password) from 203.98.XXX.XXX: 2 Time(s)<br />
  invalid user webadmin (password) from 125.240.XXX.XXX: 28 Time(s)<br />
  invalid user webadmin (password) from 203.98.XXX.XXX: 2 Time(s)<br />
  invalid user webmaster (password) from 125.240.XXX.XXX: 28 Time(s)<br />
  invalid user webmaster (password) from 203.98.XXX.XXX: 4 Time(s)<br />
  invalid user while (password) from 203.98.XXX.XXX: 2 Time(s)<br />
  invalid user white (password) from 203.98.XXX.XXX: 2 Time(s)<br />
  invalid user william (password) from 203.98.XXX.XXX: 2 Time(s)<br />
  invalid user willy (password) from 203.98.XXX.XXX: 2 Time(s)<br />
  invalid user wish (password) from 203.98.XXX.XXX: 2 Time(s)<br />
  invalid user write (password) from 203.98.XXX.XXX: 2 Time(s)<br />
  invalid user www (password) from 203.98.XXX.XXX: 2 Time(s)<br />
  invalid user www-data (password) from 203.98.XXX.XXX: 2 Time(s)<br />
  invalid user wwwrun (password) from 203.98.XXX.XXX: 2 Time(s)<br />
  invalid user yarrow (password) from 203.98.XXX.XXX: 2 Time(s)<br />
  invalid user zed (password) from 203.98.XXX.XXX: 4 Time(s)<br />
  invalid user zoom (password) from 203.98.XXX.XXX: 2 Time(s)<br />
  root/password from 125.240.XXX.XXX: 30 Time(s)<br />
  root/password from 203.98.XXX.XXX: 36 Time(s)</code></p>
<p><code>Locked account login attempts:<br />
  apache : 32 Time(s)<br />
  mysql : 30 Time(s)<br />
  postfix : 26 Time(s)</code></p>
<p>However, my favourite ones are still the bots that try talking HTTP to my SMTP server:</p>
<p><code>unknown[61.128.XXX.XXX] sent non-SMTP command: POST / HTTP/1.1 : 1 Time(s)</code></p>
]]></content:encoded>
			<wfw:commentRss>http://blog.lobstertechnology.com/2006/02/08/brute-force-passwords-attacks-on-linux-over-ssh/feed/</wfw:commentRss>
		<slash:comments>4</slash:comments>
		</item>
		<item>
		<title>Experimenting with Googlebot</title>
		<link>http://blog.lobstertechnology.com/2006/01/23/experimenting-with-googlebot/</link>
		<comments>http://blog.lobstertechnology.com/2006/01/23/experimenting-with-googlebot/#comments</comments>
		<pubDate>Mon, 23 Jan 2006 15:27:17 +0000</pubDate>
		<dc:creator>Michael Cutler</dc:creator>
				<category><![CDATA[Hacks]]></category>
		<category><![CDATA[Internet]]></category>
		<category><![CDATA[Search Engines]]></category>
		<category><![CDATA[Spiders & Bots]]></category>
		<category><![CDATA[Thoughts]]></category>

		<guid isPermaLink="false">http://blog.lobstertechnology.com/2006/01/23/experimenting-with-googlebot/</guid>
		<description><![CDATA[In my previous post 'Blogs are fundamentally flawed…' I noted an observation that more often than not search results would direct a user to an index-style page containing the post instead of directly to the 'permalink' location of the post. This leads to a poor user-experience from the visitor’s point of view, on busy blogs [...]]]></description>
			<content:encoded><![CDATA[<p>In my previous post '<a href="http://blog.lobstertechnology.com/2006/01/03/blogs-are-fundamentally-flawed-for-the-typical-grandma-user/">Blogs are fundamentally flawed…</a>' I noted an observation that more often than not search results would direct a user to an index-style page containing the post instead of directly to the 'permalink' location of the post. This leads to a poor user-experience from the visitor’s point of view, on busy blogs the post has almost certainly moved since the page was spider'd. Google in particular appeared to be the worst for it.</p>
<p>Discussions on the subject with <a href='http://webofshite.com/'>Gerry</a> determined that this is most likely down to Google's PageRank technology; where index-style pages have a higher value than the post pages themselves. To get around this he suggested manipulating 'robots.txt' directives within the index-style pages.</p>
<p>On Google's "Information for Webmasters" help page I found they look for special 'robots.txt' directives and meta tags in documents when spidering specific to Googlebot only. This meant I could single out Googlebot for these directives and not affect other search engines (which don’t exhibit the problem so much).</p>
<p>I basically want Google to 'FOLLOW' links on all pages, but not to 'INDEX' the index-style pages like categories &#038; archives by date. The desired effect being that Google can find all posts as before but simply ignore the index-style pages themselves. Implementing this is quite simple; I modified my theme's "header.php" file inserting the following code in the "head" section:</p>
<div class="igBar"><span id="lphp-6"><a href="#" onclick="javascript:showPlainTxt('php-6'); return false;">PLAIN TEXT</a></span></div>
<div class="syntax_hilite"><span class="langName">PHP:</span>
<div id="php-6">
<div class="php">
<ol>
<li style="font-family: 'Courier New', Courier, monospace; color: black; font-weight: normal; font-style: normal;color:#3A6A8B;">
<div style="font-family: 'Courier New', Courier, monospace; font-weight: normal;"><span style="color:#000000; font-weight:bold;">&lt;?php</span></div>
</li>
<li style="font-weight: bold;color:#26536A;">
<div style="font-family: 'Courier New', Courier, monospace; font-weight: normal;">&nbsp; &nbsp; <span style="color:#616100;">if</span> <span style="color:#006600; font-weight:bold;">&#40;</span> !is_single<span style="color:#006600; font-weight:bold;">&#40;</span><span style="color:#006600; font-weight:bold;">&#41;</span> &amp;&amp; !is_page<span style="color:#006600; font-weight:bold;">&#40;</span><span style="color:#006600; font-weight:bold;">&#41;</span> &amp;&amp; !is_home<span style="color:#006600; font-weight:bold;">&#40;</span><span style="color:#006600; font-weight:bold;">&#41;</span> <span style="color:#006600; font-weight:bold;">&#41;</span></div>
</li>
<li style="font-family: 'Courier New', Courier, monospace; color: black; font-weight: normal; font-style: normal;color:#3A6A8B;">
<div style="font-family: 'Courier New', Courier, monospace; font-weight: normal;">&nbsp; &nbsp; &nbsp; &nbsp; <a href="http://www.php.net/echo"><span style="color:#000066;">echo</span></a> <span style="color:#FF0000;">"&nbsp; &lt;meta name=<span style="color:#000099; font-weight:bold;">\"</span>GOOGLEBOT<span style="color:#000099; font-weight:bold;">\"</span> content=<span style="color:#000099; font-weight:bold;">\"</span>NOINDEX,FOLLOW<span style="color:#000099; font-weight:bold;">\"</span> /&gt;<span style="color:#000099; font-weight:bold;">\n</span>"</span>;</div>
</li>
<li style="font-weight: bold;color:#26536A;">
<div style="font-family: 'Courier New', Courier, monospace; font-weight: normal;"><span style="color:#000000; font-weight:bold;">?&gt;</span> </div>
</li>
</ol>
</div>
</div>
</div>
<p></p>
<p>This reads almost literally, if this is not a single post view, not a page view or the home page, add the following "meta..." tag. Although the home page is an index-style page I am reluctant to add 'NOINDEX' because I don't want it disappearing from search results. <img src='http://blog.lobstertechnology.com/wp-includes/images/smilies/icon_wink.gif' alt=';)' class='wp-smiley' /> </p>
<p>Now the long wait for the changes to reflect in Google's results.</p>
<p>Updated 24th January 2006 - <a href='http://webofshite.com/'>Gerry</a> pointed out this can be optimised using <a href="http://en.wikipedia.org/wiki/De_Morgan's_Law">De Morgan's Law</a> <img src='http://blog.lobstertechnology.com/wp-includes/images/smilies/icon_razz.gif' alt=':P' class='wp-smiley' /> </p>
<div class="igBar"><span id="lphp-7"><a href="#" onclick="javascript:showPlainTxt('php-7'); return false;">PLAIN TEXT</a></span></div>
<div class="syntax_hilite"><span class="langName">PHP:</span>
<div id="php-7">
<div class="php">
<ol>
<li style="font-family: 'Courier New', Courier, monospace; color: black; font-weight: normal; font-style: normal;color:#3A6A8B;">
<div style="font-family: 'Courier New', Courier, monospace; font-weight: normal;"><span style="color:#000000; font-weight:bold;">&lt;?php</span></div>
</li>
<li style="font-weight: bold;color:#26536A;">
<div style="font-family: 'Courier New', Courier, monospace; font-weight: normal;">&nbsp; &nbsp; <span style="color:#616100;">if</span> <span style="color:#006600; font-weight:bold;">&#40;</span> ! <span style="color:#006600; font-weight:bold;">&#40;</span>is_single<span style="color:#006600; font-weight:bold;">&#40;</span><span style="color:#006600; font-weight:bold;">&#41;</span> || is_page<span style="color:#006600; font-weight:bold;">&#40;</span><span style="color:#006600; font-weight:bold;">&#41;</span> || is_home<span style="color:#006600; font-weight:bold;">&#40;</span><span style="color:#006600; font-weight:bold;">&#41;</span><span style="color:#006600; font-weight:bold;">&#41;</span> <span style="color:#006600; font-weight:bold;">&#41;</span></div>
</li>
<li style="font-family: 'Courier New', Courier, monospace; color: black; font-weight: normal; font-style: normal;color:#3A6A8B;">
<div style="font-family: 'Courier New', Courier, monospace; font-weight: normal;">&nbsp; &nbsp; &nbsp; &nbsp; <a href="http://www.php.net/echo"><span style="color:#000066;">echo</span></a> <span style="color:#FF0000;">"&nbsp; &lt;meta name=<span style="color:#000099; font-weight:bold;">\"</span>GOOGLEBOT<span style="color:#000099; font-weight:bold;">\"</span> content=<span style="color:#000099; font-weight:bold;">\"</span>NOINDEX,FOLLOW<span style="color:#000099; font-weight:bold;">\"</span> /&gt;<span style="color:#000099; font-weight:bold;">\n</span>"</span>;</div>
</li>
<li style="font-weight: bold;color:#26536A;">
<div style="font-family: 'Courier New', Courier, monospace; font-weight: normal;"><span style="color:#000000; font-weight:bold;">?&gt;</span> </div>
</li>
</ol>
</div>
</div>
</div>
<p></p>
]]></content:encoded>
			<wfw:commentRss>http://blog.lobstertechnology.com/2006/01/23/experimenting-with-googlebot/feed/</wfw:commentRss>
		<slash:comments>2</slash:comments>
		</item>
		<item>
		<title>Blogs are fundamentally flawed for the typical Grandma-User</title>
		<link>http://blog.lobstertechnology.com/2006/01/03/blogs-are-fundamentally-flawed-for-the-typical-grandma-user/</link>
		<comments>http://blog.lobstertechnology.com/2006/01/03/blogs-are-fundamentally-flawed-for-the-typical-grandma-user/#comments</comments>
		<pubDate>Tue, 03 Jan 2006 08:29:32 +0000</pubDate>
		<dc:creator>Michael Cutler</dc:creator>
				<category><![CDATA[Internet]]></category>
		<category><![CDATA[Misc]]></category>
		<category><![CDATA[Search Engines]]></category>
		<category><![CDATA[Thoughts]]></category>
		<category><![CDATA[Wordpress]]></category>
		<category><![CDATA[XML]]></category>

		<guid isPermaLink="false">http://blog.lobstertechnology.com/2006/01/03/blogs-are-fundamentally-flawed-for-the-typical-grandma-user/</guid>
		<description><![CDATA[It was reading my access_log's that made me realise that Blogs are actually a really bad format for the Grandma user...

Picture this, imagine your Grandma is Google'ing and happens to get a result that points to your Blog. She see's a teaser in the search results that shows you've written something about what she is looking for. Grandma clicks your link and is presented with your last 10 posts about God knows what and no sign of the post she saw the snippet of. Grandma goes back to Google thoroughly disappointed and never to return...]]></description>
			<content:encoded><![CDATA[<p>It may seem a little sad but I can honestly say that reading my access_log is far more interesting than any soap opera on TV; they are filled with exotic foreigners, futuristic robots, drama, intrigue and personal tragedy. The best thing about it is that it’s all real; these are (mostly) real people who stumble across your humble Blog in the hope to find the solution to their problems.</p>
<p>Over the Christmas period I have observed more people visiting an ancient post of mine than in the past six months. The post is about my experiences with an external hard drive enclosure; more accurately, the chip / controller a great deal of hard drive enclosures use. Based on this I would guess that a considerable number of people got hard drive enclosures for their Christmas and ran into the same problems I had. Anyway, I am wandering a little.</p>
<p>It was reading my access_log's that made me realise that Blogs are actually a really bad format for the Grandma user...</p>
<p>Picture this, imagine your Grandma is Google'ing and happens to get a result that points to your Blog. She see's a teaser in the search results that shows you've written something about what she is looking for. Grandma clicks your link and is presented with your last 10 posts about God knows what and no sign of the post she saw the snippet of. Grandma goes back to Google thoroughly disappointed and never to return...</p>
<p>I encounter this phenomenon frequently, but because I am familiar with the Blog format I think nothing of drilling down to the relevant category to find the post I wanted; or if I am lazy click Google’s cached copy of the page. However for the average internet surfer it presents a fundamental flaw in the usability of the Blog format.</p>
<p>The problem is quite simple; search engines can never be up to date with your content all the time. The more frequent you post the more the problem will occur and the harder the post will be to find. The way I see it there are two possible solutions. </p>
<p><strong>Smarter search engines</strong></p>
<p><em>Enhancing search engines so they can distinguish between an index-page of posts and individual posts. This could be done by identifying sections of text within a page as an extract from another URL using something like RDF [<a href=" http://www.w3.org/RDF/">http://www.w3.org/RDF/</a>] which can already be embedded within XHTML [<a href="http://internetalchemy.org/2005/10/introducing-embedded-rdf">http://internetalchemy.org/2005/10/introducing-embedded-rdf</a>]. Enclosing the section of text between the ‘&lt;rdf:RDF&gt;’ tags would do the trick.</p>
<p>In the Blog format the index pages and category pages would all contain embedded RDF indicating that the enclosed section of text is actually from another URL – its permalink. However this idea is not just limited to the Blog format, it has huge potential for most modern website formats.</p>
<p>This wouldn’t be a trivial change for search engines to make, it would be time-consuming and therefore costly but I believe it would be worthwhile for the future of internet content.</em></p>
<p><strong>Smarter websites</strong></p>
<p><em>A more short-term solution I am looking at is improving my website [i.e. Wordpress] to detect that the visitor has come from a search engine, try and determine the query they used from the ‘Referer’ HTTP header, then find and present the best matches to that query before any other posts are displayed.</p>
<p>Obviously this method has quite a few shortfalls:</p>
<p>* The ‘Referer’ header may not be there (some people disable it within the browser or through third-party software)<br />
* Although handling the query formats of the main players is quite easy, not all search engines can be catered for<br />
* It requires an intensive search of all the site’s posts, the standard Wordpress search won’t cut it</p>
<p>I contemplated getting the site to pull a copy of the URL given in the ‘Referer’ header, scan for the result that led the visitor to your site then locate the correct post given the snippet text… Then I decided that was a reeeeeeaaally bad idea.</em></p>
<p>In the long-run I believe the content and therefore the search engines that index it have to improve to cater for the format of internet content today and I think embedded RDF might be the key; unfortunately this cannot happen overnight.</p>
<p>In the meantime making smarter websites will help the situation until the content and the search engines catch up.</p>
]]></content:encoded>
			<wfw:commentRss>http://blog.lobstertechnology.com/2006/01/03/blogs-are-fundamentally-flawed-for-the-typical-grandma-user/feed/</wfw:commentRss>
		<slash:comments>1</slash:comments>
		</item>
		<item>
		<title>Upgrading to Wordpress 2.0</title>
		<link>http://blog.lobstertechnology.com/2005/12/20/upgrading-to-wordpress-2-0/</link>
		<comments>http://blog.lobstertechnology.com/2005/12/20/upgrading-to-wordpress-2-0/#comments</comments>
		<pubDate>Tue, 20 Dec 2005 01:22:43 +0000</pubDate>
		<dc:creator>Michael Cutler</dc:creator>
				<category><![CDATA[Internet]]></category>
		<category><![CDATA[Misc]]></category>
		<category><![CDATA[Wordpress]]></category>

		<guid isPermaLink="false">http://blog.lobstertechnology.com/2005/12/20/upgrading-to-wordpress-2-0/</guid>
		<description><![CDATA[Keen to try out my own PlugIns in Wordpress 2.0 and swayed by <a href='http://dougal.gunters.org/blog/2005/12/19/wordpress-20-release-imminent'>this post</a> I took the plunge and installed the nightly Wordpress 2.0 [20051219] build.]]></description>
			<content:encoded><![CDATA[<p>Keen to try out my own PlugIns in Wordpress 2.0 and swayed by <a href='http://dougal.gunters.org/blog/2005/12/19/wordpress-20-release-imminent'>this post</a> I took the plunge and installed the nightly Wordpress 2.0 [20051219] build.</p>
<p>I manage the entire installation of my Wordpress blog in CVS; although I have several customisations to the core Wordpress code, I merged them seamlessly into the 2.0 code without problems. During the upgrade process I did get some warnings that the 'wp_usermeta' table didn't exist – because I hadn't run /wp-admin/upgrade.php yet which was no big deal. Apart from that the whole process was very straight-forward!</p>
<p>All the plugins I use appear to work perfectly: <a href='http://greengabbro.net/'>Weighted Words</a>, <a href='http://blog.igeek.info/wp-plugins/igsyntax-hiliter/'>iG:Syntax Hiliter</a>, <a href='http://www.coldforged.org/spelling-checker-plugin-for-wordpress/'>Spelling Checker</a> &#038; <a href='http://blog.lobstertechnology.com/category/wordpress/plugins/'>my own</a></p>
<p>My first opinions...</p>
<p>Although the new fancy rich-text editor looks great, I am still going to use the original code-version as I am so used to it and prefer writing my own xhtml. I am not so sure if I like the new admin colour scheme yet either, I have grown quite attached to the previous grey one.</p>
<p>I am extremely impressed that Wordpress has changed hugely 'under-the-hood' seemingly without breaking any of the existing interfaces and therefore plugins.</p>
<p>Version 2.0 boasts a lot of new features and the <a href='http://trac.wordpress.org/timeline'>Trac Timeline</a> is a hive of activity. I will probably keep running on the latest nightly build until the final release.</p>
]]></content:encoded>
			<wfw:commentRss>http://blog.lobstertechnology.com/2005/12/20/upgrading-to-wordpress-2-0/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>More thoughts on SpamKit&#8230;</title>
		<link>http://blog.lobstertechnology.com/2005/12/16/more-thoughts-on-spamkit/</link>
		<comments>http://blog.lobstertechnology.com/2005/12/16/more-thoughts-on-spamkit/#comments</comments>
		<pubDate>Fri, 16 Dec 2005 00:31:22 +0000</pubDate>
		<dc:creator>Michael Cutler</dc:creator>
				<category><![CDATA[Internet]]></category>
		<category><![CDATA[Spam]]></category>
		<category><![CDATA[SpamKit]]></category>
		<category><![CDATA[Thoughts]]></category>

		<guid isPermaLink="false">http://blog.lobstertechnology.com/2005/12/16/more-thoughts-on-spamkit/</guid>
		<description><![CDATA[fter my recent release of SpamKit I have been contemplating the whole spam problem in much greater depth. Gerry highlighted one major problem with my SpamKit plugin itself – trackbacks are considered spam because they don't include the time-based token. I started to look into amending my plugin to support them when I realised that this would be a serious loop hole.]]></description>
			<content:encoded><![CDATA[<p>After my recent release of <a href='http://blog.lobstertechnology.com/2005/12/06/spamkit-plugin-for-wordpress/'>SpamKit Plugin</a> I have been contemplating the whole spam problem in much greater depth. <a href='http://webofshite.com/'>Gerry</a> highlighted one major problem with my SpamKit plugin itself – trackbacks are considered spam because they don't include the time-based token. I started to look into amending my plugin to support them when I realised that this would be a serious loop hole.</p>
<p>As far as the SpamKit plugin is concerned it would see trackbacks &#038; pingbacks as new comments, without any time-based token. Thankfully Wordpress passes a 'comment_type' flag of 'trackback' or 'pingback' in these cases, and there is also a separate action of 'trackback_post' which the plugin can hook into. But what is stopping a particular clever spammer from doing trackback / pingback spam? Nothing...</p>
<p>So, slightly disheartened by this particular problem I went back to the mental-drawing -board. SpamKit plugin is great for a very specific problem, but its certainly not the grand unified solution I am looking for. All it would take is for a smarter-spammer to pull the comment form with my time-based-token and then send it along with the spam message for instant approval. It's worryingly easy to do, I did it in about 10 lines of PHP code – no joke.</p>
<p>The spam problem will always present itself as long as we try to 'filter' out automatic-spam-clients from humans; it is simply the wrong approach. Eventually spam-clients will evolve to become so sophisticated we cant tell the difference anyway, then all the filters and tricks in the book won't help you. I have already started to see a distinct difference between the spam clients hitting my blog daily, a trend showing that more sophisticated clients are being developed all the time.</p>
<p>The only way you will truly eradicate spam is to validate the content of the message  or the content of the web page being linked, everything else is just a bonus.</p>
<p>The way I see it, there are two solutions that will blow Spam out of the water forever:</p>
<p><strong>Smarter Application</strong></p>
<p><em>An application that verifies everything based on in-built rules, configuration and past experiences.</p>
<p>* validating that the comment is relevant to the post<br />
* validating any info associated [URL, email] is valid<br />
* validating that the trackback/pingback came from somewhere that exists<br />
* validating links in the comments for their content, is it relevant? are they selling Viagra?</p>
<p>All this is great but it goes far beyond what a humble PHP application should be doing. I dread to think what server load this would produce on even a small-scale blog – not to mention a multi-blog-site or a shared web server.<br />
</em></p>
<p><strong>Dumb Application made smarter through On-line Collaboration</strong></p>
<p><em>This basically describes Akismet – where a central system has all the complex logic and the client-application receives a simple yes/no decision.</p>
<p>Personally, I would do this asynchronously because the client-application [Wordpress for example] doesn't really need an immediate decision. Comments can easily wait in a queue for approval which a later XMLRPC callback provides. This would greatly reduce the load on the system and allow far more complex algorithms and lookups to take place improving the overall accuracy of the decisions dispensed.<br />
</em></p>
<p>Other widely used anti-spam solutions also look promising, for example the 'roadmap' for Spam Karma SK2.2 has some really good ideas, my favourite's being the honeypot and the anti-PageRank idea.</p>
<p>If you have read my <a href='http://blog.lobstertechnology.com/2005/11/22/more-thoughts-on-comment-spam/'>previous posts</a> you'll have seen how wide the spam problem already is. Although I am very proud of my little SpamKit plugin because it does exactly what I wanted, I am quite frustrated with its limitations.</p>
<p>The whole spam problem is very interesting to me, its not going to go away any time soon. It's existed since <a href='http://www.templetons.com/brad/spam/spam25.html' rel='external nofollow'>1978</a> and been evolving ever since...</p>
]]></content:encoded>
			<wfw:commentRss>http://blog.lobstertechnology.com/2005/12/16/more-thoughts-on-spamkit/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Horde 3.0.8 appears to be broken</title>
		<link>http://blog.lobstertechnology.com/2005/12/15/horde-3-0-8-appears-to-be-broken/</link>
		<comments>http://blog.lobstertechnology.com/2005/12/15/horde-3-0-8-appears-to-be-broken/#comments</comments>
		<pubDate>Thu, 15 Dec 2005 21:27:38 +0000</pubDate>
		<dc:creator>Michael Cutler</dc:creator>
				<category><![CDATA[Internet]]></category>
		<category><![CDATA[Misc]]></category>
		<category><![CDATA[Thoughts]]></category>

		<guid isPermaLink="false">http://blog.lobstertechnology.com/2005/12/15/horde-3-0-8-appears-to-be-broken/</guid>
		<description><![CDATA[Horde is an application framework used by a web-based email client IMP I use to read my email. ... Today I downloaded and attempted to install Horde 3.0.8 - released on Sunday 11th December 2005 -  something appears to be wrong as I didn't get very far. I followed all the given instructions, my server is configured correctly, all the dependencies are installed and working. I got so far as to use the web-based setup / configuration screen but it didn't allow me to save any settings or complete the setup process.]]></description>
			<content:encoded><![CDATA[<p>Horde is an application framework used by a web-based email client IMP I use to read my email. From the Horde site [<a href='http://www.horde.org/'>www.horde.org</a>]:</p>
<blockquote><p>
The Horde Project is about creating high quality Open Source applications, based on PHP and the Horde Framework.</p>
<p>The guiding principles of the Horde Project are to create solid standards-based applications using intelligent object oriented design that, wherever possible, are designed to run on a wide range of platforms and backends.<br />
There is great emphasis on making Horde as friendly to non-English speakers as possible. The Horde Framework currently supports many localization features such as Unicode and right-to-left text and generous users have contributed many translations for the framework and applications.
</p></blockquote>
<p>Today I downloaded and attempted to install Horde 3.0.8 - released on Sunday 11th December 2005 -  something appears to be wrong as I didn't get very far. I followed all the given instructions, my server is configured correctly, all the dependencies are installed and working. I got so far as to use the web-based setup / configuration screen but it didn't allow me to save any settings or complete the setup process.</p>
<p>Following the instructions to the letter; I went to the 'Authentication' tab, selected 'IMAP Authentication', the page reloaded but didn't reflect my choice from the 'authentication backend' drop-down list. Instead it wouldn't display anything other than 'Let a Horde application handle authentication' but without the additional drop-down to select the application to use.</p>
<p>I initially suspected some Javascript incompatibility as I normally use Firefox and sooo many applications are written against Internet Explorer. But after several attempts from different browsers &#038; platforms I gave up on the authentication tab, opting to try at least the 'Database' tab and configure MySQL. I could easily fill out all the details but when I tried to 'Generate Horde Configuration' it threw me back to the 'General' tab, highlighting that I had not completed required fields to do with error reporting &#038; URL generation – both were set to valid values.</p>
<p>I re-read the documentation and re-did the whole installation... just in case I missed something or was too eager to lock down permissions. Again, exactly the same problem. Next, I relaxed ALL the permissions possible, I basically chmod'd the whole thing to 777 - in case the setup wasn't able to write to the config directory but this didn't help either.</p>
<p>The FAQ didn't provide much help so i went to the IRC channel #horde @ irc.freenode.net and found others with exactly the same problem *holds back the tears of frustration* ... But unfortunately no-one seemed to have any immediate answers.</p>
<p>On a hunch I grabbed Horde 3.0.7 from the FTP site and went through the whole setup process again. However this time it worked as expected and was running within ten minutes!!</p>
<p>Argh... Next step is to diff the code and see where it went wrong... (stay tuned)</p>
<p><strong>Update</strong> - This issue was fixed in version 3.0.9 which is now available from <a href='http://www.horde.org/'>www.horde.org</a></p>
]]></content:encoded>
			<wfw:commentRss>http://blog.lobstertechnology.com/2005/12/15/horde-3-0-8-appears-to-be-broken/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
	</channel>
</rss>
