Fedora Core 4 x86_64 Linux on Compaq R4000 Laptop

The majority of the hardware worked out of the box, the WXGA (1280 x 800) screen needs to be manually frigged into the X configuration. Only the wireless adapter and the memory-card reader are unsupported by the base install.

I got the wireless adapter (Broadcom BCM4318 PCI id 14E4:4318) working using ndiswrapper 1.8 from ndiswrapper.sourceforge.net, it built & installed cleanly into my 64-bit kernel. However, this means you must use 64-bit Windows Device Drivers. Thankfully the List mentions a similar attempt on a HP (HP/Compaq same thing) AMD64 laptop – it works!!

The memory-card reader seems to be harder to get working with mixed reports of success / failure. It appears to be a Texas Instruments PCIxx21, PCI device id 104C:8033 & 104C:8034.

And for my next trick…

Experimenting with Googlebot

In my previous post 'Blogs are fundamentally flawed…' I noted an observation that more often than not search results would direct a user to an index-style page containing the post instead of directly to the 'permalink' location of the post. This leads to a poor user-experience from the visitor’s point of view, on busy blogs the post has almost certainly moved since the page was spider'd. Google in particular appeared to be the worst for it.

Discussions on the subject with Gerry determined that this is most likely down to Google's PageRank technology; where index-style pages have a higher value than the post pages themselves. To get around this he suggested manipulating 'robots.txt' directives within the index-style pages.

On Google's "Information for Webmasters" help page I found they look for special 'robots.txt' directives and meta tags in documents when spidering specific to Googlebot only. This meant I could single out Googlebot for these directives and not affect other search engines (which don’t exhibit the problem so much).

I basically want Google to 'FOLLOW' links on all pages, but not to 'INDEX' the index-style pages like categories & archives by date. The desired effect being that Google can find all posts as before but simply ignore the index-style pages themselves. Implementing this is quite simple; I modified my theme's "header.php" file inserting the following code in the "head" section:

PHP:
  1. <?php
  2.     if ( !is_single() && !is_page() && !is_home() )
  3.         echo "  <meta name=\"GOOGLEBOT\" content=\"NOINDEX,FOLLOW\" />\n";
  4. ?>

This reads almost literally, if this is not a single post view, not a page view or the home page, add the following "meta..." tag. Although the home page is an index-style page I am reluctant to add 'NOINDEX' because I don't want it disappearing from search results. ;)

Now the long wait for the changes to reflect in Google's results.

Updated 24th January 2006 - Gerry pointed out this can be optimised using De Morgan's Law :P

PHP:
  1. <?php
  2.     if ( ! (is_single() || is_page() || is_home()) )
  3.         echo "  <meta name=\"GOOGLEBOT\" content=\"NOINDEX,FOLLOW\" />\n";
  4. ?>

WP Plugin » SpamKit Plugin 0.1 – Time-Based-Tokens to Fight Spam

This is a minor release of SpamKit Plugin to address an easy-to-fix problem where Trackbacks from the same blog or server get treated as spam because they don't include the time-based token. This is checked into Subversion over at WP-Plugins.org and you can download the new version here spamkit-plugin.php.

Changelog:

Added a check in spam_action_pre_comment_approved to compare the REMOTE_ADDR with the SERVER_ADDR, if they match it allows the comment to bypass time-based token checking. However this could be abused if another web application on the server is exploited allowing an attacker to post comments apparently from this server. Then again if someone goes to all the effort of expoiting a web application comment-spam is probably low on their priorities.