Getting Unnoticed

While I was working on the conversion, I setup a sub-domain at testbed.jclark.org. Unfortunately, I forgot to disable pings, and my testbed site ended up pinging something (pingomatic, I think). Google indexed the test site, and Technorati climbed all over it. Technorati found a special hacked version of my old blosxom site that contained links to the real site (jclark.org), so now my technorati inbound links page is full of posts from the test site that linked to me. Since this is now a dead site, I didn’t want it indexed by any search engine.

As soon as I figured out what happened, I killed testbed.jclark.org with a .htaccess mod_rewrite rule that makes every URI on the site return 410 Gone. After a few days, nothing had changed. A little research revealed how to get removed from Google. In short, I added a robots.txt file disallowing all User Agents (I could have disallowed only Googlebot). To hasten the process, I used the url removal system to request an expedited check of my robots.txt file. I also had to modify my mod_rewrite rules to serve robots.txt, since I was sending 410 for every request. Within a day, the test site was gone from Google.

Getting delisted from Technorati has proven more difficult. I’ve been unable to find any instructions on the website. I tried using the ping form to have some posts re-spidered, knowing they would return a 410, but no change so far. I sent an email to support about a week ago, to which I’ve received no reply. A day later, all of the posts had dissapeared, only to return hours later. If anyone knows how to get delisted, please leave me a comment.

Every Old Problem is New Again

My recent redesign and switch to WordPress was started about 6 months ago, and was allowed to languish until a couple weeks ago, when I found the time to finish the project. As a result, I keep running into little bugs/issues that I should have caught before I launched- stuff I thought about 6 months ago, and forgot about until now.

One example is tweaking for search engines. A couple of years ago, I made some tweaks to my Blosxom templates to help search engines better index my site. I added post titles to permalink (single post) pages, to make sure Google, et al, show more than just “jclark.org” above every hit on my site. I also added <meta name='robots' ... /> entries to ensure that category and date archive pages weren’t indexed, only permalinks (and the main index page). Of course, I forgot to add these changes to my WordPress templates, and Google has recently reindexed me, so I’m back to square one.

I updated my header template (header.php) last night to get the post title back onto permalink pages. This was surprisingly easy, as WP offers a function aimed squarely at this purpose:

<title>jclark.org<?php single_post_title(' - ');?></title>

The function single_post_title will only output a value if the current page is a permalink page, optionally inserting a prefix (I use ' - '). This allowed me to update the header template used all across the site, without building my own switching logic. Because the entire html <head> section is in my header template, adding the robot meta directives will require a bit of switching logic; maybe I’ll check to see how single_post_title() is implemented and steal be inspired by that code.