Archive for October, 2004

Moved

The website hosting move has been a success. I’ve had a total of 8 hits on the old host today, so I guess the address propagation is nearly complete. Everything seems to be working fine at the new host. Spent a little time this morning setting up AWStats on the new host. That’s the package my old host offered, and I’m used to it. Eventually, I’d like to put together something of my own to do more specific analysis, but that’s a ways down my list.

My new hosting provider is Dreamhost. My old host was fine, but I really wanted shell access, which they didn’t offer. The plan I’m on now lets me host up to 15 domains, of which I’m using two. By paying for 2 years in advance, I’m paying the same as I was previously for two separate single-domain packages. If I ever need another domain (or 13), I’m all set. Sure am loving the shell access.

I’ll make a final check of my pop3 account at the old host tommorow evening, and Monday I’ll call and close the old hosting account. Prepaid a year in August, so I’ve even got a refund coming.

Moving Day

Today I’m initiating the DNS change to point the domain jclark.org at my new hosting provider. Apologies in advance if anything goes awry.

If this is the last sentence you see, you’re viewing the old host.

And if you’re viewing this sentence, you’re viewing the new host. Woohoo!

Update: The move has been remarkably painless. Mail seems to be working, site’s working fine. Bloglines sees the new site, as do my home and work broadband connections.

Also, I’ve re-enabled comments. I haven’t got the blacklist in place yet; I want to try to build in a performance enhancement first, and I need to setup an extra Perl module to do that. More on that later.

Update 2: Big thanks to Dugh for the heads up that my comments weren’t working. All fixed now.

Magic with wget

As mentioned previously, I’m in the process of switching hosts. Last night I uploaded my entire site at the new host, and set everything up. Along the way, I decided to refine my internal directory structures a bit. Everything is working fine on the new host, and once I get the e-mail accounts squared away I’ll be ready to make the DNS switch.

In the meantime, I have a synchronization problem. When I uploaded my site to the new host, I used a site backup (tarball) from the old host. This made it easy to preserve directory structures as well as timestamps. Because of the way Blosxom works, the file datestamps on my posts are very important… Blosxom uses them for the post date/time. After the upload (and after I moved a few things around), both old an new sites were in sync, with the same posts in the same categories, and having the same timestamps. From that point on, new posts to the old site are out of sync with the new site. Posting to both locations is no good; the timestamps will be off and it breaks the programmer’s first virtue (laziness). I can’t use scp because my old host doesn’t offer shell access.

I went looking for a way to transfer a file from the old host to the new host, preserving the timestamp, and preferably making it easy to keep things in the right directory. I’ll have to run this after each post until the old site goes away. I looked at cURL first, but it didn’t quite do all I needed, so I turned to wget. Magic ensued.

The setup: from the shell account on my new host, grab the file from my old host. The file can only be retrieved from the old host via FTP. As an example, I wanted to sync my prior post on the new iPods. It’s in the category Apple and the “stub” title is newpods. On the old host, the file is in /blosxom/Apple/newpods.txt. On the new host, it needs to go into ~/jclark.org/blosxom/content/Apple/newpods.txt. I didn’t want to specify the directory in both the source and destination. The solution:

cd ~/jclark.org/blosxom/content
wget -N -x -nH --cut-dirs=1 ftp://jclark.org/blosxom/Apple/newpods.txt

The first command just puts me in the base directory for my posts on the new server. The magic is in the wget command. -N turns on timestamping, preserving my timestamps. -x forces wget to create directories locally to match the remote (this is the default for recursive fetches). Normally, the dirs created would start with the host name (e.g., jclark.org), but -nH removes the host name. Finally, –cut-dirs removes directories from the front of the path, so the file blosxom/Apple/newpods.txt on the remote end becomes Apple/newpods.txt locally. This combined with the initial cd lets me handle the changes I made to my directory structure. After I publish this post (on the old host), I’ll run the same command from the new host, plugging in the new file/path.

One detail of note: the above wget command will try to login anonymously, and give up if it fails. You can specify user and password on the command line, but bad idea on a shared host (think ps -aux, although my host protects against this). If you specify the user without password, you don’t get prompted for a password. The way around this is an old UNIX standby, the .netrc file.

New iPods

Apple today released two new iPods: the iPod Photo and the iPod Special Edition: U2. Both have been rumored for a while.

This interests me because I’ve recently decided to get an iPod. I’ve wanted one for quite a while, but hadn’t really been able to justify the expense. I considered asking Sherri to get me one for Christmas last year, but requested the Firewire HD and DVD Burner instead. With the growing size of my music collection and my new interest in Podcasting (I’m tired of burining CDs every day for my commute), I’ve made up my mind to get the iPod. I’ve been trying to decide if I should put it on my Christmas list, or wait until my birthday (in February) since Apple usually anounces new iPod models at Macworld in January. Looks like it’s safe to shoot for Christmas… now, which one?

The field before today had three choices: the 20gig model ($299), the 40gig model ($399), and the Mini ($249). The mini is out- I want storage. Of the standard models, the 40gig makes a much better choice for me. Besides double the capacity for only $100 more, the 20gig model doesn’t come with a dock or remote. I want the dock, and it’s $39 separately. Might as well get the 40gig.

Now we come to the new choices. First, the $349 U2 iPod. It’s a 20gig iPod for $50 more than a standard 20gig iPod. For your $50, you get a black iPod with a red scrollwheel, signatures from the band laser-etched on the back, and a $50 credit towards a 400+ song “digital boxed set” of all of U2’s music from the iTunes Music Store. I find it interesting that there’s no indication of the price of this “digital boxed set”, which won’t be available until the end of November. I’m not a huge U2 fan, so the boxed set is irrelevant to me. The black iPod would be sharp without the red scroll wheel, but with it, it’s quite atrocious (at least in photos… I’ll report again once I see it at the Apple store). This one’s out of the running.

This brings us to the product everyone will be talking about, the iPod Photo. Available in 40gig and 60 gig flavors, it looks like a standard iPod but with a color screen. It can store photos which you can view on the 2″ screen, and it comes with an A/V cable you can use to show your pictures on a TV. The battery life appears to be a bit longer than the b&w iPod according to the spec sheet. The 40gig model is $499 and the 60gig model is $599.

The new device is intriguing. We have taken all of our pictures with a digital camera for a couple of years now. Having a Series 2 TiVo on my home network lets me view photos on my TV, and it turns out to be great. These days, we just dock the camera, push the dock’s sync button, and head to the family room to see our pictures on the big screen. The only downside to the TiVo setup is portability; an iPodPhoto would make an excellent way to show photos to everyone when we visit family. Aside from pictures, the extra capacity of the 60gig model sounds nice.

Being already prepared to spend $399 for a 40gig (standard) iPod, is it worth $100 more for the photo feature? If so, is it worth another $100 for the extra storage? If the photo option isn’t worth $100, is the storage worth $200? I’m sure for some, the answers will be yes. For me, I don’t think the benefit of the photo feature is worth the cost. I would use it some, but not enough. And $200 is too much for the extra 20gigs storage. If they offered a 60gig standard iPod for $499, I might consider it. Then again, Macworld is only a couple of months away….

Mail Issues

I’m in the process of switching hosting providers (more on that in a later post). If your mailhost is at Dreamhost (my new provider), you may not be able to send me email until I get everything sorted. In the interim, if you send me mail and it bounces, you can contact me at jason.clark {at} comcast.net.

Paypal Scam

Got a Paypal scam e-mail tonight. I have to admit, it was a nice effort. It included several warnings to check the URL in your browser’s address bar, and looked very authentic and believable, except for this bit (emphasis mine):

You will be guided through a series of steps which will require you to enter personal information, such as credit card number and/or bank details.

Of course, I’m always extremely skeptical and cautious about such things, and even without the very fishy line above, I was suspicious. By using Mail.app’s View Raw Headers option I was able to look at the HTML source. All of the images were linked from the real Paypal site, and all of the links (privacy policy, Paypal security center, update mail prefs) were valid Paypal links… except for the payload (”You MUST click the link below…”) URL, which used the %00 password-in-url hack that affects IE users who aren’t patched up to date.

At any rate, I poked around the Paypal site for a few minutes, and found an address where you can forward such emails to help Paypal research them. The address is spoof@paypal.com. If you get any of this crap, take a moment to forward it along, especially if it slips past your spam filter (as it did mine). I was impressed when five minutes later, I received and email from Paypal (probably auto-generated, but still) that included a quoted copy of what I’d sent them, along with thanks and an assurance that it was a spoof. My only suggestion is that they make this information more prominent, such as a homepage “Need to know if a Paypal email is authentic?” link.

Morentries Plugin Update

Steve Schwartz has created an updated version of my moreentries plugin that adds a series of links for each additional page of posts, like at the bottom of the page on Google. It supports both text and image links. I think this is just fantastic… this is a feature I’ve had requested, and just never got around to implementing. Go check it out!

First Bot Ban

Once nice side effect of the recent spam attack I suffered is that it got me poking around in my logs and stats. My hosting provider iPowerWeb offers stats via awstats, which isn’t the greatest but it works. While looking at my stats, I noticed My Most Frequent Visitor had over 3800 page requests this month, while the #2 visitor had less than 500. My Most Frequent Visitor had also sucked down over 90 Meg, while #2 had only around 15 Meg. I became quite interested in My Most Frequent Visitor.

MMFV was identified only by an IP address - 38.144.36.16. Wonder who that is? :

% host 38.144.36.16
16.36.144.38.in-addr.arpa domain name pointer news.allresearch.com

Pluging news.allresearch.com into the browser yielded a refused connection, so I tried www.allresearch.com. Bingo. From the home page:

AllResearch, Inc. was founded in 1998 to provide research, media analysis, and strategic intelligence services for a variety of different markets.

We offer a broad range of products and services to assist various entities with gathering relevant intelligence from the online world. Utilizing cutting-edge proprietary technology, we are able to view and understand the online world in ways never before possible.

Huh. It seems that slogging through my bandwith at 7 times the rate of any other visitor is a proprietary and cutting-edge technology. Who knew? While the marketroid-speak above isn’t perfectly clear, the menu of services certainly brings things into focus, which such items as Webclipping, TrademarkTracker, Online Peer Group Analysis, and Law Enforcement. I’m being stalked by The Man! (and I’m not the only one.)

But why is The Man (aka My Most Frequent Visitor) visiting so much more freqently than everyone else? A grep or two through my access logs reveals all. It seems that once an hour, The Man pulls my RSS feed. Okay, no problem. But then, The Man pulls every one of the posts in my feed. On the one hand, this is stupid because my feed is full content. On the other hand, this is really stupid, wasteful, and hateful because The Man requests the full content of all 10 posts in the feed every hour! Even when the feed hasn’t changed, The Man is re-reading all 10 posts. The Man must have The Bot, even though The Man’s user agent string is "Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.1; SV1)". While I bet The Man probably does use IE, I doubt he’s using it once an hour to pull all my posts by hand. Bad, Sneaky The Man!

Far be it from me to criticize The Man, so here ends my tale. On a completely unrelated note, check out the newest addition to my .htaccess file:

RewriteCond %{REMOTE_ADDR} "^38.144.36.16$"
RewriteRule .* - [F,L]

Interestingly, I seem to be seeing a “403 Forbidden” in my logs now, once an hour, every hour, like clockwork.

Beat

I am beat. Even though I’ve been promising myself I’d write a real post one night this week, tonight just isn’t going to be the night. I’m only posting this because I challenged dugh to a month of solid blogging after he created the week-long October Blogging Challenge. Which makes this a lame cheat. The first rule of the Blogging Challenge is don’t blog about the Blogging Challenge, and all that rot.

I could sit here and write the rant I’d planned about the sad state of windows “freeware” (yes, those are air-quotes, please make exagerated hand motions when you read them), but that will have to wait until tommorow. Beat, I tell you.

And yes, the comments are still down (see prior post, I’m too lazy to link it tonight). I’ll try to install the blacklist plugin this weekend. If you’re really feeling sorry for me and my abused comment system, email me (link on the right somewhere).

For now, I’m going to fire up the TiVo and watch Smallville, even if they did write out the best character they’ve ever had last week.

SpamWars: The Spampire Strikes Back

So here I was, idly checking my Bloglines feeds and lamenting the fact that I had nothing to blog about tonight. Silly rabbit, be careful what you wish for. Poing! New Mail. No, wait… 6 new e-mails in the 5 minutes since the last automatic check. That never happens. Must be comment spam on the ol’ blog.

Indeed. Not only that, but all of the spam comments showed up in my inbox as new comments, not spam attempts. This means my anti-spam measures have failed. Several months ago, I suffered a severe spam onslaught, which lead to my disabling comments for three weeks. When my comment system returned, I had implemented several changes to help stop the spam. I even kept the details to myself to slow the spammers from catching on. Looks like they’ve caught on.

My countermeasures included rejecting all items without a referrer, and changing the default value in a hidden comment form field used by the Blosxom writeback plugin. Nice try. Tonight’s spammer is much more sophisticated. Each post came from a separate IP address. Referrer is present and correct, and the User Agent string looks innocuous, although I’d bet it’s a bot. The posts came in groups of three, and for each group of three I can see a single IP address GETing the original post plus other pages (archive links, etc); however the two “sniffer” IPs are different.

These little weasels deserve all seven levels of Dante’s Inferno and a couple of new ones I just thought up. For now, I’ve shut off comments. Looks like I’ll be setting up the blosxom port of the MT blacklist very soon. Sorry for the inconvenience, feel free to email me in the interim. Unless you are a spammer… you may feel free to (Extremely violent and anatomically questionable recommendation censored).