Too many fracking bots
The Internet is infested with way too many bots. Bots are currently downloading twice as many pages on this site as actual humans. It’s nearly absurd. Do I really need to be indexed twice as often as people actually read my stuff? Just looking at the Apache logs, I’ve been visited thousands of times this month by each of MSN Search’s Bot, Google bot, and Yahoo’s bot. Throw in Google’s feedfetcher bot and AdSense bot and Yahoo’s feed seeker bot for good measure. And don’t forget to add a pinch of Internet Archive bot and Technorati bot.
It’s crazy! I’m up to my neck in bots over here! And there’s not a damn thing I can do about it. Well, theoretically I could just make my site unindexable using a robots.txt file, but that’d be like curing a roach infestation through the use of nuclear weapons. Given a choice of too many bots versus nobody new being able to find my site, I think I’ll choose the bots, thank you very much. Google and Yahoo’s RSS feed bots don’t even respect robots.txt anyway.
I suspect that a graph of bot activity percentage versus traffic numbers looks sort of like an inverted V-curve. Below a certain threshold bots don’t even care about a site (it is doomed to obscurity), so what few visits you get are mostly human. And at the high end, when you’re getting hundreds of human visitors an hour, the bot visit numbers are dwarfed. But somewhere in the middle, corresponding to a small-to-medium-sized size, is the domain where bots rule. That’s where I am right now.
To be fair, I don’t really hate bots. I realize that they are a necessary evil, and that the 1,100 incoming hits from search results so far this month couldn’t possibly have come without them. It’s just annoying that nearly half of my bandwidth is being eaten up by non-sentient entities who get the same level of robotic satisfaction from consuming my site’s bits as the common splog’s.
February 22nd, 2007 at 06:05
Thanks for nice info….
February 22nd, 2007 at 09:00
Well that’s fun, a spam comment that actually gets through my filtering through virtue of not including any spam URLs in the comment itself, but rather, just in the author URL (which I’ve since destroyed). Nice try “John”. You could work on the very generic comment though. That raises red flags. Maybe that’ll fly on other sites, but not here.
To everyone else: in case you’re wondering what was being spammed, the site’s sub-domain at blogspot was dancingmoney.
March 1st, 2007 at 04:12
Good Catch Cyde
Anyway, its really nice to surf through your site….
March 1st, 2007 at 04:13
it surprised me somehow why do you want to provide this input field….??
March 1st, 2007 at 08:59
What input field, leaving comments? So I can talk with my readers. See some of the comments on other articles; we’ve had some good discussions.
I resent the implication that just because I have a comments form means I should accept whatever someone types into it. No spamming!
March 6th, 2007 at 15:37
I was talking about input filed for “Website”. As it doesn’t matter for you. So why do u keep this field while posting a comment ?
March 6th, 2007 at 15:48
Well, the field is useful in most circumstances. For instance, if someone posting here has a blog of their own, I’ll frequently click the link to see what their blog is like. Generally people who post comments on here are going to have similar interests to me and I might be interested in reading their stuff.
March 9th, 2007 at 13:46
Cyde, you’ve obviuosly have heard the old Internet adage before this:
Rule 1. Spammers lie.
Rule 2. See rule 1.
This is Cyde’s website, which means he can run it anyway he wants. Don’t like it? Don’t post here.
Geoff
March 9th, 2007 at 14:09
I haven’t heard it put that way explicitly, but I do agree with the general thrust of it. Yes, spammers do lie. And no, I’m not going to cave in to them. I think it’s kind of funny that I’m talking with one on an individual level. Don’t they have millions of spam mails to send off? How is it time-efficient to get into arguments with individual small site operators?
March 9th, 2007 at 15:03
Maybe it’s how they have fun. I meant, it’s hardly efficient to sit around and play Supreme Commander for a whole Saturday, but…
March 7th, 2008 at 18:26
[...] already wrote about how there are too many fracking bots on the Internet. Bots have downloaded twice as many pages on Cyde Weys Musings as people in February, with many [...]