Too many fracking bots

The Internet is infested with way too many bots. Bots are currently downloading twice as many pages on this site as actual humans. It’s nearly absurd. Do I really need to be indexed twice as often as people actually read my stuff? Just looking at the Apache logs, I’ve been visited thousands of times this month by each of MSN Search’s Bot, Google bot, and Yahoo’s bot. Throw in Google’s feedfetcher bot and AdSense bot and Yahoo’s feed seeker bot for good measure. And don’t forget to add a pinch of Internet Archive bot and Technorati bot.

It’s crazy! I’m up to my neck in bots over here! And there’s not a damn thing I can do about it. Well, theoretically I could just make my site unindexable using a robots.txt file, but that’d be like curing a roach infestation through the use of nuclear weapons. Given a choice of too many bots versus nobody new being able to find my site, I think I’ll choose the bots, thank you very much. Google and Yahoo’s RSS feed bots don’t even respect robots.txt anyway.

I suspect that a graph of bot activity percentage versus traffic numbers looks sort of like an inverted V-curve. Below a certain threshold bots don’t even care about a site (it is doomed to obscurity), so what few visits you get are mostly human. And at the high end, when you’re getting hundreds of human visitors an hour, the bot visit numbers are dwarfed. But somewhere in the middle, corresponding to a small-to-medium-sized size, is the domain where bots rule. That’s where I am right now.

To be fair, I don’t really hate bots. I realize that they are a necessary evil, and that the 1,100 incoming hits from search results so far this month couldn’t possibly have come without them. It’s just annoying that nearly half of my bandwidth is being eaten up by non-sentient entities who get the same level of robotic satisfaction from consuming my site’s bits as the common splog’s.

11 Responses to “Too many fracking bots”

  1. John Says:

    Thanks for nice info….

  2. Cyde Weys Says:

    Well that’s fun, a spam comment that actually gets through my filtering through virtue of not including any spam URLs in the comment itself, but rather, just in the author URL (which I’ve since destroyed). Nice try “John”. You could work on the very generic comment though. That raises red flags. Maybe that’ll fly on other sites, but not here.

    To everyone else: in case you’re wondering what was being spammed, the site’s sub-domain at blogspot was dancingmoney.

  3. John Says:

    Good Catch Cyde
    Anyway, its really nice to surf through your site….

  4. John Says:

    it surprised me somehow why do you want to provide this input field….??

  5. Cyde Weys Says:

    What input field, leaving comments? So I can talk with my readers. See some of the comments on other articles; we’ve had some good discussions.

    I resent the implication that just because I have a comments form means I should accept whatever someone types into it. No spamming!

  6. John Says:

    I was talking about input filed for “Website”. As it doesn’t matter for you. So why do u keep this field while posting a comment ?

  7. Cyde Weys Says:

    Well, the field is useful in most circumstances. For instance, if someone posting here has a blog of their own, I’ll frequently click the link to see what their blog is like. Generally people who post comments on here are going to have similar interests to me and I might be interested in reading their stuff.

  8. llywrch Says:

    Cyde, you’ve obviuosly have heard the old Internet adage before this:

    Rule 1. Spammers lie.

    Rule 2. See rule 1.

    This is Cyde’s website, which means he can run it anyway he wants. Don’t like it? Don’t post here.

    Geoff

  9. Cyde Weys Says:

    I haven’t heard it put that way explicitly, but I do agree with the general thrust of it. Yes, spammers do lie. And no, I’m not going to cave in to them. I think it’s kind of funny that I’m talking with one on an individual level. Don’t they have millions of spam mails to send off? How is it time-efficient to get into arguments with individual small site operators?

  10. Will Says:

    Maybe it’s how they have fun. I meant, it’s hardly efficient to sit around and play Supreme Commander for a whole Saturday, but…

  11. Robotic collaboration on the net | Cyde Weys Musings Says:

    [...] already wrote about how there are too many fracking bots on the Internet. Bots have downloaded twice as many pages on Cyde Weys Musings as people in February, with many [...]

Feel free to leave a comment: