Phillip Pearson - web + electronics notes

tech notes and web hackery from a new zealander who was vaguely useful on the web back in 2002 (see: python community server, the blogging ecosystem, the new zealand coffee review, the internet topic exchange).

2003-5-4

Weirdness in the referrer log

The Googlebot is usually fairly well-behaved, but today I saw this:

/pycs_search/htdig-pycs-snapshot-20030402.tar.gz - 245 hits (32276480 bytes)
      23: 64.68.84.42; Googlebot/2.1 (+http://www.googlebot.com/bot.html)
      20: 64.68.84.51; Googlebot/2.1 (+http://www.googlebot.com/bot.html)
      18: 64.68.84.31; Googlebot/2.1 (+http://www.googlebot.com/bot.html)
      17: 64.68.85.13; Googlebot/2.1 (+http://www.googlebot.com/bot.html)
      15: 64.68.84.76; Googlebot/2.1 (+http://www.googlebot.com/bot.html)
      15: 64.68.84.39; Googlebot/2.1 (+http://www.googlebot.com/bot.html)
      15: 64.68.84.143; Googlebot/2.1 (+http://www.googlebot.com/bot.html)
      13: 64.68.84.149; Googlebot/2.1 (+http://www.googlebot.com/bot.html)
      12: 64.68.84.16; Googlebot/2.1 (+http://www.googlebot.com/bot.html)
      12: 64.68.84.137; Googlebot/2.1 (+http://www.googlebot.com/bot.html)
      11: 64.68.85.6; Googlebot/2.1 (+http://www.googlebot.com/bot.html)
      11: 64.68.84.15; Googlebot/2.1 (+http://www.googlebot.com/bot.html)
      9: 64.68.84.6; Googlebot/2.1 (+http://www.googlebot.com/bot.html)
      9: 64.68.84.153; Googlebot/2.1 (+http://www.googlebot.com/bot.html)
      9: 64.68.84.144; Googlebot/2.1 (+http://www.googlebot.com/bot.html)
      7: 64.68.84.46; Googlebot/2.1 (+http://www.googlebot.com/bot.html)
      7: 64.68.84.132; Googlebot/2.1 (+http://www.googlebot.com/bot.html)
      7: 64.68.84.131; Googlebot/2.1 (+http://www.googlebot.com/bot.html)
      5: 64.68.85.9; Googlebot/2.1 (+http://www.googlebot.com/bot.html)
      4: 64.68.84.49; Googlebot/2.1 (+http://www.googlebot.com/bot.html)
      4: 64.68.84.134; Googlebot/2.1 (+http://www.googlebot.com/bot.html)
      2: 64.68.84.43; Googlebot/2.1 (+http://www.googlebot.com/bot.html)


It looks like a whole heap of different instances of Googlebot have been downloading the file in 128K chunks. It's about 5 MB long, so I guess they've got six different copies of it in the cache right now. I didn't realise they indexed .tar.gz files ;-)
... more like this: []