A weekend project - fromthecache.com
I was playing around on the weekend screen-scraping and analyzing word-frequencies for various sites (don’t ask), and was getting some slow responses (and accidentally got my IP blocked from one site when I hit them a few too many times). Eventually I hit upon the idea of hitting Google Cache for each URL (the pages I was scraping had sequential ?id=xxx URLs so it was easy to automate), with the aim of speeding things up a bit and taking some load off the target sites. With this in mind, I spent a few hours Saturday and Sunday developing fromthecache.com - it’s built on rails, and designed to provide transparent access to the Google cache, while fetching the original page as a fallback if necessary. ...