My #1 concern is capturing all the requests....could amazon's elastic load balancer pull this off?
and how would you store massive amount of web request data from several thousand users?
how would you index that data for search, query that fast, back it up, and show visualization?
More importantly, how would you catch malicious bot traffic?
I'm fishing for some ideas on how to implement your own 'kissmetrics' because it seems like an interesting project that I can dig my teeth in to.