I'd want to know most used technologies on these sites. Also some general layout specifics like percent use of side navigation vs top navigation etc ...
Comment by Lionel A. Pierre on "What type of data would you like to see from a crawl of the top 1 million websites" | Hashnode