So I want to crawl the top 1 million websites in the world according to alexa. I was wondering what type of data you would like to see from a crawl like this?
The top most trackers, ad services, maybe top 100 words. Used js frameworks and libraries and so on.
Lionel A. Pierre
Always learning
I'd want to know most used technologies on these sites. Also some general layout specifics like percent use of side navigation vs top navigation etc ...