My FeedDiscussionsHeadless CMS
New
Sign in
Log inSign up
Learn more about Hashnode Headless CMSHashnode Headless CMS
Collaborate seamlessly with Hashnode Headless CMS for Enterprise.
Upgrade ✨Learn more

What would be the tech stack of a high data volume system like Google analytics?

Ankit Singhaniya's photo
Ankit Singhaniya
·Sep 14, 2017

I was wondering if I am to build Google Analytics today, what stack should I be using or what is used by Google? The scale of the app may not be as large as google, but it should work for most general use cases.

The system will have following requirement:

  1. The volume of data can be very high and variable
  2. The data needs to be stored for a long period of time and should be query-able?
  3. It should be cost effective(need not mention this)

I have some ideas like using:

  1. I think serverless should be way to go, as it can scale up and down with ease and in a cost effective way
  2. Database is the most confusing part, as it should be able to handle that much data while being effective and efficient. I have following options:
    1. MongoDb - is queryable and is nosql should be able to handle large volumne of data
    2. Cassandra - highly scalable made for high volumne data, but query? setup?
    3. Postgresql - query will be charm, but can it handle the volume?
    4. Elasticsearch - ??
  3. I am also thinking that will I also need a service like Kafka or Kinesis on the frontline?

Am I missing any piece here? What will you choose to build this?

There are other products like keen.io and treasuredata? What tech stack should they be using currently?