My FeedDiscussionsHeadless CMS
New
Sign in
Log inSign up
Learn more about Hashnode Headless CMSHashnode Headless CMS
Collaborate seamlessly with Hashnode Headless CMS for Enterprise.
Upgrade ✨Learn more

Nodejs CSV Ingest to MySQL - Billion rows use case

Shahid Shaikh's photo
Shahid Shaikh
·Oct 7, 2017

Use case: I have a table storing file information such as path, size etc. I need to read this table and grab the file path. Read those CSV files in parallel and ingest in MySQL in parallel.

My design:

I somehow designed this.

  • Cron to read the file information table in every 30 minutes.
  • I read those files in parallel and start a stream to read those files.
  • Push each stream messages i.e file content in the Message queue to say RabbitMQ.
  • Attach multiple listeners say 4 at the other end of the queue and fetch 100 messages at once i.e 400 messages at once.
  • Perform the parallel MySQL query insertion and update the tables accordingly.

I need your suggestion and inputs to correct me if I am doing it wrong!

Thanks in advance.