My FeedDiscussionsHeadless CMS
New
Sign in
Log inSign up
Learn more about Hashnode Headless CMSHashnode Headless CMS
Collaborate seamlessly with Hashnode Headless CMS for Enterprise.
Upgrade ✨Learn more

What are the most suitable datastores for storing huge number of articles and news?

Anas Rabei's photo
Anas Rabei
·Feb 6, 2019

I am assigned to a new big project at my current company. The project will collect a huge number of news articles from different sources.

The whole requirements still not clear but we can expect some of it. For example,

  • Building some dashboards to display some statistics about the collected news.
  • Full-text search (exact, fuzzy, and synonym)
  • Providing a way to other teams (specifically data analysis team) to query the data.

What would you suggest as a datastore for such a project?

I believe there is no one-size-fits-all solution to this type of project.

As a start, I am thinking in using Elassandra as it combines both Cassandra and Elasticsearch which may satisfy the first two points (Cassandra for aggregation and analytics and Elasticsearch for full-text search).

Still the third point not satisfied. The data analysis people are familiar more with SQL which will not be 100% provided by either Cassandra or Elasticsearch.

The other approach I am thinking in is to have another storage for the analysis team and the application responsible for writing the data will write it to both storages.

What do you think?