I am trying to make a slack clone using node.js, redis, mongoDB and socket.io. I have read that slack is using redis in their app and i am wondering what's the reason behind this? I am struggling to understand the caching flow!
Is this diagram correct...???

Gijo Varghese
A WordPress speed enthusiast
What are the advantages of caching chat messages when users send new > messages every second?
If you perform reads from DB for the real-time operations like Instant Messaging, it would be expensive. Secondly, you'd face concurrent DB write issue because of Locks. Cache reads are Faster.
If you cache too many messages on redis wouldn't this be a problem?
You'd usually store latest X no. of messages in cache of every channel/workspace depending on the use-case. Whenever you receive a message, you'd broadcast the write request on both Cache and DB asynchronously through Queues or similar approaches.
What if server go down or some error occur? Am i going to miss all the data from redis?
As mentioned above, you'd always write on DB along with Cache since Cache cannot be used as a persistence layer.
I'm not sure if Slack uses Redis only for Caching or it also uses it for Pub/Sub module in Redis. Pub/Sub module follows Publisher-Subscriber pattern where online Users will be subscribed to all channels they're part of and when a user posts a message to a channel to which other users are subscribed, this message gets broadcasted to all the subscribed online users of that channel.
First of all socket.io is not a framework but html5 websockets server and client library. If we call it framework than it won't be fair to .net, symfony, laravel, spring and many others. As a library it opens a socket port between client and server to send message packages to all connnected clients and it is a greaat tool. But to me, still unstable and opinionated. If you have socket.io server than only option for client is their own client script.
Meanwhile Redis already provides a pubSub event emitter. You subscribe and listen to events, when data changes or added to redis, it notifies all subscribers. npmjs.com/package/node-redis-pubsub is a great node implementation.
Long story short, if OPs main concern is updating clients based on Redis cache, adding socket.io in the middle is unnecessary. But if the concern is also usin websockets to communicate between client to client or other functionality outside of redis, websockets will be the way to go. But still socket.io won't be my first choice.
Vishwa Bhat has written an excellent answer. Along with that:
Most DBs like MongoDB, Postgres, etc write data to hard disk. While Redis uses RAM. Ram is way faster than the hard disk (even on SSD) for read/write.
However, RAM is really expensive and volatile. So you can't store the entire messages in Redis. Use it as a cache layer, only frequently accessed data will be cached. If something is not present in the cache, it will be routed to a real database.
For example, in a chat between two users, who have crossed 1000 messages. Store the last 100 messages in redis, so that every time user fetches lastest messages it never hit DB.
You can also configure Redis to write a copy of everything to hard disk every x seconds, so if the server goes down and when it restarts, data from disk will be transferred to ram. You should architect it so that if redis goes down, data should be picked from real DB
I'd a similar situation, but data is not frequently changed. So instead of Redis, I used Cloudflare. Read this post: coffeencoding.com/how-i-used-cloudflare-to-reduce…