Hey!
For starters, welcome to Hashnode. Trust me when I say this, Hashnode is one of the best (if not the best) communities to be a part of.
The problem you're facing is actually very simple to rectify. In database-uber-jargon, we call this "Highly Available, Redundant, Duplex Data Stores". Let me dissect every word there.
Depending on what you want, there are many methods to cluster database servers. There is the M/M (Master/Master) database replication; M/S (Master/Slave) database replication; and the M/S/S (Master/Slave/Slave) replication.
Very simply put, "Master" is the database where you can do R/W (read and write) operations. "Slave" is the one which replicates the data from the master, and only offers clients to read from it.
Essentially what you'd want to do is setup database replication: say M/S (it's the most common one). Then you'd want to create a Redis middleware to store the immediate queries. What the Redis queue does do is that it stores all of the queries the client makes, and then different worker processes fetch each query and execute it on the server, removing it from the Redis stack.
This takes care of two of your problems:
Another thing you can do is setup a Galera cluster for true-sync data replication, and then a load balancer in between. No retry logic, whatsoever.
If you need help in setting up any of these, or you have any other questions, feel free to ask! I hope this helps. :)