Well, 20 images at 4.2 megabytes is pretty heavy. Are those thumbnails? If so I'd try to srcset out them down smaller and/or play with the formats and encoding more. Assuming that's all thumbnails (the only way I'd allow that many images on a single page at once) you should really be at a megabyte or less. What formats are you using? What type of images are they? Are you sure the format type matches the image content? Can colour depth reduction or heavier encoding be leveraged?
THOUGH, if you have 3Mb of images, and 4mb total of code, I would be as worried about the code than the images. If you've actually got 1.2 megabytes of JS, HTML, and CSS, the bottlenecks are more likely to be there than from the images.
How many separate files total are there? That's is a huge bottleneck since every file past the first 8 averages 200ms and can reach a second or more apiece in "handshaking" overhead. That's why combining down your scripts and styles to single files can have huge payoffs in page-load time. More so if you're doing this via HTTPS given it's increased overhead.
Is there any way you could use the inaccurately named "CSS Sprite" technique to reduce the overall number of images? Could some of the images be reduced to monochrome vectors and stored in a webfont as a single smaller (and more scaleable) file?
Are you storing those images as static in one form or another so they can be served by your http server and not your back-end language? That's pretty much a must-have.
How many elements are on the page? How big is the markup? Excessive / unnecessary markup can not only delay the render, it can cause slowdowns in your scripting and make the server work harder. This is one of the many reasons I dislike (actually, more like rabidly hate) frameworks as they tend to just slop in endless pointless DIV and classes for no good reason.
What's the scripting breakdown? How much of it is off-site stuff like social media or discussion services, and how much of it is on-site template stuff?
I mean, 1.2 megs of code -- unless you've got something like disqus being loaded -- is pretty heavy. Painfully so in fact.
That you say "SSR for first load" is kind of a warning sign too, as it implies later loads are CSR only with no fallbacks? Calls into question your semantics and possibly even your mechanism of action. There's a reason I consider methodologies like those used by React and Angular to be trash that is more likely to get a developer into trouble than it is a good way of building websites.
Is there any way you could lazy-load some of this on-demand? I'm not a fan of the method, but there are times where it's the most viable option. (though if so, it should be done as an enhancement with fallback)
I'd have to see a sample of the markup and probably the content too, but dimes to dollars you've probably got two to ten times the HTML needed, ten times the CSS needed, and 50 times the JavaScript needed. That's a wild (and partly unfounded) guess, but typical of what I've been seeing people do with all the hot and trendy framework nonsense that's so popular right now. There ARE reasons to get things as big as you're saying (such as social media and discussion plugins), but they're not usually something talked about at the stage of development you seem to be at.
But that really hinges on what the site is actually doing.