Sign in
Log inSign up

Making fast websites is easy, stop making it hard!

Jason Knight's photo
Jason Knight
·Dec 21, 2018

The recent post by Sandeep Panda about making subsequent pageloads faster, and the discussion I ended up having with Emil Moe in the responses had me thinking I should write an article on this topic. , just because it's so simple a thing that so many out there give little but lip service to during their development stage, and then get duped, suckered, or otherwise bamboozled by "tools" that are supposed to help... when really that's like trying to treat a gunshot wound with painters tape.

First off, I think the topic should be broken into three parts -- cache empty first load, cache filled subpages, and scripting overhead.

When it comes to that first-load the most important thing is keeping file counts low. Every separate file is a handshake, requiring at least three back-and-forth requests between client and server per file. Whilst yes, SPDY under HTTPS can help, it's still file requests that can tax the server load even if the actual networking connections are reduced.

The same effect can be seen exaggerated in FTP. Have you ever noticed how a single 10 megabyte file will upload many, many times faster than a thousand 10k files? That's the handshaking overhead biting you. HTTP has the same flaw, and only gets around it by leveraging how http requests are structured, and allowing for more than one connection at a time so the overhead is hidden behind other file's transfer times. The latter you can often do in FTP as well.

It is also the thing most oft neglected by developers, particularly those steeped in "frameworks" and/or heavy amounts of pointless presentational images. At the same time it is one of the easiest things to fix. On deployment just combine all your CSS into a single file. Combine all your JavaScript (that you can) into a single file. Use techniques like "fonts as images" -- like fontawesome -- or the incorrectly named "CSS sprites" to use less separate image files.

It's pretty obvious how to fix it, which is why it's so disturbing how few people even try, even as they try to scrape out a few bytes through minification.

Combining your files can also be used to pre-cache subpages. If you have a site where people visit lots of separate pages, they are often willing to forgive a bit of bandwidth use up front if the sub-pages load as quickly as they can click on them.

With subsequent pages the same concepts apply, but if you've done your job right most things should be cached. This is usually the second big failure. There should on a normal pageload for a subpage be only two things you need to worry about if your first-load is properly pre-caching ALL your style and scripts. Those two things are your new HTML and any content media unique to that page. Even if you CSR those concepts apply though instead of new HTML you should be sending "just the data" or "just the content area markup" instead of the whole page's markup.

To make that new markup on a subpage as fast as possible -- and this helps with first-load too -- the best move you can make it to simply keep as much of your presentation and scripting out of the markup. If you've heard the term "separation of presentation from content" that's exactly what they're talking about, and speed of subpages is a hefty chunk of why. I'm shocked how many think "separation of presentation from content" in regards to web development is a "server side thing"... it isn't!

It is one of the entire reasons CSS exists separate from HTML. It is a hefty chunk of why tags were deprecated in HTML 4 Strict and why you're supposed to have your markup say what things are, not what they look like.

... and it is the entire reason that saying something like class="text-center text-dark box-shadow col-4-s col-6-m" is ignorant, broken, nonsensical, and in general amounting to nothing more than making water on your codebase. Saying what you want things to look like has no business in your HTML apart from a few small corner cases where the presentation helps convey meaning; such as font-size in a tag-cloud or height/width on a HTML constructed bar-graph. That's why 98% of the time you use style="" in a codebase you're doing it all wrong, and 100% of the time the STYLE tag has not one blasted bit of business even existing.

Apologies for pulling percentages out of my backside there.

Another of the many reasons HTML/CSS frameworks -- like bootstrap or w3.css -- are ignorant, incompetent gibberish created by people not qualified to write a single line of HTML. By their very nature they result in writing two to ten times the markup from the use of unneccessary containers, non-semantic markup, and presentational classes.

note, w3.css is a framework by W3Schools, which has absolutely no affiliation with the W3C. Shocking how many people still don't realize that or recognize what inaccurate garbage W3fools entire website is.

You want a fast loading, easy to write, easy to maintain page, how about not writing 100k of HTML to do 10k's job? But sure, tell me again how "easy" those "frameworks" are or how much better they make "working in a group". bullcookies

Hence the formula I use to give me a rough idea of code quality when doing audits. Given the size of a page in plaintext, the number of content media elements (content images, movies, etc), number of form elements, and number of links on a page, you should have a good idea how big a well written semantically marked up page should be.

2048 + plaintext 1.5 + content media elements 256 + form elements 128 + anchors 128

Let's run one of my personal sites homepages through this:

elementalsjs.com

It has 4.19k of plaintext, no content images, no form elements, and 33 anchors. (I think, ballparking the anchor count). As such the expected page size would be

2048 + 4190 1.5 + 33 128 2048 + 6285 + 4244 = 12557

So basically 12.5k should be how big the markup is. HTML size of that home page? 10.8k. Well under... and I'm not even minimizing the markup.

As a rule of thumb if you exceed those figures by more than 50%, you've got problems. If you more than double it you have no business writing HTML professionally.

This is why the thread about subpage speeds on hashnode had me almost chuckling, since a number of things trigger page-loads whilst the home page is 70k but doesn't even load any real content. You load it scripting blocked it has 1.35k of plaintext, eight content images, and three dozen anchors.

That's the job of 9.7k of code... so why is it sending 160k of markup? Pointless presentational classes for bootstrap, pointless data- attributes for react, endless pointless DIV for nothing, static SVG in the markup, static style in the markup, static scripting in the markup. The laugh being that's after wasting time minifying it -- which is akin to closing the barn door after the horses got out.

Want subsequent pages faster? Make your back end not waste time sending 16 times the code needed to do the job when a pageload occurs. To be brutally frank what it sends scripting off -- currently 6 files totalling 1 megabyte (146k once gzipped) is doing the job of 3 files totalling well under 128k.

But that's entirely what these frameworks do and entirely what I expect when things like bootstrap and react are put together. By their very nature they create bloated HTML so people don't have to learn to use CSS properly... because that's "easier". Right.

Don't let the lies of how "easy" some framework is trick you into doing more work, as sooner or later it's going to bite you and you'll end up having to pitch it all in the trash and start over -- or worse, you'll keep throwing more and more hardware (or scammy software solutions) at it wondering why it's not helping.

So what's the alternative to presentational classes? Semantics, saying what things ARE, or would be in a professionally written document. Don't use classes to say what you want things to look like, that appearance might not even apply to all your different possible media targets. Instead say what things are or why they might receive a certain style. If you can't figure out a reason as to why -- be it a description like "current" or "mainMenu", or just plain using a semantic tag like H1, H2, p -- it probably shouldn't be styled!

There's a reason this -- lifted straight from bootstrap's tutorials:

<body>
    <div class="d-flex flex-column flex-md-row align-items-center p-3 px-md-4 mb-3 bg-white border-bottom box-shadow">
        <h5 class="my-0 mr-md-auto font-weight-normal">Company name</h5>
        <nav class="my-2 my-md-0 mr-md-3">
            <a class="p-2 text-dark" href="#">Features</a>
            <a class="p-2 text-dark" href="#">Enterprise</a>
            <a class="p-2 text-dark" href="#">Support</a>
            <a class="p-2 text-dark" href="#">Pricing</a>
        </nav>
        <a class="btn btn-outline-primary" href="#">Sign up</a>
    </div>

... is -- as I've said many times -- ignorant incompetent trash written by someone unqualified to write HTML, much less tell others how to do so. It is semantic gibberish, laden with endless pointless classes for nothing, treated by screen readers as a run-on sentence, and using tags you wouldn't even need like NAV if you just bothered using numbered headings properly. I mean, if you don't know what's wrong with starting a page with a H5, just back away from the keyboard and go take up something a bit less detail oriented like macramé!

They clearly never even bothered reading the specification on using NAV either for that matter!

There is no excuse for any competent developer to write that as anything more than:

<div id="top">
    <h1>Company name</h1>
    <ul id="mainMenu">
        <li><a href="#">Features</a></li>
        <li><a href="#">Enterprise</a></li>
        <li><a href="#">Support</a></li>
        <li><a href="#">Pricing</a></li>
        <li class="signup"><a href="#">Sign up</a></li>
    </ul>
<!-- #top --></div>

Three-fifths the code. That outer DIV's ID providing more than enough hooks for any related style, and again you don't even need NAV if you bother starting out your content with a H2 or HR like a good little doobie. Hence my saying all the new HTML 5 "structural" tags are pointless redundancies and code bloat you're best off avoiding altogether!

Though had they written it properly with all their garbage classes, actually having the list for the menu inside the nav, and knowing that they'd end up throwing even more classes at those, it's probably around a third the code without the framework.

So that's file counts, file sizes, and keeping as much as possible out of your HTML. That just leaves scripting overhead.

The overhead of scripting isn't just about how much JavaScript there is. It's also where you load it, how you hook the markup, and how often you do lookups on the DOM.

Where you load the scripting in your markup can be a huge determining factor. Scripts inside HEAD that aren't set to "ASYNC" will hold the page render until the script is loaded, even though they will still not have access to the completed DOM. It is for that reason I tell people not to put scripts inside HEAD at all. Any script that "needs" it in most cases is just poorly written junk.

Loading it in the middle of the markup can have a similar impact, the script has to run before render can continue, and having them all there can slow down how often they are queued for download.

Simply put the fastest place to load your scripts is right before you close the BODY tag. This has multiple advantages. The markup DOM is already built so if you're going to add to it, you don't have to wait for DOMContentLoaded or window.onload. It doesn't hang the load process but at the same time you can make DOM changes often before the render gets that far into things.

It can also help to post-load less important scripts and content. Advertisements, social media plugins, and discussion plugins (like disqus) are perfect for this, as instead of just slapping their scripts into your markup everywhere, wait until window.onload to add them. Just createElement('script'), give it a src, body.appendChild, and be done with it. Boom, all your on page content will load and render giving the user a useful page, with the extra doo dads loaded later. Overall it takes no more or less time to load, but it gives the illusion of a far faster page because what's really important is made available sooner.

How you hook onto your elements is equally important. The various getElement methods are slow, they have to parse through their corresponding maintained lists browsers keep internally of className, id, tagName, etc. The querySelector methods are even slower as they have to check all those in one way or another, often calling CSS' selector parsing to handle it.

This slowness is only exacerbated by the number of elements on a page. The more DOM elements, the slower they aall are. In this way that practice of not throwing endless pointless DIV, endless pointless classes, and maintaining that separation of content from presentation, leveraging your semantics as much as possible can speed up your scripting when doing these sorts of things.

It also helps to try and only call them once per script as needed. NodeLists are live updated, so if you need all the "buttons" in a form, and you "var inputs = form.getElementsByTagName('button');" that nodelist result will update if you dynamically add or remove buttons, meaning you do not have to call it every single time.

Likewise it just helps that if you have an event that's going to use an element over and over, store that element as a property of the event handler or even as a global or local in the same scope. (the latter being preferred). It's part of why I'm such a fan of SIF/IIFE construction as I can have a 'global' for my part of the script since it's actually all inside a function.

This is where frameworks like jQuery can bite you, as pretty much everything they do is based on grabbing onto things with selectors every blasted time you want ot use them. Whilst they kind-of try to cache similar selectors, that still ends up an object table lookup that's slow.

But can we do better? yeah! As I mentioned above, if a element exists for scripting purposes only it has no business in the markup... You document.createElement it. Since you're making the element yourself, just keep track of it in a variable! Boom, done. Never have to "get" again. You just made the thing, use it.

Another thing we can do to minimize the impact a large DOM can have and the slowness of get functions is to simply walk the DOM for things. A simple example of which I did recently for a form where when an input was changed and valid, they wanted the label to change. As the label was the element before the input, rather than giving it an ID and doing a getElementById, the change handler ended up thus:

// this is dumbed down from actual use case
function inputChangeHandler(e) {
    e.currentTarget.previousElementNode.textContent = 'New Text';
}

How simple is that? We know the currentTarget of the event would be input in question. We know the element before it is the label -- so boom, done.

Likewise let's say you want to do something to all the immediate child LI of a UL... like add " new" to them as a textnode. let's use UL#mainMenu as an example.

You'll see this:

var allLI = document.getElementById('mainMenu').getElementsByTagName('li');
allLI.foreach((li) => {
  li.appendChild(document.createTextNode(' new'));
}

Which is broken, it could hit LI we don't want if there's a submenu.

Or this:

var allLI = document.querySelectorAll('#mainMenu li');
allLI.foreach((li) => {
  li.appendChild(document.createTextNode(' new'));
}

Laughably we could make this faster by ditching the cryptic pointless arrow function and foreach.

var allLI = document.querySelectorAll('#mainMenu li');
for (var li of allLI) {
    li.appendChild(document.createTextNode(' new'));
}

But that too isn't addressing the real speed problem of the queryAll and how slow JavaScript array-like structures are.

What if I told you this was a dozen times faster and use less RAM?

var li = document.getElementById('mainMenu').firstElementChild;
if (li) do {
    li.appendChild(document.createTextNode(' new'));
} while (li = li.nextElementSibling;);

Walk the DOM. In Soviet Russia, the DOM walks you.

We already have a structure for walking from one element to the next on the tree. Like following the switches in a complex train yard. Leveraging that can speed up everything you do with your scripting.

The same goes for generating content on the page. InnerHTML is slow as it re-triggers the entire document parsing process. WORSE to preserve existing changes to the DOM, some browsers (IE, Edge) will actually turn the live DOM back into live HTML, add the code, then parse it as if it were a new document.

This is why whenever possible you should be building new stuff using the DOM. Avoid innerHTML as much as possible. IF you really have no choice, apply the new HTML to a DOM fragment -- say a newly created DIV that's not hooked onto the BODY yet. That way the parser only does that element. Then attach that DIV to the live DOM.

It is also why a lot of frameworks that claim they use the DOM quite clearly do not -- or at least the people using them sure-as-shine-ola aren't!

If you have anything being loaded by the scripting during page load, sticking to actual DOM manipulation is always faster.

The final detail is to stop using JavaScript to do things CSS3 can now handle. CSS3 can animate, it can fade, it can slide things in and out. Using :target you can create modals without a line of JS. If you're willing to "abuse" checkboxes you can say the same of any toggle-able section without resorting to a line of scripting. Done properly you can even do it in a way that gracefully degrades for legacy browsers that can't handle it completely right. With the new HTML 5 input validation there is often little reason to even be throwing JavaScript at forms anymore, unless it's for really complex things like credit card number formatting -- and even then anything you verify client-side HAS to be re-checked server side, so if the new types don't work or are missing, just re-send the page for those older browsers.

Basically, stop using JavaScript to do things we haven't needed JS to handle in six or seven years.

But I know web developers, we'll still be seeing that mm_swap rubbish from Dreamweaver users a decade from now even though it should serve zero purpose any time after 1998'ish.

As overuse of JavaScript starts to hit hot and trendy levels not seen since the peak of the dotcom bubble, people are using it where inappropriate, unnecessary, or just plain wrong on accessibility grounds.

At the same time, the "dreaded" CSR -- client side rendering -- can be leveraged to speed up an already working page. Again, only send what you have to -- so write the page normally first to work as if JavaScript doesn't even exist... it lets you develop all your template stuff and have a nice clean understandable baseline to work from. Then -- and here's the magic -- intercept all anchors with on-site links, REST in just the new content, update the title tag, update the address bar with a history.push, DONE. Boom, you get all your nice neat CSR high speed fancyness, and a site that's still 100% functional scripting off/blocked/disabled! In fact, because you can build it "normal" first it is often easier to implement the AJAX side. you already know where everything is going to go!

In fact, I just did this for a non-profit medical clinic's internal discussion/messaging system where to implement the difference between a normal request and a AJAX one is the page template, I just checked the HTTP_X_REQUESTED_WITH header to see if it equals xmlhttprequest, and if so I load the ajax page template which sends nothing but the new title, content, and any other changes as JSON, instead of the normal full page load -- as such I don't even need to play games with the URI's pulled from the anchor's href apart from domain matching during the hook. They needed as many tricks as possible since 90% of their hardware is still atom or even geode thin clients or desktops from the P4 era. A lot of it hand-me-downs from their local hospital. Hell, some of their LAN is still on 10b2 coax as they're still interfacing to a ASA 400 machine.

None of this is rocket science -- Werner Von Braun

It's all little more than good practices, and I still don't understand why people think it's so hard, why they think any of these junk frameworks are doing them any good, or how anyone can justify megabytes of code spread out over dozens (even hundreds) of separate files to handle two dozen or less kilobytes of plaintext. Much less how using JS to do CSS' job is 'easier', writing two to ten times the markup is "easier", etc, etc... It's all lies.

... and users pay for these lies every time they visit a website slopped together in ignorance of the most basic of development practices.

TDLR version: Keep the separate file counts as low as possible, move anything you can remove from the HTML out of the HTML, leverage your semantics and stop using presentational classes. Load JavaScript right before you close BODY and/or load any scripts that aren't important (like ads) when window.onload triggers, and stop using JavaScript in a manner that adds nothing of value to the page or worse, does HTML, CSS, or the server's job.

Compared to that, the games once useful tools turned marketing scams like Google PageSpeed tell you to play with cache-control settings and "everyone needs a CDN", or "content caching" like varnish, or any other attempt at throwing more tools and more code at the problem? All pointless hoodoo-voodoo that avoids addressing the real problems.

To paraphrase the Vulture (Red Baron to you normies)

Find the content, deliver it to users. Anything else is rubbish.