This website is (slightly) faster

2021.02.07 [Updated: 2021.05.15] :: {web} :: #caddy

As many of you may be aware, the public internet is a bloated, Javascript-riddled mess.

If you don't believe me, I'm not the first one to talk about this. And I certainly won't be the last...

You'd think with how much internet speeds have increased in the past 20 years, particularly when compared to the dialup internet that many of us may have grown up with, we'd be loading every website with blinding speeds. But that's just not the case, now is it?

There's a multitude of reasons for this, including things like advertising, JavaScript, tracking pixels, the list goes on...

Today, I'm going to talk about a much smaller piece of this pie - the actual size of the HTML document you're reading, and how it gets delivered to your browser.

Background.§

Let's get some quick background.

When you point your browser at my website, you kick off a 3 step proccess. I'm hand waving some things here (like TCP connections and TLS handshakes), so just roll with it if you're more technically inclined.

Steps:

Browser makes a request to the server, while also specifying what types of content encoding it can accept.
The server receives the request, analyzes that list of allowed content encodings, and then finds the requested asset (think an HTML web page or CSS stylesheet).
Having found the asset, the server makes sure it can "fit" into the desired content encoding specified by the browser, and may take steps to optimize the target asset for a particular content encoding.

Now, when I say "content encoding" here, I'm really meaning "compression".

Specifically, many web servers support serving content using gzip compression. This is an old standard, works pretty well for text data, is relatively fast for a web server to do on the fly, and "just works".

Now, it's not guaranteed that you receive a gzip-compressed blob of HTMl for every website your visit. It depends on your browser and the web server configuration.

So why is this website "faster"?§

Because I made my compression "more better". Yeah, I said it.

More ~~better~~ specifically, I optimized 3 key aspects of how my website assets are compressed:

Utilized a better gzip compressor known as Zopfli.
Added Brotli, an even more efficient compression algorithm than gzip.
Compressed all assets before the web server ever sees them, meaning less work for it to do.

To be clear, I was already using compression via Caddy's encode gzip zstd option, but that requires all compression to be done on the fly (and thus repeated for every duplicate request), doesn't support Brotli, and uses a sub-par gzip compression algorithm (when compared to Zopfli).

In summary, I'm increasing existing compression, utilizing more advanced compression, and pre-compressing all of my pertinent text-related assets of this website (things like HTML, CSS, and even SVG images, if I had any...).

Great, so how did you do this?§

It all kind of started when I got a little frustrated with my web server, Caddy.

Now to be clear, Caddy is an excellent web server. I just wanted to replicate an existing setup I had on my old nginx server where I pre-compressed a number of assets before uploading them.

I'd looked at doing this a few times over the past year, but ran into a few roadblocks along the way, including:

Older versions of Caddy (v1) made this easy, but this feature was removed in a large rewrite for v2, which is what I use.
It was theoretically possible using their new configuration syntax, but no one had fully put the puzzel pieces together yet.

Luckily, if you wait long enough, someone else figures out how to get something done on the internet. Actually, several someones.

After doing another web search on handling pre-compressed assets with Caddy, I noticed some forum posts from the past year, and eventually stumbled on this excellent blog post.

See the bottom of that blog post for links to the pertinent forum posts.

By solving the "serving pre-compressed assets" problem, getting better gzip compression with Zopfli and adding Brotli to the mix was the easy part.

Updating Caddy to serve pre-compressed assets.§

Update 2021-05-15: Serving precompressed assets got a LOT easier in Caddy 2.4.0. Read more here

This is copied pretty much verbatim from the above blog post, but here is the configuration I added in order to get Caddy to correctly serve pre-compressed Brotli and gzip compressed assets whenever possible.

example.com {
    root * /path/to/dir
    
    #I specifically removed the following 'encode' option, since I'm doing compression
    #myself and Caddy was gzip-ing my PNG and JPEG images with it on (which is wasted CPU effort).
    #encode gzip zstd

    ### Precompression support
	@brotli {
	header Accept-Encoding *br*
	  file {
	    try_files {path}.br {path}/index.html.br {path}.html.br
	  }
	}
	handle @brotli {
	  header {
	    Content-Encoding br
	    Content-Type text/html
	  }
	  rewrite {http.matchers.file.relative}
	}

	@gzip {
	  header Accept-Encoding *gzip*
	  file {
	    try_files {path}.gz {path}/index.html.gz {path}.html.gz
	  }
	}
	handle @gzip {
	  header {
	    Content-Encoding gzip
	    Content-Type text/html
	  }
	  rewrite {http.matchers.file.relative}
	}

	@html {
	  file
	  path *.html */
	}
	header @html {
	  Content-Type text/html
	  defer
	}

	@css {
	  file
	  path *.css
	}
	header @css {
	  Content-Type text/css
	  defer
	}

	@js {
	  file
	  path *.js
	}
	header @js {
	  Content-Type text/javascript
	  defer
	}

	@svg {
	  file
	  path *.svg
	}
	header @svg {
	  Content-Type image/svg+xml
	  defer
	}

	@xml {
	  file
	  path *.xml
	}
	header @xml {
	  Content-Type application/xml
	  defer
	}

	@json {
	  file
	  path *.json
	}
	header @json {
	  Content-Type application/json
	  defer
	}
}

While long, it's actually not a scary config. It basically tells Caddy to serve pre-compressed assets if they exist, while also fixing the Content-Type header so that we correctly identify *.html files as text/html instead of compressed binary data.

Compressing assets using Zopfli§

Now that the web server can correctly serve my assets, I tested a few iterations of the Zopfli before adding it to my web publishing workflow. I already optimize my website's images using advpng and jpegtran, so I just added to my existing pattern.

Also, for those who may be confused - Zopfli is basically a "better gzip compressor". It produces files that any standard gzip utility can decompress, but it does so using more advanced algorithms than the standard gzip utility.

If you want to get started, here's some useful commands for testing zopfli.

zopfli -c index.html > index.html.gz

You'll note that I'm using a redirect here (>) to produce a new file. By default, zopfli overwrites the existing file, which I want to preserve as a fallback option on my web server in case the user's browser doesn't support compression. So instead, I force zopfli to output to stdout by using the -c flag, and then use shell redirection to create a new file.

Do it in parallel§

Here's the fancy fd command that I came up with that will automatically find all of my text assets in my website publishing directory (public/) and compress them using zopfli:

fd -e html -e xml -e css -e js -e svg -e txt -e json --exec sh -c 'zopfli -c {} > {}.gz' \; . public/

By default, fd will utilize all of your CPU cores to run whatever command you specify in bulk. This allows me to very quickly compress my entire website.

Compressing assets with Brotli§

Brotli produces even smaller files than Zopfli, and has been accepted as an IETF standard, meaning that most web browsers support it by default now.

In order to create Brotli files for my assets like I did with Zopfli above, I came up with the following:

brotli -Z -f index.html

Unlike zopfli, brotli doesn't overwrite its input file, so the above command will produce an index.html.br file by default. For good measure I added -Z to always use max compression, and -f to force overwriting of any existing files. This is useful because I statically generate this site from a set of Markdown files, and I'd rather not have to delete the existing output directory every time just to keep Brotli happy.

Do it in parallel, again§

fd -e html -e xml -e css -e js -e svg -e txt -e json --exec brotli -Z -f {} \; . public/

Same style of fd command as above...

Results§

For just a quick comparison, I compressed the home page of this blog with Zopfli and Brotli.

Here's the number of bytes in four forms: uncompressed, compressed with Brotli, and compressed with Zopfli, and compressed with standard gzip using gzip -9 for maximum compression:

17,817 index.html
2,607  index.html.br (85.37% improvement over uncompressed)
3,308  index.html.gz (81.43% improvement over uncompressed) (zopfli)
3,404  index.html.gzip (80.89% improvement over uncompressed) (gzip -9)

As you can see, compression in general results in significant savings. Brotli unsurprisingly takes the cake, followed by Zopfli and standard gzip as dead last.

That said, the improvement of Zopfli over standard gzip is marginal in this case, but when applied to all text assets across the entire website, you can shave a few percentage points off your web traffic.

Conclusion§

So yeah, you're likely reading this via a Brotli-compressed HTML page, or maybe even a Zopfli-compressed HTML page.

In addition to the total lack of Javascript, trackers, and any other 3rd party requests, you can enjoy a lighting fast "plain text" experience on this website.

Cheers!