As I’ve mentioned before, one of the big pain points relating to this blogging business is deployment time: Haskell is slow to compile and Hakyll has multiple large dependencies, so the builds would initially take up to an hour. Yeah, you read that right. 60 minutes 😱. Something goes wrong towards the end of the build? Sucks to be you.
Thanks to Saksham Sharma and their post on speeding up Haskell CI builds, however, I have been able to bring it down to 7-8 minutes in GitLab’s CI/CD systems (excluding time spent waiting for runners to spin up etc.). That said, it wasn’t quite as easy as I’d hoped it would be (when is it ever?): Due to how Stack and Nix interact, building of the site would crash when it ran into UTF-8-encoded characters. Not cool.
Let’s fix it.
Step 1: using an image with Hakyll pre-built
In Sharma’s post, they mention that they’ve created an image that you can use for your build systems. The simplest version would look a little something like this (freely updated from their minimal configuration example):
An important thing to note is that your stack config’s resolver must match the one used in the Docker image, otherwise the build system would have to recompile Hakyll and its dependencies for your version, taking us back to the hour-long builds.
v3, the resolver is
lts-12.21, so make sure your project’s
stack.yaml contains the following line:
If this works for you and is all you need: great! If it doesn’t and you get errors talking about invalid byte sequences like the one below: don’t panic. I’ll sort you out.
Step 2: This one weird trick
As described in this GitHub issue, a fix for the above error is available in Stack’s master branch and as of Stack v2.1—the release candidate for which was released while I was writing this post—will be included with the tool.
From the release notes for the release candidate: “Use en_US.UTF-8 locale by default in pure Nix mode so programs won’t crash because of Unicode in their output”.
So if you’re using Stack v2.1 or later, the steps outlined in this section should not be necessary.
As evidenced by a fair few GitHub issues1, this is something that a number of users run into and it might be difficult to troubleshoot, but what it boils down to is this: When running Stack in Nix mode it defaults to building in pure mode. This isolates the build environment by removing environment variables and other things on your system that could influence the build and lead to a lack of reproducibility. This is usually a good thing, but it also unsets the
LANG variable, which Stack relies on to know how it should handle encodings.
Ok. So all we gotta do is re-set that variable, then? Yes. But how to do that might not be immediately apparent. You might be used to running shell commands like this:
But this won’t work with Stack, because it’ll still isolate the environment. What you can do, however, is to use the
--no-nix-pure option. This tells Stack not to isolate the build environment, so you’ll still be able to access external variables. Here’s an extract from my current build file that does just that:
This works perfectly on GitLab’s CI runners, but if this still doesn’t solve your issue, you might want to check what the locale is actually set to by using the
locale shell command. The output should look something like this:
$ locale LANG=en_US.UTF-8 LC_CTYPE="en_US.UTF-8" LC_NUMERIC="en_US.UTF-8" LC_TIME="en_US.UTF-8" LC_COLLATE="en_US.UTF-8" LC_MONETARY="en_US.UTF-8" LC_MESSAGES="en_US.UTF-8" LC_PAPER="en_US.UTF-8" LC_NAME="en_US.UTF-8" LC_ADDRESS="en_US.UTF-8" LC_TELEPHONE="en_US.UTF-8" LC_MEASUREMENT="en_US.UTF-8" LC_IDENTIFICATION="en_US.UTF-8" LC_ALL=
If the output doesn’t show a UTF-8 format, that seems like a good place to start (I’d try
EXPORT LANG=en_US.UTF-8 before running the Stack commands), but now we’re wading out past the scope of this post, so you’re gonna have to go it on your own, I’m afraid. Sorry, kiddo.
And that’s it! Simple, but not immediately obvious. It’s likely that a similar approach—the prepared Nix container—would work for other Haskell projects as well, though I can’t say for certain one way or the other.
A selection of GitHub issues relating to the unicode problem:
- hakyll can’t handle unicode?
- Enabling nix causes LANG to be lost.
- commitBuffer: invalid argument (invalid character)