Why soupault?

Estimated reading time: 9 minutes.
Date:

There are so many static site generators already that another one needs a pretty good justification. I mainly made soupault for my own use, in fact it has grown out of a set of custom scripts that used to power the baturin.org website. Still, if I'm making it public, I may as well want to explain the decisions behind it and reasons why anyone else may want to use it.

If you are not following that scene, I can tell you there are lots of them. A popular directory, staticgen.com, lists 261 of them now, and I'm sure it's only a fraction of all projects in existence.

This is reminiscent of the early to mid-2000's CMS boom, but the scale is even larger because there are no ongoing maintenance costs. You can keep improving your generator, but you don't need to keep up with libraries and make security updates just to keep it running. Well, usually... The static generator setup of staticgen.com broke down due to library incompatibilities lately, but that's JS ecosystem for you. Which brings us to the first point.

Friendly to non-programmers and future-proof

So far static website generators was mostly a programmers' game. While most of them don't require programming knowledge to use, they often make an assumption that the user is familiar with programmers' workflows. For example, Jekyll's website says you need to run gem install jekyll to install it. Does this say anything to a person not familiar with the Ruby language and its tools? Gatsby's website just assumes you know how to install JS applications. UNIX-like system users have it easier since they can install from repositories/ports, but I still wonder how many non-programmers decided it's not worth the trouble after reading instructions like those.

With generators written in interpreted languages like Ruby or JS, there's also a hidden danger of software rot. Unless you bundle it with all libraries it needs, chances are it will eventually stop working when libraries change. With online software like CMSes, using an outdated version is unsafe, but with purely offline programs, there are no security risks. I believe they should be available in a form that end users can stick with even if the program becomes unmaintained or they don't like new versions. One of the Visicalc1 still uses a tool from 2001, and there seem to be quite a few HomeSite users around. Nothing wrong with that. If old tools work for them, why should they switch to something else?

Of course, both CuteSITE Builder and HomeSite are proprietary products that you cannot keep updated yourself. Soupault is free software that anyone can fork and maintain even if I stop working on it, but source code availability is not a gurantee that anyone will actually do that.

That is why I made it a single, self-contained executable that can be “installed” just by unpacking it. The Linux version is statically linked with musl so that it will work on any Linux-based OS for years to come, unless system call numbers change (which hasn't happened yet and probably won't happen). With Windows and macOS versions, backwards compatibility is on the OS vendor.

Windows support

I don't like Windows and I don't use it, but there are many people using it, and I want soupault to be available to all people regardless of their OS choice. Soupault is a native program on Windows (i.e. you don't need to install anything else to run it), and I spent quite some time testing it there to make sure it works exactly like it works on UNIX-like systems. That said, since I don't use a Windows version daily, I might have missed some sharp edges. If you find anything, let me know, and I'll do my best to fix it.

Fast but extensible

For a program that runs only once in a website deployment cycle, performance is much less critical than for a server-side web application that runs every time someone makes a request. However, it still can be a major frustration point for users, especially when they are tweaking their page layout.

Indeed, performance is one of the biggest reasons for the popularity of Hugo, to the point that people are ready to give up extensibility for it. Hugo's variety of built-in functions sort of make up for lack of plugin support, but the point remains.

There are two sources of performance problems: the program itself and the language it's written in. Interpreted languages like Python or Ruby have to read the program text and translate it to machine instructions every time it runs, there's no way around it. Hugo would still be faster than many alternatives even if its authors didn't put any effort into optimizing for performance, simply because it's written in Go, a language that compiles to native machine code.

Interpreted languages, however, make it very easy to make programs extensible with plugins. The program is a text file that is loaded into the interpreter, and plugins are also distributed as text files that can be loaded together with the main program. With native code, extensibility takes much more effort even if the language supports dynamic code loading (Go doesn't).

Soupault is written in OCaml, a fast language that does support dynamic linking, but distributing plugins would not be nearly as easy since plugins are also native libraries that would have to be compiled for each OS, against a specific program version and using a matching compiler version. There's one other static site generator that supports plugins—Stog, also written in OCaml, but I'm not aware of any third-party plugins for it.

One way to break this cycle is to embed a small interpreted language into the program. I've revived an embeddable Lua 2.5 interpreter project, Lua-ML, with help from its original maintainer, Christian Lindig. While Lua 2.5 is not nearly as nice as modern Lua 5.x (if Lua can be described as nice at all), it does allow users to go beyond the built-in functionality and add their own.

Aside from Lua plugins, soupault can also run external scripts, feed them data extracted from pages, and include their output in the page. For example, post dates in this blog are extracted from git commits, and the blog index is generated automatically as well.

Web 1.0 support

Most existing generators are Web 2.0 applications in their spirit. The defining characteristic of Web 2.0 is complete separation of content and layout. With dynamic websites that visitors can modify, like Wikipedia, it's a necessity, since custom layout elements inside user-submitted data can break website functionality and can be abused for malicious purposes.

For static websites, that approach is not necessary. It can be valid of course: for example, a writer who just wants to publish stories on the web may not need anything but very basic formatting inside the stories. A blog generator that takes stories from plaintext files and inserts them into a fixed HTML page template can be a perfect tool for the job. Jekyll's tagline is “transform your plaintext files into a website”. Even though you can have custom HTML inside your page files, it's still inserted in a fixed place in a page template, unless you know how to do some advanced theme hackery.

The web, by itself, however, does not have any layout and content separation. It's a medium of its own, with unique means of expression. The main reason people are turning back to static websites and Web 1.0 is complete creative control over the presentation of their content.

Tools that can automate the tedious tasks of static site maintenance without interfering with user's creative control need to be aware of the native language of the web, HTML. Libraries that can read and modify HTML have existed for a long time, but they were mostly used for extracting data from websites, not for making them.

That's the big idea of soupault. Rather than “filling in the blanks” like template processors do, it can transform your existing pages. It still has a concept of a “page template”, but it's really just an empty page. There are also ways to perform some actions only on specific pages, for example, insert a <script src="/scripts/tetris.js"> tag only into site/tetris.html.

Also, site sections are just subdirectories. The workflow isn't so different from managing a static site by hand, you can automate exactly what you want to automate.

Conclusion

Will it become the tool of choice for the Web 1.0 community? It's too early to say. But I hope it will help someone else build and maintain their websites.


1The first widely available spreadsheet program.