Free Reproducible Lunches: Deploying Julia and Python in a Singularity container

Table of Contents

Free reproducible lunches #

In the development of a computing science project, be it in industry or academia, you want your code to be reproducible, and you don’t want that reproducibility to reduce your productivity. Limiting features because they may not be portable, supported, and so on is unfortunate. There are several factors that enable or increase reproducibility, here we focus on what I understand by the therm in the context of scientific computing.

Stable results : A computation produces the same output, given a reference input, with a tolerance that does not have an effect on interpretation.
Develop once, runs everywhere : You write your code not just for you (and the paper), but ideally to solve a real world problem, or at least for others to verify and reproduce. That means your work needs to run on a number of different systems, each probably with different versions. You don’t want that to mean you’ll settle for the lowest common denominator.
Defies time : Having to revisit your work 2 years after writing it because some arcane dependency isn’t available anymore is not fun. It’s very common in academia, while trying to reproduce a previously published work. Unless there’s a bug or feature you want to correct, you do not have time to spend on unnecessary attempts at making something that did work, work again, because of the random actions of someone else.
Survives improvement : Adding features should not break previous functionality.

To realize all of these lofty goals, we’ll deploy a trifecta of software engineering solutions: testing, virtualization, and versioning.

First, testing should be a no-brainer, but is rarely present in academic software. When you implement a new feature, you cannot know it does what you think it does, unless you verify it.

There are 2 main reasons that this is absolute necessity, even if you are a faultless genius:

Illusion of execution: We’re long past the age where a CPU executes exactly what you write. Instead, your work is executed with the same side-effects as something that actually runs a lot faster. Side-effects are a bit of a misnomer, originating in part from functional programming context. A side effect can be output, a file your program wrote, something displayed on the screen, a robotic arm moved and so on. C++ compilers are great examples of this, if you write C++ code that produces no output, and set the optimizer to O3, your code will run nothing, because there are no measurable side-effects. If you write a for loop to compute a factorial, and the compiler or runtime knows how to optimize that into a closed form formula, then the latter will be executed, not your version. The point I’m making here is that this benevolent behavior is counterintuitive for those who first become aware of it. Most programmers are taught in imperative programming environments, where they tell the computer exactly what to do, step by step. In fact, what you tell the computer is what side effects your code will produce, and the compiler and instruction engine will figure out how to get that result in a way that’s usually 10-100x times faster. Getting back on topic, this matters to our discussion because you thought you controlled execution, but you don’t. And even geniuses that are confident about line by line execution of their code, will be less so on the outcome of 10.000+ line codebases, i.o.w. the side-effects.

Scientific Progress by Surprising Failure: Let’s assume you never write code with a bug in it. Even then the solution to a problem you’re implementing is not perfect. Rarely, in science, is a perfect, never to be improved, solution found to a problem. But by definition, you do not know where your solution will break down. You don’t know how approximate the solution will be, how accurate it can, or will be, and what the surprises are. This, and not the solution, is a key result of scientific computing. That means you will have to extensively test, evaluate, benchmark, and verify your code. So even if you object to writing tests because you are a never erring coder, and you think it’s a waste of time, as a computing scientist, you will have to, because it’s the raison d’etre of your code to begin with. And because we love efficiency, we’ll write those tests once, and then want them executed conveniently, automatically, with nicely informative results.

Tests require a stable environment. As an argument in extremis, you don’t expect a call to a random number generator to give identical results without a seed either? The same applies to your environment, why invoke non-determinism when you don’t have to? Dependencies have subtle interactions, there’s timing issues in parallel execution, variable differences in precision of approximation and so on. A ‘fix’ or improvement in a mathematical library can increase, or decrease the precision of your results. So while you’re certainly welcome to write code that runs exactly the same everywhere, including on yet to be designed computing platforms, for now let’s settle on a controlled environment.

This encompasses points 2 and 3, because if you control the environment, you also control when, or if it will ever change.

All of the above now enables you to complete aim number 4. In your perfect environment, you can alter your work by adding yet another feature, or refactor existing code, and because you extensively tested the previous performance, and have automated it, you now have the framework to improve without overhead of introducing breaking changes. While users of your code will beg, request, cajole, and pay for new features, the one thing that will make them actually happen is having the guarantee that they can trust on your code being as immutable in its execution as a statue.

Achieving reproducibility with Julia and Singularity #

First, we install Singularity, we’re working on a Fedora based system here, but the steps for Debian based are similar.

sudo dnf -y singularity

Next, we write a simple definition file of what we want containerized.

Creating your definition file #

Below, I’ll walk you through the sections of this minimal example.

BootStrap: docker
From: fedora:35

Singularity can reuse or extend images based on docker instances, so let’s re-use the official Fedora 35 base image.

%post
    export TARGET=https://julialang-s3.julialang.org/bin/linux/x64/1.7/julia-1.7.1-linux-x86_64.tar.gz
    dnf install -y wget openssh-clients git
    mkdir -p /opt/julia && wget $TARGET && tar -xf julia-1.7.1-linux-x86_64.tar.gz && rm julia-1.7.1-linux-x86_64.tar.gz
    export PATH=/opt/julia/julia-1.7.1/bin:$PATH
    mkdir -p /opt/juliadepot
    export JULIA_DEPOT_PATH=/opt/juliadepot
    cd /opt && git clone https://github.com/<YOU>/MyPackage.jl
    julia --project=/opt/MyPackage.jl -e 'using Pkg; Pkg.update(); Pkg.build(); Pkg.instantiate(); Pkg.test()'
    julia --project=/opt/MyPackage.jl -e 'using MyPackage'

    ## Interactive Julia will write to logs, but we don't want
    ## the container to be writable, so link it to tmpfs on host.
    rm -rf /opt/juliadepot/logs
    ln -s /dev/shm/ /opt/juliadepot/logs
    # Get rid of the packages not needed for runtime
    dnf remove -y wget openssh-clients git

We first download and install our preferred Julia version, extract it, and configure the container to point to that julia executable. Next we configured $JULIA_DEPOT_PATH. This is a key step, without it Julia will write to $HOME/.julia, which is outside of the container. We have 0 control of what’s there, and both we, and the user, do not want to mix things. So Julia is told to look only in /opt/juliadepot for scratch space, logs, packages, and so on.

Finally, we clone, build, and install your Julia package.

The only somewhat unorthodox section are the rm/ln lines. We want to deploy a read only container, but Julia will sometimes write log files in $JULIA_DEPOT_PATH/logs. So we can either make the entire container writable, and throw away a lot of advantages for no gain, or link that directory to /dev/shm, a temporary file system in Linux that’s user writable. To save space we uninstall the packages we won’t need at runtime.

Runtime configuration #

Next we configure the container

%environment
    export LC_ALL=C
    export PATH=/opt/julia/julia-1.7.1/bin:$PATH
    export JULIA_DEPOT_PATH=/opt/juliadepot

And add a simple token runscript, that’s just friendly but mostly useless.

%runscript
    echo "Container was created $NOW"
    echo "Arguments received: $*"
    echo pwd
    exec echo "$@"

We’ll add some information and usage instruction as breadcrumbs for unsuspecting users.

%labels
    Author you@you.country
    Version v0.4.2

%help
    singularity exec image.sif julia --project=/opt/MyPackage.jl

You can simplify this by installing the distribution Julia package, but that is rarely as bleeding edge as we’d like. As an exercise, how would you define an alias that makes “julia –project=/opt/MyPackage.jl” equate to something shorter?

Building the image #

Next, we use the definition file to build our image

sudo singularity build myimage.sif mydefinition.def

This will run through the %post section, and should anything fail, halt the build. This means that if the build succeeds, we know that:

Julia is installed and up to date
Our package is installed with all dependencies
Our package is precompiled, saving the user some time at first use
All the tests pass

You do need sudo for this, if you do not have it, use the Sylabs Remote Builder, or spawn a virtual machine on a VPS.

Running the container #

Next we can use the image to either execute commands of our choice, or enter it in an interactive mode. Neither can alter the image, because the container filesystem overlay is read only.

singularity exec myimage.sif echo "Hi there!"

It’s more likely the user will want to do something like

singularity exec myimage.sif julia --project=/opt/MyPackage.jl ajuliascript.jl arg1 arg2 42

Or perhaps use the container interactively

singularity shell myimage.sif
Singularity>julia --project=/opt/MyPackage.jl
julia> sum(1:10)
julia> using MyPackage
julia> ?MyPackage.myfunction

We now have the image, but how do we get it to users? You can share the definition file, but users do not always have sudo rights.

Assuming you have configured your free Sylabs.io cloud account, you can do

singularity remote login
singularity sign #Assuming you've setup your key, see sylabs.io docs.
singularity push myimage.sif library://<you>/<collection>/yourimage:x.y.z

Note that this defaults to public images, so make sure to configure it as private first if your code is not yet open.

Then all you need to tell your users is:

singularity pull library://<you>/<collection>/yourimage:x.y.z
singularity exec image.sif julia --project=/opt/MyPackage.jl ajuliascript.jl arg1 arg2 42

The code will be reproducible, tested, and the container and testing process is fully automated. When a user now comes to you with an issue, you and they will be working on the exact same environment, meaning most issues will not exist in the first place, because their image is handcrafted by you, and if there’s an issue, the time trying to find out if it’s an actual bug or the environment can now be spent on fixing the bug.

The observant reader, while copy-pasting, may have noted that we haven’t discussed versioning. It’s implicit, we use git for our package, and you’ll want to tell your favorite CI/CD system to use the Singularity definition file if you like saving time. Second, the images are versioned, for example you can have a collection for 1 project, with 3 versions, alpha, beta, and reviewer-changes, for example. Because images are signed, users will know you, and only you, compiled this for them.

Encore #

But wait, in the title we said “Python”. Julia can call Python packages, and the one thing I don’t want to do is reinvent wheels. So often I use Python packages from within Julia. In my experience, getting Python working in a reproducible fashion is non-trivial either, and as before we do not want to access or rely the user’s environment. Instead, we’ll rely of Julia’s Conda.jl package to embed an isolated Python version in our installation.

In %post, change

cd /opt && git clone https://github.com/<YOU>/MyPackage.jl
julia --project=/opt/MyPackage.jl -e 'using Pkg; Pkg.update(); Pkg.build(); Pkg.instantiate(); Pkg.test()'
julia --project=/opt/MyPackage.jl -e 'using MyPackage'

cd /opt && git clone https://github.com/<YOU>/MyPackage.jl
julia --project=/opt/MyPackage.jl -e 'using Pkg; Pkg.update();ENV["PYTHON"]=""; Pkg.add("Conda"), Pkg.add("PyCall");'
julia --project=/opt/MyPackage.jl -e 'using Pkg; ENV["PYTHON"]=""; Pkg.build("PyCall");'
julia --project=/opt/MyPackage.jl -e 'using Pkg; ENV["PYTHON"]=""; using Conda; Conda.add("scikit-image")'
julia --project=/opt/MyPackage.jl -e 'using Pkg; ENV["PYTHON"]=""; using PyCall; PyCall.pyimport("skimage")'
julia --project=/opt/MyPackage.jl -e 'using Pkg; Pkg.build(); Pkg.instantiate(); Pkg.test()'
julia --project=/opt/MyPackage.jl -e 'using MyPackage'

The critical part is setting ENV[“PYTHON”] to “”, this is checked by PyCall/Conda to see if they can use the system’s Python environment. If set to “”, they’ll assume you want a new, clean, isolated environment.

You can shorten this by adding to “MyPackage.jl” a deps/build.jl file, and add those extra lines there. Now when you issue Pkg.build() it will create a fresh Python install with scikit-image. We call pyimport once to make sure it works, and to make it’s precompiled. Otherwise you may incur a once-only penalty of a few seconds at first invocation.

That’s it.

Notes #

Singularity works equally well on Windows by leveraging WSL2, I’ll leave you in the good hands of your search engine of choice to find the instructions for that.
Sylabs has a convenient remote builder, so in theory you can do all of the above on a phone with a somewhat user friendly web browser.
On the topic of genius coders who don’t write bugs, Kruger-Dunning suggests that programmers believing themselves to be infallible, would be the ones unaware of the trail of destruction they leave behind for others to fix.