Using containers for reproducible computing

Better Code, Better Science: Chapter 6, Part 2

Nov 04, 2025

This is a possible section from the open-source living textbook Better Code, Better Science, which is being released in sections on Substack. The entire book can be accessed here and the Github repository is here. This material is released under CC-BY-NC. Thanks to Steffen Bollman for helpful suggestions on a draft of this section.

An article about computational science in a scientific publication is not the scholarship itself, it is merely advertising of the scholarship. The actual scholarship is the complete software development environment and the complete set of instructions which generated the figures. (Buckheit & Donoho, 1995).

So far we have discussed the importance of code for reproducibility, and in a later chapter we talk extensively about the sharing of data. However, the foregoing quote from Buckheit and Donoho highlights the additional importance of the computational platform. When they wrote their paper in 1995 the were no easily accessible solutions for sharing of compute platforms, but a technology known as containerization has emerged in the last decade, which provides an easily implemented and widely accessible solution for the sharing of computational platforms.

To understand the concept of a container, it’s first useful to understand the related idea of the virtual machine (or VM). A VM is like a “computer-in-a-computer”, in the sense that it behaves like a fully functioning computer, despite the fact that it only exists virtually within its host system. If you have ever used a cloud system like Amazon Web Services Elastic Compute Cloud (EC2), you have run a virtual machine; the virtualization technology is how Amazon can run many virtual computers on a single physical computing node. The virtual machine runs a fully functioning version of the operating system; for example, a Windows virtual machine would run a fully functioning version of Windows, even if it’s implemented on an Apple Mac host. One challenge of this is that sharing the virtual machine with someone else requires sharing the entire operating system along with any installed components, which can often take many gigabytes of space.

A container is a way to share only the components that are required to run the intended applications, rather than sharing the entire operating system. This makes containers generally much smaller and faster to work with compared to a virtual machine. Containers were made popular by the Docker software, which allows the same container to run on a Mac, Windows, or Linux machine, because Docker runs a Linux virtual machine that supports these containers. Another tool known as Apptainer (formerly Singularity) is commonly used to run containerized applications on high-performance computing (HPC) systems, since Docker requires root access that is not available to users on most shared systems. We will focus on Docker here, given that it is broadly available and that Apptainer can easily run Docker containers as well.

A container image is, at present, the most reproducible way to share software, because it ensures that the dependencies will remain fixed. We use containers to distribute software built by our lab, such as fMRIPrep, because it greatly reduces installation hassles for complex applications. All the user needs to do is install the Docker software, and they are up and running quickly. Without the containerized version, the user would need to install a large number of dependencies, some of which might not be available for their operating system. Containers are far from perfect, but they are currently the best solution we have for reproducible software execution.

Running a Docker container

We will start by running a container based on an existing container image, which is a file that defines the contents of the container. The Docker Hub is a portal that contains images for many different applications. For this example, we will use the official Python image, which contains the required dependencies for a basic Python installation.

We first need to pull the container image from Docker Hub onto our local system, using the docker pull command to obtain version 3.13.9 of the container:

➤  docker pull python:3.13.9
3.13.9: Pulling from library/python
2a101b2fcb53: Pull complete
f510ac7d6fe7: Pull complete
721433549fef: Pull complete
e2f695ddffd8: Pull complete
17e8deb32a49: Pull complete
bc60d97daad5: Pull complete
6275e9642344: Pull complete
Digest: sha256:12513c633252a28bcfee85839aa384e1af322f11275779c6645076c6cd0cfe52
Status: Downloaded newer image for python:3.13.9
docker.io/library/python:3.13.9

Make sure to always specify a valid version of the image and do not use the convenient latest tag, which will lead to unreproducible setups, because the version of the image will depend on the download date and can lead to security vulnerabilities.

Now that the image exists on our machine, we can use it to open a container and run a Python command:

➤  docker run python:3.13.9 python -c “import sys; print(f’Hello World from Python {sys.version_info.major}.{sys.version_info.minor}.{sys.version_info.micro}’)”
Hello World from Python 3.13.9

We could also log into the container, to see that it’s really just like any other Unix system. We do this by giving the `-it` flag to docker run, which tells it to run in interactive mode and allocate a terminal, which in this case is `bash`:

➤  docker run -it python:3.13.9 bash
root@65bb0e71462d:/# pwd
/
root@65bb0e71462d:/# whoami
root
root@65bb0e71462d:/# ls
bin   dev  home  media	opt   root  sbin  sys  usr
boot  etc  lib	 mnt	proc  run   srv   tmp  var

Mounting local storage within a container

For realistic applications we will often want to be able to access our local storage volumes to be able to process files, run local code, or save outputs. With Docker we can bind mount a local storage volume to a location within the container. Let’s say that we want to output contents from an operation within the container to a local directory called container-output and that we want this to appear as /data within the container. We can do that using the -v flag to docker run:

➤  docker run -it -v ./container-output:/data python:3.13.9 bash
root@d14247359d45:/# cd /data
root@d14247359d45:/data# ls
root@d14247359d45:/data# echo “test output from container” > test_output.txt
root@d14247359d45:/data# more test_output.txt
test output from container
root@d14247359d45:/data# exit
exit
➤  ls container-output
test_output.txt
➤  more container-output/test_output.txt       
test output from container

Configuring a Docker image

To create a reproducible software execution environment, we will often need to create our own new Docker image that contains the necessary dependencies and application code. AI coding tools are generally quite good at creating the required Dockerfile that defines the image. We use the following prompt to Claude Sonnet 4:

I would like to generate a Dockerfile to define a Docker image based on the python:3.13.9 image.  The Python package wonderwords should be installed from PyPi. A local Python script should be created that creates a random sentence using wonderwords.RandomSentence() and prints it.  This script should be the entrypoint for the Docker container.  Create this within src/BetterCodeBetterScience/docker-example inside the current project.  Do not create a new workspace - use the existing workspace for this project.

Here is the content of the resulting `Dockerfile`:

FROM python:3.13.9

# Set working directory
WORKDIR /app

# Install wonderwords package
RUN pip install wonderwords==2.2.0

# Copy the Python script
COPY random_sentence.py .

# Set the entrypoint to run our script
ENTRYPOINT [”python”, “random_sentence.py”]

The entry point defines the command that will be run by default when the container is run. We can then build the image:

➤  docker build -t random-sentence-generator .
[+] Building 0.0s (9/9) FINISHED                              docker:desktop-linux
 => [internal] load build definition from Dockerfile              0.0s
 => => transferring dockerfile: 339B                              0.0s
 => [internal] load metadata for docker.io/library/python:3.13.9  0.0s
 => [internal] load .dockerignore                                 0.0s
 => => transferring context: 2B                                   0.0s
 => [1/4] FROM docker.io/library/python:3.13.9                    0.0s
 => [internal] load build context                                 0.0s
 => => transferring context: 89B                                  0.0s
 => CACHED [2/4] WORKDIR /app                                     0.0s
 => CACHED [3/4] RUN pip install wonderwords==2.2.0.              0.0s
 => CACHED [4/4] COPY random_sentence.py .                        0.0s
 => exporting to image                                            0.0s
 => => exporting layers                                           0.0s
 => => writing image sha256:02794d11ad789b3a056831da2a431deb2241a5da0b20506e  0.0s
 => => naming to docker.io/library/random-sentence-generator      0.0s

We can now see it in the list of images obtained using docker images:

➤  docker images
REPOSITORY                TAG      IMAGE ID       CREATED         SIZE
random-sentence-generator latest   02794d11ad78   5 minutes ago   1.13GB
python                    3.13.9   49bb15d4b6f6   2 weeks ago     1.12GB

We then run the container to execute the command:

➤  docker run --rm random-sentence-generator
Random sentence: The tangible fairy informs crazy.

Using containers as a sandbox for AI agents

In addition to allowing the sharing of reproducible environments, containers also provide a very handy tool in the context of agentic coding tools: They allow us to create a sandboxed computing environment that limits the scope of the agent’s actions. This is essential when one is using agentic tools with disabled access controls. For example, Claude Code usually requires the user to provide explicitly permission for access to particular locations on the local disk (with the option to enable them automatically for the remainder of the session). However, it has a --dangerously-skip-permissions flag (also referred to as “YOLO mode”) that allows one to turn off these permissions, giving the agent complete access reading and writing files, running scripts or programs, and accessing the internet without any limits. This is primarily meant for use on “headless” computers to automate various processess, but it’s not surprising that users have tried to use it on their own local systems to speed up the development process. The Anthropic documentation for Claude Code explicitly cautions against this:

Letting Claude run arbitrary commands is risky and can result in data loss, system corruption, or even data exfiltration (e.g., via prompt injection attacks). To minimize these risks, use --dangerously-skip-permissions in a container without internet access.

In the next post I will discuss project organization.

Neural Strategies

Discussion about this post