1 Reproducibility with Nix
1.1 Learning Outcomes
By the end of this chapter, you will:
- Understand the need for environment reproducibility in modern workflows
- Install Nix
- Use
{rix}
to generatedefault.nix
files - Build cross-language environments for data work or software development
1.2 Why Reproducibility? Why Nix? (2h)
1.2.1 Motivation: Reproducibility in Scientific and Data Workflows
To ensure that a project is reproducible you need to deal with at least four things:
- Make sure that the required/correct version of R (or any other language) is installed;
- Make sure that the required versions of packages are installed;
- Make sure that system dependencies are installed (for example, you’d need a working Java installation to install the rJava R package on Linux);
- Make sure that you can install all of this for the hardware you have on hand.
But in practice, one or most of these bullet points are missing from projects. The goal of this course is to learn how to fullfill all the requirements to build reproducible projects.
1.2.2 Problems with Ad-Hoc Tools
Tools like Python’s venv
or R’s renv
only deal with some pieces of the reproducibility puzzle. Often, they assume an underlying OS, do not capture native system dependencies (like libxml2
, pandoc
, or curl
), and require users to “rebuild” their environments from partial metadata. Docker helps but introduces overhead, security challenges, and complexity.
Traditional approaches fail to capture the entire dependency graph of a project in a deterministic way. This leads to “it works on my machine” syndromes, onboarding delays, and subtle bugs.
1.2.3 Nix, a declarative package manager
Nix is a tool for reproducible builds and development environments, often introduced as a package manager. It captures complete dependency trees, from your programming language interpreter to every system-level library you rely on. With Nix, environments are not recreated from documentation, but rebuilt precisely from code.
Nix can be installed on Linux distributions, macOS and it even works on Windows if you enable WSL2. In this course, we will use Nix mostly as a package manager (but towards the end also as a build automation tool).
What’s a package manager? If you’re not a Linux user, you may not know. Let me explain it this way: in R, if you want to install a package to provide some functionality not included with a vanilla installation of R, you’d run this:
install.packages("dplyr")
It turns out that Linux distributions, like Ubuntu for example, work in a similar way, but for software that you’d usually install using an installer (at least on Windows). For example you could install Firefox on Ubuntu using:
sudo apt-get install firefox
(there’s also graphical interfaces that make this process “more user-friendly”). In Linux jargon, packages
are simply what we call software (or I guess it’s all “apps” these days). These packages get downloaded from so-called repositories (think of CRAN, the repository of R packages) but for any type of software that you might need to make your computer work: web browsers, office suites, multimedia software and so on.
So Nix is just another package manager that you can use to install software.
But what interests us is not using Nix to install Firefox, but instead to install R and the R packages that we require for our analysis (or any other programming language that we need). But why use Nix instead of the usual ways to install software on our operating systems?
The first thing that you should know is that Nix’s repository, nixpkgs
, is huge. Humongously huge. As I’m writing these lines, there’s more than 120’000 pieces of software available, and the entirety of CRAN and Bioconductor is also available through nixpkgs
. So instead of installing R as you usually do and then use install.packages()
to install packages, you could use Nix to handle everything. But still, why use Nix at all?
Nix has an interesting feature: using Nix, it is possible to install software in (relatively) isolated environments. So using Nix, you can install as many versions of R and R packages that you need. Suppose that you start working on a new project. As you start the project, with Nix, you would install a project-specific version of R and R packages that you would only use for that particular project. If you switch projects, you’d switch versions of R and R packages.
However Nix has quite a steep learning curve, so this is why for the purposes of this course we are going to use an R package called {rix}
to set up reproducible environments.
1.2.4 The rix package
The idea of {rix}
is for you to declare the environment you need using the provided rix()
function. rix()
is the package’s main function and generates a file called default.nix
which is then used by the Nix package manager to build that environment. Ideally, you would set up such an environment for each of your projects. You can then use this environment to either work interactively, or run R or Python scripts. It is possible to have as many environments as projects, and software that is common to environments will simply be re-used and not get re-installed to save space. Environments are isolated for each other, but can still interact with your system’s files, unlike with Docker where a volume must be mounted. Environments can also interact with the software installed on your computer through the usual means, which can sometimes lead to issues. For example, if you already have R installed, and a user library of R packages, more caution is required to properly use environments managed by Nix.
You don’t need to have R installed or be an R user to use {rix}
. If you have Nix installed on your system, it is possible to “drop” into a temporary environment with R and {rix}
available and generate the required Nix expression from there.
But first, let’s install Nix and try to use temporary shells.
1.2.5 Installing Nix
If you are on Windows, you need the Windows Subsystem for Linux 2 (WSL2) to run Nix. If you are on a recent version of Windows 10 or 11, you can simply run this as an administrator in PowerShell:
wsl --install
You can find further installation notes at this official MS documentation.
I recommend to activate systemd
in Ubuntu WSL2, mainly because this supports other users than root
running Nix. To set this up, please do as outlined this official Ubuntu blog entry:
# in WSL2 Ubuntu shell
sudo -i
nano /etc/wsl.conf
This will open the /etc/wsl.conf
in a nano, a command line text editor. Add the following line:
[boot]
systemd=true
Save the file with CTRL-O and then quit nano with CTRL-X. Then, type the following line in powershell:
wsl --shutdown
and then relaunch WSL (Ubuntu) from the start menu. For those of you running Windows, we will be working exclusively from WSL2 now. If that is not an option, then I highly recommend you set up a virtual machine with Ubuntu using VirtualBox for example, or dual-boot Ubuntu.
Installing (and uninstalling) Nix is quite simple, thanks to the installer from Determinate Systems, a company that provides services and tools built on Nix, and works the same way on Linux (native or WSL2) and macOS.
Do not use your operating system’s package manager to install Nix. Instead, simply open a terminal and run the following line (on Windows, if you cannot or have decided not to activate systemd, then you have to append --init none
to the command. You can find more details about this on The Determinate Nix Installer page):
curl --proto '=https' --tlsv1.2 -sSf \
-L https://install.determinate.systems/nix | \
sh -s -- install
Then, install the cachix
client and configure the rstats-on-nix
cache: this will install binary versions of many R packages which will speed up the building process of environments:
nix-env -iA cachix -f https://cachix.org/api/v1/install
then use the cache:
cachix use rstats-on-nix
You only need to do this once per machine you want to use {rix}
on. Many thanks to Cachix for sponsoring the rstats-on-nix
cache!
1.2.6 Temporary shells
You now have Nix installed; before continuing, it let’s see if everything works (close all your terminals and reopen them) by droping into a temporary shell with a tool you likely have not installed on your machine.
Open a terminal and run:
which sl
you will likely see something like this:
which: no sl in ....
now run this:
nix-shell -p sl
and then again:
which sl
this time you should see something like:
/nix/store/cndqpx74312xkrrgp842ifinkd4cg89g-sl-5.05/bin/sl
This is the path to the sl
binary installed through Nix. The path starts with /nix/store
: the Nix store is where all the software installed through Nix is stored. Now type sl
and see what happens!
You can find the list of available packages here.
1.3 Session 1.2 – Dev Environments with Nix (2h)
1.3.1 Some Nix concepts
While temporary shells are useful for quick testing, this is not how Nix is typically used in practice. Nix is a declarative package manager: users specify what they want to build, and Nix takes care of the rest.
To do so, users write files called default.nix
that contain the a so-called Nix expression. This expression will contain the definition of a (or several) derivations.
In Nix
terminology, a derivation is a specification for running an executable on precisely defined input files to repeatably produce output files at uniquely determined file system paths. (source)
In simpler terms, a derivation is a recipe with precisely defined inputs, steps, and a fixed output. This means that given identical inputs and build steps, the exact same output will always be produced. To achieve this level of reproducibility, several important measures must be taken:
- All inputs to a derivation must be explicitly declared.
- Inputs include not just data files, but also software dependencies, configuration flags, and environment variables, essentially anything necessary for the build process.
- The build process takes place in a hermetic sandbox to ensure the exact same output is always produced.
The next sections of this document explain these three points in more detail.
1.3.2 Derivations
Here is an example of a simple Nix
expression:
let
pkgs = import (fetchTarball "https://github.com/rstats-on-nix/nixpkgs/archive/2025-04-11.tar.gz") {};
in
.stdenv.mkDerivation {
pkgsname = "filtered_mtcars";
buildInputs = [ pkgs.gawk ];
dontUnpack = true;
src = ./mtcars.csv;
installPhase = ''
mkdir -p $out
awk -F',' 'NR==1 || $9=="1" { print }' $src > $out/filtered.csv
'';
}
I won’t go into details here, but what’s important is that this code uses awk
, a common Unix data processing tool, to filter the mtcars.csv
file to keep only rows where the 9th column (the am
column) equals 1. As you can see, a significant amount of boilerplate code is required to perform this simple operation. However, this approach is completely reproducible: the dependencies are declared and pinned to a specific dated branch of our rstats-on-nix/nixpkgs
fork (more on this later), and the only thing that could make this pipeline fail (though it’s a bit of a stretch to call this a pipeline) is if the mtcars.csv
file is not provided to it. This expression can be instantiated into a derivation, and the derivation is then built into the actual output that interests us, namely the filtered mtcars
data.
The derivation above uses the Nix
builtin function mkDerivation
: as its name implies, this function makes a derivation. But there is also mkShell
, which is the function that builds a shell instead. Nix expressions that built a shell is the kind of expressions {rix}
generates for you.
1.3.3 Using {rix}
to generate development environments
If you have successfully installed Nix, but don’t have yet R installed on your system, you could install R as you would usually do on your operating system, and then install the {rix}
package, and from there, generate project-specific expressions and build them. But you could also install R using Nix. Running the following line in a terminal will drop you in an interactive R session that you can use to start generating expressions:
nix-shell -p R rPackages.rix
This will drop you in a temporary shell with R and {rix}
available. Navigate to an empty directory to help a project, call it rix-session-1
:
mkdir rix-session-1
and start R and load {rix}
:
R
library(rix)
you can now generate an expression by running the following code:
rix(
date = "2025-06-02",
r_pkgs = c("dplyr", "ggplot2"),
py_conf = list(
py_version = "3.13",
py_pkgs = c("polars", "great-tables")
),ide = "positron",
project_path = ".",
overwrite = TRUE
)
This will write a file called default.nix
in your project’s directory. This default.nix
contains a Nix expression which will build a shell that comes with R, {dplyr}
and {ggplot2}
as they were on the the 2nd of June 2025 on CRAN. This will also add Python 3.13 and the ploars
and great-tables
Python packages as they were at the time in nixpkgs
(more on this later). Finally, this also add the Positron IDE, which is a fork of VS Code for data science. This is just an example, and you can use another IDE if you wish. See this vignette for learning how to setup your IDE with Nix.
1.3.4 Using nix-shell
to Launch Environments
Once your file is in place, simply run:
nix-shell
This gives you an isolated shell session with all declared packages available. You can test code, explore APIs, or install further tools within this session.
To remove the packages that were installed, call nix-store --gc
. This will call the garbage collector. If you want to avoid that an environment gets garbage-collected, use nix-build
instead of nix-shell
. This will create a symlink called result
in your project’s root directory and nix-store --gc
won’t garbage-collect this environment until you manually remove result
.
1.3.5 Pinning with nixpkgs
To ensure long-term reproducibility, pin the version of Nixpkgs used. Replace <nixpkgs>
with a fixed import:
let
pkgs = import (fetchTarball "https://github.com/rstats-on-nix/nixpkgs/archive/2025-06-02.tar.gz") {};
in
.mkShell {
pkgsbuildInputs = [ pkgs.r pkgs.rPackages.dplyr ];
}
This avoids unexpected updates and lets others reproduce your environment exactly.
1.4 Configuring your IDE
We now need to configure an IDE to use both our Nix shells as development environments, and GitHub Copilot. You are free to use whatever IDE you want but the instructions below are going to focus on RStudio, VS Code and Positron.
The following are the setups we recommend you use to work using an IDE and Nix environments. To be recommended, a setup should:
- be easy to setup;
- work the same on any operating system;
- not require any type of special maintenance.
Regardless of your operating system, a general-purpose editor such as VS Code (or Codium), Emacs, or Neovim meets the above requirements. Recent releases of Positron also work quite well. (Note: Neovim is not covered here due to lack of experience—PRs welcome!) However, some editors perform better on certain platforms.
Also, we recommend you uninstall R if it’s installed system-wide and also remove your local library of packages and instead only use dedicated Nix shells to manage your projects. While we made our possible for Nix shells to not interfere with a system-installed R, we recommend users go into the habit of taking some minutes at the start of a project to properly set up their development environment.
1.4.1 Recommended setup on macOS
On macOS, RStudio will only be available through Nix and only for versions 4.4.3 or more recent, or after the 2025-02-28 if you’re using dates. For older versions of R or dates, RStudio is not available for macOS through Nix so you cannot use it. As such, we recommend either VS Code (or Codium) or Positron for older dates or versions. Emacs or Neovim are also good options. See the relevant sections below to set up any of these editors. We also recommend to install the editor on macOS directly, and configure it to interact with Nix shells, instead of using Nix to install the editor, even though it does take some more effort to configure.
1.4.2 Recommended setup on Windows
On Windows, since you have to use Nix through WSL, your options are limited to editors that either:
- can be installed on Windows and interact with WSL, or
- can be launched directly from WSL.
We recommend to use an editor you can install directly on Windows and configure to interact nicely with WSL, and it turns out that this is mostly only VS Code (or Codium) or Positron. See this section to learn how to configure VS Code (or Codium) or Positron.
If you want to use RStudio, this is also possible but:
- RStudio should ideally be installed with Nix inside WSL;
- your version of Windows needs to support WSLg which should be fine on Windows 11 or the very latest Windows 10 builds. WSLg allows you to run GUI apps from WSL.
You should also be aware that there is currently a bug in the RStudio Nix package that makes RStudio ignore project-specific .Rprofile
files, which can be an issue if you also have a system-level library of packages. Instead, you can sure the .Rprofile
generated by rix()
yourself or you can uninstall the system-level R and library of packages.
Furthermore, be aware that there is a bug in WSLg that prevents modifier keys like Alt Gr from working properly.
If you prefer Emacs or Neovim, then we recommend to install it in WSL and use it in command line mode, not through WSLg (so starting Emacs with the - nw
argument).
1.4.3 Recommended setup on Linux
On Linux distributions, the only real limitation is that RStudio cannot interact with Nix shells (just like on the other operating systems), so if you want to use RStudio then you need to install it using Nix.
You should also be aware that there is currently a bug in the RStudio Nix package that makes RStudio ignore project-specific .Rprofile
files, which can be an issue if you also have a system-level library of packages. Instead, you can sure the .Rprofile
generated by rix()
yourself or you can uninstall the system-level R and library of packages.
If you use another editor, just follow the relevant instructions below; the question you need to think about is whether you want to use Nix to install the editor inside of the development shell or if you prefer to install your editor yourself using your distribution’s package manager, and configure it to interact with Nix shells. We recommend the latter option, regardless of the editor you choose.
1.4.4 RStudio
RStudio must be installed by Nix in order to see and use Nix shells. So you cannot use the RStudio already installed on your computer to work with Nix shells. This means you need to set ide = "rstudio"
if you wish to use RStudio.
You should also be aware that there is currently a bug in the RStudio Nix package that makes RStudio ignore project-specific .Rprofile
files, which can be an issue if you also have a system-level library of packages. Instead, you can sure the .Rprofile
generated by rix()
yourself or you can uninstall the system-level R and library of packages.
1.4.4.1 RStudio on macOS
To use RStudio on macOS simply use ide = "rstudio"
, but be aware that this will only work for R version 4.4.3 at least, or for a date on or after the 2025-02-28. If you don’t need to work with older versions of R or older date, RStudio is an appropriate choice. Then, build the environment using nix-build
and drop into the shell using nix-shell
. Then, type rstudio
to start RStudio. If you wish, you can even put the rstudio
command in the shell hook to start it immediately as you run nix-shell
.
1.4.4.2 RStudio on Linux or Windows
To use RStudio on Linux or Windows simply use ide = "rstudio"
. Then, build the environment using nix-build
and drop into the shell using nix-shell
. Then, type rstudio
to start RStudio.
If you plan to use RStudio on Ubuntu, then you need further configuration to make it work, because of newly introduced sandboxing features in Ubuntu 24.04. You will need to create an RStudio-specific AppArmor profile. To do so create this apparmor profile:
sudo nano /etc/apparmor.d/nix.rstudio
Populate it with:
profile nix.rstudio /nix/store/*-RStudio-*-wrapper/bin/rstudio flags=(unconfined) {
userns,
}
Save it, load the profile and start RStudio:
sudo apparmor_parser -r /etc/apparmor.d/nix.rstudio
sudo systemctl reload apparmor
You can now start RStudio from the activated Nix shell.
On Windows, you need to have WSLg
enabled, which should be the case on the latest versions of Windows. If you wish, you can even put the rstudio
command in the shell hook to start it immediately as you run nix-shell
.
On Linux and WSL, depending on your desktop environment, and for older versions of RStudio, you might see the following error message when trying to launch RStudio:
qt.glx: qglx_findConfig: Failed to finding matching FBConfig for QSurfaceFormat(version 2.0, options QFlags<QSurfaceFormat::FormatOption>(), depthBufferSize -1, redBufferSize 1, greenBufferSize 1, blueBufferSize 1, alphaBufferSize -1, stencilBufferSize -1, samples -1, swapBehavior QSurfaceFormat::SingleBuffer, swapInterval 1, colorSpace QSurfaceFormat::DefaultColorSpace, profile QSurfaceFormat::NoProfile)
Could not initialize GLX
Aborted (core dumped)
in this case, run the following before running RStudio:
export QT_XCB_GL_INTEGRATION=none
To use GitHub Copilot with RStudio, follow these instructions.
1.4.5 VS Code or Positron
Positron is a fork of VS Code made by Posit and tailored for data science. Henceforth, I will refer to both editors simply as Code.
The same instructions apply whether your host operating system is Linux, macOS or Windows. The first step is of course to install Code on your operating system using the usual means of installing software.
If you’re on Windows, install Code on Windows, not in WSL. Code on Windows is able to interact with WSL seamlessly and before continuing here, please follow these instructions (it’s mostly about installing the right extensions after having installed Positron).
On macOS, start by installing Code using the official .dmg
installer. Start Code, and then the command palette using COMMAND-SHIFT-P
. In the search bar, type "Install 'positron' command in PATH"
and click on it: this will make it possible to start Positron from a terminal.
Once Code is installed, you need to install a piece of software called direnv
: direnv
will automatically load Nix shells when you open a project that contains a default.nix
file in an editor. It works on any operating system and many editors support it, including Code. Follow the instructions for your operating system here but if you’re using Windows, install direnv
in WSL (even though you’ve just installed Code for Windows), so follow the instructions for whatever Linux distribution you’re using there (likely Ubuntu), or use Nix to install direnv
if you prefer (this is the way I recommend to install it on macOS, unless you already use brew
):
nix-env -f '<nixpkgs>' -iA direnv
This will install direnv
and make it available even outside of Nix shells!
Then, we highly recommend to install the nix-direnv
extension:
nix-env -f '<nixpkgs>' -iA nix-direnv
It is not mandatory to use nix-direnv
if you already have direnv
, but it’ll make loading environments much faster and seamless. Finally, if you haven’t used direnv
before, don’t forget this last step.
Then, in Code, install the direnv extension (and also the WSL extension if you’re on Windows, as explained in the official documentation linked above!). Finally, add a file called .envrc
and simply write the following two lines in it:
use nix
mkdir $TMP
in it. On Windows, remotely connect to WSL first, but on other operating systems, simply open the project’s folder using File > Open Folder...
and you will see a pop-up stating direnv: /PATH/TO/PROJECT/.envrc is blocked
and a button to allow it. Click Allow
and then open an R script. You might get another pop-up asking you to restart the extension, so click Restart
. Be aware that at this point, direnv
will run nix-shell
and so will start building the environment. If that particular environment hasn’t been built and cached yet, it might take some time before Code will be able to interact with it. You might get yet another popup, this time from the R Code extension complaining that R can’t be found. In this case, simply restart Code and open the project folder again: now it should work every time. For a new project, simply repeat this process:
- Generate the project’s
default.nix
file; - Build it using
nix-build
; - Create an
.envrc
and write the two lines from above in it; - Open the project’s folder in Code and click allow when prompted;
- Restart the extension and Code if necessary.
Another option is to create the .envrc
file and write use nix
in it, then open a terminal, navigate to the project’s folder, and run direnv allow
. Doing this before opening Code should not prompt you anymore.
If you’re on Windows, using Code like this is particularly interesting, because it allows you to install Code on Windows as usual, and then you can configure it to interact with a Nix shell, even if it’s running from WSL. This is a very seamless experience.
Now configure VS Code to use GitHub Copilot, click here or for Positron click here.
1.5 Hands-On Exercises
- Start a temporary shell with R and
{rix}
again usingnix-shell -p R rPackages.rix
. Start an R session (by typingR
) and then load the{rix}
package (usinglibrary(rix)
). Run theavailable_dates()
function: using the latest available date, generate a newdefault.nix
. - Inside of an activated shell, type
which R
andecho $PATH
. Explore what is being added to your environment. What is the significance of paths like/nix/store/...
? - Break it on purpose: generate a new environment with a wrong R package name, for example
dplyrnaught
. Try to build the environment. What happens? - Go to https://search.nixos.org/packages and look for packages that you usually use for your projects to see if they are available.