Bruno Rodrigues, head of the statistics at the Ministry of Research and Higher education in Luxembourg
Slides available online at https://is.gd/repro_basf
Code available at: https://github.com/b-rodrigues/repro_basf
What I mean by reproducibility
What is Nix, how it works and its complementary relationship to Docker
What I will not discuss (but is very useful!):
We need to answer these questions
Here are the 4 main things influencing an analysis’ reproducibility:
Source: Peng, Roger D. 2011. “Reproducible Research in Computational Science.” Science 334 (6060): 1226–27
renv
renv
is commonly usedrenv::init()
to generate library snapshot as a renv.lock
filerenv.lock
file looks like{
"R": {
"Version": "4.2.2",
"Repositories": [
{
"Name": "CRAN",
"URL": "https://packagemanager.rstudio.com/all/latest"
}
]
},
"Packages": {
"MASS": {
"Package": "MASS",
"Version": "7.3-58.1",
"Source": "Repository",
"Repository": "CRAN",
"Hash": "762e1804143a332333c054759f89a706",
"Requirements": []
},
"Matrix": {
"Package": "Matrix",
"Version": "1.5-1",
"Source": "Repository",
"Repository": "CRAN",
"Hash": "539dc0c0c05636812f1080f473d2c177",
"Requirements": [
"lattice"
]
***and many more packages***
renv.lock
filerenv.lock
file not just a recordrenv::restore()
{renv}
alone is not enoughShortcomings:
but… :
renv.lock
file is “free”Dockerizing a project could look like this:
renv.lock
){renv}
can be tricky:#> * installing *source* package ‘ModelMetrics’ ...
#> ** package ‘ModelMetrics’ successfully unpacked and MD5 sums checked
#> ** using staged installation
#> ** libs
#> /usr/bin/clang++ -std=gnu++11 -I"/opt/R-devel/lib64/R/include" -DNDEBUG -I'/home/docker/R/Rcpp/include' -I/usr/local/include -fpic -g -O2 -c RcppExports.cpp -o RcppExports.o
#> /usr/bin/clang++ -std=gnu++11 -I"/opt/R-devel/lib64/R/include" -DNDEBUG -I'/home/docker/R/Rcpp/include' -I/usr/local/include -fpic -g -O2 -c auc_.cpp -o auc_.o
#> auc_.cpp:2:10: fatal error: 'omp.h' file not found
#> #include
#> ^~~~~~~
#> 1 error generated.
#> make: *** [/opt/R-devel/lib64/R/etc/Makeconf:178: auc_.o] Error 1
#> ERROR: compilation failed for package ‘ModelMetrics’
renv.lock
file, the harder to restore!Package manager: tool to install and manage packages
Package: any piece of software (not just R packages)
Example of popular package manager:
let
pkgs = import (fetchTarball "https://github.com/NixOS/nixpkgs/archive/976fa3369d722e76f37c77493d99829540d43845.tar.gz") {};
system_packages = builtins.attrValues {
inherit (pkgs) R ;
};
in
pkgs.mkShell {
buildInputs = [ system_packages ];
shellHook = "R --vanilla";
}
There’s a lot to discuss here!
pkgs = import (fetchTarball "https://github.com/NixOS/nixpkgs/archive/976fa3369d722e76f37c77493d99829540d43845.tar.gz") {};
system_packages
: a variable that lists software to installsystem_packages = builtins.attrValues {
inherit (pkgs) R ;
};
pkgs.mkShell {
buildInputs = [ system_packages ];
shellHook = "R --vanilla";
}
system_packages
(buildInputs
)R --vanilla
when started (shellHook
){rix}
will help!{rix}
(website) makes writing Nix expressions easy!rix()
function:renv.lock
files can also be used as starting points:library(rix)
renv2nix(
renv_lock_path = "path/to/original/renv_project/renv.lock",
project_path = "path/to/rix_project",
override_r_ver = "4.4.1" # <- optional
)
scripts/nix_expressions/docker/
(you’ll find many other examples in the repository)
"dplyr@1.0.0"
){rix}
makes it easy to run pipelines in the right environment{targets}
)scripts/nix_expressions/nix_targets_pipeline
cd /absolute/path/to/pipeline/ && nix-shell default.nix --run "Rscript -e 'targets::tar_make()'"
{targets}
pipeline on GitHub actionsrix::tar_nix_ga()
to generate the required filesrenv.lock
file{targets}
: not only good for reproducibility, but also an amazing tool all around{rixpress}
Contact me if you have questions:
Thank you!