Bruno Rodrigues, head of the statistics at the Ministry of Research and Higher education in Luxembourg
Slides available online at https://is.gd/repro_basf
Code available at: https://github.com/b-rodrigues/repro_basf
What I mean by reproducibility
What is Nix, how it works and its complementary relationship to Docker
What I will not discuss (but is very useful!):
We need to answer these questions
Here are the 4 main things influencing an analysis’ reproducibility:
Source: Peng, Roger D. 2011. “Reproducible Research in Computational Science.” Science 334 (6060): 1226–27
renvrenv is commonly usedrenv::init() to generate library snapshot as a renv.lock filerenv.lock file looks like{
"R": {
"Version": "4.2.2",
"Repositories": [
{
"Name": "CRAN",
"URL": "https://packagemanager.rstudio.com/all/latest"
}
]
},
"Packages": {
"MASS": {
"Package": "MASS",
"Version": "7.3-58.1",
"Source": "Repository",
"Repository": "CRAN",
"Hash": "762e1804143a332333c054759f89a706",
"Requirements": []
},
"Matrix": {
"Package": "Matrix",
"Version": "1.5-1",
"Source": "Repository",
"Repository": "CRAN",
"Hash": "539dc0c0c05636812f1080f473d2c177",
"Requirements": [
"lattice"
]
***and many more packages***
renv.lock filerenv.lock file not just a recordrenv::restore(){renv} alone is not enoughShortcomings:
but… :
renv.lock file is “free”Dockerizing a project could look like this:
renv.lock){renv} can be tricky:#> * installing *source* package ‘ModelMetrics’ ...
#> ** package ‘ModelMetrics’ successfully unpacked and MD5 sums checked
#> ** using staged installation
#> ** libs
#> /usr/bin/clang++ -std=gnu++11 -I"/opt/R-devel/lib64/R/include" -DNDEBUG -I'/home/docker/R/Rcpp/include' -I/usr/local/include -fpic -g -O2 -c RcppExports.cpp -o RcppExports.o
#> /usr/bin/clang++ -std=gnu++11 -I"/opt/R-devel/lib64/R/include" -DNDEBUG -I'/home/docker/R/Rcpp/include' -I/usr/local/include -fpic -g -O2 -c auc_.cpp -o auc_.o
#> auc_.cpp:2:10: fatal error: 'omp.h' file not found
#> #include
#> ^~~~~~~
#> 1 error generated.
#> make: *** [/opt/R-devel/lib64/R/etc/Makeconf:178: auc_.o] Error 1
#> ERROR: compilation failed for package ‘ModelMetrics’
renv.lock file, the harder to restore!Package manager: tool to install and manage packages
Package: any piece of software (not just R packages)
Example of popular package manager:
let
pkgs = import (fetchTarball "https://github.com/NixOS/nixpkgs/archive/976fa3369d722e76f37c77493d99829540d43845.tar.gz") {};
system_packages = builtins.attrValues {
inherit (pkgs) R ;
};
in
pkgs.mkShell {
buildInputs = [ system_packages ];
shellHook = "R --vanilla";
}
There’s a lot to discuss here!
pkgs = import (fetchTarball "https://github.com/NixOS/nixpkgs/archive/976fa3369d722e76f37c77493d99829540d43845.tar.gz") {};
system_packages: a variable that lists software to installsystem_packages = builtins.attrValues {
inherit (pkgs) R ;
};
pkgs.mkShell {
buildInputs = [ system_packages ];
shellHook = "R --vanilla";
}
system_packages (buildInputs)R --vanilla when started (shellHook){rix} will help!{rix} (website) makes writing Nix expressions easy!rix() function:renv.lock files can also be used as starting points:library(rix)
renv2nix(
renv_lock_path = "path/to/original/renv_project/renv.lock",
project_path = "path/to/rix_project",
override_r_ver = "4.4.1" # <- optional
)
scripts/nix_expressions/docker/(you’ll find many other examples in the repository)
"dplyr@1.0.0"){rix} makes it easy to run pipelines in the right environment{targets})scripts/nix_expressions/nix_targets_pipelinecd /absolute/path/to/pipeline/ && nix-shell default.nix --run "Rscript -e 'targets::tar_make()'"
{targets} pipeline on GitHub actionsrix::tar_nix_ga() to generate the required filesrenv.lock file{targets}: not only good for reproducibility, but also an amazing tool all around{rixpress}Contact me if you have questions:
Thank you!