Skip to contents

This function generates a pipeline.nix file based on a list of derivation objects. Each derivation defines a build step, and rxp_populate() chains these steps and handles the serialization and conversion of Python objects into R objects (or vice-versa). Derivations are created with rxp_r(), rxp_py() and so on. By default, the pipeline is also immediately built after being generated, but the build process can be postponed by setting build to FALSE. In this case, the pipeline can then be built using rxp_make() at a later stage.

Usage

rxp_populate(derivs, project_path = ".", build = FALSE, py_imports = NULL, ...)

Arguments

derivs

A list of derivation objects, where each object is a list of five elements: - $name, character, name of the derivation - $snippet, character, the nix code snippet to build this derivation - $type, character, can be R, Python or Quarto - $additional_files, character vector of paths to files to make available to build sandbox - $nix_env, character, path to Nix environment to build this derivation A single deriv is the output of rxp_r(), rxp_qmd() or rxp_py() function.

project_path

Path to root of project, defaults to ".".

build

Logical, defaults to FALSE. Should the pipeline get built right after being generated? When FALSE, use rxp_make() to build the pipeline at a later stage.

py_imports

Named character vector of Python import rewrites. Names are the base modules that rixpress auto-imports as "import ", and values are the desired import lines. For example: c(numpy = "import numpy as np", xgboost = "from xgboost import XGBClassifier"). Each entry is applied by replacing "import " with the provided string across generated _rixpress Python library files.

...

Further arguments passed down to methods. Use max-jobs and cores to set parallelism during build. See the documentation of rxp_make() for more details.

Value

Nothing, writes a file called pipeline.nix with the Nix code to build the pipeline.

Details

The generated pipeline.nix expression includes:

  • the required imports of environments, typically default.nix files generated by the rix package;

  • correct handling of interdependencies of the different derivations;

  • serialization and deserialization of both R and Python objects, and conversion between them when objects are passed from one language to another;

  • correct loading of R and Python packages, or extra functions needed to build specific targets

Inline Python import adjustments In some cases, due to the automatic handling of Python packages, users might want to change import statements. By default if, say, pandas is needed to build a derivation, it will be imported with import pandas. However, Python programmers typically use import pandas as pd. You can either:

  • use py_imports to rewrite these automatically during population, or

  • use adjust_import() for advanced/manual control.

See also

Other pipeline functions: rxp_make()

Examples

if (FALSE) { # \dontrun{
# Create derivation objects
d1 <- rxp_r(mtcars_am, filter(mtcars, am == 1))
d2 <- rxp_r(mtcars_head, head(mtcars_am))
list_derivs <- list(d1, d2)

# Generate and build in one go
rxp_populate(derivs = list_derivs, project_path = ".", build = TRUE)

# Or only populate, with inline Python import adjustments
rxp_populate(
  derivs = list_derivs,
  project_path = ".",
  build = FALSE,
  py_imports = c(pandas = "import pandas as pd")
)
# Then later:
rxp_make()
} # }