This function generates a pipeline.nix
file based on a list of derivation
objects. Each derivation defines a build step, and rxp_populate()
chains these
steps and handles the serialization and conversion of Python objects into R
objects (or vice-versa). Derivations are created with rxp_r()
, rxp_py()
and so on. By default, the pipeline is also immediately built after being
generated, but the build process can be postponed by setting build
to
FALSE. In this case, the pipeline can then be built using rxp_make()
at
a later stage.
Arguments
- derivs
A list of derivation objects, where each object is a list of five elements: -
$name
, character, name of the derivation -$snippet
, character, the nix code snippet to build this derivation -$type
, character, can be R, Python or Quarto -$additional_files
, character vector of paths to files to make available to build sandbox -$nix_env
, character, path to Nix environment to build this derivation A single deriv is the output ofrxp_r()
,rxp_qmd()
orrxp_py()
function.- project_path
Path to root of project, defaults to ".".
- build
Logical, defaults to FALSE. Should the pipeline get built right after being generated? When FALSE, use
rxp_make()
to build the pipeline at a later stage.- py_imports
Named character vector of Python import rewrites. Names are the base modules that rixpress auto-imports as "import
", and values are the desired import lines. For example: c(numpy = "import numpy as np", xgboost = "from xgboost import XGBClassifier"). Each entry is applied by replacing "import " with the provided string across generated _rixpress Python library files. - ...
Further arguments passed down to methods. Use
max-jobs
andcores
to set parallelism during build. See the documentation ofrxp_make()
for more details.
Details
The generated pipeline.nix
expression includes:
the required imports of environments, typically
default.nix
files generated by therix
package;correct handling of interdependencies of the different derivations;
serialization and deserialization of both R and Python objects, and conversion between them when objects are passed from one language to another;
correct loading of R and Python packages, or extra functions needed to build specific targets
Inline Python import adjustments
In some cases, due to the automatic handling of Python packages, users might
want to change import statements. By default if, say, pandas
is needed to
build a derivation, it will be imported with import pandas
. However, Python
programmers typically use import pandas as pd
. You can either:
use
py_imports
to rewrite these automatically during population, oruse
adjust_import()
for advanced/manual control.
See also
Other pipeline functions:
rxp_make()
Examples
if (FALSE) { # \dontrun{
# Create derivation objects
d1 <- rxp_r(mtcars_am, filter(mtcars, am == 1))
d2 <- rxp_r(mtcars_head, head(mtcars_am))
list_derivs <- list(d1, d2)
# Generate and build in one go
rxp_populate(derivs = list_derivs, project_path = ".", build = TRUE)
# Or only populate, with inline Python import adjustments
rxp_populate(
derivs = list_derivs,
project_path = ".",
build = FALSE,
py_imports = c(pandas = "import pandas as pd")
)
# Then later:
rxp_make()
} # }