Creates a Nix expression that reads in a file (or folder of data) using R.
Usage
rxp_r_file(
name,
path,
read_function,
nix_env = "default.nix",
copy_data_folder = FALSE,
env_var = NULL
)
Arguments
- name
Symbol, the name of the derivation.
- path
Character, the file path to include (e.g., "data/mtcars.shp") or a folder path (e.g., "data"). See details.
- read_function
Function, an R function to read the data, taking one argument (the path).
- nix_env
Character, path to the Nix environment file, default is "default.nix".
- copy_data_folder
Logical, if TRUE then the entire folder is copied recursively into the build sandbox.
- env_var
List, defaults to NULL. A named list of environment variables to set before running the R script, e.g., c(VAR = "hello"). Each entry will be added as an export statement in the build phase.
Details
There are three ways to read in data in a rixpress pipeline: the
first is to point directly to a file, for example, rxp_r_file(mtcars, path = "data/mtcars.csv", read_function = read.csv)
. The second way is to point
to a file but to also include the files in the "data/" folder (the folder
can be named something else). This is needed when data is split between
several files, such as a shapefile which typically also needs other files
such as .shx
and .dbf
files. For this, copy_data_folder
must be set
to TRUE
. The last way to read in data, is to only point to a folder, and
use a function that recursively reads in all data. For example
rxp_r_file(many_csvs, path = "data", read_function = \(x)(readr::read_csv( list.files(x, full.names = TRUE, pattern = ".csv$"))))
the provided
anonymous function will read all the .csv
files in the data/
folder.
Examples
if (FALSE) { # \dontrun{
# Read a CSV file
rxp_r_file(
name = mtcars,
path = "data/mtcars.csv",
read_function = \(x) (read.csv(file = x, sep = "|"))
)
# Read all CSV files in a directory using a lambda function
rxp_r_file(
name = mtcars_r,
path = "data",
read_function = \(x)
(readr::read_delim(list.files(x, full.names = TRUE), delim = "|")),
copy_data_folder = TRUE
)
} # }