This vignette details how to effectively use the
{cmdstanr}
package within a rixpress
pipeline for Bayesian statistical modeling with Stan. For a general
introduction to rixpress and its core concepts, please
refer to vignette("a-intro-concepts")
and
vignette("b-core-functions")
.
{cmdstanr}
provides a user-friendly R interface to
cmdstan
, Stan’s command-line interface. While powerful, its
reliance on external processes and file system interactions requires
careful handling within the hermetic build environment of
rixpress.
Setting up the Environment
As with any rixpress pipeline, the first step is to define the execution environment using rix:
library(rix)
rix(
date = "2025-04-29",
r_pkgs = c("readr", "dplyr", "ggplot2"), # Add other R packages as needed
system_pkgs = "cmdstan", # Crucial: include cmdstan as a system dependency
git_pkgs = list(
list(
package_name = "cmdstanr",
repo_url = "https://github.com/stan-dev/cmdstanr",
commit = "79d37792d8e4ffcf3cf721b8d7ee4316a1234b0c" # Pin to a specific commit
),
list(
package_name = "rixpress",
repo_url = "https://github.com/b-rodrigues/rixpress",
commit = "HEAD" # Or pin to a specific commit
)
),
ide = "none", # Or your preferred IDE
project_path = ".",
overwrite = TRUE
)
Key points in this environment definition:
-
cmdstan
is included insystem_pkgs
. This makes thecmdstan
executables available to the pipeline. -
{cmdstanr}
is installed from its GitHub repository, as it’s not available on CRAN. Pinning to a specific commit is recommended for maximum reproducibility.
With the environment set up, we can define the pipeline:
Setting up the pipeline
The Stan model code itself should reside in a .stan
file. We use rxp_r_file()
to bring its contents into the
pipeline as a character string.
rxp_r_file(
bayesian_linear_regression_model,
"model.stan",
readLines
)
Next, we define parameters and simulate some data for our model.
rxp_r(
parameters,
list(
N = 100,
alpha = 2,
beta = -0.5,
sigma = 1.e-1
)
),
rxp_r(
x,
rnorm(parameters$N, 0, 1)
),
rxp_r(
y,
rnorm(
n = parameters$N,
mean = parameters$alpha + parameters$beta * x,
sd = parameters$sigma
)
),
rxp_r(
# Prepare the data list for cmdstanr
inputs,
list(N = parameters$N, x = x, y = y)
),
Compiling and Sampling the Model
Interfacing with cmdstan
from within
rixpress requires a specific strategy due to the hermetic
nature of Nix sandboxes. We’ll use a wrapper function to handle model
compilation and sampling within a single rxp_r()
step.
First, let’s define the wrapper function (e.g., in a
functions.R
file that we’ll include):
# In functions.R
cmdstan_model_wrapper <- function(
stan_string = NULL, # The Stan model code as a character string
inputs, # Data list for the model
seed, # Seed for reproducibility
... # Additional arguments for cmdstan_model or sample
) {
# Create a temporary .stan file within the sandbox
stan_file <- tempfile(pattern = "model_", fileext = ".stan")
writeLines(stan_string, con = stan_file)
# Compile the Stan model
# cmdstanr will find cmdstan via the CMDSTAN environment variable
model <- cmdstanr::cmdstan_model(
stan_file = stan_file,
...
)
# Sample from the posterior
fitted_model <- model$sample(
data = inputs,
seed = seed,
...
)
return(fitted_model)
}
Now, we use this wrapper in our pipeline:
# ... (continuation of pipeline_steps list)
rxp_r(
model, # Target name for the fitted model object
cmdstan_model_wrapper(
stan_string = bayesian_linear_regression_model,
inputs = inputs,
seed = 22
),
additional_files = "functions.R",
serialize_function = "save_model",
env_var = c("CMDSTAN" = "${defaultPkgs.cmdstan}/opt/cmdstan")
)
Explanation of the Wrapper Approach:
-
stan_string = bayesian_linear_regression_model
: We pass the model code (read byrxp_r_file
) as a string to our wrapper. -
writeLines(stan_string, con = stan_file)
: Inside the wrapper, the Stan code is written to a temporary.stan
file. This file exists within the sandbox of the currentrxp_r
step. This is crucial becausecmdstan_model
needs a file path. Attempting to pass the originalmodel.stan
path directly viaadditional_files
tocmdstan_model
can lead to permission or path issues whencmdstan
tries to compile it from a different working directory or context. -
cmdstanr::cmdstan_model()
: Compiles the model from the temporarystan_file
. -
model$sample()
: Samples from the compiled model. -
Single Step: Both compilation and sampling
must happen within the same
rxp_r
step (and thus the same sandbox). This is because themodel
object returned bycmdstan_model()
contains paths to the compiled executable. If these were separate steps, the paths from the compilation sandbox wouldn’t be valid in the sampling sandbox. -
env_var = c("CMDSTAN" = "${defaultPkgs.cmdstan}/opt/cmdstan")
: This sets theCMDSTAN
environment variable within the sandbox for this specific step.{cmdstanr}
uses this variable to locate thecmdstan
installation. The${defaultPkgs.cmdstan}
is a Nix interpolation that resolves to the path of thecmdstan
package in the Nix store. If the environment providingcmdstan
were named differently, for examplecmdstan-env.nix
, then you would need to use${cmdstan_envPkgs.cmdstan}
.
Custom Serialization
{cmdstanr}
provides a specific method for saving fitted
model objects to ensure all necessary components are preserved. We
define a simple wrapper for this to use with
rixpress.
save_model <- function(fitted_model, path, ...) {
fitted_model$save_object(file = path, ...)
}
By specifying serialize_function = "save_model"
in the
rxp_r()
call, rixpress will use this
function instead of the default saveRDS()
. The fitted model
can then be read using rxp_read("model")
, which will
internally use readRDS()
.
Summary
Using {cmdstanr}
with rixpress involves
these key considerations:
Include
cmdstan
insystem_pkgs
and{cmdstanr}
(from Git) in your rix environment definition.Read your
.stan
file into the pipeline usingrxp_r_file()
.-
Implement a wrapper function that:
- Takes the model code string and writes it to a temporary
.stan
file inside the wrapper. - Calls
cmdstanr::cmdstan_model()
on this temporary file. - Calls
model$sample()
to fit the model. - Returns the fitted model object.
- Takes the model code string and writes it to a temporary
Perform model compilation and sampling within the same
rxp_r()
call using the wrapper.Set the
CMDSTAN
environment variable for therxp_r()
step that runs the wrapper, pointing to the Nix store path ofcmdstan
.Use
{cmdstanr}
’s$save_object()
method via a customserialize_function
for robust saving of the fitted model.
This approach ensures that cmdstan
can operate correctly
within the isolated and reproducible environment provided by
rixpress and Nix.