This vignette shows the general purpose and syntax of the
tern.rbmi
R package. The tern.rbmi
provides an
interface for Reference Based Multiple Imputation (rbmi
)
within the tern framework. For details of the rbmi
package,
please see Reference Based
Multiple Imputation (rbmi). The basic usage of rbmi
core functions is described in the quickstart
vignette:
tern.rbmi
The rbmi
package consists of 4 core functions (plus
several helper functions) which are typically called in sequence:
draws()
- fits the imputation models and stores their
parametersimpute()
- creates multiple imputed datasetsanalyse()
- analyses each of the multiple imputed
datasetspool()
- combines the analysis results across imputed
datasets into a single statisticWe use a publicly available example dataset from an antidepressant
clinical trial of an active drug versus placebo. The relevant endpoint
is the Hamilton 17-item depression rating scale (HAMD17
)
which was assessed at baseline and at weeks 1, 2, 4, and 6. Study drug
discontinuation occurred in 24% of subjects from the active drug and 26%
of subjects from placebo. All data after study drug discontinuation are
missing and there is a single additional intermittent missing
observation.
library(tern.rbmi)
#> Loading required package: rbmi
#> Loading required package: tern
#> Loading required package: rtables
#> Loading required package: formatters
#>
#> Attaching package: 'formatters'
#> The following object is masked from 'package:base':
#>
#> %||%
#> Loading required package: magrittr
#>
#> Attaching package: 'rtables'
#> The following object is masked from 'package:utils':
#>
#> str
#> Registered S3 method overwritten by 'tern':
#> method from
#> tidy.glm broom
library(dplyr)
#>
#> Attaching package: 'dplyr'
#> The following objects are masked from 'package:stats':
#>
#> filter, lag
#> The following objects are masked from 'package:base':
#>
#> intersect, setdiff, setequal, union
data <- antidepressant_data
levels(data$THERAPY) <- c("PLACEBO", "DRUG") # This is important! The order defines the computation order later
missing_var <- "CHANGE"
vars <- list(
id = "PATIENT",
visit = "VISIT",
expand_vars = c("BASVAL", "THERAPY"),
group = "THERAPY"
)
covariates <- list(
draws = c("BASVAL*VISIT", "THERAPY*VISIT"),
analyse = c("BASVAL")
)
data <- data %>%
dplyr::select(PATIENT, THERAPY, VISIT, BASVAL, THERAPY, CHANGE) %>%
dplyr::mutate(dplyr::across(.cols = vars$id, ~ as.factor(.x))) %>%
dplyr::arrange(dplyr::across(.cols = c(vars$id, vars$visit)))
# Use expand_locf to add rows corresponding to visits with missing outcomes to the dataset
data_full <- do.call(
expand_locf,
args = list(
data = data,
vars = c(vars$expand_vars, vars$group),
group = vars$id,
order = c(vars$id, vars$visit)
) %>%
append(lapply(data[c(vars$id, vars$visit)], levels))
)
data_full <- data_full %>%
dplyr::group_by(dplyr::across(vars$id)) %>%
dplyr::mutate(!!vars$group := Filter(Negate(is.na), .data[[vars$group]])[1])
# there are duplicates - use first value
data_full <- data_full %>%
dplyr::group_by(dplyr::across(c(vars$id, vars$group, vars$visit))) %>%
dplyr::slice(1) %>%
dplyr::ungroup()
# need to have a single ID column
data_full <- data_full %>%
tidyr::unite("TMP_ID", dplyr::all_of(vars$id), sep = "_#_", remove = FALSE) %>%
dplyr::mutate(TMP_ID = as.factor(TMP_ID))
Set the imputation strategy to MAR for each patient with at least one missing observation.
The rbmi::draws()
function fits the imputation models
and stores the corresponding parameter estimates or Bayesian posterior
parameter draws. The three main inputs to the rbmi::draws()
function are:
Define the names of key variables in our dataset and the covariates
included in the imputation model using rbmi::set_vars()
.
Note that the covariates argument can also include interaction
terms.
debug_mode <- FALSE
draws_vars <- rbmi::set_vars(
outcome = missing_var,
visit = vars$visit,
group = vars$group,
covariates = covariates$draws
)
draws_vars$subjid <- "TMP_ID"
Define which imputation method to use, then create samples for the
imputation parameters by running the draws()
function.
draws_method <- method_bayes()
draws_obj <- rbmi::draws(
data = data_full,
data_ice = data_ice,
vars = draws_vars,
method = draws_method
)
#>
#> SAMPLING FOR MODEL 'rbmi_mmrm' NOW (CHAIN 1).
#> Chain 1:
#> Chain 1: Gradient evaluation took 0.000251 seconds
#> Chain 1: 1000 transitions using 10 leapfrog steps per transition would take 2.51 seconds.
#> Chain 1: Adjust your expectations accordingly!
#> Chain 1:
#> Chain 1:
#> Chain 1: Iteration: 1 / 1200 [ 0%] (Warmup)
#> Chain 1: Iteration: 120 / 1200 [ 10%] (Warmup)
#> Chain 1: Iteration: 201 / 1200 [ 16%] (Sampling)
#> Chain 1: Iteration: 320 / 1200 [ 26%] (Sampling)
#> Chain 1: Iteration: 440 / 1200 [ 36%] (Sampling)
#> Chain 1: Iteration: 560 / 1200 [ 46%] (Sampling)
#> Chain 1: Iteration: 680 / 1200 [ 56%] (Sampling)
#> Chain 1: Iteration: 800 / 1200 [ 66%] (Sampling)
#> Chain 1: Iteration: 920 / 1200 [ 76%] (Sampling)
#> Chain 1: Iteration: 1040 / 1200 [ 86%] (Sampling)
#> Chain 1: Iteration: 1160 / 1200 [ 96%] (Sampling)
#> Chain 1: Iteration: 1200 / 1200 [100%] (Sampling)
#> Chain 1:
#> Chain 1: Elapsed Time: 0.496 seconds (Warm-up)
#> Chain 1: 1.56 seconds (Sampling)
#> Chain 1: 2.056 seconds (Total)
#> Chain 1:
#> Warning in fit_mcmc(designmat = model_df_scaled[, -1, drop = FALSE], outcome = model_df_scaled[, : The largest R-hat is 1.27, indicating chains have not mixed.
#> Running the chains for more iterations may help. See
#> https://mc-stan.org/misc/warnings.html#r-hat
The next step is to use the parameters from the imputation model to
generate the imputed datasets. This is done via the
rbmi::impute()
function. The function only has two key
inputs: the imputation model output from rbmi::draws()
and
the reference groups relevant to reference-based imputation methods. Its
usage is thus:
The next step is to run the analysis model on each imputed dataset.
This is done by defining an analysis function and then calling
rbmi::analyse()
to apply this function to each imputed
dataset.
# Define analysis model
analyse_fun <- ancova
ref_levels <- levels(impute_obj$data$group[[1]])
names(ref_levels) <- c("ref", "alt")
analyse_obj <- rbmi::analyse(
imputations = impute_obj,
fun = analyse_fun,
vars = rbmi::set_vars(
subjid = "TMP_ID",
outcome = missing_var,
visit = vars$visit,
group = vars$group,
covariates = covariates$analyse
)
)
The rbmi::pool()
function can be used to summarize the
analysis results across multiple imputed datasets to provide an overall
statistic with a standard error, confidence intervals and a p-value for
the hypothesis test of the null hypothesis that the effect is equal to
0.
Finally create output with rtables
and tern
packages
library(broom)
df <- tidy(pool_obj)
df
#> group est se_est lower_cl_est upper_cl_est est_contr se_contr
#> 1 ref -1.615820 0.4862316 -2.575771 -0.6558685 NA NA
#> 2 alt -1.707626 0.4749573 -2.645319 -0.7699335 -0.09180645 0.6826279
#> 3 ref -4.239118 0.6577632 -5.538298 -2.9399373 NA NA
#> 4 alt -2.824275 0.6449649 -4.098307 -1.5502435 1.41484246 0.9289560
#> 5 ref -6.396271 0.7069625 -7.793378 -4.9991638 NA NA
#> 6 alt -4.188299 0.7232842 -5.620524 -2.7560738 2.20797170 1.0043783
#> 7 ref -7.671683 0.7876724 -9.229894 -6.1134722 NA NA
#> 8 alt -4.903380 0.8208205 -6.532983 -3.2737772 2.76830290 1.1476188
#> lower_cl_contr upper_cl_contr p_value relative_reduc visit conf_level
#> 1 NA NA NA NA 4 0.95
#> 2 -1.4394968 1.255884 0.89317724 0.05681725 4 0.95
#> 3 NA NA NA NA 5 0.95
#> 4 -0.4202815 3.249966 0.12979562 -0.33375871 5 0.95
#> 5 NA NA NA NA 6 0.95
#> 6 0.2222466 4.193697 0.02956838 -0.34519672 6 0.95
#> 7 NA NA NA NA 7 0.95
#> 8 0.4937275 5.042878 0.01752783 -0.36084689 7 0.95
Final product, reshape rbmi
final results to nicely
formatted rtable
object.
basic_table() %>%
split_cols_by("group", ref_group = levels(df$group)[1]) %>%
split_rows_by("visit", split_label = "Visit", label_pos = "topleft") %>%
summarize_rbmi() %>%
build_table(df)
#> Visit ref alt
#> —————————————————————————————————————————————————————————————————————————
#> 4
#> Adjusted Mean (SE) -1.616 (0.486) -1.708 (0.475)
#> 95% CI (-2.576, -0.656) (-2.645, -0.770)
#> Difference in Adjusted Means (SE) -0.092 (0.683)
#> 95% CI (-1.439, 1.256)
#> Relative Reduction (%) 5.7%
#> p-value (RBMI) 0.8932
#> 5
#> Adjusted Mean (SE) -4.239 (0.658) -2.824 (0.645)
#> 95% CI (-5.538, -2.940) (-4.098, -1.550)
#> Difference in Adjusted Means (SE) 1.415 (0.929)
#> 95% CI (-0.420, 3.250)
#> Relative Reduction (%) -33.4%
#> p-value (RBMI) 0.1298
#> 6
#> Adjusted Mean (SE) -6.396 (0.707) -4.188 (0.723)
#> 95% CI (-7.793, -4.999) (-5.621, -2.756)
#> Difference in Adjusted Means (SE) 2.208 (1.004)
#> 95% CI (0.222, 4.194)
#> Relative Reduction (%) -34.5%
#> p-value (RBMI) 0.0296
#> 7
#> Adjusted Mean (SE) -7.672 (0.788) -4.903 (0.821)
#> 95% CI (-9.230, -6.113) (-6.533, -3.274)
#> Difference in Adjusted Means (SE) 2.768 (1.148)
#> 95% CI (0.494, 5.043)
#> Relative Reduction (%) -36.1%
#> p-value (RBMI) 0.0175