cmest_rb is used to implement the the regression-based approach
by Valeri & VanderWeele (2013) and VanderWeele & Vansteelandt (2014) for causal mediation analysis
with a single exposure, a single outcome, and a single or multiple mediators.
cmest_rb(
data = NULL,
outcome = NULL,
event = NULL,
exposure = NULL,
mediator = NULL,
EMint = NULL,
basec = NULL,
yreg = NULL,
mreg = NULL,
estimation = "imputation",
inference = "bootstrap",
astar = NULL,
a = NULL,
mval = NULL,
basecval = NULL,
yval = NULL,
nboot = 200,
boot.ci.type = "per",
casecontrol = FALSE,
yrare = NULL,
yprevalence = NULL,
multimp = FALSE,
args_mice = NULL
)a data frame (or object coercible by as.data.frame to a data frame) containing the variables in the model.
variable name of the outcome.
(required when yreg is coxph, aft_exp,
or aft_weibull) variable name of the event.
variable name of the exposure.
a vector of variable name(s) of mediator(s).
a logical value. TRUE indicates there is
exposure-mediator interaction in yreg.
a vector of variable names of confounders. See Details.
outcome regression model. See Details.
a list of mediator regression models following the order in mediator. See Details.
estimation method. paramfunc and
imputation are implemented (the first 4 letters are sufficient). Default is imputation.
See Details.
inference method. delta and
bootstrap are implemented (the first 4 letters are sufficient). Default is bootstrap.
See Details.
the control value of the exposure.
the treatment value of the exposure.
a list of values at which each mediator is controlled to calculate the cde, following the order in mediator.
(required when estimation is paramfunc and EMint is TRUE)
a list of values at which each confounder is conditioned on, following the order in basec.
If NULL, the mean of each confounder is used.
(required when the outcome is categorical) the level of the outcome at which causal effects on the
risk ratio scale are estimated. If NULL, the last level is used.
(used when inference is bootstrap) the number of bootstraps applied.
Default is 200.
(used when inference is bootstrap) the type of bootstrap confidence interval. If per, percentile bootstrap
confidence intervals are estimated; if bca, bias-corrected and accelerated (BCa) bootstrap
confidence intervals are estimated. Default is per.
a logical value. TRUE indicates a case control study in which the
first level of the outcome is treated as the control and the second level of the outcome is
treated as the case. Default is FALSE.
(used when casecontrol is TRUE) a logical value. TRUE
indicates the case is rare.
(used when casecontrol is TRUE) the prevalence of the case.
a logical value. If
TRUE, conduct multiple imputations using the mice function. Default is
FALSE.
a list of additional arguments passed to the mice function. See mice for details.
an object of class cmest.
an object of class cmest.
minimal number of significant digits. See print.default.
An object of classes cmest and cmest_rb is returned:
the function call,
the data frame,
a list of methods used which may include estimation, inference,
nboot, boot.ci.type, casecontrol, yrare, and yprevalence,
a list of variables used which may include outcome, event,
exposure, mediator, EMint, and basec,
reference values used which may include astar, a, mval,
basecval and yval,
a list of regressions input,
a list of regressions output. If multimp is TRUE,
reg.output contains regression models fitted by each imputed dataset,
a list of arguments used for multiple imputation,
point estimates of causal effects,
standard errors of causal effects,
lower limits of the 95% confidence intervals of causal effects,
higher limits of the 95% confidence intervals of causal effects,
p-values of causal effects,
...
Assumptions of the regression-based approach
There is no unmeasured exposure-outcome confounding: given basec and
postc, exposure is independent of outcome.
There is no unmeasured mediator-outcome confounding: given exposure and
basec, mediator is independent of outcome.
There is no unmeasured exposure-mediator confounding: given basec,
exposure is independent of mediator.
There is no mediator-outcome confounder affected by the exposure: there is no
variable in basec affected by exposure.
Regression models
Each regression model in yreg and mreg can be specified by a fitted regression
object or the character name of a regression model.
The Character Name of a Regression Model:
linear: linear regression fitted by glm with family = gaussian()
logistic: logistic regression fitted by glm with family = logit()
loglinear: loglinear regression fitted by glm with
family = poisson()
poisson: poisson regression fitted by glm with
family = poisson()
quasipoisson: quasipoisson regression fitted by glm with
family = quasipoisson()
negbin: negative binomial regression fitted by glm.nb
multinomial: multinomial regression fitted by multinom
ordinal: ordered logistic regression fitted by polr
coxph: cox proportional hazard model fitted by coxph
aft_exp: accelerated failure time model fitted by survreg
with dist = "exponential"
aft_weibull: accelerated failure time model fitted by survreg
with dist = "weibull"
coxph, aft_exp and aft_weibull are currently not implemented for mreg.
If EMint is TRUE and yreg is specified by the character name of a regression
model, yreg is fitted with the interaction between the exposure and each mediator.
A Fitted Regression Object:
Regression objects can be fitted by lm, glm, glm.nb, gam, multinom, polr, coxph and survreg.
Regression objects fitted by coxph and survreg
are currently not supported for mreg.
yreg should regress outcome on exposure,
mediator and basec.
For p=1,...,k, mreg[p] should regress mediator[p] on
exposure and basec, where k is the number of mediators.
yreg can't include mediator-mediator interactions when there
are multiple mediators (VanderWeele TJ & Vansteelandt, 2014).
Estimation Methods
paramfunc: (only available for a single
mediator) closed-form parameter function estimation by Valeri & VanderWeele (2013).
Each causal effect is estimated by a closed-form formula of regression coefficients.
imputation: direct counterfactual imputation estimation by Imai, et al (2010).
Each causal effect is estimated by imputing counterfactuals directly.
To use paramfunc, yreg and mreg must be specified by the character name
of a regression model. yreg can be chosen from linear, logistic, loglinear,
poisson, quasipoisson, negbin, coxph, aft_exp and
aft_weibull. mreg can be chosen from linear, logistic and
multinomial.
To use paramfunc with yreg = "logistic" or yreg = "coxph", the outcome must
be rare.
Inference Methods
delta: (only available when estimation = "paramfunc") inferences
about causal effects are obtained by the delta method.
bootstrap: inferences about causal effects are obtained by bootstrapping.
Estimated Causal Effects
For a continuous outcome, causal effects on the difference scale are estimated. For a categorical, count or survival outcome, causal effects on the ratio scale are estimated. Depending on the outcome type, the ratio can be risk ratio for a categorical outcome, rate ratio for a count outcome, hazard ratio for a survival outcome fitted by coxph, mean survival ratio for a survival outcome fitted by survreg, etc.
When EMint is FALSE, two-way decomposition (Valeri & VanderWeele, 2013) is conducted, i.e.,
for a continuous outcome: cde (controlled direct effect), pnde (pure natural
direct effect), tnde (total natural direct effect), pnie (pure natural indirect
effect), tnie (total natural indirect effect), te (total effect), and
pm (proportion mediated) are estimated.
for a categorical, count or survival outcome: Rcde (cde ratio), Rpnde (pnde ratio),
Rtnde (tnde ratio), Rpnie (pnie ratio), Rtnie (tnie ratio),
Rte (te ratio), and pm are estimated.
When EMint is TRUE: additional four-way decomposition (VanderWeele, 2014) is conducted, i.e.,
for a continuous outcome: intref
(reference interaction), intmed (mediated interaction),
cde(prop) (proportion cde), intref(prop) (proportion
intref), intmed(prop) (proportion intmed), pnie(prop)
(proportion pnie), int (proportion
attributable to interaction), and pe (proportion eliminated) are estimated.
for a categorical, count or survival outcome: ERcde (excess ratio due to cde), ERintref (excess
ratio due to intref), ERintmed (excess ratio due to intmed), ERpnie
(excess ratio due to pnie), ERcde(prop) (proportion ERcde),
ERintref(prop) (proportion ERintref), ERintmed(prop) (proportion ERintmed),
ERpnie(prop) (proportion ERpnie), int, and pe are estimated.
When EMint is TRUE and estimation is paramfunc,
causal effects conditional on basecval are estimated.
Otherwise, marginal causal effects are estimated.
Valeri L, VanderWeele TJ (2013). Mediation analysis allowing for exposure-mediator interactions and causal interpretation: theoretical assumptions and implementation with SAS and SPSS macros. Psychological Methods. 18(2): 137 - 150.
VanderWeele TJ, Vansteelandt S (2014). Mediation analysis with multiple mediators. Epidemiologic Methods. 2(1): 95 - 115.
VanderWeele TJ (2014). A unification of mediation and interaction: a 4-way decomposition. Epidemiology. 25(5): 749 - 61.
Imai K, Keele L, Tingley D (2010). A general approach to causal mediation analysis. Psychological Methods. 15(4): 309 - 334.
Schomaker M, Heumann C (2018). Bootstrap inference when using multiple imputation. Statistics in Medicine. 37(14): 2252 - 2266.
Efron B (1987). Better Bootstrap Confidence Intervals. Journal of the American Statistical Association. 82(397): 171-185.
cmest_gformula, cmest_wb, cmest_iorw, cmest_msm, cmest_multistate, ggcmest, cmdag, cmsens.
if (FALSE) { # \dontrun{
library(CMAverse)
# single-mediator case without exposure-mediator interaction
exp1 <- cmest_rb(data = cma2020, outcome = "contY",
exposure = "A", mediator = "M1", basec = c("C1", "C2"),
EMint = FALSE, yreg = "linear", mreg = list("logistic"),
estimation = "paramfunc", inference = "delta", astar = 0, a = 1, mval = list(1))
summary(exp1)
# single-mediator case with exposure-mediator interaction
exp2 <- cmest_rb(data = cma2020, outcome = "contY",
exposure = "A", mediator = "M2", basec = c("C1", "C2"),
EMint = TRUE, yreg = "linear", mreg = list("multinomial"),
estimation = "paramfunc", inference = "delta", astar = 0, a = 1, mval = list("M2_0"))
summary(exp2)
# multiple-mediators case
exp3 <- cmest_rb(data = cma2020, outcome = "contY",
exposure = "A", mediator = c("M1", "M2"), EMint = TRUE, basec = c("C1", "C2"),
yreg = "linear", mreg = list("logistic", "multinomial"),
estimation = "imputation", inference = "bootstrap",
astar = 0, a = 1, mval = list(0, "M2_0"),
nboot = 100, boot.ci.type = "bca")
summary(exp3)
# specify regression models by fitted regression objects
exp4 <- cmest_rb(data = cma2020, outcome = "contY",
exposure = "A", mediator = c("M1", "M2"), EMint = TRUE, basec = c("C1", "C2"),
yreg = lm(contY ~ A + M1 + M2 + C1 + C2, data = cma2020),
mreg = list(glm(M1 ~ A + C1 + C2, data = cma2020, family = binomial()),
nnet::multinom(M2 ~ A + C1 + C2, data = cma2020)),
estimation = "imputation", inference = "bootstrap",
astar = 0, a = 1, mval = list(0, "M2_0"),
nboot = 100, boot.ci.type = "bca")
summary(exp4)
} # }