This document describes how the six causal mediation analysis approaches including the regression-based approach by Valeri et al. (2013) and VanderWeele et al. (2014), the weighting-based approach by VanderWeele et al. (2014), the inverse odd-ratio weighting approach by Tchetgen Tchetgen (2013), the natural effect model by Vansteelandt et al. (2012), the marginal structural model by VanderWeele et al. (2017), and the gg-formula approach by Robins (1986) are implemented by the CMAverse package. See publications of these approaches for methodological details.

CMAverse currently supports a single exposure, multiple sequential mediators and a single outcome. When multiple mediators are of interest, CMAverse estimates the joint mediated effect through the set of mediators. CMAverse also supports time varying confounders preceding the mediators.

We categorize the causal mediation analysis approaches based on whether the approach can deal with mediator-outcome confounders affected by the exposure. Among the six approaches, only The marginal structural model and the gg-formula approach are able to deal with mediator-outcome confounders affected by the exposure.

In this document, the outcome and the exposure are denoted as YY and AA respectively. The set of exposure-mediator confounders, exposure-outcome confounders and mediator-outcome confounders not affected by the exposure is denoted as CC. The set of mediators is denoted as MM and M=(M1,...,Mk)M=(M_1,...,M_k) follows the temporal order. The set of mediator-outcome confounders affected by the exposure is denoted as LL and L=(L1,...,Ls)L=(L_1,...,L_s) follows the temporal order.

Since weights calculated from noncategorical variables are unstable, which hurts the performance of effect estimation and inference, weighted approaches can be implemented only for categorical exposure and mediator(s).

CMAverse Function Parameters

The CMAverse package provides a unified interface for conducting causal mediation analysis through several main functions. Users specify the statistical models and estimation approaches using the following key parameters:

Main Parameters

yreg - Specifies the regression model for the outcome (Y). Options include: - "linear" - Linear regression - "logistic" - Logistic regression for binary outcomes
- "loglinear" - Log-linear regression for count data - "poisson" - Poisson regression for count outcomes - "quasipoisson" - Quasi-Poisson regression - "negbin" - Negative binomial regression - "coxph" - Cox proportional hazards model for survival outcomes - "aft_exp" - Accelerated failure time model with exponential distribution - "aft_weibull" - Accelerated failure time model with Weibull distribution

mreg - Specifies the regression model(s) for the mediator(s) (M). Can be a list for multiple mediators. Options include: - "linear" - Linear regression - "logistic" - Logistic regression for binary mediators - "multinomial" - Multinomial regression for categorical mediators with more than 2 levels

dreg - Specifies the regression model for the confounder model in weighted approaches. Used in inverse probability weighting to estimate treatment weights. Options include: - "linear" - Linear regression - "logistic" - Logistic regression for binary treatment/exposure

ereg - Specifies the regression model for the outcome propensity in certain sensitivity analyses and weighted approaches.

a - The treatment/exposure level of interest (active treatment value)

astar - The reference treatment/exposure level (control value)

m - The value at which to control the mediator (for controlled direct effects)

basecval - A named list specifying baseline confounder values for conditional inference in closed-form parameter function estimation

data - The dataset containing all variables

Additional Common Parameters

nboot - Number of bootstrap samples for confidence interval estimation

CI - Confidence level (e.g., 0.95 for 95% CI)

boot.ci.type - Type of bootstrap confidence intervals (“perc”, “normal”, “basic”, “bca”)

exposure - Name of the exposure/treatment variable

mediators - Name(s) of the mediator variable(s)

outcome - Name of the outcome variable

covariates - Names of baseline confounders (C variables)

postexposure_confounder - Names of post-exposure confounders affected by treatment (L variables)

Function-Specific Parameters

For time-to-event analyses with competing risks: - pathcomprisk() - Implements the mediational g-formula for time-varying mediators and competing risks on the hazard scale - D - List of competing risk/event models
- yreg - The outcome model (typically Aalen or Cox model) - time_points - Time points at which to evaluate mediators - peryr - Person-years divisor for rates - refit - Logical indicating whether to use exact match bootstrapping mode

For causal mediation with multiple mediators: - cmest() - Main function for estimation with multiple sequential mediators - cmsens() - Sensitivity analysis for unmeasured confounding

Users select combinations of these parameters based on their data structure and research question to implement the desired causal mediation analysis approach.

Basic Usage Example

Here is a simple example of conducting causal mediation analysis using CMAverse:

library(CMAverse)

# Example: Regression-based mediation analysis with a continuous outcome
# Suppose we have a dataset 'mydata' with:
# - Exposure: A
# - Mediator: M  
# - Outcome: Y
# - Confounders: C1, C2

# Conduct mediation analysis with linear models
result <- cmest(
  data = mydata,
  model = "rb",                          # regression-based approach
  outcome = "Y",
  exposure = "A",
  mediator = "M",
  basec = c("C1", "C2"),                 # baseline confounders
  yreg = "linear",                       # outcome model: linear regression
  mreg = "linear",                       # mediator model: linear regression
  a = 1,                                 # treatment value (e.g., exposure = 1)
  astar = 0,                             # control value (e.g., exposure = 0)
  mval = 0,                              # mediator value for CDE
  estimation = "paramfunc",              # closed-form parameter functions
  inference = "bootstrap",               # bootstrap for confidence intervals
  nboot = 1000                           # 1000 bootstrap samples
)

# View results
summary(result)

# For time-to-event outcomes with competing risks:
# result <- pathcomprisk(
#   D = list_of_competing_risk_models,
#   mreg = list_of_mediator_models,
#   mvar = c("M1", "M2"),
#   yreg = aalen_cox_model,
#   avar = "A",
#   a = 1,
#   astar = 0,
#   data = mydata,
#   nboot = 1000
# )

No Confounders Affected by the Exposure

DAG

Estimands

For a continuous outcome, causal effects are estimated on the difference scale (summarized in table 1). For a categorical, count, or survival outcome, causal effects are estimated on the ratio scale (summarized in table 2). See Valeri et al. (2013) and VanderWeele (2015) for details about these effects.

Table 1: Causal Effects on the Difference Scale
Full Name Abbreviation Formula
Controlled Direct Effect CDECDE E[YamYa*m]E[Y_{am}-Y_{a^*m}]
Pure Natural Direct Effect PNDEPNDE E[YaMa*Ya*Ma*]E[Y_{aM_a^*}-Y_{a^*M_a^*}]
Total Natural Direct Effect TNDETNDE E[YaMaYa*Ma]E[Y_{aM_a}-Y_{a^*M_a}]
Pure Natural Indirect Effect PNIEPNIE E[Ya*MaYa*Ma*]E[Y_{a^*M_a}-Y_{a^*M_a^*}]
Total Natural Indirect Effect TNIETNIE E[YaMaYaMa*]E[Y_{aM_a}-Y_{aM_a^*}]
Total Effect TETE PNDE+TNIEPNDE+TNIE or TNDE+PNIETNDE+PNIE
Reference Interaction INTrefINT_{ref} PNDECDEPNDE-CDE
Mediated Interaction INTmedINT_{med} TNIEPNIETNIE-PNIE
Proportion CDECDE propCDEprop^{CDE} CDE/TECDE/TE
Proportion INTrefINT_{ref} propINTrefprop^{INT_{ref}} INTref/TEINT_{ref}/TE
Proportion INTmedINT_{med} propINTmedprop^{INT_{med}} INTmed/TEINT_{med}/TE
Proportion PNIEPNIE propPNIEprop^{PNIE} PNIE/TEPNIE/TE
Proportion Mediated PMPM TNIE/TETNIE/TE
Proportion Attributable to Interaction INTINT (INTref+INTmed)/TE(INT_{ref}+INT_{med})/TE
Proportion Eliminated PEPE (INTref+INTmed+PNIE)/TE(INT_{ref}+INT_{med}+PNIE)/TE
Residual Disparity RDRD $P(S_g&gt;s&amp;#124;A=a,C) - P(S_g&gt;s&amp;#124;A=a^*,C)$
Shifting Distribution Effect SDSD $P(S_{g'}&gt;s&amp;#124;A=a,C) - P(S_g&gt;s&amp;#124;A=a,C)$
Note:
aa and a*a^* are the active and control values for AA respectively. mm is the value at which MM is controlled. MaM_a denotes the counterfactual value of MM that would have been observed had AA been set to be aa. YamY_{am} denotes the counterfactual value of YY that would have been observed had AA been set to be aa, and MM to be mm. YaMa*Y_{aMa*} denotes the counterfactual value of YY that would have been observed had AA been set to be aa, and MM to be the counterfactual value Ma*M_{a*}.
Table 2: Causal Effects on the Ratio Scale
Full Name Abbreviation Formula
Controlled Direct Effect RCDER^{CDE} E[Yam]/E[Ya*m]E[Y_{am}]/E[Y_{a^*m}]
Pure Natural Direct Effect RPNDER^{PNDE} E[YaMa*]/E[Ya*Ma*]E[Y_{aM_a^*}]/E[Y_{a^*M_a^*}]
Total Natural Direct Effect RTNDER^{TNDE} E[YaMa]/E[Ya*Ma]E[Y_{aM_a}]/E[Y_{a^*M_a}]
Pure Natural Indirect Effect RPNIER^{PNIE} E[Ya*Ma]/E[Ya*Ma*]E[Y_{a^*M_a}]/E[Y_{a^*M_a^*}]
Total Natural Indirect Effect RTNIER^{TNIE} E[YaMa]/E[YaMa*]E[Y_{aM_a}]/E[Y_{aM_a^*}]
Total Effect RTER^{TE} RPNDE×RTNIER^{PNDE}\times R^{TNIE} or RTNDE×RPNIER^{TNDE}\times R^{PNIE}
Excess Ratio due to Controlled Direct Effect ERCDEER^{CDE} (E[YamYa*m])/E[Ya*Ma*](E[Y_{am}-Y_{a^*m}])/E[Y_{a^*M_a^*}]
Excess Ratio due to Reference Interaction ERINTrefER^{INT_{ref}} RPNDE1ERCDER^{PNDE}-1-ER^{CDE}
Excess Ratio due to Mediated Interaction ERINTmedER^{INT_{med}} RTNIE*RPNDERPNDERPNIE+1R^{TNIE}*R^{PNDE}-R^{PNDE}-R^{PNIE}+1
Excess Ratio due to Pure Natural Indirect Effect ERPNIEER^{PNIE} RPNIE1R^{PNIE}-1
Proportion ERCDEER^{CDE} propERCDEprop^{ER^{CDE}} ERCDE/(RTE1)ER^{CDE}/(R^{TE}-1)
Proportion ERINTrefER^{INT_{ref}} propERINTrefprop^{ER^{INT_{ref}}} ERINTref/(RTE1)ER^{INT_{ref}}/(R^{TE}-1)
Proportion ERINTmedER^{INT_{med}} propERINTmedprop^{ER^{INT_{med}}} ERINTmed/(RTE1)ER^{INT_{med}}/(R^{TE}-1)
Proportion ERPNIEER^{PNIE} propERPNIEprop^{ER^{PNIE}} ERPNIE/(RTE1)ER^{PNIE}/(R^{TE}-1)
Proportion Mediated PMPM (RPNDE*(RTNIE1))/(RTE1)(R^{PNDE}*(R^{TNIE}-1))/(R^{TE}-1)
Proportion Attributable to Interaction INTINT (ERINTref+ERINTmed)/(RTE1)(ER^{INT_{ref}}+ER^{INT_{med}})/(R^{TE}-1)
Proportion Eliminated PEPE (ERINTref+ERINTmed+ERPNIE)/(RTE1)(ER^{INT_{ref}}+ER^{INT_{med}}+ER^{PNIE})/(R^{TE}-1)
Note:
aa and a*a^* are the active and control values for AA respectively. mm is the value at which MM is controlled. MaM_a denotes the counterfactual value of MM that would have been observed had AA been set to be aa. YamY_{am} denotes the counterfactual value of YY that would have been observed had AA been set to be aa, and MM to be mm. YaMa*Y_{aMa*} denotes the counterfactual value of YY that would have been observed had AA been set to be aa, and MM to be the counterfactual value Ma*M_{a*}. If YY is categorical, E[Y]E[Y] represents the probability of Y=yY=y where yy is a pre-specified value of YY.

The Regression-based Approach

With the regression-based approach, all causal effects are estimated through either closed-form parameter function estimation or direct counterfactual imputation estimation. Standard errors of causal effects are estimated through either the delta method or bootstrapping.

Closed-form Parameter Function Estimation

Closed-form parameter function estimation is available when there is only a single mediator, i.e., M=M1M=M_1. Also, yreg must be chosen from linear, logistic, loglinear, poisson, quasipoisson, negbin, coxph, aft_exp and aft_weibull. mreg must be chosen from linear, logistic and multinomial. To use yreg = "logistic" and yreg = "coxph" in closed-form parameter function estimation, the outcome must be rare. Additionally, the causal effects estimated through closed-form parameter function estimation are conditional on the value of CC specified by the basecval argument. Closed-form parameter functions are summarized below.

Linear yreg, Linear mreg and Noncategorical Exposure

If the exposure is not categorical, yreg="linear" and mreg=list("linear"), CMAverse estimates the causal effects by the following steps:

  1. Fit a linear regression model for the mediator: E[M|A,C]=β0+β1A+β2CE[M|A,C]=\beta_0+\beta_1A+\beta_2'C

  2. Fit a linear regression model for the outcome: E[Y|A,M,C]=θ0+θ1A+θ2M+θ3AM+θ4CE[Y|A,M,C]=\theta_0+\theta_1A+\theta_2M+\theta_3AM+\theta_4'C

  3. Estimate CDECDE, PNDEPNDE, TNDETNDE, PNIEPNIE and TNIETNIE by the following parameter functions:

    • CDE=(θ1+θ3m)(aa*)CDE=(\theta_1+\theta_3m)(a-a^*)
    • PNDE={θ1+θ3(β0+β1a*+β2c)}(aa*)PNDE=\{\theta_1+\theta_3(\beta_0+\beta_1a^*+\beta_2'c)\}(a-a^*)
    • TNDE={θ1+θ3(β0+β1a+β2c)}(aa*)TNDE=\{\theta_1+\theta_3(\beta_0+\beta_1a+\beta_2'c)\}(a-a^*)
    • PNIE=(θ2β1+θ3β1a*)(aa*)PNIE=(\theta_2\beta_1+\theta_3\beta_1a^*)(a-a^*)
    • TNIE=(θ2β1+θ3β1a)(aa*)TNIE=(\theta_2\beta_1+\theta_3\beta_1a)(a-a^*)
  4. Calculate other effects using formulas in table 1.

Linear yreg, Linear mreg and Categorical Exposure

If the exposure is categorical, yreg="linear" and mreg=list("linear"), CMAverse estimates the causal effects by the following steps:

  1. Fit a linear regression model for the mediator: E[M|A,C]=β0+h=1Hβ1hI{A=h}+β2CE[M|A,C]=\beta_0+\sum_{h=1}^H\beta_{1h}I\{A=h\}+\beta_2'C

  2. Fit a linear regression model for the outcome: E[Y|A,M,C]=θ0+h=1Hθ1hI{A=h}+θ2M+h=1Hθ3hI{A=h}M+θ4CE[Y|A,M,C]=\theta_0+\sum_{h=1}^H\theta_{1h}I\{A=h\}+\theta_2M+\sum_{h=1}^H\theta_{3h}I\{A=h\}M+\theta_4'C

  3. Estimate CDECDE, PNDEPNDE, TNDETNDE, PNIEPNIE and TNIETNIE by the following parameter functions:

    • CDE=h=1Hθ1hI{a=h}h=1Hθ1hI{a*=h}+(h=1Hθ3hI{a=h}h=1Hθ3hI{a*=h})mCDE=\sum_{h=1}^H\theta_{1h}I\{a=h\}-\sum_{h=1}^H\theta_{1h}I\{a^*=h\}+(\sum_{h=1}^H\theta_{3h}I\{a=h\}-\sum_{h=1}^H\theta_{3h}I\{a^*=h\})m
    • PNDE=h=1Hθ1hI{a=h}h=1Hθ1hI{a*=h}+(h=1Hθ3hI{a=h}h=1Hθ3hI{a*=h})(β0+h=1Hβ1hI{a*=h}+β2c)PNDE=\sum_{h=1}^H\theta_{1h}I\{a=h\}-\sum_{h=1}^H\theta_{1h}I\{a^*=h\}+(\sum_{h=1}^H\theta_{3h}I\{a=h\}-\sum_{h=1}^H\theta_{3h}I\{a^*=h\})(\beta_0+\sum_{h=1}^H\beta_{1h}I\{a^*=h\}+\beta_2'c)
    • TNDE=h=1Hθ1hI{a=h}h=1Hθ1hI{a*=h}+(h=1Hθ3hI{a=h}h=1Hθ3hI{a*=h})(β0+h=1Hβ1hI{a=h}+β2c)TNDE=\sum_{h=1}^H\theta_{1h}I\{a=h\}-\sum_{h=1}^H\theta_{1h}I\{a^*=h\}+(\sum_{h=1}^H\theta_{3h}I\{a=h\}-\sum_{h=1}^H\theta_{3h}I\{a^*=h\})(\beta_0+\sum_{h=1}^H\beta_{1h}I\{a=h\}+\beta_2'c)
    • PNIE=(θ2+h=1Hθ3hI{a*=h})(h=1Hβ1hI{a=h}h=1Hβ1hI{a*=h})PNIE=(\theta_2+\sum_{h=1}^H\theta_{3h}I\{a^*=h\})(\sum_{h=1}^H\beta_{1h}I\{a=h\}-\sum_{h=1}^H\beta_{1h}I\{a^*=h\})
    • TNIE=(θ2+h=1Hθ3hI{a=h})(h=1Hβ1hI{a=h}h=1Hβ1hI{a*=h})TNIE=(\theta_2+\sum_{h=1}^H\theta_{3h}I\{a=h\})(\sum_{h=1}^H\beta_{1h}I\{a=h\}-\sum_{h=1}^H\beta_{1h}I\{a^*=h\})
  4. Calculate other effects using formulas in table 1.

Linear yreg, Logistic mreg and Noncategorical Exposure

If the exposure is not categorical, yreg="linear" and mreg=list("logistic"), CMAverse estimates the causal effects by the following steps:

  1. Fit a logistic regression model for the mediator: logitE[M|A,C]=β0+β1A+β2ClogitE[M|A,C]=\beta_0+\beta_1A+\beta_2'C

  2. Fit a linear regression model for the outcome: E[Y|A,M,C]=θ0+θ1A+θ2M+θ3AM+θ4CE[Y|A,M,C]=\theta_0+\theta_1A+\theta_2M+\theta_3AM+\theta_4'C

  3. Estimate CDECDE, PNDEPNDE, TNDETNDE, PNIEPNIE and TNIETNIE by the following parameter functions:

    • CDE=(θ1+θ3m)(aa*)CDE=(\theta_1+\theta_3m)(a-a^*)
    • PNDE={θ1+θ3exp(β0+β1a*+β2c)1+exp(β0+β1a*+β2c)}(aa*)PNDE=\{\theta_1+\theta_3\frac{exp(\beta_0+\beta_1a^*+\beta_2'c)}{1+exp(\beta_0+\beta_1a^*+\beta_2'c)}\}(a-a^*)
    • TNDE={θ1+θ3exp(β0+β1a+β2c)1+exp(β0+β1a+β2c)}(aa*)TNDE=\{\theta_1+\theta_3\frac{exp(\beta_0+\beta_1a+\beta_2'c)}{1+exp(\beta_0+\beta_1a+\beta_2'c)}\}(a-a^*)
    • PNIE=(θ2+θ3a*)(exp(β0+β1a+β2c)1+exp(β0+β1a+β2c)exp(β0+β1a*+β2c)1+exp(β0+β1a*+β2c))PNIE=(\theta_2+\theta_3a^*)(\frac{exp(\beta_0+\beta_1a+\beta_2'c)}{1+exp(\beta_0+\beta_1a+\beta_2'c)}-\frac{exp(\beta_0+\beta_1a^*+\beta_2'c)}{1+exp(\beta_0+\beta_1a^*+\beta_2'c)})
    • TNIE=(θ2+θ3a)(exp(β0+β1a+β2c)1+exp(β0+β1a+β2c)exp(β0+β1a*+β2c)1+exp(β0+β1a*+β2c))TNIE=(\theta_2+\theta_3a)(\frac{exp(\beta_0+\beta_1a+\beta_2'c)}{1+exp(\beta_0+\beta_1a+\beta_2'c)}-\frac{exp(\beta_0+\beta_1a^*+\beta_2'c)}{1+exp(\beta_0+\beta_1a^*+\beta_2'c)})
  4. Calculate other effects using formulas in table 1.

Linear yreg, Logistic mreg and Categorical Exposure

If the exposure is categorical, yreg="linear" and mreg=list("logistic"), CMAverse estimates the causal effects by the following steps:

  1. Fit a logistic regression model for the mediator: logitE[M|A,C]=β0+h=1Hβ1hI{A=h}+β2ClogitE[M|A,C]=\beta_0+\sum_{h=1}^H\beta_{1h}I\{A=h\}+\beta_2'C

  2. Fit a linear regression model for the outcome: E[Y|A,M,C]=θ0+h=1Hθ1hI{A=h}+θ2M+h=1Hθ3hI{A=h}M+θ4CE[Y|A,M,C]=\theta_0+\sum_{h=1}^H\theta_{1h}I\{A=h\}+\theta_2M+\sum_{h=1}^H\theta_{3h}I\{A=h\}M+\theta_4'C

  3. Estimate CDECDE, PNDEPNDE, TNDETNDE, PNIEPNIE and TNIETNIE by the following parameter functions:

    • CDE=h=1Hθ1hI{a=h}h=1Hθ1hI{a*=h}+(h=1Hθ3hI{a=h}h=1Hθ3hI{a*=h})mCDE=\sum_{h=1}^H\theta_{1h}I\{a=h\}-\sum_{h=1}^H\theta_{1h}I\{a^*=h\}+(\sum_{h=1}^H\theta_{3h}I\{a=h\}-\sum_{h=1}^H\theta_{3h}I\{a^*=h\})m
    • PNDE=h=1Hθ1hI{a=h}h=1Hθ1hI{a*=h}+(h=1Hθ3hI{a=h}h=1Hθ3hI{a*=h})exp(β0+h=1Hβ1hI{a*=h}+β2c)1+exp(β0+h=1Hβ1hI{a*=h}+β2c)PNDE=\sum_{h=1}^H\theta_{1h}I\{a=h\}-\sum_{h=1}^H\theta_{1h}I\{a^*=h\}+(\sum_{h=1}^H\theta_{3h}I\{a=h\}-\sum_{h=1}^H\theta_{3h}I\{a^*=h\})\frac{exp(\beta_0+\sum_{h=1}^H\beta_{1h}I\{a^*=h\}+\beta_2'c)}{1+exp(\beta_0+\sum_{h=1}^H\beta_{1h}I\{a^*=h\}+\beta_2'c)}
    • TNDE=h=1Hθ1hI{a=h}h=1Hθ1hI{a*=h}+(h=1Hθ3hI{a=h}h=1Hθ3hI{a*=h})exp(β0+h=1Hβ1hI{a=h}+β2c)1+exp(β0+h=1Hβ1hI{a=h}+β2c)TNDE=\sum_{h=1}^H\theta_{1h}I\{a=h\}-\sum_{h=1}^H\theta_{1h}I\{a^*=h\}+(\sum_{h=1}^H\theta_{3h}I\{a=h\}-\sum_{h=1}^H\theta_{3h}I\{a^*=h\})\frac{exp(\beta_0+\sum_{h=1}^H\beta_{1h}I\{a=h\}+\beta_2'c)}{1+exp(\beta_0+\sum_{h=1}^H\beta_{1h}I\{a=h\}+\beta_2'c)}
    • PNIE=(θ2+h=1Hθ3hI{a*=h})(exp(β0+h=1Hβ1hI{a=h}+β2c)1+exp(β0+h=1Hβ1hI{a=h}+β2c)exp(β0+h=1Hβ1hI{a*=h}+β2c)1+exp(β0+h=1Hβ1hI{a*=h}+β2c))PNIE=(\theta_2+\sum_{h=1}^H\theta_{3h}I\{a^*=h\})(\frac{exp(\beta_0+\sum_{h=1}^H\beta_{1h}I\{a=h\}+\beta_2'c)}{1+exp(\beta_0+\sum_{h=1}^H\beta_{1h}I\{a=h\}+\beta_2'c)}-\frac{exp(\beta_0+\sum_{h=1}^H\beta_{1h}I\{a^*=h\}+\beta_2'c)}{1+exp(\beta_0+\sum_{h=1}^H\beta_{1h}I\{a^*=h\}+\beta_2'c)})
    • TNIE=(θ2+h=1Hθ3hI{a=h})(exp(β0+h=1Hβ1hI{a=h}+β2c)1+exp(β0+h=1Hβ1hI{a=h}+β2c)exp(β0+h=1Hβ1hI{a*=h}+β2c)1+exp(β0+h=1Hβ1hI{a*=h}+β2c))TNIE=(\theta_2+\sum_{h=1}^H\theta_{3h}I\{a=h\})(\frac{exp(\beta_0+\sum_{h=1}^H\beta_{1h}I\{a=h\}+\beta_2'c)}{1+exp(\beta_0+\sum_{h=1}^H\beta_{1h}I\{a=h\}+\beta_2'c)}-\frac{exp(\beta_0+\sum_{h=1}^H\beta_{1h}I\{a^*=h\}+\beta_2'c)}{1+exp(\beta_0+\sum_{h=1}^H\beta_{1h}I\{a^*=h\}+\beta_2'c)})
  4. Calculate other effects using formulas in table 1.

Linear yreg, Multinomial mreg and Noncategorical Exposure

If the exposure is not categorical, yreg="linear" and mreg=list("multinomial"), CMAverse estimates the causal effects by the following steps:

  1. Fit a multinomial regression model for the mediator: logE[M=j|A,C]E[M=0|A,C]=β0j+β1jA+β2jC,j=1,2,...,Jlog\frac{E[M=j|A,C]}{E[M=0|A,C]}=\beta_{0j}+\beta_{1j}A+\beta_{2j}'C, j=1,2,...,J

  2. Fit a linear regression model for the outcome: E[Y|A,M,C]=θ0+θ1A+j=1Jθ2jI{M=j}+Aj=1Jθ3jI{M=j}+θ4CE[Y|A,M,C]=\theta_0+\theta_1A+\sum_{j=1}^J\theta_{2j}I\{M=j\}+A\sum_{j=1}^J\theta_{3j}I\{M=j\}+\theta_4'C

  3. Estimate CDECDE, PNDEPNDE, TNDETNDE, PNIEPNIE and TNIETNIE by the following parameter functions:

    • CDE=(θ1+j=1Jθ3jI{m=j})(aa*)CDE=(\theta_1+\sum_{j=1}^J\theta_{3j}I\{m=j\})(a-a^*)
    • PNDE={θ1+j=1Jθ3jexp(β0j+β1ja*+β2jc)1+j=1Jexp(β0j+β1ja*+β2jc)}(aa*)PNDE=\{\theta_1+\frac{\sum_{j=1}^J\theta_{3j}exp(\beta_{0j}+\beta_{1j}a^*+\beta_{2j}'c)}{1+\sum_{j=1}^Jexp(\beta_{0j}+\beta_{1j}a^*+\beta_{2j}'c)}\}(a-a^*)
    • TNDE={θ1+j=1Jθ3jexp(β0j+β1ja+β2jc)1+j=1Jexp(β0j+β1ja+β2jc)}(aa*)TNDE=\{\theta_1+\frac{\sum_{j=1}^J\theta_{3j}exp(\beta_{0j}+\beta_{1j}a+\beta_{2j}'c)}{1+\sum_{j=1}^Jexp(\beta_{0j}+\beta_{1j}a+\beta_{2j}'c)}\}(a-a^*)
    • PNIE=j=1J(θ2j+θ3ja*)exp(β0j+β1ja+β2jc)1+j=1Jexp(β0j+β1ja+β2jc)j=1J(θ2j+θ3ja*)exp(β0j+β1ja*+β2jc)1+j=1Jexp(β0j+β1ja*+β2jc)PNIE=\frac{\sum_{j=1}^J(\theta_{2j}+\theta_{3j}a^*)exp(\beta_{0j}+\beta_{1j}a+\beta_{2j}'c)}{1+\sum_{j=1}^Jexp(\beta_{0j}+\beta_{1j}a+\beta_{2j}'c)}-\frac{\sum_{j=1}^J(\theta_{2j}+\theta_{3j}a^*)exp(\beta_{0j}+\beta_{1j}a^*+\beta_{2j}'c)}{1+\sum_{j=1}^Jexp(\beta_{0j}+\beta_{1j}a^*+\beta_{2j}'c)}
    • TNIE=j=1J(θ2j+θ3ja)exp(β0j+β1ja+β2jc)1+j=1Jexp(β0j+β1ja+β2jc)j=1J(θ2j+θ3ja)exp(β0j+β1ja*+β2jc)1+j=1Jexp(β0j+β1ja*+β2jc)TNIE=\frac{\sum_{j=1}^J(\theta_{2j}+\theta_{3j}a)exp(\beta_{0j}+\beta_{1j}a+\beta_{2j}'c)}{1+\sum_{j=1}^Jexp(\beta_{0j}+\beta_{1j}a+\beta_{2j}'c)}-\frac{\sum_{j=1}^J(\theta_{2j}+\theta_{3j}a)exp(\beta_{0j}+\beta_{1j}a^*+\beta_{2j}'c)}{1+\sum_{j=1}^Jexp(\beta_{0j}+\beta_{1j}a^*+\beta_{2j}'c)}
  4. Calculate other effects using formulas in table 1.

Linear yreg, Multinomial mreg and Categorical Exposure

If the exposure is categorical, yreg="linear" and mreg=list("multinomial"), CMAverse estimates the causal effects by the following steps:

  1. Fit a multinomial regression model for the mediator: logE[M=j|A,C]E[M=0|A,C]=β0j+h=1Hβ1jhI{A=h}+β2jC,j=1,2,...,Jlog\frac{E[M=j|A,C]}{E[M=0|A,C]}=\beta_{0j}+\sum_{h=1}^H\beta_{1jh}I\{A=h\}+\beta_{2j}'C, j=1,2,...,J

  2. Fit a linear regression model for the outcome: E[Y|A,M,C]=θ0+h=1Hθ1hI{A=h}+j=1Jθ2jI{M=j}+j=1Jh=1Hθ3jhI{M=j}I{A=h}+θ4CE[Y|A,M,C]=\theta_0+\sum_{h=1}^H\theta_{1h}I\{A=h\}+\sum_{j=1}^J\theta_{2j}I\{M=j\}+\sum_{j=1}^J\sum_{h=1}^H\theta_{3jh}I\{M=j\}I\{A=h\}+\theta_4'C

  3. Estimate CDECDE, PNDEPNDE, TNDETNDE, PNIEPNIE and TNIETNIE by the following parameter functions:

    • CDE=h=1Hθ1hI{a=h}h=1Hθ1hI{a*=h}+(j=1Jh=1Hθ3jhI{m=j}I{a=h}j=1Jh=1Hθ3jhI{m=j}I{a*=h})CDE=\sum_{h=1}^H\theta_{1h}I\{a=h\}-\sum_{h=1}^H\theta_{1h}I\{a^*=h\}+(\sum_{j=1}^J\sum_{h=1}^H\theta_{3jh}I\{m=j\}I\{a=h\}-\sum_{j=1}^J\sum_{h=1}^H\theta_{3jh}I\{m=j\}I\{a^*=h\})
    • PNDE=h=1Hθ1hI{a=h}h=1Hθ1hI{a*=h}+j=1J(h=1Hθ3jhI{a=h}h=1Hθ3jhI{a*=h})exp(β0j+h=1Hβ1jhI{a*=h}+β2jc)1+j=1Jexp(β0j+h=1Hβ1jhI{a*=h}+β2jc)PNDE=\sum_{h=1}^H\theta_{1h}I\{a=h\}-\sum_{h=1}^H\theta_{1h}I\{a^*=h\}+\frac{\sum_{j=1}^J(\sum_{h=1}^H\theta_{3jh}I\{a=h\}-\sum_{h=1}^H\theta_{3jh}I\{a^*=h\})exp(\beta_{0j}+\sum_{h=1}^H\beta_{1jh}I\{a^*=h\}+\beta_{2j}'c)}{1+\sum_{j=1}^Jexp(\beta_{0j}+\sum_{h=1}^H\beta_{1jh}I\{a^*=h\}+\beta_{2j}'c)}
    • TNDE=h=1Hθ1hI{a=h}h=1Hθ1hI{a*=h}+j=1J(h=1Hθ3jhI{a=h}h=1Hθ3jhI{a*=h})exp(β0j+h=1Hβ1jhI{a=h}+β2jc)1+j=1Jexp(β0j+h=1Hβ1jhI{a=h}+β2jc)TNDE=\sum_{h=1}^H\theta_{1h}I\{a=h\}-\sum_{h=1}^H\theta_{1h}I\{a^*=h\}+\frac{\sum_{j=1}^J(\sum_{h=1}^H\theta_{3jh}I\{a=h\}-\sum_{h=1}^H\theta_{3jh}I\{a^*=h\})exp(\beta_{0j}+\sum_{h=1}^H\beta_{1jh}I\{a=h\}+\beta_{2j}'c)}{1+\sum_{j=1}^Jexp(\beta_{0j}+\sum_{h=1}^H\beta_{1jh}I\{a=h\}+\beta_{2j}'c)}
    • PNIE=j=1J(θ2j+h=1Hθ3jhI{a*=h})exp(β0j+h=1Hβ1jhI{a=h}+β2jc)1+j=1Jexp(β0j+h=1Hβ1jhI{a=h}+β2jc)j=1J(θ2j+h=1Hθ3jhI{a*=h})exp(β0j+h=1Hβ1jhI{a*=h}+β2jc)1+j=1Jexp(β0j+h=1Hβ1jhI{a*=h}+β2jc)PNIE=\frac{\sum_{j=1}^J(\theta_{2j}+\sum_{h=1}^H\theta_{3jh}I\{a^*=h\})exp(\beta_{0j}+\sum_{h=1}^H\beta_{1jh}I\{a=h\}+\beta_{2j}'c)}{1+\sum_{j=1}^Jexp(\beta_{0j}+\sum_{h=1}^H\beta_{1jh}I\{a=h\}+\beta_{2j}'c)}-\frac{\sum_{j=1}^J(\theta_{2j}+\sum_{h=1}^H\theta_{3jh}I\{a^*=h\})exp(\beta_{0j}+\sum_{h=1}^H\beta_{1jh}I\{a^*=h\}+\beta_{2j}'c)}{1+\sum_{j=1}^Jexp(\beta_{0j}+\sum_{h=1}^H\beta_{1jh}I\{a^*=h\}+\beta_{2j}'c)}
    • TNIE=j=1J(θ2j+h=1Hθ3jhI{a=h})exp(β0j+h=1Hβ1jhI{a=h}+β2jc)1+j=1Jexp(β0j+h=1Hβ1jhI{a=h}+β2jc)j=1J(θ2j+h=1Hθ3jhI{a=h})exp(β0j+h=1Hβ1jhI{a*=h}+β2jc)1+j=1Jexp(β0j+h=1Hβ1jhI{a*=h}+β2jc)TNIE=\frac{\sum_{j=1}^J(\theta_{2j}+\sum_{h=1}^H\theta_{3jh}I\{a=h\})exp(\beta_{0j}+\sum_{h=1}^H\beta_{1jh}I\{a=h\}+\beta_{2j}'c)}{1+\sum_{j=1}^Jexp(\beta_{0j}+\sum_{h=1}^H\beta_{1jh}I\{a=h\}+\beta_{2j}'c)}-\frac{\sum_{j=1}^J(\theta_{2j}+\sum_{h=1}^H\theta_{3jh}I\{a=h\})exp(\beta_{0j}+\sum_{h=1}^H\beta_{1jh}I\{a^*=h\}+\beta_{2j}'c)}{1+\sum_{j=1}^Jexp(\beta_{0j}+\sum_{h=1}^H\beta_{1jh}I\{a^*=h\}+\beta_{2j}'c)}
  4. Calculate other effects using formulas in table 1.

Nonlinear yreg, Linear mreg and Noncategorical Exposure

If the exposure is not categorical, yreg!="linear" and mreg=list("linear"), CMAverse estimates the causal effects by the following steps:

  1. Fit a linear regression model for the mediator: E[M|A,C]=β0+β1A+β2C+ϵM,ϵMN(0,σ2)E[M|A,C]=\beta_0+\beta_1A+\beta_2'C+\epsilon_M,\epsilon_M\sim N(0,\sigma^2)

  2. Fit the specified regression model for the outcome: g(E[Y|A,M,C])=θ0+θ1A+θ2M+θ3AM+θ4Cg(E[Y|A,M,C])=\theta_0+\theta_1A+\theta_2M+\theta_3AM+\theta_4'C

  3. Estimate RCDER^{CDE}, RPNDER^{PNDE}, RTNDER^{TNDE}, RPNIER^{PNIE}, RTNIER^{TNIE} and ERCDEER^{CDE} by the following parameter functions:

    • RCDE=exp((θ1+θ3m)(aa*))R^{CDE}=exp((\theta_1+\theta_3m)(a-a^*))
    • RPNDE=exp({θ1+θ3(β0+β1a*+β2c+θ2σ2)}(aa*)+0.5θ32σ2(a2a*2))R^{PNDE}=exp(\{\theta_1+\theta_3(\beta_0+\beta_1a^*+\beta_2'c+\theta_2\sigma^2)\}(a-a^*)+0.5\theta_3^2\sigma^2(a^2-a^{*2}))
    • RTNDE=exp({θ1+θ3(β0+β1a+β2c+θ2σ2)}(aa*)+0.5θ32σ2(a2a*2))R^{TNDE}=exp(\{\theta_1+\theta_3(\beta_0+\beta_1a+\beta_2'c+\theta_2\sigma^2)\}(a-a^*)+0.5\theta_3^2\sigma^2(a^2-a^{*2}))
    • RPNIE=exp((θ2β1+θ3β1a*)(aa*))R^{PNIE}=exp((\theta_2\beta_1+\theta_3\beta_1a^*)(a-a^*))
    • RTNIE=exp((θ2β1+θ3β1a)(aa*))R^{TNIE}=exp((\theta_2\beta_1+\theta_3\beta_1a)(a-a^*))
    • ERCDE=(exp(θ1(aa*)+θ3am)exp(θ3a*m))exp(θ2m(θ2+θ3a*)(β0+β1a*+β2c)0.5(θ2+θ3a*)2σ2)ER^{CDE}=(exp(\theta_1(a-a^*)+\theta_3am)-exp(\theta_3a^*m))exp(\theta_2m-(\theta_2+\theta_3a^*)(\beta_0+\beta_1a^*+\beta_2'c)-0.5(\theta_2+\theta_3a^*)^2\sigma^2)
  4. Calculate other effects using formulas in table 2.

Nonlinear yreg, Linear mreg and Categorical Exposure

If the exposure is categorical, yreg!="linear" and mreg=list("linear"), CMAverse estimates the causal effects by the following steps:

  1. Fit a linear regression model for the mediator: E[M|A,C]=β0+h=1Hβ1hI{A=h}+β2C+ϵM,ϵMN(0,σ2)E[M|A,C]=\beta_0+\sum_{h=1}^H\beta_{1h}I\{A=h\}+\beta_2'C+\epsilon_M,\epsilon_M\sim N(0,\sigma^2)

  2. Fit the specified regression model for the outcome: g(E[Y|A,M,C])=θ0+h=1Hθ1hI{A=h}+θ2M+h=1Hθ3hI{A=h}M+θ4Cg(E[Y|A,M,C])=\theta_0+\sum_{h=1}^H\theta_{1h}I\{A=h\}+\theta_2M+\sum_{h=1}^H\theta_{3h}I\{A=h\}M+\theta_4'C

  3. Estimate RCDER^{CDE}, RPNDER^{PNDE}, RTNDER^{TNDE}, RPNIER^{PNIE}, RTNIER^{TNIE} and ERCDEER^{CDE} by the following parameter functions:

    • RCDE=exp(h=1Hθ1hI{a=h}h=1Hθ1hI{a*=h}+(h=1Hθ3hI{a=h}h=1Hθ3hI{a*=h})m)R^{CDE}=exp(\sum_{h=1}^H\theta_{1h}I\{a=h\}-\sum_{h=1}^H\theta_{1h}I\{a^*=h\}+(\sum_{h=1}^H\theta_{3h}I\{a=h\}-\sum_{h=1}^H\theta_{3h}I\{a^*=h\})m)
    • RPNDE=exp(h=1Hθ1hI{a=h}h=1Hθ1hI{a*=h}+(h=1Hθ3hI{a=h}h=1Hθ3hI{a*=h})(β0+h=1Hβ1hI{a*=h}+β2c+θ2σ2)+0.5σ2(h=1Hθ3h2I{a=h}h=1Hθ3h2I{a*=h}))R^{PNDE}=exp(\sum_{h=1}^H\theta_{1h}I\{a=h\}-\sum_{h=1}^H\theta_{1h}I\{a^*=h\}+(\sum_{h=1}^H\theta_{3h}I\{a=h\}-\sum_{h=1}^H\theta_{3h}I\{a^*=h\})(\beta_0+\sum_{h=1}^H\beta_{1h}I\{a^*=h\}+\beta_2'c+\theta_2\sigma^2)+0.5\sigma^2(\sum_{h=1}^H\theta_{3h}^2I\{a=h\}-\sum_{h=1}^H\theta_{3h}^2I\{a^*=h\}))
    • RTNDE=exp(h=1Hθ1hI{a=h}h=1Hθ1hI{a*=h}+(h=1Hθ3hI{a=h}h=1Hθ3hI{a*=h})(β0+h=1Hβ1hI{a=h}+β2c+θ2σ2)+0.5σ2(h=1Hθ3h2I{a=h}h=1Hθ3h2I{a*=h}))R^{TNDE}=exp(\sum_{h=1}^H\theta_{1h}I\{a=h\}-\sum_{h=1}^H\theta_{1h}I\{a^*=h\}+(\sum_{h=1}^H\theta_{3h}I\{a=h\}-\sum_{h=1}^H\theta_{3h}I\{a^*=h\})(\beta_0+\sum_{h=1}^H\beta_{1h}I\{a=h\}+\beta_2'c+\theta_2\sigma^2)+0.5\sigma^2(\sum_{h=1}^H\theta_{3h}^2I\{a=h\}-\sum_{h=1}^H\theta_{3h}^2I\{a^*=h\}))
    • RPNIE=exp(θ2(h=1Hβ1hI{a=h}h=1Hβ1hI{a*=h})+h=1Hθ3hI{a*=h}(h=1Hβ1hI{a=h}h=1Hβ1hI{a*=h}))R^{PNIE}=exp(\theta_2(\sum_{h=1}^H\beta_{1h}I\{a=h\}-\sum_{h=1}^H\beta_{1h}I\{a^*=h\})+\sum_{h=1}^H\theta_{3h}I\{a^*=h\}(\sum_{h=1}^H\beta_{1h}I\{a=h\}-\sum_{h=1}^H\beta_{1h}I\{a^*=h\}))
    • RTNIE=exp(θ2(h=1Hβ1hI{a=h}h=1Hβ1hI{a*=h})+h=1Hθ3hI{a=h}(h=1Hβ1hI{a=h}h=1Hβ1hI{a*=h}))R^{TNIE}=exp(\theta_2(\sum_{h=1}^H\beta_{1h}I\{a=h\}-\sum_{h=1}^H\beta_{1h}I\{a^*=h\})+\sum_{h=1}^H\theta_{3h}I\{a=h\}(\sum_{h=1}^H\beta_{1h}I\{a=h\}-\sum_{h=1}^H\beta_{1h}I\{a^*=h\}))
    • ERCDE=(exp(h=1Hθ1hI{a=h}h=1Hθ1hI{a*=h}+h=1Hθ3hI{a=h}m)exp(h=1Hθ3hI{a*=h}m))exp(θ2m(θ2+h=1Hθ3hI{a*=h})(β0+h=1Hβ1hI{a*=h}+β2c)0.5(θ2+h=1Hθ3hI{a*=h})2σ2)ER^{CDE}=(exp(\sum_{h=1}^H\theta_{1h}I\{a=h\}-\sum_{h=1}^H\theta_{1h}I\{a^*=h\}+\sum_{h=1}^H\theta_{3h}I\{a=h\}m)-exp(\sum_{h=1}^H\theta_{3h}I\{a^*=h\}m))exp(\theta_2m-(\theta_2+\sum_{h=1}^H\theta_{3h}I\{a^*=h\})(\beta_0+\sum_{h=1}^H\beta_{1h}I\{a^*=h\}+\beta_2'c)-0.5(\theta_2+\sum_{h=1}^H\theta_{3h}I\{a^*=h\})^2\sigma^2)
  4. Calculate other effects using formulas in table 2.

Nonlinear yreg, Logistic mreg and Noncategorical Exposure

If the exposure is not categorical, yreg!="linear" and mreg=list("logistic"), CMAverse estimates the causal effects by the following steps:

  1. Fit a logistic regression model for the mediator: logitE[M|A,C]=β0+β1A+β2ClogitE[M|A,C]=\beta_0+\beta_1A+\beta_2'C

  2. Fit the specified regression model for the outcome: g(E[Y|A,M,C])=θ0+θ1A+θ2M+θ3AM+θ4Cg(E[Y|A,M,C])=\theta_0+\theta_1A+\theta_2M+\theta_3AM+\theta_4'C

  3. Estimate RCDER^{CDE}, RPNDER^{PNDE}, RTNDER^{TNDE}, RPNIER^{PNIE}, RTNIER^{TNIE} and ERCDEER^{CDE} by the following parameter functions:

    • RCDE=exp((θ1+θ3m)(aa*))R^{CDE}=exp((\theta_1+\theta_3m)(a-a^*))
    • RPNDE=exp(θ1a){1+exp(θ2+θ3a+β0+β1a*+β2c)}exp(θ1a*){1+exp(θ2+θ3a*+β0+β1a*+β2c)}R^{PNDE}=\frac{exp(\theta_1a)\{1+exp(\theta_2+\theta_3a+\beta_0+\beta_1a^*+\beta_2'c)\}}{exp(\theta_1a^*)\{1+exp(\theta_2+\theta_3a^*+\beta_0+\beta_1a^*+\beta_2'c)\}}
    • RTNDE=exp(θ1a){1+exp(θ2+θ3a+β0+β1a+β2c)}exp(θ1a*){1+exp(θ2+θ3a*+β0+β1a+β2c)}R^{TNDE}=\frac{exp(\theta_1a)\{1+exp(\theta_2+\theta_3a+\beta_0+\beta_1a+\beta_2'c)\}}{exp(\theta_1a^*)\{1+exp(\theta_2+\theta_3a^*+\beta_0+\beta_1a+\beta_2'c)\}}
    • RPNIE={1+exp(β0+β1a*+β2c)}{1+exp(θ2+θ3a*+β0+β1a+β2c)}{1+exp(β0+β1a+β2c)}{1+exp(θ2+θ3a*+β0+β1a*+β2c)}R^{PNIE}=\frac{\{1+exp(\beta_0+\beta_1a^*+\beta_2'c)\}\{1+exp(\theta_2+\theta_3a^*+\beta_0+\beta_1a+\beta_2'c)\}}{\{1+exp(\beta_0+\beta_1a+\beta_2'c)\}\{1+exp(\theta_2+\theta_3a^*+\beta_0+\beta_1a^*+\beta_2'c)\}}
    • RTNIE={1+exp(β0+β1a*+β2c)}{1+exp(θ2+θ3a+β0+β1a+β2c)}{1+exp(β0+β1a+β2c)}{1+exp(θ2+θ3a+β0+β1a*+β2c)}R^{TNIE}=\frac{\{1+exp(\beta_0+\beta_1a^*+\beta_2'c)\}\{1+exp(\theta_2+\theta_3a+\beta_0+\beta_1a+\beta_2'c)\}}{\{1+exp(\beta_0+\beta_1a+\beta_2'c)\}\{1+exp(\theta_2+\theta_3a+\beta_0+\beta_1a^*+\beta_2'c)\}}
    • ERCDE=exp(θ2m)(exp(θ1(aa*)+θ3am)exp(θ3a*m))(1+exp(β0+β1a*+β2c))1+exp(β0+β1a*+β2c+θ2+θ3a*)ER^{CDE}=\frac{exp(\theta_2m)(exp(\theta_1(a-a^*)+\theta_3am)-exp(\theta_3a^*m))(1+exp(\beta_0+\beta_1a^*+\beta_2'c))}{1+exp(\beta_0+\beta_1a^*+\beta_2'c+\theta_2+\theta_3a^*)}
  4. Calculate other effects using formulas in table 2.

Nonlinear yreg, Logistic mreg and Categorical Exposure

If the exposure is categorical, yreg!="linear" and mreg=list("logistic"), CMAverse estimates the causal effects by the following steps:

  1. Fit a logistic regression model for the mediator: logitE[M|A,C]=β0+h=1Hβ1hI{A=h}+β2ClogitE[M|A,C]=\beta_0+\sum_{h=1}^H\beta_{1h}I\{A=h\}+\beta_2'C

  2. Fit the specified regression model for the outcome: g(E[Y|A,M,C])=θ0+h=1Hθ1hI{A=h}+θ2M+h=1Hθ3hI{A=h}M+θ4Cg(E[Y|A,M,C])=\theta_0+\sum_{h=1}^H\theta_{1h}I\{A=h\}+\theta_2M+\sum_{h=1}^H\theta_{3h}I\{A=h\}M+\theta_4'C

  3. Estimate RCDER^{CDE}, RPNDER^{PNDE}, RTNDER^{TNDE}, RPNIER^{PNIE}, RTNIER^{TNIE} and ERCDEER^{CDE} by the following parameter functions:

    • RCDE=exp(h=1Hθ1hI{a=h}h=1Hθ1hI{a*=h}+(h=1Hθ3hI{a=h}h=1Hθ3hI{a*=h})m)R^{CDE}=exp(\sum_{h=1}^H\theta_{1h}I\{a=h\}-\sum_{h=1}^H\theta_{1h}I\{a^*=h\}+(\sum_{h=1}^H\theta_{3h}I\{a=h\}-\sum_{h=1}^H\theta_{3h}I\{a^*=h\})m)
    • RPNDE=exp(h=1Hθ1hI{a=h}){1+exp(θ2+h=1Hθ3hI{a=h}+β0+h=1Hβ1hI{a*=h}+β2c)}exp(h=1Hθ1hI{a*=h}){1+exp(θ2+h=1Hθ3hI{a*=h}+β0+h=1Hβ1hI{a*=h}+β2c)}R^{PNDE}=\frac{exp(\sum_{h=1}^H\theta_{1h}I\{a=h\})\{1+exp(\theta_2+\sum_{h=1}^H\theta_{3h}I\{a=h\}+\beta_0+\sum_{h=1}^H\beta_{1h}I\{a^*=h\}+\beta_2'c)\}}{exp(\sum_{h=1}^H\theta_{1h}I\{a^*=h\})\{1+exp(\theta_2+\sum_{h=1}^H\theta_{3h}I\{a^*=h\}+\beta_0+\sum_{h=1}^H\beta_{1h}I\{a^*=h\}+\beta_2'c)\}}
    • RTNDE=exp(h=1Hθ1hI{a=h}){1+exp(θ2+h=1Hθ3hI{a=h}+β0+h=1Hβ1hI{a=h}+β2c)}exp(h=1Hθ1hI{a*=h}){1+exp(θ2+h=1Hθ3hI{a*=h}+β0+h=1Hβ1hI{a=h}+β2c)}R^{TNDE}=\frac{exp(\sum_{h=1}^H\theta_{1h}I\{a=h\})\{1+exp(\theta_2+\sum_{h=1}^H\theta_{3h}I\{a=h\}+\beta_0+\sum_{h=1}^H\beta_{1h}I\{a=h\}+\beta_2'c)\}}{exp(\sum_{h=1}^H\theta_{1h}I\{a^*=h\})\{1+exp(\theta_2+\sum_{h=1}^H\theta_{3h}I\{a^*=h\}+\beta_0+\sum_{h=1}^H\beta_{1h}I\{a=h\}+\beta_2'c)\}}
    • RPNIE={1+exp(β0+h=1Hβ1hI{a*=h}+β2c)}{1+exp(θ2+h=1Hθ3hI{a*=h}+β0+h=1Hβ1hI{a=h}+β2c)}{1+exp(β0+h=1Hβ1hI{a=h}+β2c)}{1+exp(θ2+h=1Hθ3hI{a*=h}+β0+h=1Hβ1hI{a*=h}+β2c)}R^{PNIE}=\frac{\{1+exp(\beta_0+\sum_{h=1}^H\beta_{1h}I\{a^*=h\}+\beta_2'c)\}\{1+exp(\theta_2+\sum_{h=1}^H\theta_{3h}I\{a^*=h\}+\beta_0+\sum_{h=1}^H\beta_{1h}I\{a=h\}+\beta_2'c)\}}{\{1+exp(\beta_0+\sum_{h=1}^H\beta_{1h}I\{a=h\}+\beta_2'c)\}\{1+exp(\theta_2+\sum_{h=1}^H\theta_{3h}I\{a^*=h\}+\beta_0+\sum_{h=1}^H\beta_{1h}I\{a^*=h\}+\beta_2'c)\}}
    • RTNIE={1+exp(β0+h=1Hβ1hI{a*=h}+β2c)}{1+exp(θ2+h=1Hθ3hI{a=h}+β0+h=1Hβ1hI{a=h}+β2c)}{1+exp(β0+h=1Hβ1hI{a=h}+β2c)}{1+exp(θ2+h=1Hθ3hI{a=h}+β0+h=1Hβ1hI{a*=h}+β2c)}R^{TNIE}=\frac{\{1+exp(\beta_0+\sum_{h=1}^H\beta_{1h}I\{a^*=h\}+\beta_2'c)\}\{1+exp(\theta_2+\sum_{h=1}^H\theta_{3h}I\{a=h\}+\beta_0+\sum_{h=1}^H\beta_{1h}I\{a=h\}+\beta_2'c)\}}{\{1+exp(\beta_0+\sum_{h=1}^H\beta_{1h}I\{a=h\}+\beta_2'c)\}\{1+exp(\theta_2+\sum_{h=1}^H\theta_{3h}I\{a=h\}+\beta_0+\sum_{h=1}^H\beta_{1h}I\{a^*=h\}+\beta_2'c)\}}
    • ERCDE=exp(θ2m)(exp(h=1Hθ1hI{a=h}h=1Hθ1hI{a*=h}+h=1Hθ3hI{a=h}m)exp(h=1Hθ3hI{a*=h}m))(1+exp(β0+h=1Hβ1hI{a*=h}+β2c))1+exp(β0+h=1Hβ1hI{a*=h}+β2c+θ2+h=1Hθ3hI{a*=h})ER^{CDE}=\frac{exp(\theta_2m)(exp(\sum_{h=1}^H\theta_{1h}I\{a=h\}-\sum_{h=1}^H\theta_{1h}I\{a^*=h\}+\sum_{h=1}^H\theta_{3h}I\{a=h\}m)-exp(\sum_{h=1}^H\theta_{3h}I\{a^*=h\}m))(1+exp(\beta_0+\sum_{h=1}^H\beta_{1h}I\{a^*=h\}+\beta_2'c))}{1+exp(\beta_0+\sum_{h=1}^H\beta_{1h}I\{a^*=h\}+\beta_2'c+\theta_2+\sum_{h=1}^H\theta_{3h}I\{a^*=h\})}
  4. Calculate other effects using formulas in table 2.

Nonlinear yreg, Multinomial mreg and Noncategorical Exposure

If the exposure is not categorical, yreg!="linear" and mreg=list("multinomial"), CMAverse estimates the causal effects by the following steps:

  1. Fit a multinomial regression model for the mediator: logE[M=j|A,C]E[M=0|A,C]=β0j+β1jA+β2jC,j=1,2,...,Jlog\frac{E[M=j|A,C]}{E[M=0|A,C]}=\beta_{0j}+\beta_{1j}A+\beta_{2j}'C, j=1,2,...,J

  2. Fit the specified regression model for the outcome: g(E[Y|A,M,C])=θ0+θ1A+j=1Jθ2jI{M=j}+Aj=1Jθ3jI{M=j}+θ4Cg(E[Y|A,M,C])=\theta_0+\theta_1A+\sum_{j=1}^J\theta_{2j}I\{M=j\}+A\sum_{j=1}^J\theta_{3j}I\{M=j\}+\theta_4'C

  3. Estimate RCDER^{CDE}, RPNDER^{PNDE}, RTNDER^{TNDE}, RPNIER^{PNIE}, RTNIER^{TNIE} and ERCDEER^{CDE} by the following parameter functions:

    • RCDE=exp((θ1+j=1Jθ3jI{m=j})(aa*))R^{CDE}=exp((\theta_1+\sum_{j=1}^J\theta_{3j}I\{m=j\})(a-a^*))
    • RPNDE=exp(θ1a){1+j=1Jexp(θ2j+θ3ja+β0j+β1ja*+β2jc)}exp(θ1a*){1+j=1Jexp(θ2j+θ3ja*+β0j+β1ja*+β2jc)}R^{PNDE}=\frac{exp(\theta_1a)\{1+\sum_{j=1}^Jexp(\theta_{2j}+\theta_{3j}a+\beta_{0j}+\beta_{1j}a^*+\beta_{2j}'c)\}}{exp(\theta_1a^*)\{1+\sum_{j=1}^Jexp(\theta_{2j}+\theta_{3j}a^*+\beta_{0j}+\beta_{1j}a^*+\beta_{2j}'c)\}}
    • RTNDE=exp(θ1a){1+j=1Jexp(θ2j+θ3ja+β0j+β1ja+β2jc)}exp(θ1a*){1+j=1Jexp(θ2j+θ3ja*+β0j+β1ja+β2jc)}R^{TNDE}=\frac{exp(\theta_1a)\{1+\sum_{j=1}^Jexp(\theta_{2j}+\theta_{3j}a+\beta_{0j}+\beta_{1j}a+\beta_{2j}'c)\}}{exp(\theta_1a^*)\{1+\sum_{j=1}^Jexp(\theta_{2j}+\theta_{3j}a^*+\beta_{0j}+\beta_{1j}a+\beta_{2j}'c)\}}
    • RPNIE={1+j=1Jexp(β0j+β1ja*+β2jc)}{1+j=1Jexp(θ2j+θ3ja*+β0j+β1ja+β2jc)}{1+j=1Jexp(β0j+β1ja+β2jc)}{1+j=1Jexp(θ2j+θ3ja*+β0j+β1ja*+β2jc)}R^{PNIE}=\frac{\{1+\sum_{j=1}^Jexp(\beta_{0j}+\beta_{1j}a^*+\beta_{2j}'c)\}\{1+\sum_{j=1}^Jexp(\theta_{2j}+\theta_{3j}a^*+\beta_{0j}+\beta_{1j}a+\beta_{2j}'c)\}}{\{1+\sum_{j=1}^Jexp(\beta_{0j}+\beta_{1j}a+\beta_{2j}'c)\}\{1+\sum_{j=1}^Jexp(\theta_{2j}+\theta_{3j}a^*+\beta_{0j}+\beta_{1j}a^*+\beta_{2j}'c)\}}
    • RTNIE={1+j=1Jexp(β0j+β1ja*+β2jc)}{1+j=1Jexp(θ2j+θ3ja+β0j+β1ja+β2jc)}{1+j=1Jexp(β0j+β1ja+β2jc)}{1+j=1Jexp(θ2j+θ3ja+β0j+β1ja*+β2jc)}R^{TNIE}=\frac{\{1+\sum_{j=1}^Jexp(\beta_{0j}+\beta_{1j}a^*+\beta_{2j}'c)\}\{1+\sum_{j=1}^Jexp(\theta_{2j}+\theta_{3j}a+\beta_{0j}+\beta_{1j}a+\beta_{2j}'c)\}}{\{1+\sum_{j=1}^Jexp(\beta_{0j}+\beta_{1j}a+\beta_{2j}'c)\}\{1+\sum_{j=1}^Jexp(\theta_{2j}+\theta_{3j}a+\beta_{0j}+\beta_{1j}a^*+\beta_{2j}'c)\}}
    • ERCDE=exp(j=1Jθ2jI{m=j})(1+j=1Jexp(β0j+β1ja*+β2jc))(exp(θ1(aa*)+aj=1Jθ3jI{m=j})exp(a*j=1Jθ3jI{m=j}))1+j=1Jexp(θ2j+θ3ja*+β0j+β1ja*+β2jc)ER^{CDE}=\frac{exp(\sum_{j=1}^J\theta_{2j}I\{m=j\})(1+\sum_{j=1}^Jexp(\beta_{0j}+\beta_{1j}a^*+\beta_{2j}'c))(exp(\theta_1(a-a^*)+a\sum_{j=1}^J\theta_{3j}I\{m=j\})-exp(a^*\sum_{j=1}^J\theta_{3j}I\{m=j\}))}{1+\sum_{j=1}^Jexp(\theta_{2j}+\theta_{3j}a^*+\beta_{0j}+\beta_{1j}a^*+\beta_{2j}'c)}
  4. Calculate other effects using formulas in table 2.

Nonlinear yreg, Multinomial mreg and Categorical Exposure

If the exposure is categorical, yreg!="linear" and mreg=list("multinomial"), CMAverse estimates the causal effects by the following steps:

  1. Fit a multinomial regression model for the mediator: logE[M=j|A,C]E[M=0|A,C]=β0j+h=1Hβ1jhI{A=h}+β2jC,j=1,2,...,Jlog\frac{E[M=j|A,C]}{E[M=0|A,C]}=\beta_{0j}+\sum_{h=1}^H\beta_{1jh}I\{A=h\}+\beta_{2j}'C, j=1,2,...,J

  2. Fit the specified regression model for the outcome: g(E[Y|A,M,C])=θ0+h=1Hθ1hI{A=h}+j=1Jθ2jI{M=j}+j=1Jh=1Hθ3jhI{M=j}I{A=h}+θ4Cg(E[Y|A,M,C])=\theta_0+\sum_{h=1}^H\theta_{1h}I\{A=h\}+\sum_{j=1}^J\theta_{2j}I\{M=j\}+\sum_{j=1}^J\sum_{h=1}^H\theta_{3jh}I\{M=j\}I\{A=h\}+\theta_4'C

  3. Estimate RCDER^{CDE}, RPNDER^{PNDE}, RTNDER^{TNDE}, RPNIER^{PNIE}, RTNIER^{TNIE} and ERCDEER^{CDE} by the following parameter functions:

    • RCDE=exp(h=1Hθ1hI{a=h}h=1Hθ1hI{a*=h}+j=1Jh=1Hθ3jhI{m=l}I{a=h}j=1Jh=1Hθ3jhI{m=l}I{a*=h})R^{CDE}=exp(\sum_{h=1}^H\theta_{1h}I\{a=h\}-\sum_{h=1}^H\theta_{1h}I\{a^*=h\}+\sum_{j=1}^J\sum_{h=1}^H\theta_{3jh}I\{m=l\}I\{a=h\}-\sum_{j=1}^J\sum_{h=1}^H\theta_{3jh}I\{m=l\}I\{a^*=h\})
    • RPNDE=exp(h=1Hθ1hI{a=h}){1+j=1Jexp(θ2j+h=1Hθ3jhI{a=h}+β0j+h=1Hβ1jhI{a*=h}+β2jc)}exp(h=1Hθ1hI{a*=h}){1+j=1Jexp(θ2j+h=1Hθ3jhI{a*=h}+β0j+h=1Hβ1jhI{a*=h}+β2jc)}R^{PNDE}=\frac{exp(\sum_{h=1}^H\theta_{1h}I\{a=h\})\{1+\sum_{j=1}^Jexp(\theta_{2j}+\sum_{h=1}^H\theta_{3jh}I\{a=h\}+\beta_{0j}+\sum_{h=1}^H\beta_{1jh}I\{a^*=h\}+\beta_{2j}'c)\}}{exp(\sum_{h=1}^H\theta_{1h}I\{a^*=h\})\{1+\sum_{j=1}^Jexp(\theta_{2j}+\sum_{h=1}^H\theta_{3jh}I\{a^*=h\}+\beta_{0j}+\sum_{h=1}^H\beta_{1jh}I\{a^*=h\}+\beta_{2j}'c)\}}
    • RTNDE=exp(h=1Hθ1hI{a=h}){1+j=1Jexp(θ2j+h=1Hθ3jhI{a=h}+β0j+h=1Hβ1jhI{a=h}+β2jc)}exp(h=1Hθ1hI{a*=h}){1+j=1Jexp(θ2j+h=1Hθ3jhI{a*=h}+β0j+h=1Hβ1jhI{a=h}+β2jc)}R^{TNDE}=\frac{exp(\sum_{h=1}^H\theta_{1h}I\{a=h\})\{1+\sum_{j=1}^Jexp(\theta_{2j}+\sum_{h=1}^H\theta_{3jh}I\{a=h\}+\beta_{0j}+\sum_{h=1}^H\beta_{1jh}I\{a=h\}+\beta_{2j}'c)\}}{exp(\sum_{h=1}^H\theta_{1h}I\{a^*=h\})\{1+\sum_{j=1}^Jexp(\theta_{2j}+\sum_{h=1}^H\theta_{3jh}I\{a^*=h\}+\beta_{0j}+\sum_{h=1}^H\beta_{1jh}I\{a=h\}+\beta_{2j}'c)\}}
    • RPNIE={1+j=1Jexp(β0j+h=1Hβ1jhI{a*=h}+β2jc)}{1+j=1Jexp(θ2j+h=1Hθ3jhI{a*=h}+β0j+h=1Hβ1jhI{a=h}+β2jc)}{1+j=1Jexp(β0j+h=1Hβ1jhI{a=h}+β2jc)}{1+j=1Jexp(θ2j+h=1Hθ3jhI{a*=h}+β0j+h=1Hβ1jhI{a*=h}+β2jc)}R^{PNIE}=\frac{\{1+\sum_{j=1}^Jexp(\beta_{0j}+\sum_{h=1}^H\beta_{1jh}I\{a^*=h\}+\beta_{2j}'c)\}\{1+\sum_{j=1}^Jexp(\theta_{2j}+\sum_{h=1}^H\theta_{3jh}I\{a^*=h\}+\beta_{0j}+\sum_{h=1}^H\beta_{1jh}I\{a=h\}+\beta_{2j}'c)\}}{\{1+\sum_{j=1}^Jexp(\beta_{0j}+\sum_{h=1}^H\beta_{1jh}I\{a=h\}+\beta_{2j}'c)\}\{1+\sum_{j=1}^Jexp(\theta_{2j}+\sum_{h=1}^H\theta_{3jh}I\{a^*=h\}+\beta_{0j}+\sum_{h=1}^H\beta_{1jh}I\{a^*=h\}+\beta_{2j}'c)\}}
    • RTNIE={1+j=1Jexp(β0j+h=1Hβ1jhI{a*=h}+β2jc)}{1+j=1Jexp(θ2j+h=1Hθ3jhI{a=h}+β0j+h=1Hβ1jhI{a=h}+β2jc)}{1+j=1Jexp(β0j+h=1Hβ1jhI{a=h}+β2jc)}{1+j=1Jexp(θ2j+h=1Hθ3jhI{a=h}+β0j+h=1Hβ1jhI{a*=h}+β2jc)}R^{TNIE}=\frac{\{1+\sum_{j=1}^Jexp(\beta_{0j}+\sum_{h=1}^H\beta_{1jh}I\{a^*=h\}+\beta_{2j}'c)\}\{1+\sum_{j=1}^Jexp(\theta_{2j}+\sum_{h=1}^H\theta_{3jh}I\{a=h\}+\beta_{0j}+\sum_{h=1}^H\beta_{1jh}I\{a=h\}+\beta_{2j}'c)\}}{\{1+\sum_{j=1}^Jexp(\beta_{0j}+\sum_{h=1}^H\beta_{1jh}I\{a=h\}+\beta_{2j}'c)\}\{1+\sum_{j=1}^Jexp(\theta_{2j}+\sum_{h=1}^H\theta_{3jh}I\{a=h\}+\beta_{0j}+\sum_{h=1}^H\beta_{1jh}I\{a^*=h\}+\beta_{2j}'c)\}}
    • ERCDE=exp(j=1Jθ2jI{m=j})(1+j=1Jexp(β0j+h=1Hβ1jhI{a*=h}+β2jc))(exp((h=1Hθ1hI{a=h}h=1Hθ1hI{a*=h})+j=1Jh=1Hθ3jhI{a=h}I{m=j})exp(j=1Jh=1Hθ3jhI{a*=h}I{m=j}))1+j=1Jexp(θ2j+h=1Hθ3jhI{a*=h}+β0j+h=1Hβ1jhI{a*=h}+β2jc)ER^{CDE}=\frac{exp(\sum_{j=1}^J\theta_{2j}I\{m=j\})(1+\sum_{j=1}^Jexp(\beta_{0j}+\sum_{h=1}^H\beta_{1jh}I\{a^*=h\}+\beta_{2j}'c))(exp((\sum_{h=1}^H\theta_{1h}I\{a=h\}-\sum_{h=1}^H\theta_{1h}I\{a^*=h\})+\sum_{j=1}^J\sum_{h=1}^H\theta_{3jh}I\{a=h\}I\{m=j\})-exp(\sum_{j=1}^J\sum_{h=1}^H\theta_{3jh}I\{a^*=h\}I\{m=j\}))}{1+\sum_{j=1}^Jexp(\theta_{2j}+\sum_{h=1}^H\theta_{3jh}I\{a^*=h\}+\beta_{0j}+\sum_{h=1}^H\beta_{1jh}I\{a^*=h\}+\beta_{2j}'c)}
  4. Calculate other effects using formulas in table 2.

Direct Counterfactual Imputation Estimation

CMAverse conducts direct counterfactual imputation estimation by the following steps:

  1. Fit a regression model for E(Y|A,M,C)E(Y|A,M,C). This regression model is specified by the yreg argument.

  2. For p=1,...,kp=1,...,k, fit a regression model for the distribution of MpM_p given AA and CC. These regression models are specified by the mreg argument.

  3. For p=1,...,kp=1,...,k and i=1,...,ni=1,...,n, simulate the counterfactuals Ma,p,iM_{a,p,i} and Ma*,p,iM_{a^*,p,i}.

    • Simulate Ma,p,iM_{a,p,i} by randomly drawing a value from the distribution of MpM_{p} given A=a,C=CiA=a,C=C_i. Denote Ma,i=(Ma,1,i,...,Ma,k,i)TM_{a,i}=(M_{a,1,i},...,M_{a,k,i})^T.
    • Simulate Ma*,p,iM_{a^*,p,i} by randomly drawing a value from the distribution of MpM_{p} given A=a*,C=CiA=a^*,C=C_i. Denote Ma*,i=(Ma*,1,i,...,Ma*,k,i)TM_{a^*,i}=(M_{a^*,1,i},...,M_{a^*,k,i})^T.
  4. For i=1,...,ni=1,...,n, obtain E[Yi|A=a*,M=m,C=Ci]E[Y_i|A=a^*,M=m,C=C_i], E[Yi|A=a,M=m,C=Ci]E[Y_i|A=a,M=m,C=C_i], E[Yi|A=a*,M=Ma*,i,C=Ci]E[Y_i|A=a^*,M=M_{a^*,i},C=C_i], E[Yi|A=a*,M=Ma,i,C=Ci]E[Y_i|A=a^*,M=M_{a,i},C=C_i], E[Yi|A=a,M=Ma*,i,C=Ci]E[Y_i|A=a,M=M_{a^*,i},C=C_i] and E[Yi|A=a,M=Ma,i,C=Ci]E[Y_i|A=a,M=M_{a,i},C=C_i] from the regression model in step 1.

  5. Impute the counterfactuals E[Ya*m]E[Y_{a^*m}], E[Yam]E[Y_{am}], E[Ya*Ma*]E[Y_{a^*Ma^*}], E[YaMa]E[Y_{aMa}], E[YaMa*]E[Y_{aMa^*}] and E[Ya*Ma]E[Y_{a^*Ma}].

    • Impute E[Ya*m]E[Y_{a^*m}] by taking an average of {E[Yi|A=a*,M=m,C=Ci]}i=1,...,n\{E[Y_i|A=a^*,M=m,C=C_i]\}_{i=1,...,n}.
    • Impute E[Yam]E[Y_{am}] by taking an average of {E[Yi|A=a,M=m,C=Ci]}i=1,...,n\{E[Y_i|A=a,M=m,C=C_i]\}_{i=1,...,n}.
    • Impute E[Ya*Ma*]E[Y_{a^*Ma^*}] by taking an average of {E[Yi|A=a*,M=Ma*,i,C=Ci]}i=1,...,n\{E[Y_i|A=a^*,M=M_{a^*,i},C=C_i]\}_{i=1,...,n}.
    • Impute E[YaMa]E[Y_{aMa}] by taking an average of {E[Yi|A=a,M=Ma,i,C=Ci]}i=1,...,n\{E[Y_i|A=a,M=M_{a,i},C=C_i]\}_{i=1,...,n}.
    • Impute E[YaMa*]E[Y_{aMa^*}] by taking an average of {E[Yi|A=a,M=Ma*,i,C=Ci]}i=1,...,n\{E[Y_i|A=a,M=M_{a^*,i},C=C_i]\}_{i=1,...,n}.
    • Impute E[Ya*Ma]E[Y_{a^*Ma}] by taking an average of {E[Yi|A=a*,M=Ma,i,C=Ci]}i=1,...,n\{E[Y_i|A=a^*,M=M_{a,i},C=C_i]\}_{i=1,...,n}.
  6. Calculate causal effects with formulas in table 1 or table 2.

The Weighting-based Approach

With the weighting-based approach, CMAverse estimates causal effects through direct counterfactual imputation estimation by the following steps:

  1. Fit a regression model for the distribution of E(Y|A,M,C)E(Y|A,M,C). This regression model is specified by the yreg argument.

  2. If CC is not empty, fit a regression model for P(A|C)P(A|C) and obtain P(A=Ai|C=Ci)P(A=A_i|C=C_i) for i=1,...,ni=1,...,n. This regression model is specified by the ereg argument.

  3. For i=1,...,ni=1,...,n, obtain E[Yi|A=a*,M=m,C=Ci]E[Y_i|A=a^*,M=m,C=C_i], E[Yi|A=a,M=m,C=Ci]E[Y_i|A=a,M=m,C=C_i], E[Yi|A=a*,M=Mi,C=Ci]E[Y_i|A=a^*,M=M_i,C=C_i] and E[Yi|A=a,M=Mi,C=Ci]E[Y_i|A=a,M=M_i,C=C_i] from the regression model in step 1.

  4. Impute the counterfactuals E[Ya*m]E[Y_{a^*m}], E[Yam]E[Y_{am}], E[Ya*Ma*]E[Y_{a^*Ma^*}], E[YaMa]E[Y_{aMa}], E[YaMa*]E[Y_{aMa^*}] and E[Ya*Ma]E[Y_{a^*Ma}].

    • Impute E[Ya*m]E[Y_{a^*m}] by taking an average of {E[Yi|A=a*,M=m,C=Ci]}i=1,...,n\{E[Y_i|A=a^*,M=m,C=C_i]\}_{i=1,...,n}.
    • Impute E[Yam]E[Y_{am}] by taking an average of {E[Yi|A=a,M=m,C=Ci]}i=1,...,n\{E[Y_i|A=a,M=m,C=C_i]\}_{i=1,...,n}.
    • Impute E[Ya*Ma*]E[Y_{a^*Ma^*}] by taking a weighted average of {Yi}i{i:Ai=a*}\{Y_i\}_{i\in \{i:A_i=a^*\}}, and each subject i is given a weight P(A=Ai)P(A=Ai|C=Ci)\frac{P(A=A_i)}{P(A=A_i|C=C_i)}.
    • Impute E[YaMa]E[Y_{aMa}] by taking a weighted average of {Yi}i{i:Ai=a}\{Y_i\}_{i\in \{i:A_i=a\}}, and each subject i is given a weight P(A=Ai)P(A=Ai|C=Ci)\frac{P(A=A_i)}{P(A=A_i|C=C_i)}.
    • Impute E[YaMa*]E[Y_{aMa^*}] by taking a weighted average of {E[Yi|A=a,M=Mi,C=Ci]}i{i:Ai=a*}\{E[Y_i|A=a,M=M_i,C=C_i]\}_{i\in \{i:A_i=a^*\}}, and each subject i is given a weight P(A=Ai)P(A=Ai|C=Ci)\frac{P(A=A_i)}{P(A=A_i|C=C_i)}.
    • Impute E[Ya*Ma]E[Y_{a^*Ma}] by taking a weighted average of {E[Yi|A=a*,M=Mi,C=Ci]}i{i:Ai=a}\{E[Y_i|A=a^*,M=M_i,C=C_i]\}_{i\in \{i:A_i=a\}}, and each subject i is given a weight P(A=Ai)P(A=Ai|C=Ci)\frac{P(A=A_i)}{P(A=A_i|C=C_i)}.
  5. Calculate causal effects with formulas in table 1 or table 2.

The Inverse Odds Ratio Weighting Approach

With the inverse odds ratio weighting approach, CMAverse estimates causal effects through direct counterfactual imputation estimation by the following steps:

  1. Fit a regression model for P(A|M,C)P(A|M,C) and obtain P(A=0|M=Mi,C=Ci)P(A=Ai|M=Mi,C=Ci)\frac{P(A=0|M=M_i,C=C_i)}{P(A=A_i|M=M_i,C=C_i)} for i=1,...,ni=1,...,n. This regression model is specified by the ereg argument.

  2. Fit a regression model for E(Y|A,C)E(Y|A,C). This regression model is specified by the yreg argument.

  3. Fit a weighted regression model for E(Y|A,C)E(Y|A,C) and each subject ii is given a weight P(A=0|M=Mi,C=Ci)P(A=Ai|M=Mi,C=Ci)\frac{P(A=0|M=M_i,C=C_i)}{P(A=A_i|M=M_i,C=C_i)}. This regression model is obtained by adding the weights to the regression model in step 2.

  4. Impute the counterfactuals Etot[Ya]E_{tot}[Y_{a}], Etot[Ya*]E_{tot}[Y_{a^*}], Edir[Ya]E_{dir}[Y_{a}] and Edir[Ya*]E_{dir}[Y_{a^*}].

    • Impute Etot[Ya]E_{tot}[Y_{a}] by taking an average of {E[Yi|A=a,C=Ci]}i=1,...,n\{E[Y_i|A=a,C=C_i]\}_{i=1,...,n} obtained from the regression model in step 2.
    • Impute Etot[Ya*]E_{tot}[Y_{a^*}] by taking an average of {E[Yi|A=a*,C=Ci]}i=1,...,n\{E[Y_i|A=a^*,C=C_i]\}_{i=1,...,n} obtained from the regression model in step 2.
    • Impute Edir[Ya]E_{dir}[Y_{a}] by taking a weighted average of {E[Yi|A=a,C=Ci]}i=1,...,n\{E[Y_i|A=a,C=C_i]\}_{i=1,...,n} obtained from the regression model in step 3 and each subject ii is given a weight P(A=0|M=Mi,C=Ci)P(A=Ai|M=Mi,C=Ci)\frac{P(A=0|M=M_i,C=C_i)}{P(A=A_i|M=M_i,C=C_i)}.
    • Impute Edir[Ya*]E_{dir}[Y_{a^*}] by taking a weighted average of {E[Yi|A=a*,C=Ci]}i=1,...,n\{E[Y_i|A=a^*,C=C_i]\}_{i=1,...,n} obtained from the regression model in step 3 and each subject ii is given a weight P(A=0|M=Mi,C=Ci)P(A=Ai|M=Mi,C=Ci)\frac{P(A=0|M=M_i,C=C_i)}{P(A=A_i|M=M_i,C=C_i)}.
    • For a continuous outcome, calculate TETE by Etot[Ya]Etot[Ya*]E_{tot}[Y_{a}]-E_{tot}[Y_{a^*}], calculate PNDEPNDE by Edir[Ya]Edir[Ya*]E_{dir}[Y_{a}]-E_{dir}[Y_{a^*}] and calculate TNIETNIE by TEPNDETE-PNDE.
    • For a categorical, count or survival outcome, calculate TETE by Etot[Ya]/Etot[Ya*]E_{tot}[Y_{a}]/E_{tot}[Y_{a^*}], calculate PNDEPNDE by Edir[Ya]/Edir[Ya*]E_{dir}[Y_{a}]/E_{dir}[Y_{a^*}] and calculate TNIETNIE by TE/PNDETE/PNDE.

The Natural Effect Model

With the natural effect model, CMAverse estimates causal effects through direct counterfactual imputation estimation by the following steps:

  1. Fit a regression model for E(Y|A,M,C)E(Y|A,M,C). This regression model is specified by the yreg argument.

  2. Expand the dataset using the regression model in step 1 and the neImpute function in the medflex package. The expanded dataset gives A0A0 for the direct effect and A1A1 for the indirect effect.

  3. Fit the regression model in step 1 by the expanded dataset in step 2 with the exposure in the regression formula replaced by A0A0 and mediators in the regression formula replaced by A1A1, i.e., YA0+A1+A0*A1+CY\sim A0+A1+A0*A1+C if the regression formula in step 1 is YA+M1+M2+A*M1+CY\sim A+M_1+M_2+A*M_1+C.

  4. For i=1,...,ni=1,...,n, obtain E[Yi|A=a*,M=m,C=Ci]E[Y_i|A=a^*,M=m,C=C_i] and E[Yi|A=a,M=m,C=Ci]E[Y_i|A=a,M=m,C=C_i] from the regression model in step 1; obtain E[Yi|A0=a*,A1=a*,C=Ci]E[Y_i|A0=a^*,A1=a^*,C=C_i], E[Yi|A0=a*,A1=a,C=Ci]E[Y_i|A0=a^*,A1=a,C=C_i], E[Yi|A0=a,A1=a*,C=Ci]E[Y_i|A0=a,A1=a^*,C=C_i] and E[Yi|A0=a,A1=a,C=Ci]E[Y_i|A0=a,A1=a,C=C_i] from the regression model in step 3.

  5. Impute the counterfactuals E[Ya*m]E[Y_{a^*m}], E[Yam]E[Y_{am}], E[Ya*Ma*]E[Y_{a^*Ma^*}], E[YaMa]E[Y_{aMa}], E[YaMa*]E[Y_{aMa^*}] and E[Ya*Ma]E[Y_{a^*Ma}].

    • Impute E[Ya*m]E[Y_{a^*m}] by taking an average of {E[Yi|A=a*,M=m,C=Ci]}i=1,...,n\{E[Y_i|A=a^*,M=m,C=C_i]\}_{i=1,...,n}.
    • Impute E[Yam]E[Y_{am}] by taking an average of {E[Yi|A=a,M=m,C=Ci]}i=1,...,n\{E[Y_i|A=a,M=m,C=C_i]\}_{i=1,...,n}.
    • Impute E[Ya*Ma*]E[Y_{a^*Ma^*}] by taking an average of {E[Yi|A0=a*,A1=a*,C=Ci]}i=1,...,n\{E[Y_i|A0=a^*,A1=a^*,C=C_i]\}_{i=1,...,n}.
    • Impute E[YaMa]E[Y_{aMa}] by taking an average of {E[Yi|A0=a,A1=a,C=Ci]}i=1,...,n\{E[Y_i|A0=a,A1=a,C=C_i]\}_{i=1,...,n}.
    • Impute E[YaMa*]E[Y_{aMa^*}] by taking an average of {E[Yi|A0=a,A1=a*,C=Ci]}i=1,...,n\{E[Y_i|A0=a,A1=a^*,C=C_i]\}_{i=1,...,n}.
    • Impute E[Ya*Ma]E[Y_{a^*Ma}] by taking an average of {E[Yi|A0=a*,A1=a,C=Ci]}i=1,...,n\{E[Y_i|A0=a^*,A1=a,C=C_i]\}_{i=1,...,n}.
  6. Calculate causal effects with formulas in table 1 or table 2.

Confounders Affected by the exposure

DAG

Estimands

For a continuous outcome, causal effects are estimated on the difference scale (summarized in table 3). For a categorical, count, or survival outcome, causal effects are estimated on the ratio scale (summarized in table 4). Because of the existence of LL, some causal effects in table 1 and table 2 are not identifiable. However, their randomized analogues are still identifiable. See VanderWeele et al. (2014) for details about randomized analogues of causal effects.

Table 3: Causal Effects on the Difference Scale
Full Name Abbreviation Formula
Controlled Direct Effect CDECDE E[YamYa*m]E[Y_{am}-Y_{a^*m}]
Randomized Analogue of PNDEPNDE rPNDErPNDE E[YaGa*Ya*Ga*]E[Y_{aG_{a^*}}-Y_{a^*G_{a^*}}]
Randomized Analogue of TNDETNDE rTNDErTNDE E[YaGaYa*Ga]E[Y_{aG_{a}}-Y_{a^*G_{a}}]
Randomized Analogue of PNIEPNIE rPNIErPNIE E[Ya*GaYa*Ga*]E[Y_{a^*G_{a}}-Y_{a^*G_{a^*}}]
Randomized Analogue of TNIETNIE rTNIErTNIE E[YaGaYaGa*]E[Y_{aG_{a}}-Y_{aG_{a^*}}]
Total Effect TETE rPNDE+rTNIErPNDE+rTNIE or rTNDE+rPNIErTNDE+rPNIE
Randomized Analogue of INTrefINT_{ref} rINTrefrINT_{ref} rPNDECDErPNDE-CDE
Randomized Analogue of INTmedINT_{med} rINTmedrINT_{med} rTNIErPNIErTNIE-rPNIE
Proportion CDECDE propCDEprop^{CDE} CDE/TECDE/TE
Proportion rINTrefrINT_{ref} proprINTrefprop^{rINT_{ref}} rINTref/TErINT_{ref}/TE
Proportion rINTmedrINT_{med} proprINTmedprop^{rINT_{med}} rINTmed/TErINT_{med}/TE
Proportion rPNIErPNIE proprPNIEprop^{rPNIE} rPNIE/TErPNIE/TE
Randomized Analogue of PMPM rPMrPM rTNIE/TErTNIE/TE
Randomized Analogue of INTINT rINTrINT (rINTref+rINTmed)/TE(rINT_{ref}+rINT_{med})/TE
Randomized Analogue of PEPE rPErPE (rINTref+rINTmed+rPNIE)/TE(rINT_{ref}+rINT_{med}+rPNIE)/TE
Note:
aa and a*a^* are the active and control values for AA. mm is the value at which MM is controlled. GaG_{a} denotes a random draw from the distribution of MM had A=aA=a. YamY_{am} denotes the counterfactual value of YY that would have been observed had AA been set to be aa, and MM to be mm. YaGa*Y_{aG_{a*}} denotes the counterfactual value of YY that would have been observed had AA been set to be aa, and MM to be the counterfactual value Ga*G_{a*}.
Table 4: Causal Effects on the Ratio Scale
Full Name Abbreviation Formula
Controlled Direct Effect RCDER^{CDE} E[Yam]/E[Ya*m]E[Y_{am}]/E[Y_{a^*m}]
Randomized Analogue of PNDEPNDE rRPNDErR^{PNDE} E[YaGa*]/E[Ya*Ga*]E[Y_{aG_{a^*}}]/E[Y_{a^*G_{a^*}}]
Randomized Analogue of TNDETNDE rRTNDErR^{TNDE} E[YaGa]/E[Ya*Ga]E[Y_{aG_{a}}]/E[Y_{a^*G_{a}}]
Randomized Analogue of PNIEPNIE rRPNIErR^{PNIE} E[Ya*Ga]/E[Ya*Ga*]E[Y_{a^*G_{a}}]/E[Y_{a^*G_{a^*}}]
Randomized Analogue of TNIETNIE rRTNIErR^{TNIE} E[YaGa]/E[YaGa*]E[Y_{aG_{a}}]/E[Y_{aG_{a^*}}]
Total Effect RTER^{TE} rRPNDE×rRTNIErR^{PNDE}\times rR^{TNIE} or rRTNDE×rRPNIErR^{TNDE}\times rR^{PNIE}
Excess Ratio due to Controlled Direct Effect ERCDEER^{CDE} (E[YamYa*m])/E[Ya*Ma*](E[Y_{am}-Y_{a^*m}])/E[Y_{a^*M_a^*}]
Randomized Analogue of ERINTrefER^{INT_{ref}} rERINTrefrER^{INT_{ref}} rRPNDE1ERCDErR^{PNDE}-1-ER^{CDE}
Randomized Analogue of ERINTmedER^{INT_{med}} rERINTmedrER^{INT_{med}} rRTNIE*rRPNDErRPNDErRPNIE+1rR^{TNIE}*rR^{PNDE}-rR^{PNDE}-rR^{PNIE}+1
Randomized Analogue of ERPNIEER^{PNIE} rERPNIErER^{PNIE} rRPNIE1rR^{PNIE}-1
Proportion ERCDEER^{CDE} propERCDEprop^{ER^{CDE}} ERCDE/(rRTE1)ER^{CDE}/(rR^{TE}-1)
Proportion rERINTrefrER^{INT_{ref}} proprERINTrefprop^{rER^{INT_{ref}}} rERINTref/(RTE1)rER^{INT_{ref}}/(R^{TE}-1)
Proportion rERINTmedrER^{INT_{med}} proprERINTmedprop^{rER^{INT_{med}}} rERINTmed/(RTE1)rER^{INT_{med}}/(R^{TE}-1)
Proportion rERPNIErER^{PNIE} proprERPNIEprop^{rER^{PNIE}} rERPNIE/(RTE1)rER^{PNIE}/(R^{TE}-1)
Randomized Analogue of PMPM rPMrPM (rRPNDE*(rRTNIE1))/(RTE1)(rR^{PNDE}*(rR^{TNIE}-1))/(R^{TE}-1)
Randomized Analogue of INTINT rINTrINT (rERINTref+rERINTmed)/(RTE1)(rER^{INT_{ref}}+rER^{INT_{med}})/(R^{TE}-1)
Randomized Analogue of PEPE rPErPE (rERINTref+rERINTmed+rERPNIE)/(RTE1)(rER^{INT_{ref}}+rER^{INT_{med}}+rER^{PNIE})/(R^{TE}-1)
Note:
aa and a*a^* are the active and control values for AA. mm is the value at which MM is controlled. GaG_{a} denotes a random draw from the distribution of MM among those with A=aA=a. YamY_{am} denotes the counterfactual value of YY that would have been observed had AA been set to be aa, and MM to be mm. YaGa*Y_{aG_{a*}} denotes the counterfactual value of YY that would have been observed had AA been set to be aa, and MM to be the counterfactual value Ga*G_{a*}. If YY is categorical, E[Y]E[Y] represents the probability of Y=yY=y where yy is a pre-specified value of YY.

The Marginal Structural Model

With the marginal structural model, CMAverse estimates causal effects through direct counterfactual imputation estimation by the following steps:

  1. For p=1,...,kp=1,...,k, fit the regression model specified by wmnomreg[p] for P(Mp|A,M1,...,Mp1)P(M_p|A,M_1, ...,M_{p-1}) and obtain P(Mp=Mp,i|A=Ai,M1=M1,i,...,Mp1=Mp1,i)P(M_p=M_{p,i}|A=A_i,M_1=M_{1,i},...,M_{p-1}=M_{p-1,i}) for i=1,...,ni=1,...,n.

  2. For p=1,...,kp=1,...,k, fit the regression model specified by wmdenomreg[p] for P(Mp|A,M1,...,Mp1,L,C)P(M_p|A,M_1, ...,M_{p-1},L,C) and obtain P(Mp=Mp,i|A=Ai,M1=M1,i,...,Mp1=Mp1,i,L=Li,C=Ci)P(M_p=M_{p,i}|A=A_i,M_1=M_{1,i},...,M_{p-1}=M_{p-1,i},L=L_i,C=C_i) for i=1,...,ni=1,...,n.

  3. If CC is not empty, fit the regression model specified by ereg for P(A|C)P(A|C) and obtain P(A=Ai|C=Ci)P(A=A_i|C=C_i) for i=1,...,ni=1,...,n.

  4. Add weights to the regression model specified by yreg for E(Y|A,M)E(Y|A,M) and each subject i,i=1,...,ni,i=1,...,n is given a weight P(A=Ai)P(A=Ai|C=Ci)P(M1=M1,i|A=Ai)P(M1=M1,i|A=Ai,C=Ci,L=Li)...P(Mk=Mk,i|A=Ai,M1=M1,i,...,Mk1=Mk1,i)P(Mk=Mk,i|A=Ai,M1=M1,i,...,Mk1=Mk1,i,C=Ci,L=Li)\frac{P(A=A_i)}{P(A=A_i|C=C_i)}\frac{P(M_1=M_{1,i}|A=A_i)}{P(M_1=M_{1,i}|A=A_i,C=C_i,L=L_i)}...\frac{P(M_k=M_{k,i}|A=A_i,M_1=M_{1,i},...,M_{k-1}=M_{k-1,i})}{P(M_k=M_{k,i}|A=A_i,M_1=M_{1,i},...,M_{k-1}=M_{k-1,i},C=C_i,L=L_i)}.

  5. For p=1,...,kp=1,...,k, add weights to the regression model specified by mreg[p] for the distribution of MpM_p given AA and each subject i,i=1,...,ni,i=1,...,n is given a weight P(A=Ai)P(A=Ai|C=Ci)\frac{P(A=A_i)}{P(A=A_i|C=C_i)}.

  6. For p=1,...,kp=1,...,k and i=1,...,ni=1,...,n, simulate the counterfactuals Ma,p,iM_{a,p,i} and Ma*,p,iM_{a^*,p,i} from the regression models in step 5.

    • Simulate Ma,p,iM_{a,p,i} by randomly drawing a value from the distribution of MpM_{p} given A=aA=a. Denote Ma,i=(Ma,1,i,...,Ma,k,i)TM_{a,i}=(M_{a,1,i},...,M_{a,k,i})^T.
    • Simulate Ma*,p,iM_{a^*,p,i} by randomly drawing a value from the distribution of MpM_{p} given A=a*A=a^*. Denote Ma*,i=(Ma*,1,i,...,Ma*,k,i)TM_{a^*,i}=(M_{a^*,1,i},...,M_{a^*,k,i})^T.
  7. For i=1,...,ni=1,...,n, obtain E[Yi|A=a*,M=m]E[Y_i|A=a^*,M=m], E[Yi|A=a,M=m]E[Y_i|A=a,M=m], E[Yi|A=a*,M=Ma*,i]E[Y_i|A=a^*,M=M_{a^*,i}], E[Yi|A=a*,M=Ma,i]E[Y_i|A=a^*,M=M_{a,i}], E[Yi|A=a,M=Ma*,i]E[Y_i|A=a,M=M_{a^*,i}] and E[Yi|A=a,M=Ma,i]E[Y_i|A=a,M=M_{a,i}] from the regression model in step 4.

  8. Impute the counterfactuals E[Ya*m]E[Y_{a^*m}], E[Yam]E[Y_{am}], E[Ya*Ga*]E[Y_{a^*Ga^*}], E[YaGa]E[Y_{aGa}], E[YaGa*]E[Y_{aGa^*}] and E[Ya*Ga]E[Y_{a^*Ga}].

    • Impute E[Ya*m]E[Y_{a^*m}] by taking a weighted average of {E[Yi|A=a*,M=m]}i=1,...,n\{E[Y_i|A=a^*,M=m]\}_{i=1,...,n};
    • Impute E[Yam]E[Y_{am}] by taking a weighted average of {E[Yi|A=a,M=m]}i=1,...,n\{E[Y_i|A=a,M=m]\}_{i=1,...,n};
    • Impute E[Ya*Ga*]E[Y_{a^*Ga^*}] by taking a weighted average of {E[Yi|A=a*,M=Ma*,i]}i=1,...,n\{E[Y_i|A=a^*,M=M_{a^*,i}]\}_{i=1,...,n};
    • Impute E[YaGa]E[Y_{aGa}] by taking a weighted average of {E[Yi|A=a,M=Ma,i]}i=1,...,n\{E[Y_i|A=a,M=M_{a,i}]\}_{i=1,...,n};
    • Impute E[YaGa*]E[Y_{aGa^*}] by taking a weighted average of {E[Yi|A=a,M=Ma*,i]}i=1,...,n\{E[Y_i|A=a,M=M_{a^*,i}]\}_{i=1,...,n};
    • Impute E[Ya*Ga]E[Y_{a^*Ga}] by taking a weighted average of {E[Yi|A=a*,M=Ma,i]}i=1,...,n\{E[Y_i|A=a^*,M=M_{a,i}]\}_{i=1,...,n},

    each subject i,i=1,...,ni,i=1,...,n is given a weight P(A=Ai)P(A=Ai|C=Ci)P(M1=M1,i|A=Ai)P(M1=M1,i|A=Ai,C=Ci,L=Li)...P(Mk=Mk,i|A=Ai,M1=M1,i,...,Mk1=Mk1,i)P(Mk=Mk,i|A=Ai,M1=M1,i,...,Mk1=Mk1,i,C=Ci,L=Li)\frac{P(A=A_i)}{P(A=A_i|C=C_i)}\frac{P(M_1=M_{1,i}|A=A_i)}{P(M_1=M_{1,i}|A=A_i,C=C_i,L=L_i)}...\frac{P(M_k=M_{k,i}|A=A_i,M_1=M_{1,i},...,M_{k-1}=M_{k-1,i})}{P(M_k=M_{k,i}|A=A_i,M_1=M_{1,i},...,M_{k-1}=M_{k-1,i},C=C_i,L=L_i)}.

  9. Calculate causal effects with formulas in table 3 or table 4.

The g-formula Approach

With the gg-formula approach, CMAverse estimates causal effects through direct counterfactual imputation estimation by the following steps:

  1. For q=1,...,sq=1,...,s, fit the regression model specified by postcreg[q] for the distribution of LqL_q given AA and CC.

  2. For q=1,...,sq=1,...,s and i=1,...,ni=1,...,n, simulate the counterfactuals La,q,iL_{a,q,i} and La*,q,iL_{a^*,q,i} from the regression models in step 1.

    • Simulate La,q,iL_{a,q,i} by randomly drawing a value from the distribution of LqL_{q} given A=a,C=CiA=a,C=C_i. Denote La,i=(La,1,i,...,La,s,i)TL_{a,i}=(L_{a,1,i},...,L_{a,s,i})^T.
    • Simulate La*,q,iL_{a^*,q,i} by randomly drawing a value from the distribution of LqL_{q} given A=a*,C=CiA=a^*,C=C_i. Denote La*,i=(La*,1,i,...,La*,s,i)TL_{a^*,i}=(L_{a^*,1,i},...,L_{a^*,s,i})^T.
  3. For p=1,...,kp=1,...,k, fit the regression model specified by mreg[p] for the distribution of MpM_p given AA, LL and CC.

  4. For p=1,...,kp=1,...,k and i=1,...,ni=1,...,n, simulate the counterfactuals Ma,p,iM_{a,p,i} and Ma*,p,iM_{a^*,p,i} from the regression models in step 3.

    • Simulate Ma,p,iM_{a,p,i} by randomly drawing a value from the distribution of MpM_{p} given A=a,L=La,i,C=CiA=a,L=L_{a,i},C=C_i. Denote Ma,i=(Ma,1,i,...,Ma,k,i)TM_{a,i}=(M_{a,1,i},...,M_{a,k,i})^T.
    • Simulate Ma*,p,iM_{a^*,p,i} by randomly drawing a value from the distribution of MpM_{p} given A=a*,L=La*,i,C=CiA=a^*,L=L_{a^*,i},C=C_i. Denote Ma*,i=(Ma*,1,i,...,Ma*,k,i)TM_{a^*,i}=(M_{a^*,1,i},...,M_{a^*,k,i})^T.
  5. Obtain {Ga,i}i=1,...,n\{G_{a,i}\}_{i=1,...,n} by randomly permuting {Ma,i}i=1,...,n\{M_{a,i}\}_{i=1,...,n} and obtain {Ga*,i}i=1,...,n\{G_{a^*,i}\}_{i=1,...,n} by randomly permuting {Ma*,i}i=1,...,n\{M_{a^*,i}\}_{i=1,...,n}.

  6. Fit the regression model specified by yreg for E(Y|A,M,L,C)E(Y|A,M,L,C).

  7. For i=1,...,ni=1,...,n, obtain E[Yi|A=a*,M=m,L=La*,i,C=Ci]E[Y_i|A=a^*,M=m,L=L_{a^*,i},C=C_i], E[Yi|A=a,M=m,L=La,i,C=Ci]E[Y_i|A=a,M=m,L=L_{a,i},C=C_i], E[Yi|A=a*,M=Ga*,i,L=La*,i,C=Ci]E[Y_i|A=a^*,M=G_{a^*,i},L=L_{a^*,i},C=C_i], E[Yi|A=a*,M=Ga,i,L=La*,i,C=Ci]E[Y_i|A=a^*,M=G_{a,i},L=L_{a^*,i},C=C_i], E[Yi|A=a,M=Ga*,i,L=La,i,C=Ci]E[Y_i|A=a,M=G_{a^*,i},L=L_{a,i},C=C_i] and E[Yi|A=a,M=Ga,i,L=La,i,C=Ci]E[Y_i|A=a,M=G_{a,i},L=L_{a,i},C=C_i] from the regression model in step 5.

  8. Impute the counterfactuals E[Ya*m]E[Y_{a^*m}], E[Yam]E[Y_{am}], E[Ya*Ga*]E[Y_{a^*Ga^*}], E[YaGa]E[Y_{aGa}], E[YaGa*]E[Y_{aGa^*}] and E[Ya*Ga]E[Y_{a^*Ga}].

    • Impute E[Ya*m]E[Y_{a^*m}] by taking an average of {E[Yi|A=a*,M=m,L=La*,i,C=Ci]}i=1,...,n\{E[Y_i|A=a^*,M=m,L=L_{a^*,i},C=C_i]\}_{i=1,...,n};
    • Impute E[Yam]E[Y_{am}] by taking an average of {E[Yi|A=a,M=m,L=La,i,C=Ci]}i=1,...,n\{E[Y_i|A=a,M=m,L=L_{a,i},C=C_i]\}_{i=1,...,n};
    • Impute E[Ya*Ga*]E[Y_{a^*Ga^*}] by taking an average of {E[Yi|A=a*,M=Ga*,i,L=La*,i,C=Ci]}i=1,...,n\{E[Y_i|A=a^*,M=G_{a^*,i},L=L_{a^*,i},C=C_i]\}_{i=1,...,n};
    • Impute E[YaGa]E[Y_{aGa}] by taking an average of {E[Yi|A=a,M=Ga,i,L=La,i,C=Ci]}i=1,...,n\{E[Y_i|A=a,M=G_{a,i},L=L_{a,i},C=C_i]\}_{i=1,...,n};
    • Impute E[YaGa*]E[Y_{aGa^*}] by taking an average of {E[Yi|A=a,M=Ga*,i,L=La,i,C=Ci]}i=1,...,n\{E[Y_i|A=a,M=G_{a^*,i},L=L_{a,i},C=C_i]\}_{i=1,...,n};
    • Impute E[Ya*Ga]E[Y_{a^*Ga}] by taking an average of {E[Yi|A=a*,M=Ga,i,L=La*,i,C=Ci]}i=1,...,n\{E[Y_i|A=a^*,M=G_{a,i},L=L_{a^*,i},C=C_i]\}_{i=1,...,n},
  9. Calculate causal effects with formulas in table 3 or table 4.