graph LR
X((X)) -->|a| M((M))
M -->|b| Y((Y))
X -->|c| Y
Mediation analysis (Yuan and MacKinnon 2009) allows researchers to investigate the mechanism by which an independent variable () influences a dependent variable (). Rather than just asking “Does X affect Y?”, mediation asks “Does X affect Y through an intermediate variable M?”
Common examples include:
- Psychology: Does a therapy () reduce anxiety (), which in turn improves sleep quality ()?
- Medicine: Does a new drug () lower blood pressure (), thereby decreasing the risk of heart attack ()?
In this vignette, we demonstrate how to estimate a simple mediation model using INLAvaan. We will fit a standard three-variable mediation model:
- : The effect of on .
- : The effect of on .
- : The direct effect of on .
- : The indirect effect (the mediation effect).
In a mediation model, the Total Effect represents the overall impact of on , ignoring the specific pathway. It answers the question: “If I change , how much does change in total, regardless of whether it goes through or not?”.
Data Simulation
To verify that INLAvaan recovers the correct parameters, we simulate data where the “truth” is known. The logic is as follows: Generate…
- normally;
- dependent on with a coefficient of 0.5; and
- dependent only on with a coefficient of 0.7.
Critically, we do not add to the generation of . This means the true direct effect () is 0, and the relationship is fully mediated. We expect our model to estimate , , and the indirect effect . The direct effect should be close to zero.
set.seed(11)
n <- 100 # sample size
# 1. Predictor
X <- rnorm(n)
# 2. Mediator (Path a = 0.5)
M <- 0.5 * X + rnorm(n)
# 3. Outcome (Path b = 0.7, Path c = 0)
Y <- 0.7 * M + rnorm(n)
dat <- data.frame(X = X, Y = Y, M = M)Model Specification and Fit
The standard lavaan syntax for a mediation model is straightforward (note the use of the := operator to define the indirect effect as a new parameter.):
mod <- "
# Direct effect (path c)
Y ~ c*X
# Mediator paths (path a and b)
M ~ a*X
Y ~ b*M
# Define Indirect effect (a*b)
ab := a*b
# Define Total effect
total := c + (a*b)
"The model is fit using asem(). The meanstructure = TRUE argument is supplied to estimate intercepts for the variables.
library(INLAvaan)
fit <- asem(mod, dat, meanstructure = TRUE)
#> ℹ Finding posterior mode.
#> ✔ Finding posterior mode. [34ms]
#>
#> ℹ Computing the Hessian.
#> ✔ Computing the Hessian. [83ms]
#>
#> ℹ Performing VB correction.
#> ✔ VB correction; mean |δ| = 0.011σ. [166ms]
#>
#> ⠙ Fitting skew normal to 0/7 marginals.
#> ✔ Fitting skew normal to 7/7 marginals. [210ms]
#>
#> ⠙ Computing ppp and DIC.
#> ✔ Computing ppp and DIC. [352ms]
#> The user may wish to specify different prior distributions for the parameters. See the relevant section in the Get started vignetted for further details.
Results
The summary output provides the posterior mean, standard deviation, and 95% credible intervals for all paths.
summary(fit)
#> INLAvaan 0.2.3.9004 ended normally after 5 iterations
#>
#> Estimator BAYES
#> Optimization method NLMINB
#> Number of model parameters 7
#>
#> Number of observations 100
#>
#> Model Test (User Model):
#>
#> Marginal log-likelihood -311.904
#> PPP (Chi-square) 0.598
#>
#> Information Criteria:
#>
#> Deviance (DIC) 568.397
#> Effective parameters (pD) 6.756
#>
#> Parameter Estimates:
#>
#> Marginalisation method SKEWNORM
#> VB correction TRUE
#>
#> Regressions:
#> Estimate SD 2.5% 97.5% NMAD Prior
#> Y ~
#> X (c) -0.060 0.118 -0.291 0.171 0.000 normal(0,10)
#> M ~
#> X (a) 0.525 0.108 0.315 0.736 0.000 normal(0,10)
#> Y ~
#> M (b) 0.771 0.099 0.577 0.964 0.000 normal(0,10)
#>
#> Intercepts:
#> Estimate SD 2.5% 97.5% NMAD Prior
#> .Y -0.071 0.098 -0.263 0.122 0.000 normal(0,32)
#> .M 0.126 0.099 -0.068 0.319 0.000 normal(0,32)
#>
#> Variances:
#> Estimate SD 2.5% 97.5% NMAD Prior
#> .Y 0.977 0.145 0.733 1.303 0.006 gamma(1,.5)[sd]
#> .M 0.998 0.147 0.749 1.325 0.006 gamma(1,.5)[sd]
#>
#> Defined Parameters:
#> Estimate SD 2.5% 97.5% NMAD Prior
#> ab 0.405 0.091 0.235 0.582
#> total 0.351 0.136 0.107 0.620Looking at the Regressions and Defined Parameters sections of the output:
- Both intercepts are non-significant, since we simulated data with true means of zero.
- Path (
M ~ X) estimated at 0.525 (true value 0.5). - Path (
Y ~ M) estimated at 0.771 (true value 0.7). - Path (
Y ~ X) estimated at -0.060. The 95% Credible Interval [-0.291, 0.171] includes zero, correctly identifying that there is no direct effect. - Indirect Effect estimated at 0.405 (true value 0.35). The interval [0.235, 0.582] does not cross zero, indicating significant mediation.
- Total Effect estimated at 0.351.
- This is the sum of the direct and indirect effects ().
- It tells us that a 1-unit increase in leads to a total increase of roughly 0.351 in .
- Note: In this simulation, even though the direct effect is non-significant (close to zero), the total effect is significant because the mechanism via is strong. This illustrates a “full mediation” scenario: affects , but only because of .
