Nonlinear latent variable models
ML-inference in non-linear SEMs is complex. Computational intensive methods based on numerical integration are needed and results are sensitive to distributional assumptions.
In a recent paper: A two-stage estimation procedure for non-linear structural equation models by Klaus Kähler Holst & Esben Budtz-Jørgensen (https://doi.org/10.1093/biostatistics/kxy082), we consider two-stage estimators as a computationally simple alternative to MLE. Here both steps are based on linear models: first we predict the non-linear terms and then these are related to latent outcomes in the second step.

Introduction: the measurement error problem
Measurement error in covariates is a common source of bias in regression analyses. In a simple linear regression setting, let
where the measurement error terms
In the following, generalizations of the above problem to general latent variable models including non-linear effects of the exposure variable are considered.
Statistical model
For subjects
where
The latent predictors (
where diagonal elements in

The observed variables
where the error terms
Two-stage estimator (2SSEM)
- The linear SEM given by equations (
) and is ( ) is fitted (e.g., MLE) to to obtain consistent estimate of . - The parameter
is estimated via a linear SEM with measurement model given by equation ( ) and structural model given by equation ( ), where the latent predictor, , is replaced by the Empirical Bayes plug-in estimate .
Prediction of non-linear latent terms
For important classes of non-linear functions (
where
Polynomials
Here
Exponentials
For the exponential function
The conditional mean of functions on the form
Interactions
Product-interaction model
now
Splines
A natural cubic spline with
with
where
Relaxing distributional assumptions
Let
with
Results can be extended also to the case where
Asymptotics
Theorem.
Under correctly specified model 2SSEM will yield consistent estimation of
all parameters
Proof: (intuitive version - linear models and Berkson error)
Let
with Berkson error and . Let . Berkson error does not lead to bias in , but residual variance is too high Follows from noting that we have Berkson error in the second step: . Iterated conditional means: .
Assume that we have i.i.d. observations
We will assume that the stage 1 model estimator is obtained as the solution to the following score equation:
and that
- The estimator of the stage 1 model is consistent, linear, regular, and asymptotic normal.
is twice continuous differentiable in a neighbourhood around the true (limiting) parameters . Further, converges uniformly to in a neighbourhood around ,- and when evaluated here
is positive definite.
This implies the following i.i.d. decomposition of the two-stage estimator
where
Implementation
Installation in R
:
library(lava)
library(magrittr)
Simulation
f <- function(x) cos(1.25*x) + x - 0.25*x^2
m <- lvm(x1+x2+x3 ~ eta1,
y1+y2+y3 ~ eta2,
eta1+eta2 ~ z,
latent=~eta1+eta2)
functional(m, eta2~eta1) <- f
# Default: all parameter values are 1. Here we change the covariate effect on eta1
d <- sim(m, n=200, seed=1, p=c('eta1~z'=-1))
head(d)
x1 x2 x3 eta1 y1 y2 y3
1 -1.3117148 -0.2758592 0.3891800 -0.6852610 0.6565382 2.8784121 0.1864112
2 1.6733480 3.1785780 3.3853595 1.4897047 -0.9867733 1.9512415 2.7624733
3 0.5661292 2.9883463 0.7987605 1.4017578 0.5694039 -1.2966555 -2.2827075
4 1.7946719 -0.1315167 -0.1914767 0.1993911 0.7221921 0.9447854 -1.3720646
5 0.3702222 -2.2445211 -0.3755076 0.0407144 -0.3144152 0.3546089 0.9828617
6 -2.8786355 0.4394945 -2.4338245 -2.0581671 -4.1310534 -5.6157543 -3.0456611
eta2 z
1 1.7434470 0.34419403
2 0.8393097 0.01271984
3 -0.4258779 -0.87345013
4 0.7340538 0.34280028
5 0.2852132 -0.17738775
6 -3.9531055 0.92143325
Specification of stage 1
m1 <- lvm(x1+x2+x3 ~ eta1, eta1 ~ z, latent = ~ eta1)
Specification of stage 2:
m2 <- lvm() %>%
regression(y1+y2+y3 ~ eta2) %>%
regression(eta2 ~ z) %>%
latent(~ eta2) %>%
nonlinear(type="quadratic", eta2 ~ eta1)
Estimation:
(mm <- twostage(m1, m2, data=d))
Estimate Std. Error Z-value P-value
Measurements:
y2~eta2 0.98628 0.04612 21.38536 <1e-12
y3~eta2 0.97614 0.05203 18.76166 <1e-12
Regressions:
eta2~z 1.08890 0.17735 6.13990 8.257e-10
eta2~eta1_1 1.13932 0.15548 7.32770 <1e-12
eta2~eta1_2 -0.38404 0.05770 -6.65582 2.817e-11
Intercepts:
y2 -0.09581 0.11350 -0.84413 0.3986
y3 0.01440 0.10849 0.13273 0.8944
eta2 0.50414 0.17550 2.87264 0.004071
Residual Variances:
y1 1.27777 0.18980 6.73206
y2 1.02924 0.13986 7.35895
y3 0.82589 0.14089 5.86181
eta2 1.94918 0.26911 7.24305
pf <- function(p) p["eta2"]+p["eta2~eta1_1"]*u + p["eta2~eta1_2"]*u^2
plot(mm,f=pf,data=data.frame(u=seq(-2,2,length.out=100)),lwd=2)
