A two-stage estimation procedure for non-linear structural equation models


Applications of structural equation models (SEMs) are often restricted to linear associations between variables. Maximum likelihood (ML) estimation in non-linear models may be complex and require numerical integration. Furthermore, ML inference is sensitive to distributional assumptions. In this paper, we introduce a simple two-stage estimation technique for estimation of non-linear associations between latent variables. Here both steps are based on fitting linear SEMs: first a linear model is fitted to data on the latent predictor and terms describing the non-linear effect are predicted by their conditional means. In the second step, the predictions are included in a linear model for the latent outcome variable. We show that this procedure is consistent and identifies its asymptotic distribution. We also illustrate how this framework easily allows the association between latent variables to be modelled using restricted cubic splines and we develop a modified estimator which is robust to non-normality of the latent predictor. In a simulation study, we compare the proposed method to MLE and alternative two-stage estimation techniques.

Biostatistics 2019 (To appear)