influence functions

Influence Function Calculus

Influence functions (IFs), also known as influence curves or canonical gradients, are essential for characterizing regular and asymptotic linear estimators. They enable the direct calculation of properties such as asymptotic variance and facilitate the construction of new estimators through straightforward combinations and transformations.

September 2025 · Klaus Kähler Holst
targeted inference

Targeted Inference `targeted`

The `targeted` package implements various methods for targeted learning and semiparametric inference including augmented inverse probability weighted (AIPW) estimators for missing data and causal inference (Bang and Robins (2005) <doi:10.1111/j.1541-0420.2005.00377.x>), variable importance and conditional average treatment effects (CATE) (van der Laan (2006) <doi:10.2202/1557-4679.1008>), estimators for risk differences and relative risks (Richardson et al. (2017) <doi:10.1080/01621459.2016.1192546>), assumption lean inference for generalized linear model parameters (Vansteelandt et al. (2022) <doi:10.1111/rssb.12504>).

September 2025 · Klaus Kähler Holst
Truncation by death

A framework for joint assessment of a terminal event and a score existing only in the absence of the terminal event

Analysis of data from randomized controlled trials in vulnerable populations requires special attention when assessing treatment effect by a score measuring, e.g., disease stage or activity together with onset of prevalent terminal events. In reality, it is impossible to disentangle a disease score from the terminal event, since the score is not clinically meaningful after this event. In this work, we propose to assess treatment interventions simultaneously on the terminal event and the disease score in the absence of a terminal event. Our proposal is based on a natural data-generating mechanism respecting that a disease score does not exist beyond the terminal event. We use modern semi-parametric statistical methods to provide robust and efficient estimation of the risk of terminal event and expected disease score conditional on no terminal event at a pre-specified landmark time. We also use the simultaneous asymptotic behavior of our estimators to develop a powerful closed testing procedure for confirmatory assessment of treatment effect on both onset of terminal event and level of disease score in the absence of a terminal event. A simulation study mimicking a large-scale outcome trial in chronic kidney patients as well as an analysis of that trial is provided to assess performance.

June 2025 · Klaus Kähler Holst, Andreas Nordland, Julie Funch Furberg, Lars Holm Damgaard, Christian Bressen Pipper
Policy Learning

Policy Learning with the polle package

The R package polle is a unifying framework for learning and evaluating finite stage policies based on observational data. The package implements a collection of existing and novel methods for causal policy learning including doubly robust restricted Q-learning, policy tree learning, and outcome weighted learning. The package deals with (near) positivity violations by only considering realistic policies. Highly flexible machine learning methods can be used to estimate the nuisance components and valid inference for the policy value is ensured via cross-fitting. The library is built up around a simple syntax with four main functions policy_data(), policy_def(), policy_learn(), and policy_eval() used to specify the data structure, define user-specified policies, specify policy learning methods and evaluate (learned) policies. The functionality of the package is illustrated via extensive reproducible examples.

January 2025 · Andreas Nordland, Klaus Kähler Holst
relative risk

Regression models for the relative risk

Relative risks (and risk differences) are collapsible and generally considered easier to interpret than odds-ratios. In a recent publication Richardson et al (JASA, 2017) proposed a new regression model for a binary exposure which solves the computational problems that are associated with using for example binomial regression with a log-link function (or identify link for the risk difference) to obtain such parameter estimates. Let Y be the binary response, A binary exposure, and V a vector of covariates, then the target parameter is RR(v)=P(Y=1A=1,V=v)P(Y=1A=0,V=v). Let pa(V)=P(YA=a,V),a{0,1}, the idea is then to posit a linear model for θ(v)=log(RR(v)) and a nuisance model for the odds-product ϕ(v)=log(p0(v)p1(v)(1p0(v))(1p1(v))) noting that these two parameters are variation independent which can be from the below L’Abbé plot. Similarly, a model can be constructed for the risk-difference on the following scale θ(v)=arctanh(RD(v)). ...

August 2019 · Klaus Kähler Holst