Template

Modeling the cumulative incidence function of multivariate competing risks data allowing for within-cluster dependence of risk and timing

We propose to model the cause-specific cumulative incidence function of multivariate competing risks data using a random effects model that allows for within-cluster dependence of both risk and timing. The model contains parameters that makes it possible to assess how the two are connected, e.g. if high-risk is related to early onset. Under the proposed model, the cumulative incidences of all failure causes are modeled and all cause-specific and cross-cause associations specified. Consequently, left-truncation and right-censoring are easily dealt with. The proposed model is assessed using simulation studies and applied in analysis of Danish register-based family data on breast cancer.

April 2018 · Luise Cederkvist, Klaus Kähler Holst, Klaus Kaa Andersen, Thomas Scheike
Cox regression with missing covariates

Cox regression with missing covariate data using a modified partial likelihood method

Missing covariate values is a common problem in survival analysis. In this paper we propose a novel method for the Cox regression model that is close to maximum likelihood but avoids the use of the EM-algorithm. It exploits that the observed hazard function is multiplicative in the baseline hazard function with the idea being to profile out this function before carrying out the estimation of the parameter of interest. In this step one uses a Breslow type estimator to estimate the cumulative baseline hazard function. We focus on the situation where the observed covariates are categorical which allows us to calculate estimators without having to assume anything about the distribution of the covariates. We show that the proposed estimator is consistent and asymptotically normal, and derive a consistent estimator of the variance–covariance matrix that does not involve any choice of a perturbation parameter. Moderate sample size performance of the estimators is investigated via simulation and by application to a real data example.

January 2017 · Torben Martinussen, Klaus Kähler Holst, Thomas H. Scheike
The extended liabilitiy model

The liability threshold model for censored twin data

Family studies provide an important tool for understanding etiology of diseases, with the key aim of discovering evidence of family aggregation and to determine if such aggregation can be attributed to genetic components. Heritability and concordance estimates are routinely calculated in twin studies of diseases, as a way of quantifying such genetic contribution. The endpoint in these studies are typically defined as occurrence of a disease versus death without the disease. However, a large fraction of the subjects may still be alive at the time of follow-up without having experienced the disease thus still being at risk. Ignoring this right-censoring can lead to severely biased estimates. The classical liability threshold model can be extended with inverse probability of censoring weighting of complete observations. This leads to a flexible way of modelling twin concordance and obtaining consistent estimates of heritability. The method is demonstrated in simulations and applied to data from the population based Danish twin cohort to describe the dependence in prostate cancer occurrence in twins.

January 2016 · Klaus Kähler Holst, Thomas H. Scheike, Jacob Hjelmborg
Measuring early or late dependence

Measuring early or late dependence for bivariate lifetimes of twins

We consider data from the Danish twin registry and aim to study in detail how lifetimes for twin-pairs are correlated. We consider models where we specify the marginals using a regression structure, here Cox’s regression model or the additive hazards model. The best known such model is the Clayton-Oakes model. This model can be extended in several directions. One extension is to allow the dependence parameter to depend on covariates. Another extension is to model dependence via piecewise constant cross-hazard ratio models. We show how both these models can be implemented for large sample data, and suggest a computational solution for obtaining standard errors for such models for large registry data. In addition we consider alternative models that have some computational advantages and with different dependence parameters based on odds ratios of the survival function using the Plackett distribution. We also suggest a way of assessing how and if the dependence is changing over time, by considering either truncated or right-censored versions of the data to measure late or early dependence. This can be used for formally testing if the dependence is constant, or decreasing/increasing. The proposed procedures are applied to Danish twin data to describe dependence in the lifetimes of the twins. Here we show that the early deaths are more correlated than the later deaths, and by comparing MZ and DZ associations we suggest that early deaths might be more driven by genetic factors. This conclusion requires models that are able to look at more local dependence measures. We further show that the dependence differs for MZ and DZ twins and appears to be the same for males and females, and that there are indications that the dependence increases over calendar time.

April 2013 · Thomas H. Scheike, Klaus Kähler Holst, Jacob B. Hjelmborg