Skip to main content
yin_li_foto_gunilla_sonnebring.jpg

Li Yin

Statistician

About me

I work as applied statistician at Department of Medical Epidemiology and Biostatistics.

Research description

CAUSAL INFERENCE FROM COMPLEX LONGITUDINAL DATA: CHALLENGE AND BREAKTHROUGH

In many medical practices, one treatment is hardly a done deal to influence a certain outcome, for example, the survival of a patient. Instead, treatments are assigned in a sequence to influence the outcome. Statistically, we should infer the causal effect from a treatment sequence rather than a single treatment.  

Causal effects of interest:

The net effects of individual treatments in the sequence. The net effect has the following medical significance:

  • The net effect of treatment distinguishes the effects of earlier treatments from later treatments on a certain outcome.
  • The net effects allow us to find optimized treatment: given a certain condition of the patient, say, age and prognosis, we could know which treatment would be optimal for a certain outcome.
  • It also allows us to find factors relevant to the net effects, say, if social economic factors such as income are important in the treatment under Swedish health care system.

The causal effect of a treatment sequence. The sequential causal effect has the following medical significance

  • The sequential causal effect compares the effects of different treatment regimes on the outcome.
  • It may give an optimized treatment regime for a sub population, for instance, a subpopulation of young patients.   
  • It may also give an optimized treatment regime for the whole patient population.

Estimation of the Causal effects

The well-known G-formula identifies the causal effects from longitudinal data such as the clinical data in which the time-dependent factors such as prognostic factors and side effects are outcomes of the earlier treatments as well as confounders of the subsequent treatments. Consequently, we have the well-known problem: a high (even infinite) dimensional and saturated model is needed, so it is extremely difficult to estimate and test these causal effects.

In our work (to appear in Annals of Statistics), we derived the new G-formula in which the time-dependent factors are only confounders for the subsequent treatments, so we only need a low dimensional and unsaturated model to estimate and test these causal effects.

Reference

Wang, X. and Yin, L. (2019). New G-Formula for the Sequential Causal Effect and Blip Effect of Treatment in Sequential Causal Inference. To appear in Annals of Statistics.

Wang, X. and Yin, L. (2015). Identifying and Estimating Net Effects of Treatments in Sequential Causal Inference. Electronic Journal of Statistics, 9: 1608–1643

 

 

 

Education

1982 BSc in Chemistry from Sichuan University, Chengdu, China

1989 PhD in Quantum Chemistry from Uppsala University, Sweden

1992 Post doctor in Quantum Chemistry, University of Minnesota, USA

Loading bibliometrics...