IV Graph from Imbens (2014) |
Confounding refers to statistical problem that there is some unobserved characteristic of the patient that is both determining the patient's observed treatment and the patient's outcome.
For example, this study shows that older stage III colon cancer patients are much less likely to receive oxaliplatin as an adjuvant therapy than younger patients. This may be the reason that in the Medicare data, oxaliplatin is associated with bigger survival effects than in the randomized control trials. The Medicare data suffers from a confounding problem. Doctors of sicker patients may be less willing to prescribe oxaliplatin because of its side effect profile. The observed difference in survival may not be due to the use of oxaliplatin, it may simply be the fact that the non-oxaliplatin patients are sicker.
For example, this study shows that older stage III colon cancer patients are much less likely to receive oxaliplatin as an adjuvant therapy than younger patients. This may be the reason that in the Medicare data, oxaliplatin is associated with bigger survival effects than in the randomized control trials. The Medicare data suffers from a confounding problem. Doctors of sicker patients may be less willing to prescribe oxaliplatin because of its side effect profile. The observed difference in survival may not be due to the use of oxaliplatin, it may simply be the fact that the non-oxaliplatin patients are sicker.
In the graph to the right, the unobserved variable (patient "sickness") is represented by the red U. The patient's treatment (oxaliplatin or not) is represented by the black X and the patient's survival is represented by the black Y. We would like to know whether there is a blue line from X to Y, representing treatment effect of using oxaliplatin on survival. But we can't determine the treatment effect because U is affecting both X and Y through the red lines from U to X and U to Y. Sicker patients are less likely to get oxaliplatin (red line from U to X) and sicker patients have lower survival (red line from U to Y).
A standard way to solve the confounding problem is to observe (or introduce) a fourth variable (Z) which is called an "instrumental variable." As the graph shows, the instrument is some observed characteristic of the patient that determines the patient's treatment choice but is unrelated to the patient's unobserved characteristic or the patient's survival. In randomized control trials the instrument is the random number generating process that is used to assign patients to treatment arms.
In the Medicare data on the use of oxaliplatin, the instrument may be the date of the diagnosis. Patient's diagnosed earlier were much less likely to receive oxaliplatin than patient diagnosed at a later date. By looking at changes in survival over the time period of the introduction of oxaliplatin we can determine the causal effect of oxaliplatin on survival (assuming no other major changes to treatment during the same time period).
In the Medicare data on the use of oxaliplatin, the instrument may be the date of the diagnosis. Patient's diagnosed earlier were much less likely to receive oxaliplatin than patient diagnosed at a later date. By looking at changes in survival over the time period of the introduction of oxaliplatin we can determine the causal effect of oxaliplatin on survival (assuming no other major changes to treatment during the same time period).
An alternative way to solve the confounding problem is to measure all the confounding characteristics. If we observe U then we can simply measure the effect of X and U on Y. If we observe the co-morbidities of the patient we can measure the relationship between the co-morbidities and the use of oxaliplatin on survival. The problem with this approach is that we may not observe all the confounding factors.
A new paper of mine (see discussion here) suggest an alternative approach. Instead of attempting to directly measure U, we infer U from observable characteristics of the patient. Instead of attempting to directly measure the "sickness" of the patient, we look at observable characteristics of the patient like their age and use those signals to determine the distribution of patient's latent sickness type.
This mixture model approach has the advantage of not requiring instruments and not requiring that observe every possible characteristic of the patient that may be determining the treatment choice.
A new paper of mine (see discussion here) suggest an alternative approach. Instead of attempting to directly measure U, we infer U from observable characteristics of the patient. Instead of attempting to directly measure the "sickness" of the patient, we look at observable characteristics of the patient like their age and use those signals to determine the distribution of patient's latent sickness type.
This mixture model approach has the advantage of not requiring instruments and not requiring that observe every possible characteristic of the patient that may be determining the treatment choice.
No comments:
Post a Comment