Subject: [Biost514a_au09] Q & A: effect modification, confounding, precision
QUESTION:
I'm pretty confused over the way the statistical role of variables have been
defined in class - its seems that some of the different types (precision vs
confounder) are nonexclusive, and that some types (a variable that IS on a
causal pathway between predictor of interest and outcome, which you might want
to exclude due to collinearity) don't quite fit into the scheme. I think that
perhaps I'm mostly confused about what a "precision variable" is - the way I've
gathered from lecture is that it's any variable associated with the outcome of
interest, which you would want to include in your model to decrease unexplained variance.
ANSWER:
First, I should have added "exclusions due to collinearity" as a term that
people with prior statistical courses should unlearn. There is an issue there,
but collinearity is not really the deciding factor of whether a variable is
included in the model. Our guesses about causal mechanisms for previously
studied "third variables" and the predictor of interest are instead what we
should be considering. I find that most times people use the term "collinear"
as justification for excluding a variable, they have not really thought through
all the scienitific issues, and they are instead just thinking about technical
properties.
The real decision point about any "third variable" must consider both the
outcome and the predictor of interst.
Now to the major point.
Of the three types of "third variables":
Effect modifier: The association between the predictor of interest (POI) and
the outcome variable is different depending upon the value of the effect
modifier.
Confounder: Both of the following criteria must be met:
-- The confounder is causally associated with the outcome variable (in truth),
but not as a "mediator" of any causal association of interest between the POI
and the outcome. (Statistics cannot necessarily tell us about causal
associations between POI and outcome, but we are typically interested in a
causal mechanism.)
-- The confounde is associated with the POI in the sample (this need not be
causal in either direction, and it will still cause a problem).
Precision: A precision variable is thought to be be causally associated with
the outcome variable (in truth), but is not associated with the POI in the
sample. (Note that randomization might be used to ensure no association between
the third variable and the POI.)
Using the above definitions, we generally first decide if some third variable
is an effect modifier. If it is, and if our question would want to describe
associations between the POI and outcome separately for each level of the
effect modifier, then we do not really care about whether it is also a
confounder (the definition of this variable as a confounder becomes somewhat
murky here).
Then, if the third variable is not causally associated with the outcome, then
it is neither a confounder nor a precision variable, so who cares? Drop it.
(And drop it ike a hand grenade, if it is associated with the POI but not
causally with the outcome.)
If the third variable is generally thought to be causally associated with the
outcome, then in order to determine whether it is a confounder or a precision
variable, we need to know what is going on between that third variable and the
POI in the sample. Generally, in an observational study, we consider the
possibility that any variable thought to be associated causally with the
outcome is a "potential confounder". In a randomized study, we can in some
sense state that the third variable cannot be a confounder, and so would be a
precision variable (but more on this later).
We will continue to explore the roles of effect modification, confounding, and
precision variables throughout this and all ensuing courses in
statistics...these are the major difficulties in statistical analyses.
(Arguably, this is the only topic of statistics.)
Scott