Wednesday, April 20, 2016

On estimation of confounders

I came to this paper on controlling for statistical confounders via Scott Alexander's Slate Star Codex website, which covers statistical topics.

Westfall and Yarkoni remind us that we use proxies to estimate the true latent constructs, e.g. a "specific survey item asking about respondents’ income bracket" to estimate "socioeconomic status". Thus, statistical arguments cannot easily control for the potentially confounding latent construct, but only for a measure of that construct, which has its own error brackets.

At the root of the problem lies the insight countervening conventional wisdom that 
The relationship between n and Type 1 error may be less obvious: all else equal, as sample size increases, error rates also increase.
The problem is exacerbated by the way reliability interacts with error rates, giving a non-monotonic relationship.
In the middle [of the reliability range, RCK], however, there exists a territory where effects are large enough to afford detection, but reliability is too low to prevent misattribution, leading to particularly high Type 1 error rates.
Furthermore, this problem is independent of the statistical analysis approach used (frequentist vs Bayesian, parameter estimation) as long as reliability is not explicitly accounted for.

Indeed, somehow getting a grip on the reliability is the core of the problem. But that is easier said than done. Westfall and Yarkoni point out that
econometric studies attempting to control for SES [i.e. socioeconomic status, cf. above, RCK] hardly ever estimate or report the reliability of the actual survey item(s) used to operationalize the SES construct ....
This is why Westfall and Yarkoni recommend structural equation modeling (SEM) in order to avoid the trap altogether. Lacking any estimations for the reliability, the researchers can then at least plot their results across a range of estimates to see how sensitive their findings are on reliability.

Bibliographic Record:
Westfall J, Yarkoni T (2016) Statistically Controlling for Confounding Constructs Is Harder than You Think. PLoS ONE 11(3): e0152719. doi:10.1371/journal.pone.0152719

No comments:

Post a Comment