Principal components analysis corrects for stratification in genome-wide association studies

AL Price, NJ Patterson, RM Plenge, ME Weinblatt… - Nature …, 2006 - nature.com
AL Price, NJ Patterson, RM Plenge, ME Weinblatt, NA Shadick, D Reich
Nature genetics, 2006nature.com
Population stratification—allele frequency differences between cases and controls due to
systematic ancestry differences—can cause spurious associations in disease studies. We
describe a method that enables explicit detection and correction of population stratification
on a genome-wide scale. Our method uses principal components analysis to explicitly
model ancestry differences between cases and controls. The resulting correction is specific
to a candidate marker's variation in frequency across ancestral populations, minimizing …
Abstract
Population stratification—allele frequency differences between cases and controls due to systematic ancestry differences—can cause spurious associations in disease studies. We describe a method that enables explicit detection and correction of population stratification on a genome-wide scale. Our method uses principal components analysis to explicitly model ancestry differences between cases and controls. The resulting correction is specific to a candidate marker's variation in frequency across ancestral populations, minimizing spurious associations while maximizing power to detect true associations. Our simple, efficient approach can easily be applied to disease studies with hundreds of thousands of markers.
nature.com