Outliers Analysis in Human Case-Control Association Mapping

Yuanyuan Shen, Zhe Liu, Jurg Ott -- March 2011


Background/Aims: In human case-control association studies, population heterogeneity is often present and can lead to increased false-positive results. Various methods have been proposed and are in current use to remedy this situation.
Methods: We assume that heterogeneity is due to a relatively small number of individuals whose allele frequencies differ from those of the remainder of the sample. For this situation, we propose a new method of handling heterogeneity by removing outliers in a controlled manner. In a coordinate system of the c largest principal components in multidimensional scaling (MDS), we systematically remove one after another of the most extreme outlying individuals and each time recompute the largest association test statistic. The smallest p value obtained within M removals serves as our test statistic whose significance level is assessed in randomization
Results: In power simulations of our method and three methods in current use, averaged over several different scenarios, the best method turned out to be logistic regression analysis (based on all individuals) with MDS components as covariates.
Conclusion: Our proposed method ranked closely behind logistic regression analysis with MDS components but ahead of other commonly used approaches. In analyses of real datasets our method performed best. Copyright 2010 S. Karger AG, Basel


The MDSOutlier computer program is currently available only for Unix/Linux and Mac OS X systems and may be downloaded by clicking here. Only the latest version is required.


Shen Y, Liu Z, Ott J: Systematic removal of outliers to reduce heterogeneity in case-control association studies. Hum Hered 2010;70:227-231 (for free download click here)