Inflation of polygenic risk scores caused by sample overlap and relatedness: Examples of a major risk of bias
- Author(s)
- Ellis, CA; Oliver, KL; Harris, RV; Ottman, R; Scheffer, IE; Mefford, HC; Epstein, MP; Berkovic, SF; Bahlo, M;
- Details
- Publication Year 2024-08-13,Volume 111,Issue #9,Page 1805-1809
- Journal Title
- American Journal of Human Genetics
- Abstract
- Polygenic risk scores (PRSs) are an important tool for understanding the role of common genetic variants in human disease. Standard best practices recommend that PRSs be analyzed in cohorts that are independent of the genome-wide association study (GWAS) used to derive the scores without sample overlap or relatedness between the two cohorts. However, identifying sample overlap and relatedness can be challenging in an era of GWASs performed by large biobanks and international research consortia. Although most genomics researchers are aware of best practices and theoretical concerns about sample overlap and relatedness between GWAS and PRS cohorts, the prevailing assumption is that the risk of bias is small for very large GWASs. Here, we present two real-world examples demonstrating that sample overlap and relatedness is not a minor or theoretical concern but an important potential source of bias in PRS studies. Using a recently developed statistical adjustment tool, we found that excluding overlapping and related samples was equal to or more powerful than adjusting for overlap bias. Our goal is to make genomics researchers aware of the magnitude of risk of bias from sample overlap and relatedness and to highlight the need for mitigation tools, including independent validation cohorts in PRS studies, continued development of statistical adjustment methods, and tools for researchers to test their cohorts for overlap and relatedness with GWAS cohorts without sharing individual-level data.
- Publisher
- Cell Press
- Keywords
- Humans; *Multifactorial Inheritance/genetics; *Genome-Wide Association Study; *Bias; *Genetic Predisposition to Disease; Cohort Studies; Polymorphism, Single Nucleotide; Female; Risk Factors; Genetic Risk Score
- Research Division(s)
- Population Health And Immunity
- PubMed ID
- 39168121
- Publisher's Version
- https://doi.org/10.1016/j.ajhg.2024.07.014
- Terms of Use/Rights Notice
- Refer to copyright notice on published article.
Creation Date: 2024-08-23 02:54:11
Last Modified: 2024-10-03 09:19:51