TY - JOUR
T1 - Global Biobank analyses provide lessons for developing polygenic risk scores across diverse cohorts
AU - BBJ
AU - BioMe
AU - BioVU
AU - Canadian Partnership for Tomorrow's Health/OHS
AU - China Kadoorie Biobank Collaborative Group
AU - Colorado Center for Personalized Medicine
AU - deCODE Genetics
AU - ESTBB
AU - FinnGen
AU - Generation Scotland
AU - Genes & Health
AU - LifeLines
AU - Mass General Brigham Biobank
AU - Michigan Genomics Initiative
AU - QIMR Berghofer Biobank
AU - Taiwan Biobank
AU - The HUNT Study
AU - UCLA ATLAS Community Health Initiative
AU - UKBB
AU - Wang, Ying
AU - Namba, Shinichi
AU - Lopera, Esteban
AU - Kerminen, Sini
AU - Tsuo, Kristin
AU - Läll, Kristi
AU - Kanai, Masahiro
AU - Zhou, Wei
AU - Wu, Kuan Han H.
AU - Favé, Marie Julie
AU - Bhatta, Laxmi
AU - Awadalla, Philip
AU - Brumpton, Ben
AU - Deelen, Patrick
AU - Hveem, Kristian
AU - Lo Faro, Valeria
AU - Mägi, Reedik
AU - Murakami, Yoshinori
AU - Sanna, Serena
AU - Smoller, Jordan W.
AU - Uzunovic, Jasmina
AU - Wolford, Brooke N.
AU - Wu, Kuan Han H.
AU - Rasheed, Humaira
AU - Hirbo, Jibril B.
AU - Bhattacharya, Arjun
AU - Zhao, Huiling
AU - Surakka, Ida
AU - Lopera-Maya, Esteban A.
AU - Chapman, Sinéad B.
AU - Karjalainen, Juha
AU - Kurki, Mitja
AU - Mutaamba, Maasha
AU - Partanen, Juulia J.
AU - Chavan, Sameer
AU - Chen, Tzu Ting
AU - Daya, Michelle
AU - Ding, Yi
AU - Feng, Yen Chen A.
AU - Gignoux, Christopher R.
AU - Graham, Sarah E.
AU - Hornsby, Whitney E.
AU - Ingold, Nathan
AU - Johnson, Ruth
AU - de Bock, Geertruida H.
AU - Boezen, Marike
AU - Franke, Lude
AU - Snieder, Harold
AU - Vonk, Judith M.
AU - Wijmenga, Cisca
AU - Martin, Alicia R.
AU - Hirbo, Jibril B.
N1 - Funding Information:
A.R.M. is funded by K99/R00MH117229 . E.L. is funded by the Colciencias fellowship ed.783. S.N. was supported by Takeda Science Foundation . Y.O. was supported by JSPS KAKENHI ( 22H00476 ) and AMED ( JP21gm4010006 , JP22km0405211 , JP22ek0410075 , JP22km0405217 , and JP22ek0109594 ); JST Moonshot R&D ( JPMJMS2021 and JPMJMS2024 ); Takeda Science Foundation ; and Bioinformatics Initiative of Osaka University Graduate School of Medicine, Osaka University . E.R.G. is supported by NIH awards R35HG010718 , R01HG011138 , and R01GM140287 and NIH /NIA AG068026 . V.L.F. was supported by the European Union’s Horizon 2020 research and innovation program under the Marie Skłodowska-Curie grant agreement no. 675033 (EGRET plus). L.B. and B.B. receive support from the K.G. Jebsen Center for Genetic Epidemiology funded by Stiftelsen Kristian Gerhard Jebsen; the Faculty of Medicine and Health Sciences, NTNU; the Liaison Committee for education, research and innovation in Central Norway; and the Joint Research Committee between St. Olavs Hospital and the Faculty of Medicine and Health Sciences, NTNU. K.L. and R.M. were supported by the Estonian Research Council grant PUT ( PRG687 ) and by INTERVENE. This project has received funding from the European Union’s Horizon 2020 research and innovation program under grant agreement no. 101016775 . W.Z. was supported by the NHGRI of the NIH under award number T32HG010464 . The work of the contributing biobanks was supported by numerous grants from governmental and charitable bodies ( Data S1 ).
Funding Information:
A.R.M. is funded by K99/R00MH117229. E.L. is funded by the Colciencias fellowship ed.783. S.N. was supported by Takeda Science Foundation. Y.O. was supported by JSPS KAKENHI (22H00476) and AMED (JP21gm4010006, JP22km0405211, JP22ek0410075, JP22km0405217, and JP22ek0109594); JST Moonshot R&D (JPMJMS2021 and JPMJMS2024); Takeda Science Foundation; and Bioinformatics Initiative of Osaka University Graduate School of Medicine, Osaka University. E.R.G. is supported by NIH awards R35HG010718, R01HG011138, and R01GM140287 and NIH/NIA AG068026. V.L.F. was supported by the European Union's Horizon 2020 research and innovation program under the Marie Skłodowska-Curie grant agreement no. 675033 (EGRET plus). L.B. and B.B. receive support from the K.G. Jebsen Center for Genetic Epidemiology funded by Stiftelsen Kristian Gerhard Jebsen; the Faculty of Medicine and Health Sciences, NTNU; the Liaison Committee for education, research and innovation in Central Norway; and the Joint Research Committee between St. Olavs Hospital and the Faculty of Medicine and Health Sciences, NTNU. K.L. and R.M. were supported by the Estonian Research Council grant PUT (PRG687) and by INTERVENE. This project has received funding from the European Union's Horizon 2020 research and innovation program under grant agreement no. 101016775. W.Z. was supported by the NHGRI of the NIH under award number T32HG010464. The work of the contributing biobanks was supported by numerous grants from governmental and charitable bodies (Data S1). Study design, A.R.M. J.H. Y.O. and Y.W.; data collection/contribution, L.B. P.A. B.B. P.D. K.H. R.M. Y.M. S.S. J.U. C.W. N.J.C. I.S. and J.H.; data analysis, Y.W. S.N. E.L. S.K. K.T. K.L. M.K. W.Z. K.-H.W. M.-J.F. L.B. V.L.F. and J.H.; writing, Y.W. S.N. E.L. Y.O. A.R.M. and J.H.; revision, Y.W. S.N. E.L. K.T. W.Z. S.S. J.W.S. B.N.W. C.W. E.R.G. N.J.C. Y.O. A.R.M. and J.H. E.R.G. received an honorarium from the journal Circulation Research of the American Heart Association as a member of the editorial board.
Publisher Copyright:
© 2022 The Author(s)
PY - 2023/1/11
Y1 - 2023/1/11
N2 - Polygenic risk scores (PRSs) have been widely explored in precision medicine. However, few studies have thoroughly investigated their best practices in global populations across different diseases. We here utilized data from Global Biobank Meta-analysis Initiative (GBMI) to explore methodological considerations and PRS performance in 9 different biobanks for 14 disease endpoints. Specifically, we constructed PRSs using pruning and thresholding (P + T) and PRS-continuous shrinkage (CS). For both methods, using a European-based linkage disequilibrium (LD) reference panel resulted in comparable or higher prediction accuracy compared with several other non-European-based panels. PRS-CS overall outperformed the classic P + T method, especially for endpoints with higher SNP-based heritability. Notably, prediction accuracy is heterogeneous across endpoints, biobanks, and ancestries, especially for asthma, which has known variation in disease prevalence across populations. Overall, we provide lessons for PRS construction, evaluation, and interpretation using GBMI resources and highlight the importance of best practices for PRS in the biobank-scale genomics era.
AB - Polygenic risk scores (PRSs) have been widely explored in precision medicine. However, few studies have thoroughly investigated their best practices in global populations across different diseases. We here utilized data from Global Biobank Meta-analysis Initiative (GBMI) to explore methodological considerations and PRS performance in 9 different biobanks for 14 disease endpoints. Specifically, we constructed PRSs using pruning and thresholding (P + T) and PRS-continuous shrinkage (CS). For both methods, using a European-based linkage disequilibrium (LD) reference panel resulted in comparable or higher prediction accuracy compared with several other non-European-based panels. PRS-CS overall outperformed the classic P + T method, especially for endpoints with higher SNP-based heritability. Notably, prediction accuracy is heterogeneous across endpoints, biobanks, and ancestries, especially for asthma, which has known variation in disease prevalence across populations. Overall, we provide lessons for PRS construction, evaluation, and interpretation using GBMI resources and highlight the importance of best practices for PRS in the biobank-scale genomics era.
KW - accuracy heterogeneity
KW - Global-Biobank Meta-analysis Initiative
KW - multi-ancestry genetic prediction
KW - polygenic risk scores
U2 - 10.1016/j.xgen.2022.100241
DO - 10.1016/j.xgen.2022.100241
M3 - Article
AN - SCOPUS:85147104921
SN - 2666-979X
VL - 3
JO - Cell Genomics
JF - Cell Genomics
IS - 1
M1 - 100241
ER -