TY - JOUR
T1 - The necessity of incorporating non-genetic risk factors into polygenic risk score models
AU - van Dam, Sipko
AU - Folkertsma, Pytrik
AU - Castela Forte, Jose
AU - de Vries, Dylan H.
AU - Herrera Cunillera, Camila
AU - Gannamani, Rahul
AU - Wolffenbuttel, Bruce H.R.
N1 - Funding Information:
We thank the UKB data access granted through application 55495 and data access to the Lifelines data through application OV20_00020. Additionally, we thank the UGLI consortium for the QC on Lifelines genotyping data and the related documentation. The Lifelines Biobank initiative has been made possible by subsidy from the Dutch Ministry of Health, Welfare and Sport, the Dutch Ministry of Economic Affairs, the University Medical Center Groningen (UMCG the Netherlands), University Groningen and the Northern Provinces of the Netherlands. This project was funded by the UMCG under project number: PPP-2019_023 and Ancora Health B.V. The funders had no role in study design, data collection and analysis, decision to publish, or preparation of the manuscript.
Funding Information:
We thank the UKB data access granted through application 55495 and data access to the Lifelines data through application OV20_00020. Additionally, we thank the UGLI consortium for the QC on Lifelines genotyping data and the related documentation. The Lifelines Biobank initiative has been made possible by subsidy from the Dutch Ministry of Health, Welfare and Sport, the Dutch Ministry of Economic Affairs, the University Medical Center Groningen (UMCG the Netherlands), University Groningen and the Northern Provinces of the Netherlands. This project was funded by the UMCG under project number: PPP-2019_023 and Ancora Health B.V. The funders had no role in study design, data collection and analysis, decision to publish, or preparation of the manuscript.
Publisher Copyright:
© 2023, The Author(s).
PY - 2023/2
Y1 - 2023/2
N2 - The growing public interest in genetic risk scores for various health conditions can be harnessed to inspire preventive health action. However, current commercially available genetic risk scores can be deceiving as they do not consider other, easily attainable risk factors, such as sex, BMI, age, smoking habits, parental disease status and physical activity. Recent scientific literature shows that adding these factors can improve PGS based predictions significantly. However, implementation of existing PGS based models that also consider these factors requires reference data based on a specific genotyping chip, which is not always available. In this paper, we offer a method naïve to the genotyping chip used. We train these models using the UK Biobank data and test these externally in the Lifelines cohort. We show improved performance at identifying the 10% most at-risk individuals for type 2 diabetes (T2D) and coronary artery disease (CAD) by including common risk factors. Incidence in the highest risk group increases from 3.0- and 4.0-fold to 5.8 for T2D, when comparing the genetics-based model, common risk factor-based model and combined model, respectively. Similarly, we observe an increase from 2.4- and 3.0-fold to 4.7-fold risk for CAD. As such, we conclude that it is paramount that these additional variables are considered when reporting risk, unlike current practice with current available genetic tests.
AB - The growing public interest in genetic risk scores for various health conditions can be harnessed to inspire preventive health action. However, current commercially available genetic risk scores can be deceiving as they do not consider other, easily attainable risk factors, such as sex, BMI, age, smoking habits, parental disease status and physical activity. Recent scientific literature shows that adding these factors can improve PGS based predictions significantly. However, implementation of existing PGS based models that also consider these factors requires reference data based on a specific genotyping chip, which is not always available. In this paper, we offer a method naïve to the genotyping chip used. We train these models using the UK Biobank data and test these externally in the Lifelines cohort. We show improved performance at identifying the 10% most at-risk individuals for type 2 diabetes (T2D) and coronary artery disease (CAD) by including common risk factors. Incidence in the highest risk group increases from 3.0- and 4.0-fold to 5.8 for T2D, when comparing the genetics-based model, common risk factor-based model and combined model, respectively. Similarly, we observe an increase from 2.4- and 3.0-fold to 4.7-fold risk for CAD. As such, we conclude that it is paramount that these additional variables are considered when reporting risk, unlike current practice with current available genetic tests.
U2 - 10.1038/s41598-023-27637-w
DO - 10.1038/s41598-023-27637-w
M3 - Article
C2 - 36807592
AN - SCOPUS:85148678780
SN - 2045-2322
VL - 13
JO - Scientific Reports
JF - Scientific Reports
IS - 1
M1 - 1351
ER -