Introduction In primary care, identifying patients with type 2 diabetes (T2D) who are at increased risk of hypoglycaemia is important for the prevention of hypoglycaemic events. We aimed to develop a screening tool based on machine learning to identify such patients using routinely available demographic and medication data.
Methods We used a cohort study design and the Groningen Initiative to ANalyse Type 2 diabetes Treatment (GIANTT) medical record database to develop models for hypoglycaemia risk. The first hypoglycaemic event in the observation period (2007-2013) was the outcome. Demographic and medication data were used as predictor variables to train machine learning models. The performance of the models was compared with a model using additional clinical data using fivefold cross validation with the area under the receiver operator characteristic curve (AUC) as a metric.
Results We included 13,876 T2D patients. The best performing model including only demographic and medication data was logistic regression with least absolute shrinkage and selection operator, with an AUC of 0.71. Ten variables were included (odds ratio): male gender (0.997), age (0.990), total drug count (1.012), glucose-lowering drug count (1.039), sulfonylurea use (1.62), insulin use (1.769), pre-mixed insulin use (1.109), insulin count (1.827), insulin duration (1.193), and antidepressant use (1.05). The proposed model obtained a similar performance to the model using additional clinical data.
Conclusion Using demographic and medication data, a model for identifying patients at increased risk of hypoglycaemia was developed using machine learning. This model can be used as a tool in primary care to screen for patients with T2D who may need additional attention to prevent or reduce hypoglycaemic events.