Logo-ehsj
Epidemiol Health System J. 2024;11(3): 128-136.
doi: 10.34172/ehsj.26113
  Abstract View: 66
  PDF Download: 61

Original Article

Exploring Risk Factors of Type 2 Diabetes Mellitus Using Decision Tree and Random Forest Models: Baseline Data From Kharameh Cohort Study

Maryam Jalali 1 ORCID logo, Hamid Reza Niazkar 2 ORCID logo, Masoumeh Ghoddusi Johari 2* ORCID logo, Amir Hossein Saem 3 ORCID logo, Abbas Rezaianzadeh 1 ORCID logo

1 Colorectal Research Center, Shiraz University of Medical Sciences, Shiraz, Iran
2 Breast Diseases Research Center, Shiraz University of Medical Sciences, Shiraz, Iran
3 School of Medicine, Shiraz University of Medical Sciences, Shiraz, Iran
*Corresponding Author: Masoumeh Ghoddusi Johari, Email: m.ghoddusi94@yahoo.com

Abstract

Background and aims: Identifying subjects that are at risk of type 2 diabetes mellitus (T2DM) and predicting the associated risk factors are highly important. Thus, this study aimed to explore the risk factors and find the prediction model for T2DM using decision trees (DTs) and random forest (RF) models.

Methods: This cross-sectional study is a part of the Kharameh Cohort Study. Kharameh Cohort is a part of the Fars Cohort, which started in 2014 with 10663 people aged 40–70. In this study, the risk factors of T2DM were explored using two data mining methods. Accuracy, sensitivity, specificity, and area under the receiver operating characteristic curve (AUC) were applied to evaluate the models. The data were statistically analyzed using R software.

Results: The DT modeling showed that age, triglycerides (TG), physical activity, systolic blood pressure, low-density lipoproteins (LDL), and body mass index (BMI) were the most associated factors in D2MT, while applying RF revealed that fasting blood sugar, cholesterol, creatinine, TG, gamma-glutamyl transferase physical activity, BMI, and LDL were the most effective on T2DM. The RF model was superior to the DT based on the applied criteria. Sensitivity, specificity, accuracy, and AUC for the RF were 73.4, 70.10, 73.5, and 79.1. These findings for the DT were 63.8, 69.7, 62.8, and 66.8, respectively.

Conclusion: Based on the inferences, a strong association was found between several risk factors and the risk of T2DM. Therefore, predictive analytics using the RF model can be applied to identify the risk factors of other chronic diseases.

First Name
Last Name
Email Address
Comments
Security code


Abstract View: 67

Your browser does not support the canvas element.


PDF Download: 61

Your browser does not support the canvas element.

Submitted: 06 Dec 2023
Accepted: 19 Aug 2024
ePublished: 12 Nov 2024
EndNote EndNote

(Enw Format - Win & Mac)

BibTeX BibTeX

(Bib Format - Win & Mac)

Bookends Bookends

(Ris Format - Mac only)

EasyBib EasyBib

(Ris Format - Win & Mac)

Medlars Medlars

(Txt Format - Win & Mac)

Mendeley Web Mendeley Web
Mendeley Mendeley

(Ris Format - Win & Mac)

Papers Papers

(Ris Format - Win & Mac)

ProCite ProCite

(Ris Format - Win & Mac)

Reference Manager Reference Manager

(Ris Format - Win only)

Refworks Refworks

(Refworks Format - Win & Mac)

Zotero Zotero

(Ris Format - Firefox Plugin)