All published articles of this journal are available on ScienceDirect.
Enhancing Early Diagnosis of Type II Diabetes through Feature Selection and Hybrid Metaheuristic Optimization Techniques
Abstract
Background
Type-II Diabetes Mellitus (T2DM) is a chronic metabolic disorder characterized by elevated blood glucose levels, posing a critical global health challenge. It is largely attributed to lifestyle changes, unhealthy dietary habits, and lack of awareness. If not diagnosed early, T2DM can lead to severe complications, including damage to vital organs like the kidneys, heart, and nerves. While timely and accurate diagnosis is crucial, current diagnostic procedures are often costly and time-intensive, necessitating innovative approaches to improve early detection.
Objective
This study aimed to enhance the early prediction of T2DM by leveraging advanced hybrid metaheuristic optimization algorithms to improve model efficiency, accuracy, and computational time.
Method
The methodology employed in this study involved three key steps: feature selection and refinement, model optimization, and evaluation. For feature selection, SHAP (SHapley Additive exPlanations) was integrated with Support Vector Machines (SVMs) to identify the most significant predictive features. This was followed by Particle Swarm Optimization (PSO), which was utilized for feature refinement, ensuring a concise yet highly informative feature set. In the model optimization phase, Genetic Algorithms (GAs) were applied to optimize the hyperparameters of machine learning models, including Artificial Neural Networks (ANNs), Random Forest (RF), and SVM. Bayesian Optimization (BO) was then employed to further refine these hyperparameters, enhancing overall model performance. Finally, the models were evaluated using key classification metrics, such as accuracy, Receiver Operating Characteristic (ROC) curves, and F1 scores, to ensure the robustness and reliability of the proposed approach.
Result
The hybrid metaheuristic optimization approach, which integrated Random Forest with SHAP, PSO, GA, and Bayesian Optimization, delivered the best performance among all evaluated methods. It achieved an impressive accuracy of 99.0%, an F1-score of 94.8%, and the largest Area Under the Curve (AUC) compared to other approaches. Furthermore, this method demonstrated a significant reduction in computational time while maintaining exceptional reliability and precision.
Conclusion
The innovative hybrid algorithm demonstrated superior efficiency and reliability, making it a promising tool for early T2DM diagnosis. By integrating metaheuristic optimization techniques with robust machine learning models, the study establishes a framework for improving diagnostic accuracy and computational efficiency in medical support systems. This research highlights the transformative potential of hybrid optimization in advancing healthcare diagnostics.