Electronic Health Record (EHR) System Development for Study on EHR Data-based Early Prediction of Diabetes Using Machine Learning Algorithms

Jagadamba G1, Shashidhar R2, Gururaj H L3, *, Vinayakumar Ravi4, *, Meshari Almeshari5, Yasser Alzamil5
1 Department of Information Science and Engineering, Siddaganaga Institute of Technology, Tumakuru, Karnataka- 57210, India
2 Department of Electronics and Communication Engineering, JSS Science and Technology University, Mysuru, Karnataka 570006, India
3 Department of Information Technology, Manipal Institute of Technology, Bengaluru, Manipal Academy of Higher Education, Manipal, 576104, India
4 Center for Artificial Intelligence, Prince Mohammad Bin Fahd University, Khobar, Saudi Arabia
5 Department of Diagnostic Radiology, College of Applied Medical Sciences, University of Ha'il, Ha'il, Saudi Arabia

Article Metrics

CrossRef Citations:
Total Statistics:

Full-Text HTML Views: 1251
Abstract HTML Views: 436
PDF Downloads: 351
ePub Downloads: 166
Total Views/Downloads: 2204
Unique Statistics:

Full-Text HTML Views: 816
Abstract HTML Views: 192
PDF Downloads: 256
ePub Downloads: 129
Total Views/Downloads: 1393

Creative Commons License
© 2023 Jagadamba et al.

open-access license: This is an open access article distributed under the terms of the Creative Commons Attribution 4.0 International Public License (CC-BY 4.0), a copy of which is available at: This license permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.

* Address correspondence to these authors at the Center for Artificial Intelligence, Prince Mohammad Bin Fahd University, Khobar, Saudi Arabia; E-mails:,



This research work aims to develop an interoperable electronic health record (EHR) system to aid the early detection of diabetes by the use of Machine Learning (ML) algorithms. A decision support system developed using many ML algorithms results in optimizing the decision in preventive care in the health information system.


The proposed system consisted of two models. The first model included interoperable EHR system development using a precise database structure. The second module comprised of data extraction from the EHR system, data cleaning, and data processing and prediction. For testing and training, about 1080 patients’ health record was considered. Among 1080, 1000 records were from the Kaggle dataset, and 80 records were demographic information from patients who visited our health center of Siddaganga organization for a regular checkup or during emergencies. The demographic information was collected from the proposed EHR system.


The proposed system was tested for the interoperability nature of the EHR system and accuracy in diabetic disease prediction using the proposed decision support system. The proposed EHR system development was tested for interoperability by random updations from various systems maintained in the laboratory. Each system acted like the admin system of different hospitals. The EHR system was tested for handling the load and interoperability by considering user view status, system matching with the real world, consistency in data updations, security etc. However, in the prediction phase, diabetes prediction was concentrated. The features considered were not randomly chosen; however, the features were those prescribed by a doctor who insisted that the features were sufficient for initial prediction. The reports collected from the doctors revealed several features they considered before giving the test details. The proposed system dataset was split into test and train datasets with eight proper features taken as input and one set as a target variable where the result was present. After this, the model was imported using standard “sklearn” libraries, and it fit with the required number of estimators, that is, the number of decision trees. The features included pregnancies, glucose level, blood pressure, skin thickness, insulin level, bone marrow index, diabetic pedigree function, age, weight, etc. At the outset, the research work concentrated on developing an interoperable EHR system, identifying the expectation of diabetic and non-diabetic conditions and demonstrating the accuracy of the system.


In this study, the first aim was to design an interoperable EHR system that could help in accumulating, storing, and sharing patients' timely health records over a lifetime. The second aim was to use EHR data for early prediction of diabetes in the user. To confirm the accuracy of the system, the system was tested regarding interoperability to support early prediction through a decision support system.

Keywords: Electronic health record (EHR), Electronic medical record (EMR), Machine learning, Artificial intelligence, Random forest algorithm, Healthcare, Classification, Diabetic prediction.