All published articles of this journal are available on ScienceDirect.

RESEARCH ARTICLE

Early Detection of Neonatal Infection in NICU Using Machine Learning Models: A Retrospective Observational Cohort Study

The Open Bioinformatics Journal 01 December 2025 RESEARCH ARTICLE DOI: 10.2174/0118750362416494251124105928

Abstract

Introduction

Neonatal infections remain a major threat in intensive care units (ICUs), often progressing rapidly and asymptomatically within the first hours of admission. Early detection is critical to improve outcomes, yet timely and reliable risk prediction remains a challenge.

Method

We conducted a retrospective, observational, explainable machine learning cohort study to predict early neonatal infection using high-resolution data from the MIMIC-III database. Two-time windows, 30 and 120 minutes post-ICU admission, were analyzed. Physiological and hematological variables were aggregated, and missing data were imputed using iterative imputation. Multiple classification models were evaluated through stratified five-fold cross-validation, and model interpretability was assessed with feature importance and SHAP analysis.

Result

CatBoost demonstrated the highest performance in the 30-minute window (F1-score = 0.76), while Gradient Boosting achieved the best results in the 120-minute window (F1-score ≈ 0.80). Key predictors included heart rate, white blood cell count, and temperature, reflecting both physiological stability and immune response. Model performance improved with longer observation windows, underscoring the role of data availability in predictive accuracy.

Discussion

A staged deployment, CatBoost at admission followed by Gradient Boosting after 2hours, could balance immediacy and precision. High missing-value rates were manageable with model-based imputation, yet external validation was required, given single-centre data and device heterogeneity.

Conclusion

This study proposes a two-stage decision-support system that adapts to data collected during early and later ICU admission periods. By combining accurate prediction with model interpretability, the framework may enable timely diagnosis and targeted interventions, ultimately reducing neonatal morbidity and mortality.

Keywords: Neonatal infection, Machine learning, Early prediction, MIMIC-III, CatBoost, Gradient boosting, NICU.
Fulltext HTML PDF
1800
1801
1802
1803
1804