RESEARCH ARTICLE


Unsupervised Clustering in Epidemiological Factor Analysis



Serge Dolgikh1, 2, *
1 Department of Systems Engineering, Solana Networks, 301 Moodie Dr., Ottawa, K2H9C4, Canada
2 National Aviation University, 1 Liubomyra Huzara Ave, 1, Kyiv, 03058,Ukraine


Article Metrics

CrossRef Citations:
2
Total Statistics:

Full-Text HTML Views: 649
Abstract HTML Views: 440
PDF Downloads: 241
ePub Downloads: 164
Total Views/Downloads: 1494
Unique Statistics:

Full-Text HTML Views: 408
Abstract HTML Views: 217
PDF Downloads: 180
ePub Downloads: 121
Total Views/Downloads: 926



Creative Commons License
© 2021 Serge Dolgikh.

open-access license: This is an open access article distributed under the terms of the Creative Commons Attribution 4.0 International Public License (CC-BY 4.0), a copy of which is available at: https://creativecommons.org/licenses/by/4.0/legalcode. This license permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.

* Address correspondence to this author at the Department of Systems Engineering, Solana Networks, 301 Moodie Dr., Ottawa, K2H9C4, Canada; E-mail: serged.7@gmail.com


Abstract

Background:

The analysis of epidemiological data at an early phase of an epidemiological situation, when the confident correlation of contributing factors to the outcome has not yet been established, may present a challenge for conventional methods of data analysis.

Objective:

This study aimed to develop approaches for the early analysis of epidemiological data that can be effective in the areas with less labeled data.

Methods:

An analysis of a combined dataset of epidemiological statistics of national and subnational jurisdictions, aligned at approximately two months after the first local exposure to COVID-19 with unsupervised machine learning methods, including principal component analysis and deep neural network dimensionality reduction, to identify the principal factors of influence was performed.

Results:

The approach and methods utilized in the study allow to clearly separate milder background cases from those with the most rapid and aggressive onset of the epidemics.

Conclusion:

The findings can be used in the evaluation of possible epidemiological scenarios and as an effective modeling approach to identify possible negative epidemiological scenarios and design corrective and preventative measures to avoid the development of epidemiological situations with potentially severe impacts.

Keywords: Infectious diseases, Epidemiology, COVID-19, Machine learning, Unsupervised learning, Clustering.