Using Unsupervised Machine Learning to Identify Age- and Sex-Independent Severity Subgroups Among Patients with COVID-19: Observational Longitudinal Study.

No Thumbnail Available

Date

2021-05-27

Authors

Benito-León, Julián
Del Castillo, Mª Dolores
Estirado, Alberto
Ghosh, Ritwik
Dubey, Souvik
Serrano, J Ignacio

Advisors

Journal Title

Journal ISSN

Volume Title

Publisher

Metrics
Google Scholar
Export

Research Projects

Organizational Units

Journal Issue

Abstract

Early detection and intervention are the key factors for improving outcomes in patients with COVID-19. The objective of this observational longitudinal study was to identify nonoverlapping severity subgroups (ie, clusters) among patients with COVID-19, based exclusively on clinical data and standard laboratory tests obtained during patient assessment in the emergency department. We applied unsupervised machine learning to a data set of 853 patients with COVID-19 from the HM group of hospitals (HM Hospitales) in Madrid, Spain. Age and sex were not considered while building the clusters, as these variables could introduce biases in machine learning algorithms and raise ethical implications or enable discrimination in triage protocols. From 850 clinical and laboratory variables, four tests-the serum levels of aspartate transaminase (AST), lactate dehydrogenase (LDH), C-reactive protein (CRP), and the number of neutrophils-were enough to segregate the entire patient pool into three separate clusters. Further, the percentage of monocytes and lymphocytes and the levels of alanine transaminase (ALT) distinguished cluster 3 patients from the other two clusters. The highest proportion of deceased patients; the highest levels of AST, ALT, LDH, and CRP; the highest number of neutrophils; and the lowest percentages of monocytes and lymphocytes characterized cluster 1. Cluster 2 included a lower proportion of deceased patients and intermediate levels of the previous laboratory tests. The lowest proportion of deceased patients; the lowest levels of AST, ALT, LDH, and CRP; the lowest number of neutrophils; and the highest percentages of monocytes and lymphocytes characterized cluster 3. A few standard laboratory tests, deemed available in all emergency departments, have shown good discriminative power for the characterization of severity subgroups among patients with COVID-19.

Description

MeSH Terms

Alanine Transaminase
Aspartate Aminotransferases
C-Reactive Protein
COVID-19
Cell Count
Cluster Analysis
Datasets as Topic
Emergency Service, Hospital
Humans
L-Lactate Dehydrogenase
Longitudinal Studies
Lymphocytes
Monocytes
Neutrophils
Prognosis
Spain
Triage
Unsupervised Machine Learning

DeCS Terms

CIE Terms

Keywords

COVID-19, characterization, data set, detection, emergency, intervention, machine learning, outcome, severity, subgroup, testing

Citation