Publication:
Systematic Review and Comparison of Publicly Available ICU Data Sets-A Decision Guide for Clinicians and Data Scientists.

dc.contributor.authorSauer, Christopher M
dc.contributor.authorDam, Tariq A
dc.contributor.authorCeli, Leo A
dc.contributor.authorFaltys, Martin
dc.contributor.authorde la Hoz, Miguel A A
dc.contributor.authorAdhikari, Lasith
dc.contributor.authorZiesemer, Kirsten A
dc.contributor.authorGirbes, Armand
dc.contributor.authorThoral, Patrick J
dc.contributor.authorElbers, Paul
dc.date.accessioned2023-05-03T13:27:58Z
dc.date.available2023-05-03T13:27:58Z
dc.date.issued2022-03-02
dc.description.abstractAs data science and artificial intelligence continue to rapidly gain traction, the publication of freely available ICU datasets has become invaluable to propel data-driven clinical research. In this guide for clinicians and researchers, we aim to: 1) systematically search and identify all publicly available adult clinical ICU datasets, 2) compare their characteristics, data quality, and richness and critically appraise their strengths and weaknesses, and 3) provide researchers with suggestions, which datasets are appropriate for answering their clinical question. A systematic search was performed in Pubmed, ArXiv, MedRxiv, and BioRxiv. We selected all studies that reported on publicly available adult patient-level intensive care datasets. A total of four publicly available, adult, critical care, patient-level databases were included (Amsterdam University Medical Center data base [AmsterdamUMCdb], eICU Collaborative Research Database eICU CRD], High time-resolution intensive care unit dataset [HiRID], and Medical Information Mart for Intensive Care-IV). Databases were compared using a priori defined categories, including demographics, patient characteristics, and data richness. The study protocol and search strategy were prospectively registered. Four ICU databases fulfilled all criteria for inclusion and were queried using SQL (PostgreSQL version 12; PostgreSQL Global Development Group) and analyzed using R (R Foundation for Statistical Computing, Vienna, Austria). The number of unique patient admissions varied between 23,106 (AmsterdamUMCdb) and 200,859 (eICU-CRD). Frequency of laboratory values and vital signs was highest in HiRID, for example, 5.2 (±3.4) lactate values per day and 29.7 (±10.2) systolic blood pressure values per hour. Treatment intensity varied with vasopressor and ventilatory support in 69.0% and 83.0% of patients in AmsterdamUMCdb versus 12.0% and 21.0% in eICU-CRD, respectively. ICU mortality ranged from 5.5% in eICU-CRD to 9.9% in AmsterdamUMCdb. We identified four publicly available adult clinical ICU datasets. Sample size, severity of illness, treatment intensity, and frequency of reported parameters differ markedly between the databases. This should guide clinicians and researchers which databases to best answer their clinical questions.
dc.identifier.doi10.1097/CCM.0000000000005517
dc.identifier.essn1530-0293
dc.identifier.pmcPMC9150442
dc.identifier.pmid35234175
dc.identifier.pubmedURLhttps://www.ncbi.nlm.nih.gov/pmc/articles/PMC9150442/pdf
dc.identifier.unpaywallURLhttps://boris.unibe.ch/166313/1/Systematic_Review_and_Comparison_of_Publicly.94974.pdf
dc.identifier.urihttp://hdl.handle.net/10668/19845
dc.issue.number6
dc.journal.titleCritical care medicine
dc.journal.titleabbreviationCrit Care Med
dc.language.isoen
dc.organizationFundación Pública Andaluz Progreso y Salud-FPS
dc.page.numbere581-e588
dc.pubmedtypeJournal Article
dc.rights.accessRightsopen access
dc.subject.meshAdult
dc.subject.meshArtificial Intelligence
dc.subject.meshCritical Care
dc.subject.meshData Accuracy
dc.subject.meshDatabases, Factual
dc.subject.meshHumans
dc.subject.meshIntensive Care Units
dc.subject.meshSystematic Reviews as Topic
dc.titleSystematic Review and Comparison of Publicly Available ICU Data Sets-A Decision Guide for Clinicians and Data Scientists.
dc.typeresearch article
dc.type.hasVersionVoR
dc.volume.number50
dspace.entity.typePublication

Files