Enhancing Cardiovascular Risk Prediction: Development of an Advanced Xgboost Model with Hospital-Level Random Effects

Autores da FMUP
Participantes de fora da FMUP
- Dong, Tim
- Oronti, Iyabosola Busola
- Sinha, Shubhra
- Zhai, Bing
- Chan, Jeremy
- Fudulu, Daniel P.
- Caputo, Massimo
- Angelini, Gianni D.
Unidades de investigação
Abstract
Background: Ensemble tree-based models such as Xgboost are highly prognostic in cardiovascular medicine, as measured by the Clinical Effectiveness Metric (CEM). However, their ability to handle correlated data, such as hospital-level effects, is limited. Objectives: The aim of this work is to develop a binary-outcome mixed-effects Xgboost (BME) model that integrates random effects at the hospital level. To ascertain how well the model handles correlated data in cardiovascular outcomes, we aim to assess its performance and compare it to fixed-effects Xgboost and traditional logistic regression models. Methods: A total of 227,087 patients over 17 years of age, undergoing cardiac surgery from 42 UK hospitals between 1 January 2012 and 31 March 2019, were included. The dataset was split into two cohorts: training/validation (n = 157,196; 2012-2016) and holdout (n = 69,891; 2017-2019). The outcome variable was 30-day mortality with hospitals considered as the clustering variable. The logistic regression, mixed-effects logistic regression, Xgboost and binary-outcome mixed-effects Xgboost (BME) were fitted to both standardized and unstandardized datasets across a range of sample sizes and the estimated prediction power metrics were compared to identify the best approach. Results: The exploratory study found high variability in hospital-related mortality across datasets, which supported the adoption of the mixed-effects models. Unstandardized Xgboost BME demonstrated marked improvements in prediction power over the Xgboost model at small sample size ranges, but performance differences decreased as dataset sizes increased. Generalized linear models (glms) and generalized linear mixed-effects models (glmers) followed similar results, with the Xgboost models also excelling at greater sample sizes. Conclusions: These findings suggest that integrating mixed effects into machine learning models can enhance their performance on datasets where the sample size is small.
Dados da publicação
- ISSN/ISSNe:
- 2306-5354, 2306-5354
- Tipo:
- Article
- Páginas:
- -
- Link para outro recurso:
- www.scopus.com
BIOENGINEERING-BASEL MDPI AG
Documentos
- Não há documentos
Filiações
Keywords
- machine learning; AI; random effects; cardiovascular medicine; risk prediction; expectation-maximization; xgboost
Proyectos asociados
Stimulate continous monitoring in personal and physical health.
Investigador Principal: José Alberto da Silva Freitas
Estudo Observacional Académico (INNO4HEALTH) . FCT . 2021
Estudos de avaliação de exequibilidade, usabilidade e utilização de uma app para telemóvel para gestão da diabetes tipo 2.
Investigador Principal: José Alberto da Silva Freitas
Estudo Observacional Académico (FoodFriend) . FCT . 2022
Portuguese Public Hospitals Financial Performance between 2014-2020
Investigador Principal: José Alberto da Silva Freitas
Estudo Clínico Académico (Financial Performance) . 2023
Tendências nas Hospitalizações por Insuficiência Cardíaca durante um Período de Dezasseis Anos: Dados de Abrangência Nacional para Portugal
Investigador Principal: José Alberto da Silva Freitas
Estudo Clínico Académico (Hospitalizações IC) . 2022
The use of secondary data in Mental Health research
Investigador Principal: José Alberto da Silva Freitas
Estudo Clínico Académico . 2023
Health priorities in the European Union - a novel framework
Investigador Principal: José Alberto da Silva Freitas
Estudo Clínico Académico . 2023
Healthcare Human Resources and Quality Indicators: Approaches to Strengthening Primary Care.
Investigador Principal: José Alberto da Silva Freitas
Estudo Clínico Académico . 2022
A machine learning-based approach to support the assessment of clinical coded data quality in the context of Diagnosis-Related Groups classification systems
Investigador Principal: José Alberto da Silva Freitas
Estudo Clínico Académico . 2020
Citar a publicação
Dong T,Oronti IB,Sinha S,Freitas A,Zhai B,Chan J,Fudulu DP,Caputo M,Angelini GD. Enhancing Cardiovascular Risk Prediction: Development of an Advanced Xgboost Model with Hospital-Level Random Effects. Bioeng. 2024. 11. (10):1039. IF:4,600. (2).