Objective To develop an interpretable machine learning model for predicting mortality risk in elderly intensive care unit (ICU) patients with type 2 diabetes mellitus (T2DM) and cerebral infarction,and to identify critical prognostic factors. Methods We extracted data of 514 elderly patients with T2DM and cerebral infarction from the Medical Information Mart for Intensive Care-Ⅳ database.The dataset was partitioned into training and test sets (7∶3 ratio) via scikit-learn.Within the training set,collinearity analysis was conducted,and features with variance inflation factor >5 were excluded.Lasso regression was further adopted to refine the feature selection.Six machine learning models-eXtreme Gradient Boosting (XGBoost),Logistic regression,LightGBM,AdaBoost,decision tree,and gradient boosting decision tree-were constructed and subjected to rigorous five-fold cross-validation.The optimal model was interpreted by SHAP analysis on the test set to determine the hierarchy of mortality-associated predictors and their nonlinear interactions. Results The XGBoost model demonstrated the best training performance and prediction generalization ability.The area under the curve for 30-day and 365-day mortality risk were 0.928 (95%CI=0.853-0.995) and 0.882 (95%CI=0.800-0.963),respectively.SHAP analysis revealed that the Oxford Acute Severity of Illness Score,length of hospital stay,congestive heart failure,length of ICU stay,peripheral capillary oxygen saturation,and heart rate were the top six predictive factors for 30-day mortality risk,while blood urea nitrogen,Oxford Acute Severity of Illness Score,peripheral capillary oxygen saturation,age,heart rate,and respiratory rate were the top six predictive factors for 365-day mortality risk. Conclusion The XGBoost model shows significant potential in predicting mortality risk in elderly ICU patients with T2DM and cerebral infarction,underscoring the importance of key clinical predictors. 目的 针对重症监护病房(ICU)老年2型糖尿病合并脑梗死患者死亡风险预测需求,构建可解释机器学习模型,探索关键预后因素。方法 从重症监护医学信息市场-Ⅳ数据库提取514例患者数据,利用scikit-learn机器学习库将数据集划分为训练集、测试集(7∶3),在训练集内部进行共线性相关分析,并排除方差膨胀因子>5的特征,后使用Lasso回归算法筛选初始特征,以构建极端梯度提升 (XGBoost)、Logistic回归、LightGBM、自适应增强算法、决策树和梯度提升决策树6种机器学习算法,并通过五重交叉验证进行严格验证。使用测试集以SHAP解释最佳模型,确定死亡率相关预测因子的层次及其非线性相互作用。结果 XGBoost模型表现出最好的训练性能和预测泛化能力。30、365 d死亡风险的受试者工作特征曲线下面积分别为0.928(95%CI=0.853~0.995)和0.882(95%CI=0.800~0.963)。SHAP分析显示,牛津急性疾病严重程度评分、住院时间、充血性心力衰竭、入住ICU时间、外周毛细血管血氧饱和度、心率是30 d死亡风险排名前6的预测因子,而血尿素氮、牛津急性疾病严重程度评分、外周毛细血管血氧饱和度、年龄、心率和呼吸频率是365 d死亡风险排名前6的预测因子。结论 XGBoost模型在预测ICU老年2型糖尿病合并脑梗死患者死亡风险方面具有显著潜力,强调了关键临床预测指标的重要性。.
使用 AI 将内容摘要翻译为中文,便于快速阅读
使用 AI 分析这篇文章的核心发现、关键要点和深度见解
由 DeepSeek AI 提供分析 · 首次使用需配置 API Key
arXiv · 2026-04-07
arXiv · 2026-01-22