Prediction of Acute Kidney Injury Following Isolated Coronary Artery Bypass Grafting in Heart Failure Patients with Preserved Ejection Fraction Using Machine Leaning with a Novel Nomogram

Background: The incidence of postoperative acute kidney injury (AKI) is high due to insufficient perfusion in patients with heart failure. Heart failure patients with preserved ejection fraction (HFpEF) have strong heterogeneity, which can obtain more accurate results. There are few studies for predicting AKI after coronary artery bypass grafting (CABG) in HFpEF patients especially using machine learning methodology. Methods: Patients were recruited in this study from 2018 to 2022. AKI was defined according to the Kidney Disease Improving Global Outcomes (KDIGO) criteria. The machine learning methods adopted included logistic regression, random forest (RF), extreme gradient boosting (XGBoost), gaussian naive bayes (GNB), and light gradient boosting machine (LGBM). We used the receiver operating characteristic curve (ROC) to evaluate the performance of these models. The integrated discrimination improvement (IDI) and net reclassification improvement (NRI) were utilized to compare the prediction model. Results: In our study, 417 (23.6%) patients developed AKI. Among the five models, random forest was the best predictor of AKI. The area under curve (AUC) value was 0.834 (95% confidence interval (CI) 0.80–0.86). The IDI and NRI was also better than the other models. Ejection fraction (EF), estimated glomerular filtration rate (eGFR), age, albumin (Alb), uric acid (UA), lactate dehydrogenase (LDH) were also significant risk factors in the random forest model. Conclusions: EF, eGFR, age, Alb, UA, LDH are independent risk factors for AKI in HFpEF patients after CABG using the random forest model. EF, eGFR, and Alb positively correlated with age; UA and LDH had a negative correlation. The application of machine learning can better predict the occurrence of AKI after CABG and may help to improve the prognosis of HFpEF patients.


Introduction
The incidence of acute kidney injury (AKI) after coronary artery bypass grafting (CABG) is high and has been reported to range from 6.7% to 39% [1][2][3].AKI has been associated with increased morbidity and mortality after CABG [3][4][5], which further increases in the more severe stages of AKI; and is associated with increased short-term and long-term mortality [4,[6][7][8][9].AKI after cardiac surgery also increases intensive care unit (ICU) length of stay and resource utilization [2,10].
The development of AKI involves a variety of mechanisms, including ischemic reperfusion injury, renal toxin release, hemolysis, oxidative stress and cytokine secretion, which can cause a systemic inflammatory response, endothelial damage and renal tubular cell damage [1,[11][12][13].Previous studies have shown that older age, low ejection fraction, a previous history of kidney disease, and increased time on cardiopulmonary bypass are important predictors of the development of AKI [6,[14][15][16].
While patients with heart failure and reduced ejection fraction are more likely to develop AKI after CABG, patients with preserved ejection fraction may also develop AKI after surgery which will also seriously affect the prognosis of those patients.The research methods for disease risk prediction models are constantly being updated.The introduction of machine learning methods now offers another technique to predict the occurrence of adverse events following surgery [17].Traditional logistic analysis generally deals with data with appropriate size, single types of data, structured data and simple parameter models that meet certain assumptions.For large and complex data, machine learning methods can be used to obtain more accurate risk prediction.In this study, we sought to predict AKI in patients with preserved ejection fraction after isolated CABG using machine learning methodology.

Patients and Setting
1767 patients who underwent CABG for the first time from 2018 to 2022 were recruited in this study.According to the Kidney Disease Improving Global Outcomes (KDIGO) diagnostic criteria of AKI [18], patients divided into developed AKI (AKI group) and who did not (non-AKI group).

Definition of AKI
AKI was defined according to the KDIGO criteria [18]: an increase in serum creatinine (Scr ≥0.3 mg/dL) or an increase in Scr ≥1.5 times baseline in 7 days after surgery or urine volume ≤0.5 mL/kg/h for 6 h.

Data Collection
Detailed clinical information included age, sex, body mass index (BMI), previous cardiac history (previous myocardial infarction and previous percutaneous coronary intervention (PCI)), diabetes, hypertension, Carotid-arterystenosis, previous stroke or chronic obstructive pulmonary disease, smoking, baseline renal function (eGFR, estimated glomerular filtration rate), anemia, and preoperative intraaortic balloon pump (IABP) implantation.

Model Development
We used logistic regression, random forest (RF), extreme gradient boosting (XGBoost), gaussian naive bayes (GNB), light gradient boosting machine (LGBM), and logistic regression (LR) algorithms to filter out significant variables.The significant variables are derived to train and verify the model.In our study, 80% of the population were merged to form a training group, while the remaining (20%) served as the verification group.The process was repeated five times for each result so that each subset can be used for a validation set to explain the differences between patients and provide risk estimates for all cases.The software (version: 4.1.0)packages including XGB Classifier, LGBM Classifier, sklearn naive bayes, sklearn model selec-tion, sklearn.metrics,sklearn.ensemblewere used for analysis as shown in Fig. 1.We used the receiver operating characteristic curve (ROC) to evaluate the performance of these models.The integrated discrimination improvement (IDI) and net reclassification improvement (NRI) also were used to evaluate the prediction model.

Outcome Measures
The most important variables were screened out by the five models, and then the area under curve (AUC), NRI and IDI of each model were compared.By comparing AUC values, the best prediction model was selected.Then the calibration of the best model was checked.The most significant factors are included in the nomogram.

Statistical Analysis
SPSS 23.0 for Mac (IBM SPSS Statistics, Armonk, NY, USA), R (version 4.1.0,Lucent Technologies, Murray Hill, NJ, USA) and Python (version 3.5) were used for statistical analysis.Continuous variables were reported as the mean standard deviation or median (interquartile range (IQR)).Categorical variables were reported as the absolute frequency and as a percentage.Student's t-test was applied for continuous data with equal or unequal variances.The Mann-Whitney U test was applied for continuous data that were not normally distributed.Pearson's χ 2 and Fisher's exact tests were used for categorical data.A p < 0.05 was considered to be statistically significant.

Patient Characteristics
1767 patients with HFpEF were included.The baseline clinical data train and test groups are shown in Table 1.There was no significant difference in baseline statistical results between the training group and the verification group.The incidence of AKI was 23.6%.The comparison of ROC curves among the five models is shown in Fig. 2. The RF performed best with the highest C-statistic (0.834 95% CI 0.80-0.86,Brier score: 0.142, NRI: 0.044, IDI: 0.172).The results if the other models are shown in Table 2.

Predictor Variables
The five models screened out the most important predictors.Random forest showed the best prediction effect.The ejection fraction (EF), eGFR, age, albumin (Alb), uric acid (UA), lactate dehydrogenase (LDH) were the most obvious risk factors in random forest.

Calibration of the Model
The best model was calibrated.
The Hosmer-Lemeshow good of fit test was used to evaluate the calibration degree of the predicted model.The p value was 0.14, which indicated that the calibration of the RF model was good.The calibration curve was shown in Fig. 3.

Construction of Tools for Patient Classification
In order to calculate the probability of postoperative AKI, we included the most important risk factors in the nomogram.By using the nomogram we could quickly calculate the incidence of AKI and provide more accurate data for clinical practice.The nomogram was shown in Fig. 4.

Discussion
Our study used machine learning methods, whereas previous studies included a small population and mixed surgical types.Our results suggest that the EF, eGFR, age, Alb, UA, and LDH are independent risk factors for AKI in HF-pEF patients after CABG using the random forest model.The incidence of AKI was 23.6% in our study which is similar to previous studies [1][2][3].
While there are many studies on acute renal injury after cardiac surgery, only a few used the machine learning method to predict AKI after cardiac surgery [20][21][22].However, there are only a few studies on patients with HFpEF undergoing isolated CABG by machine learning.In this study, we used a new machine learning method, to make a risk prediction model for this group of patients to better predict the occurrence of AKI following CABG surgery to decrease morbidity and mortality in these patients.
The eGFR, EF and age are important risk factors for predicting postoperative AKI in this study, which is consistent with many previous studies [1,2,16].The eGFR is an index to reflect the basic function of the kidney.An abnor-mal eGFR prior to surgery indicates poor renal function and a group of patients who will be more prone to acute kidney injury after surgery.Increased age is a risk factor for AKI as the renal function of the human body gradually declines with age.EF is an important indicator of cardiac function, and a low EF leads to low renal perfusion, which can lead to oliguria and is more prone to acute renal injury [19,[23][24][25].
Although LDH is not specifically produced by kidney, it can predict the occurrence of renal injury, as noted in previous studies [22].Previous studies have not found that preoperative albumin is a risk factor for predicting AKI after cardiac surgery.However, previous studies [26] have suggested that albumin infusion before CABG can reduce the occurrence of acute renal injury after surgery.In addition, studies [27] have shown that the increase of albumin absorption by renal tubules can reduce the occurrence of AKI.Albumin, a risk factor found in our study, can be used to predict AKI after bypass surgery.It may further improve the prediction of acute kidney injury and identify patients with potential risks at an early stage.In addition, our results also show that uric acid is an independent risk factor for AKI.Tang H et al. [28] showed that when uric acid is increased before cardiac surgery, there is an increased risk of AKI after cardiac surgery.Previous studies have also found that increased preoperative uric acid levels is an independent risk factor for AKI after cardiac surgery [29].
There are limitations of this study.Our study was a single-center, retrospective study, with some selection bias.However, all the included patients were HFpEF, and were not compared with other heart failure patients.In the future, we will try to increase these variables to further improve the prediction model.In this study, the diagnosis of AKI was based on KDIGO criteria.Since diuretics are used in many patients after surgery, urine volume was not used as one of the diagnostic criteria of AKI.

Conclusions
Ejection fraction, estimated glomerular filtration rate, age, albumin, uric acid, and lactate dehydrogenase are independent risk factors for acute kidney injury in heart failure preserved ejection fraction patients after coronary artery bypass grafting by the random forest model.The application of machine learning can better predict clinical events.

Table 2 . Comparison of prediction effect evaluation of five models.
Measure the precision of the model.Recall: It measures the recall of the retrieval system.F1score: It is the harmonic average of precision and recall.Brier: Evaluation of the overall performance of the model.IDI, integrated discrimination improve-ment; NRI, net reclassification improvement; RF, random forest; LR, logistic regression; LGBM, light gradient boosting machine; XGBoost, extreme gradient boosting; GNB, gaussian naive bayes.