Introduction
As radiography and percutaneous coronary intervention (PCI) have grown more widely used, contrast-induced acute kidney injury (CIAKI) has risen to become the third most prevalent cause of iatrogenic acute kidney injury [
1], especially in patients with diabetes mellitus (DM) due to the poor vascular conditions [
2,
3]. As many as 21.2% of DM patients may suffer from CIAKI [
4], which may lead to as high as 30% mortality from CIAKI [
5]. Therefore, an early predictive system applied in diabetic patients according to their risk of CIAKI is crucial to reduce the frequency of CIAKI.
Serum creatinine levels are still used in the current definition of CIAKI, which could delay the diagnosis of CIAKI. Although some novel biomarkers have been proven to predict CIAKI [
6,
7], cost-effectiveness limits their widely applications [
8]. Clinical risk scores like the Mehran score [
9] have been introduced into clinical practice for decades. However, the predictive power was inadequate in different races or populations. Recently, several studies have demonstrated that the machine learning (ML) model has an excellent prediction performance in kidney disease compared with the traditional statistics model [
10‐
13]. ML model has a more accurate prediction ability because it gives the probability of events individually rather than risk groups.
However, these ML models rarely explained the models’ variables because of the shortcomings of the black box in ML algorithms. Most studies often lacked the verification of external data sets. Furthermore, there are few prediction models based on the website for clinical use. We intended to apply a range of ML algorithms to establish ML models and compare the models’ prediction performance to the Mehran score [
9]. In addition, we used data from multi-centre hospitals as an external cohort and one of the centers as a prospective cohort to validate our model. Then, we established a dynamic and explainable website tool for predicting CIAKI in patients with diabetes.
Methods
Study design and participants
The study was divided into two steps. Firstly, we retrospectively reviewed the medical records from multi-center hospitals to build and validate the predictive model. The multi-centre hospitals included Affiliated Sir Run Run Hospital of Nanjing Medical University, Nanjing First Hospital, Affiliated Shu Yang Hospital of Nanjing University of Chinese Medicine and Xu Zhou Medical University Hospital. The study population included diabetic patients who underwent coronary angiography (CAG) and PCI between January 2014 and January 2020. We excluded patients based on the following criteria: (1) missing serum creatinine levels prior to and after CAG and PCI; (2) needing dialysis before CAG and PCI; (3) repeated hospitalization for PCI; and (4) acute kidney injury prior to CAG and PCI. Our research was carried out in respect to the Declaration of Helsinki. Due to its retrospective design, our hospitals gave their approval for the study and waived the need for informed consent.
Secondly, we conducted a prospective study in Affiliated Sir Run Run Hospital of Nanjing Medical University to determine early prediction of CIAKI with our CIAKI online calculator. The study population included adult diabetic patients that underwent CAG and PCI from June 2021 to April 2022. The Ethnic Committee approved this study (Ethics number: 2021-SR-041) and waived the requirement for informed permission to use identifiable data. We reported our work following the Transparent Reporting of a multivariable prediction model for Individual Prognosis Or Diagnosis (TRIPOD) statement guideline [
14], Strengthening the Reporting of Observational Studies in Epidemiology (STROBE) statement [
15] and guidelines for ML predictive modeling [
16].
Clinical endpoints
CIAKI was the primary outcome of our study, based on the Contrast Media Safety Committee (CMSC), described as an increment of serum creatinine value at least 44.2 μmol/L (0.5 mg/dl) or 1.25 times comparing the baseline level within 72 h exposure to contrast agent, eliminating alternative causes of acute kidney injury. The baseline creatinine was the lowest serum creatinine level within 7 days before CAG. In the 72 h following CAG and PCI, all serum creatinine values were collected. Dialysis, stroke, length of in-hospital stay, the new-onset or recurrence of myocardial infarction and other adverse cardiovascular events such as worsening heart failure and death were also included as outcomes.
Other definitions
DM was defined if the patient’s treatment included dietary, oral, or insulin therapy or if patients’ fasting blood glucose value was 126 mg/dl based on the practice guidelines of the American Diabetes Association [
17]. Congestive heart failure (CHF) was diagnosed if the patients were grouped into New York Heart Association (NYHA) class III or higher based on the categorization system of the NYHA or history of pulmonary edema. Clinicians comprehensively diagnosed acute coronary syndromes (ACS) according to the symptoms of myocardial ischemia, changes in electrocardiogram, and myocardial injury biomarkers [
18]. According to the definition of chronic kidney disease (CKD), patients with proteinuria, estimated glomerular filtration rate (eGFR) < 60 ml/min/1.73m
2, or both on at least two occasions more than or equal to three months apart [
19,
20].
Data collection and preprocessing
In each institution, demographic data, preoperative medications, and laboratory tests were collected, including gender, age, pre-CAG blood pressure, body mass index (BMI), coronary artery disease, primary disease, contrast agents, and periprocedural biochemical markers. We removed characteristics absent in 11% or more of the samples. The abnormal value of variables were rechecked in electronic hospital records. Otherwise, they were treated as missing values. Categorical variables were processed with one-hot encoding and label encoding. One-hot encoding creates a separate binary feature for each category and is suitable for categorical variables without a specific order or hierarchy. For example, we converted the gender “male” or “female” to “female or not”. Label encoding assigns a unique numerical label to each category. Each category is mapped to a different integer value. Label encoding is suitable for categorical variables with a clear order or hierarchy, such as ordinal variables. For example, the variable “Diabetes history (yrs)” with categories “ < 1 year, 1–5 years, 5–10 years, 10–20 years, > = 20 years” were converted to “1, 2, 3, 4, 5”. Variance inflation factor (VIF) and generalized variance inflation factor (GVIF) were used to deal with collinearity between continuous and categorical variables, respectively. The continuous variables with VIF > 10 were removed. For categorical variables, we set the category with the largest proportion in each categorical variable as the reference level and considered the categorical variables with GVIF^[1/(2 × Df)] > 10^(1/2) to have high collinearity and removed them, where Df refers to the degree of freedom. We divided the data into the cohort of training, internal validation and external validation. We randomly used 80% of the data from Nanjing First Hospital for model training, 20% from Nanjing First Hospital for model internal validation, and other centres for model external validation. We used the missForest method, which can handle missing values with a combination of continuous and categorical variables to fill each remaining measurement’s missing data in the three cohorts separately [
21]. Meanwhile, variables were standardized before training and prediction by removing the mean and scaling to unit variance.
In the prospective design, we recorded the time of each variable in the CIAKI model and the time of clinical diagnosis of CIAKI to obtain the earliest time when the model predicted the occurrence of CIAKI. Because of the prospective design, none of the required variables had a missing value.
Data balancing
To solve the imbalance between positive and negative samples, we adopt a variety of balancing methods in the training set, including oversampling and undersampling. Oversampling includes Synthetic Minority Oversampling Technique (SMOTE), ADAptive SYNthetic (ADASYN) technique, and random oversampling. Undersampling includes random undersampling and TomekLinks (Additional file
1: Table S4). Finally, we found that each balancing method performed equally on the internal validation set, but TomekLinks performed better in the external validation set, so we chose to use TomekLinks. Specifically, TomekLinks focuses on neighboring pairs of samples, where one sample belongs to the minority class and the other belongs to the majority class. These sample pairs are close to each other and form links. These links are considered potential noise or outliers, which may have a negative impact on model training and performance. By identifying and addressing these links, we can reduce noise or outliers in the data and improve the performance of the classification model.
Mehran risk score
Mehran risk score [
9] is calculated with 8 variables: hypotension, CHF, intra-arterial balloon pump (IABP), anemia, age, diabetes, contrast media volume, serum creatinine or eGFR. We calculated the total Mehran risk scores for each patient based on the sum of the scores corresponding to the 8 variables which were 5 points for hypotension (Systolic blood pressure is less than 80 mmHg for at least 1 h and inotropic assistance is required), 5 points for IABP (IABP is used), 5 points for CHF (NYHA classification III/IV or history of pulmonary edema), 4 points for age (more than 75 years old), 3 points for anemia (men’s hematocrit less than 39% while women’s less than 36%.), 3 points for diabetes, 1 point for contrast media volume per 100ml and 4 points for serum creatinine > 1.5mg/dl, or 2 points for eGFR 40–60 ml/min/1.73m
2, 4 points for eGFR 20–40 ml/min/1.73m
2, and 6 points for eGFR < 20 ml/min/1.73m
2.
Machine learning development
Six ML models were constructed, including extreme gradient boosting trees (XGBT) model, random forest (RF) model, support vector machine (SVM) model, logistic regression (LR) model, least absolute shrinkage and selection operator (LASSO) with LR (Lasso + LR), and gradient boosted decision trees (GBDT) model. Additional file
1 included a full explanation of the six ML models.
ML models were also trained using ten-fold cross-validation (Additional file
1: Figure S1 and Additional file
1: Table S3). The initial samples were randomly split into ten equal-sized subsamples, one of which was used to validate the results and the other nine as training samples. For each model, in order to select the ideal hyperparameters, a grid search method with ten-fold cross validation was used. Furthermore, we constructed the SHapley Additive exPlanation (SHAP), which demonstrates each variable’s impact on the overall model as well as its contributions to the model. Additionally, the SHAP plot function was also used to reveal the XGBT model’s complicated link between factors and results. Finally, to forecast the risk of CIAKI in diabetics, we developed an explainable online web-based risk calculator.
All models were evaluated in internal as well as external validation sets. Each model’s area under the receiver operating characteristic curve (AUC), accuracy, positive predictive value (PPV), negative predictive value (NPV), sensitivity, specificity, and F1 score were also compared. Additionally, we chose the CIAKI prediction threshold by maximizing the F1 score in the training set. A 95% confidence interval (CI) was performed in 1000 iterations of bootstrap sampling with replacement. To examine the agreement between calculated likelihood and observed CIAKI prevalence in the population, a calibration curve was utilized. Moreover, the net benefit of each model was calculated using decision curve analysis (DCA) based on the difference between the predicted benefit and the expected risk associated with CIAKI.
Statistical analysis
For descriptive analyses, categorical variables were expressed as quantities and percentages. To compare categorical variables, chi-square tests were utilized. Analysis and expression of continuous variables using the mean and standard deviation or median and interquartile range were compared using either the Independent-sample T-test or the Mann–Whitney U test. All analyses were carried out with Python version 3.9.7, R version 4.1.0, and SPSS version 22.0. P < 0.05 was used as the statistical significance level.
Discussion
In this study, we employed ML algorithms to develop an innovative prediction tool. Compared to Mehran risk scores, our results showed that ML models were superior to traditional logistic regression. Notably, in both the cohort of internal and external validation, the XGBT model performed best. Further, we determined the top 15 important predictors in the XGBT model as BCPMD model variables as these variables can be collected easily in medical activities. Similarly, AUC for CIAKI in the cohorts of internal validation, external validation, and prospective validation was shown by BCPMD to be 0.819 (95% CI 0.783–0.855), 0.805 (95% CI 0.755–0.850) and 0.801 (95% CI 0.688–0.887), respectively. In addition, we constructed SHAP to provide personalized interpretation for each patient. An online web risk calculator model of CIAKI in diabetes was then established to predict the occurrence of CIAKI within 1 h when patients arrived at the hospital.
The previous study indicated Mehran’s score could predict CIAKI with an AUC of 0.67 in the validation cohort [
9]. Our results verified the AUC of Mehran score was 0.654 in the cohort of internal validation and 0.656 in external validation for CIAKI in patients with diabetes. Mehran score models were updated in 2021, with model 1 including indicators before CAG, and model 2 adding procedural features, giving a better AUC of 0.84 [
22]. However, with the development of biomarkers and algorithms, ML technology is gradually emerging as a better tool for establishing prediction models. Yin et al. [
23] constructed a CIAKI prediction model using 13 preprocedural indicators through an RF algorithm, revealing an AUC of 0.907 and an accuracy of 80.8%. Other researchers also found that GBDT [
24] and RNN [
25] could perform well in predicting CIAKI. Moreover, Sun et al. [
26] exhibited that in patients with ACS, the LASSO + LR-based nomogram model provided a better prediction of CIAKI than the Mehran score (AUC was 0.835 and 0.762, respectively). According to our results, in diabetic patients, ML models (including LASSO + LR, GBDT, XGBT, and SVM) demonstrated better discriminative power than traditional LR and Mehran score in developing predictive models. Additionally, our data displayed that XGBT performed best, which was an ensemble of weak prediction trees [
27]. The XGBT algorithms can capture complex relationships in data without explicit specification of higher-order interactions and nonlinear functions [
28]. Furthermore, XGBT algorithms prevent overfitting through cross-validation and regularization [
29].
The BCPMD model included 15 features, which were easily accessible in clinical activities. Although the 15 features were readily accessible, missing data could still occur in different regions or circumstances, affecting the model’s performance and delaying the prediction time. As a result, we adopted missForest [
30] to handle mixed-type data with both missing continuous and categorical patient variables to make our web predictive tool perform well.
Notably, our model suggested that ACS was the most significant risk factor for CIAKI in diabetic individuals, consistent with current studies [
31‐
33]. In addition to the signal pathway regulation and contrast medium’s harmful effects on renal tubular cells [
34], ACS may have a comparable mode of action with diabetes, leading to the superposition of kidney injury. On the one hand, they both affect renal perfusion. Patients with ACS often have unstable hemodynamics. In the case of cardiac vascular stenosis, cardiac ejection function is impaired, and hypotension occurs, which may result in decreased renal perfusion and kidney injury [
35]. Likewise, acute myocardial ischemia can activate renin angiotensin aldosterone system (RAAS). Vasopressin, catecholamine and interleukin are produced, and the level of nitric oxide is reduced, damaging endothelial cells and bringing about decreased renal blood flow [
36,
37]. On the other hand, ACS can give rise to kidney inflammation and oxidative stress damage, like diabetes [
38,
39].
Additionally, our results revealed that hyperuricemia constituted a significant risk factor for CIAKI in diabetes. A recent study from China proved that hyperuricemia was associated with CIAKI (OR = 2.363, 95% CI 1.653–3.377,
P < 0.001) [
40]. What’s more, it was also shown that patients with uric acid levels above 8.0 mg/dL not only had a greater risk of CIAKI but also an increased risk of hemodialysis [
41]. Uric acid can promote oxidative stress and release a variety of proinflammatory factors, resulting in renal vasoconstriction and endothelial dysfunction [
42]. At the same time, contrast agents can give rise to acute uricosuria [
43], further aggravating kidney injury. Besides, diuretics were one of the important factors in the model. This may be because diuretics can accelerate the excretion of iodine and improve urine viscosity [
44]. Whereas more and more studies believed that diuretics are independent predictors of CIAKI in recent years [
45,
46]. The National Kidney Foundation and the American College of Radiology proposed that it was not recommended to use drugs that can affect renal function within 48 h before and after iodine contrast agents, including diuretics [
47]. Considering the hypoxia and inflammatory reaction induced by diuretics, using diuretics during the perioperative period of PCI may be a potential risk of CIAKI [
48]. Our study also confirmed the increased risk of CIAKI among patients suffering from heart failure, worse renal function, anemia, poor blood glucose control and more contrast volumes, underlining the need for early prevention strategies for these patients at high risk.
Of note, our web CIAKI risk calculator could be used as a guide for clinicians compared with previous studies that only stayed in constructing models, lacking practical value. Evidence has shown that early clinical intervention could improve CIAKI patients’ outcomes [
49,
50]. The time window between evidence of increased CIAKI risk in the prediction platform and the occurrence of clinical CIAKI is an ideal period for clinical intervention. When combining the platform’s prediction and early intervention, the risk of CIAKI is expected to be reduced.
Our study has several strengths, the first of which was generalizability. We assessed the BCPMD model in multi-centre hospitals and prospectively constructed the web platform based on BCPMD. Our results also showed that despite the difference in our data distribution in the external set, it did not affect the model’s predictive ability, indicating that the BCPMD model is generalizable. Secondly, the feature of BCPMD was readily accessible in routine clinical practice. We also found that it was not the greater the number of predictor variables, the higher the model’s prediction ability. Therefore, we screened out a certain number of optimal subsets according to the model effect of different feature numbers to make the model more efficient and straightforward. Thirdly, our model can be used for clinical practice. We developed a dynamically interpretable prediction web platform for the first time. Meanwhile, we set the missing value filling for the platform. Additionally, considering the ML models’ black-box flaws, we used SHAP to explain whether features contributed positively or negatively to ML models, which can explain how each characteristic affects the overall forecast of the model and how our model features affect CIAKI at the individual level. Our web calculator provides a tool that can real-time predict high-risk CIAKI patients and helps clinicians simply and intuitively understand how different values of a single feature affect the model's predictions, which can be as a reference for other disease models.
We also have some limitations in our research. Firstly, 30% of our CAG + PCI patients were excluded from the inclusion criteria. Although most of our characteristics did not differ between excluded patients and included patients, there are still some characteristics that we did not pay attention to that might have possible bias. Therefore, a large sample size of data is needed for verification in the future. Secondly, even though our model’s AUC in the prospective validation set was performing well, we observed that not all risk thresholds were beneficial for patients. In the prospective validation set, a risk threshold lower than 0.30 has no benefit. However, it can identify more lower-risk CIAKI patients, who can give routine interventions such as closely detecting the serum creatinine. In the risk threshold of 0.55 to 0.70, patients with CIAKI can benefit and be identified more accurately. More comprehensive intervention methods, such as adequate hydration, should be given to these high-risk patients. However, using a higher risk threshold means part of CIAKI patients may not be identified ahead of time. It needs to be set according to the patient characteristics of different institutions. Thirdly, we still used serum creatinine for the definition of CIAKI. More early diagnostic markers and clinical features could be added to increase the prediction probability of CIAKI in the future.
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.