Introduction
Methods
Patient data collection
Procedures
Data preprocessing and feature selection
Machine learning models
Development and validation of the clinic-machine learning nomogram
Statistical analysis
Results
Characteristics of patients
Features (mean + SD) | Training set | Test set | p-value |
---|---|---|---|
157 (80%) | 40 (20%) | ||
Age (years) | 66.24 ± 7.85 | 64.65 ± 7.22 | 0.2506 |
PSA | 96.88 ± 230.79 | 100.68 ± 220.24 | 0.9257 |
Neutrophil percentage (%) | 61.70 ± 10.42 | 58.53 ± 10.81 | 0.0914 |
Neutrophils (×109/L) | 3.94 ± 1.81 | 3.70 ± 1.41 | 0.3583 |
Lymphocyte percentage (%) | 26.31 ± 8.86 | 28.82 ± 9.01 | 0.1130 |
Lymphocytes (×109/L) | 1.55 ± 0.53 | 1.73 ± 0.65 | 0.0770 |
Hemoglobing (/L) | 132.43 ± 15.32 | 136.15 ± 14.17 | 0.1673 |
ALT (U/L) | 19.77 ± 13.50 | 21.77 ± 15.55 | 0.4202 |
Alkaline phosphatase (U/L) | 143.60 ± 382.69 | 117.65 ± 234.60 | 0.5937 |
Lactate dehydrogenase | 180.68 ± 93.13 | 164.40 ± 26.21 | 0.0587 |
Serum creatinine (mmol/l) | 87.45 ± 45.94 | 100.85 ± 85.84 | 0.1834 |
T cells (CD3+CD19−) (%) | 67.70 ± 9.47 | 69.24 ± 8.08 | 0.3470 |
T cells (CD3+CD19−) (/μl) | 1040.21 ± 314.91 | 1173.60 ± 462.04 | 0.0334* |
B cells (CD3−CD19+) (%) | 12.14 ± 5.54 | 13.07 ± 6.36 | 0.3641 |
B cells (CD3−CD19+) (/μl) | 190.44 ± 120.22 | 240.05 ± 193.13 | 0.0451* |
Th cells (CD3+CD4+) (%) | 43.41 ± 8.56 | 46.49 ± 7.72 | 0.0406* |
Th cells (CD3+CD4+) (/μl) | 666.55 ± 222.74 | 801.12 ± 364.37 | 0.0038* |
Ts cells (CD3+CD8+) (%) | 20.70 ± 6.59 | 19.48 ± 6.53 | 0.3002 |
Ts cells (CD3+CD8+) (/μl) | 318.72 ± 134.72 | 315.95 ± 124.61 | 0.9068 |
NK cells (CD3−/CD16+CD56+) (%) | 19.37 ± 9.65 | 16.86 ± 7.90 | 0.1318 |
NK cells (CD3−/CD16+CD56+) (/μl) | 307.17 ± 204.40 | 267.38 ± 132.43 | 0.1409 |
T cells + B cells + NK cells (%) | 99.22 ± 0.64 | 99.18 ± 0.98 | 0.7586 |
T cells + B cells + NK cells (/μl) | 1537.82 ± 463.60 | 1681.03 ± 629.11 | 0.1104 |
Th/Ts | 2.36 ± 1.02 | 2.71 ± 1.14 | 0.0612 |
Th cells + CD28+(CD3+CD4+CD28+) (/Th) | 94.53 ± 7.10 | 93.30 ± 9.02 | 0.3579 |
Ts cells + CD28+(CD3+CD8+CD28+) (/Ts) | 58.58 ± 17.65 | 59.85 ± 16.75 | 0.6843 |
Activated T cells (CD3+HLA−DR+) (/μl) | 17.90 ± 6.33 | 17.31 ± 7.30 | 0.6113 |
Activated Ts cells (CD3+CD8+HLA−DR+)/Ts (%) | 44.46 ± 13.06 | 41.33 ± 10.24 | 0.1109 |
Naïve Th cells (CD3+CD4+CD45RA+)/Th (%) | 32.45 ± 13.49 | 35.65 ± 14.50 | 0.1910 |
Memory Th cells (CD3+CD4+CD45RO+)/Th (%) | 67.61 ± 13.57 | 64.35 ± 14.50 | 0.1843 |
Regulatory T cells (CD3+CD4+CD25+CD127low+) (/μl) | 3.82 ± 1.22 | 4.00 ± 1.34 | 0.4085 |
Naïve regulatory T cells (CD45RA+CD3+CD4+CD25+CD127low+) (/μl) | 0.76 ± 0.47 | 0.79 ± 0.50 | 0.6688 |
Induced regulatory T cells (CD45RO+CD3+CD4+CD25+CD127low+) (/μl) | 3.06 ± 0.93 | 3.21 ± 1.10 | 0.3944 |
IFN-γ+CD4+T cells /Th (%) | 21.47 ± 8.21 | 19.48 ± 6.75 | 0.1610 |
IFN-γ+CD8+T cells /Ts (%) | 62.31 ± 15.37 | 59.91 ± 13.51 | 0.3690 |
IFN-γ+NK cells/NK (%) | 74.80 ± 14.78 | 73.58 ± 13.25 | 0.6383 |
Interleukin-1β (pg/mL) | 7.38 ± 6.18 | 6.63 ± 4.87 | 0.4237 |
Interleukin-2R (U/mL) | 498.19 ± 320.20 | 533.62 ± 509.28 | 0.5878 |
Interleukin-6 (pg/mL) | 6.87 ± 12.45 | 6.78 ± 9.36 | 0.9612 |
Interleukin-8 (pg/mL) | 27.15 ± 36.89 | 36.77 ± 55.40 | 0.1927 |
Tumor necrosis factor-α (pg/mL) | 19.58 ± 31.55 | 23.62 ± 28.55 | 0.4645 |
Features (Mean + SD) | Training set | Test set | ||||||
---|---|---|---|---|---|---|---|---|
Low | Intermediate | High | p-value | Low | Intermediate | High | p-value | |
47 (29.94%) | 38 (24.20%) | 72 (45.86%) | 12 (30.00%) | 10 (25.00%) | 18 (45.00%) | |||
Age (years) | 63.23 ± 8.18 | 65.53 ± 7.10 | 68.57 ± 7.42 | 0.0002* | 62.67 ± 7.35 | 65.80 ± 6.53 | 65.33 ± 7.81 | 0.3669 |
PSA | 7.56 ± 7.24 | 12.67 ± 9.50 | 199.62 ± 312.92 | 0.0000* | 7.07 ± 4.85 | 9.60 ± 4.90 | 213.68 ± 299.19 | 0.0069* |
Neutrophil percentage (%) | 65.96 ± 11.34 | 58.69 ± 10.44 | 60.51 ± 9.04 | 0.0103* | 60.68 ± 9.56 | 52.95 ± 11.26 | 60.20 ± 11.14 | 0.9428 |
Neutrophils (× 109/L) | 4.60 ± 2.05 | 3.55 ± 1.80 | 3.72 ± 1.55 | 0.0164* | 4.03 ± 1.65 | 2.89 ± 0.90 | 3.93 ± 1.40 | 0.9830 |
Lymphocyte percentage (%) | 23.33 ± 9.44 | 29.32 ± 9.25 | 26.65 ± 7.77 | 0.0841 | 27.87 ± 8.85 | 34.57 ± 7.91 | 26.27 ± 8.96 | 0.4912 |
Lymphocytes (× 109/L) | 1.52 ± 0.65 | 1.64 ± 0.57 | 1.53 ± 0.42 | 0.9605 | 1.79 ± 0.84 | 1.88 ± 0.72 | 1.62 ± 0.50 | 0.4476 |
Hemoglobin (/L) | 134.13 ± 16.14 | 137.16 ± 11.84 | 128.82 ± 15.81 | 0.0389* | 141.75 ± 8.43 | 136.30 ± 5.79 | 132.33 ± 19.26 | 0.0789 |
ALT (U/L) | 21.72 ± 13.18 | 19.13 ± 12.07 | 18.83 ± 14.52 | 0.2742 | 18.25 ± 9.33 | 16.10 ± 7.65 | 27.28 ± 20.53 | 0.0965 |
Alkaline phosphatase (U/L) | 73.62 ± 29.75 | 67.87 ± 15.89 | 229.25 ± 556.22 | 0.0212* | 78.08 ± 23.07 | 64.40 ± 14.37 | 173.61 ± 350.66 | 0.2527 |
Lactate dehydrogenase | 168.89 ± 37.15 | 157.05 ± 34.14 | 200.83 ± 129.90 | 0.0450* | 157.58 ± 19.69 | 158.10 ± 23.03 | 172.44 ± 30.99 | 0.1147 |
Serum creatinine (mmol/l) | 91.04 ± 64.35 | 83.82 ± 12.77 | 87.03 ± 43.40 | 0.6881 | 78.92 ± 10.77 | 85.10 ± 14.89 | 124.22 ± 126.82 | 0.1470 |
T cells (CD3+CD19−) (%) | 67.57 ± 8.16 | 67.44 ± 9.43 | 67.92 ± 10.43 | 0.8305 | 69.01 ± 9.44 | 69.69 ± 5.06 | 69.16 ± 9.06 | 0.9763 |
T cells (CD3+CD19−) (/μl) | 1059.34 ± 370.24 | 1100.29 ± 325.67 | 996.01 ± 266.54 | 0.2295 | 1225.58 ± 563.22 | 1361.70 ± 501.53 | 1034.44 ± 347.35 | 0.2196 |
B cells (CD3−CD19+) (%) | 12.74 ± 5.69 | 12.55 ± 6.07 | 11.54 ± 5.20 | 0.2335 | 12.74 ± 5.44 | 11.66 ± 7.49 | 14.07 ± 6.63 | 0.5322 |
B cells (CD3−CD19+) (/μl) | 211.09 ± 152.39 | 208.21 ± 136.04 | 167.58 ± 79.30 | 0.0428 | 252.58 ± 208.95 | 265.40 ± 285.55 | 217.61 ± 123.77 | 0.6058 |
Th cells (CD3+CD4+) (%) | 42.65 ± 7.65 | 42.38 ± 8.95 | 44.44 ± 8.96 | 0.2327 | 45.93 ± 8.34 | 48.17 ± 8.43 | 45.93 ± 7.43 | 0.9376 |
Th cells (CD3+CD4+) (/μl) | 677.06 ± 276.78 | 686.00 ± 213.15 | 649.42 ± 189.27 | 0.4727 | 840.33 ± 470.67 | 948.60 ± 394.25 | 693.06 ± 245.99 | 0.2295 |
Ts cells (CD3+CD8+) (%) | 22.14 ± 7.31 | 20.36 ± 5.49 | 19.94 ± 6.61 | 0.0845 | 20.41 ± 8.38 | 17.79 ± 6.34 | 19.81 ± 5.57 | 0.8858 |
Ts cells (CD3+CD8+) (/μl) | 339.81 ± 132.66 | 336.53 ± 151.08 | 295.56 ± 125.80 | 0.0664 | 342.50 ± 146.68 | 330.10 ± 111.75 | 290.39 ± 121.23 | 0.2579 |
NK cells (CD3−/CD16+CD56+) (%) | 18.86 ± 8.17 | 19.27 ± 9.26 | 19.76 ± 10.84 | 0.6167 | 17.12 ± 9.10 | 17.93 ± 8.35 | 16.10 ± 7.41 | 0.7029 |
NK cells (CD3−/CD16+CD56+) (/μl) | 297.21 ± 165.47 | 317.00 ± 186.65 | 308.47 ± 237.42 | 0.7993 | 281.17 ± 142.86 | 313.80 ± 138.71 | 232.39 ± 123.17 | 0.2773 |
T cells + B cells + NK cells (%) | 99.17 ± 0.74 | 99.26 ± 0.58 | 99.23 ± 0.61 | 0.6742 | 98.86 ± 1.71 | 99.28 ± 0.47 | 99.33 ± 0.36 | 0.2191 |
T cells + B cells + NK cells (/μl) | 1567.64 ± 557.32 | 1625.50 ± 441.12 | 1472.07 ± 404.37 | 0.2198 | 1759.33 ± 757.08 | 1940.90 ± 713.72 | 1484.44 ± 453.55 | 0.1964 |
Th/Ts | 2.17 ± 0.88 | 2.24 ± 0.79 | 2.55 ± 1.19 | 0.0362* | 2.65 ± 1.15 | 3.09 ± 1.44 | 2.55 ± 0.99 | 0.7307 |
Th cells + CD28+(CD3+CD4+CD28+) (/Th) | 94.44 ± 7.14 | 93.92 ± 7.51 | 94.91 ± 6.98 | 0.6804 | 92.40 ± 11.83 | 91.95 ± 11.49 | 94.65 ± 5.22 | 0.4833 |
Ts cells + CD28+(CD3+CD8+CD28+) (/Ts) | 59.28 ± 20.46 | 56.62 ± 16.45 | 59.17 ± 16.55 | 0.9609 | 63.03 ± 23.60 | 54.84 ± 17.78 | 60.51 ± 10.48 | 0.7789 |
Activated T cells (CD3+HLA−DR+) (/μl) | 17.16 ± 6.34 | 18.68 ± 6.24 | 17.98 ± 6.45 | 0.5603 | 17.71 ± 10.66 | 17.14 ± 6.55 | 17.14 ± 5.38 | 0.8456 |
Activated Ts cells (CD3+CD8+HLA−DR+)/Ts (%) | 40.53 ± 13.36 | 46.01 ± 11.88 | 46.21 ± 13.16 | 0.0265* | 42.12 ± 12.23 | 42.45 ± 12.10 | 40.17 ± 8.35 | 0.5956 |
Naïve Th cells (CD3+CD4+CD45RA+)/Th (%) | 32.64 ± 13.23 | 32.58 ± 10.86 | 32.27 ± 15.08 | 0.8808 | 32.98 ± 16.39 | 43.66 ± 17.85 | 32.99 ± 10.12 | 0.8436 |
Memory Th cells (CD3+CD4+ CD45RO+)/Th (%) | 67.36 ± 13.23 | 67.42 ± 10.86 | 67.87 ± 15.23 | 0.8358 | 67.02 ± 16.39 | 56.35 ± 17.85 | 67.01 ± 10.12 | 0.8435 |
Regulatory T cells (CD3+CD4+CD25+CD127low+) (/μl) | 3.52 ± 1.15 | 4.17 ± 1.47 | 3.83 ± 1.08 | 0.2530 | 3.26 ± 1.30 | 4.10 ± 1.00 | 4.44 ± 1.41 | 0.0193* |
Naïve regulatory T cells (CD45RA+CD3+CD4+CD25+CD127low+) (/μl) | 0.69 ± 0.39 | 0.87 ± 0.66 | 0.74 ± 0.38 | 0.6436 | 0.57 ± 0.32 | 1.01 ± 0.62 | 0.82 ± 0.49 | 0.2496 |
Induced regulatory T cells (CD45RO+CD3+CD4+CD25+CD127low+) (/μl) | 2.83 ± 0.90 | 3.30 ± 1.03 | 3.09 ± 0.88 | 0.2073 | 2.69 ± 1.03 | 3.09 ± 0.75 | 3.62 ± 1.22 | 0.0208* |
IFN-γ+CD4+T cells/Th (%) | 21.89 ± 7.82 | 22.33 ± 8.92 | 20.73 ± 8.17 | 0.4090 | 21.62 ± 8.81 | 17.81 ± 6.96 | 18.98 ± 5.12 | 0.3543 |
IFN-γ+CD8+T cells/Ts (%) | 60.66 ± 17.63 | 61.97 ± 13.45 | 63.57 ± 14.93 | 0.3101 | 61.64 ± 14.76 | 61.32 ± 13.10 | 57.97 ± 13.79 | 0.4574 |
IFN-γ+NK cells/NK (%) | 76.37 ± 14.31 | 73.49 ± 14.85 | 74.46 ± 15.25 | 0.5405 | 71.08 ± 12.17 | 77.28 ± 8.53 | 73.20 ± 16.31 | 0.7531 |
Interleukin-1β (pg/mL) | 6.55 ± 4.06 | 5.92 ± 3.05 | 8.69 ± 8.10 | 0.0446* | 6.48 ± 3.46 | 8.29 ± 9.12 | 5.82 ± 1.33 | 0.6391 |
Interleukin-2R (U/mL) | 427.45 ± 189.32 | 444.39 ± 149.37 | 572.76 ± 425.87 | 0.0112* | 381.67 ± 94.99 | 403.60 ± 88.02 | 707.17 ± 736.35 | 0.0750 |
Interleukin-6 (pg/mL) | 3.79 ± 4.86 | 4.99 ± 13.06 | 9.87 ± 14.87 | 0.0067* | 4.33 ± 5.37 | 2.37 ± 1.38 | 10.87 ± 12.37 | 0.0423* |
Interleukin-8 (pg/mL) | 18.28 ± 21.49 | 31.45 ± 41.75 | 30.67 ± 41.55 | 0.0912 | 37.36 ± 63.98 | 15.81 ± 12.54 | 48.02 ± 64.01 | 0.5254 |
Tumor necrosis factor-α (pg/mL) | 16.88 ± 20.70 | 19.12 ± 28.35 | 21.59 ± 38.68 | 0.4259 | 22.05 ± 26.87 | 18.59 ± 26.37 | 27.47 ± 32.40 | 0.5790 |
Selection of clinic features for ML models and the clinic nomogram
Performance assessment of ML algorithms
Development and performance assessment of the clinic-ML nomogram
ML Models | Univariate logistic regression | Multivariate logistic regression | ||
---|---|---|---|---|
OR (95% CL) | p-value | OR (95% CL) | p-value | |
AdaBoost | 2.535 (2.358–2.726) | 0.000* |
1.154 (1.090–1.222)
|
0.000
*
|
Decision Tree | 2.667 (2.563–2.774) | 0.000* |
1.554 (1.438–1.680)
|
0.000
*
|
Random Forest | 2.449 (2.286–2.622) | 0.000* |
1.150 (1.088–1.214)
|
0.000
*
|
SVM | 1.906 (1.681–2.162) | 0.000* | 1.014 (0.980–1.050) | 0.419 |
XGBoost | 2.577 (2.462–2.696) | 0.000* |
1.354 (1.260–1.455)
|
0.000
*
|
Clinic Features | Univariate logistic regression | Multivariate logistic regression | ||
---|---|---|---|---|
OR (95%CL) | p-value | OR (95%CL) | p-value | |
Age | 1.286 (1.129–1.465) | 0.000* |
1.137 (1.008–1.284)
|
0.037
*
|
Alkaline phosphatase | 1.171 (1.024–1.339) | 0.021* | 1.121 (0.998–1.260) | 0.054 |
B cells (CD3−CD19+) | 0.870 (0.761–0.995) | 0.043* |
0.844 (0.746–0.956)
|
0.008
*
|
Interleukin-1β | 1.148 (1.003–1.313) | 0.045* | 1.095 (0.972–1.234) | 0.133 |
Interleukin-2R | 1.189 (1.041 1.359) | 0.011* | 1.120 (0.996–1.260) | 0.059 |
Lactate dehydrogenase | 1.147 (1.003–1.313) | 0.045* | 1.110 (0.987–1.248) | 0.082 |
Neutrophil percentage | 0.839 (0.734–0.959) | 0.010* |
0.799 (0.708–0.902)
|
0.000
*
|
PSA | 1.379 (1.215–1.564) | 0.000* |
1.228 (1.084–1.391)
|
0.001
*
|
Th/Ts | 1.155 (1.009–1.320) | 0.036* |
1.200 (1.069–1.346)
|
0.002
*
|
Models | Sensitivity (95% CL) | Specificity (95% CL) | F1 (95% CL) | AUC (95% CL) |
---|---|---|---|---|
XGBoost | 0.924 (0.883–0.965) 0.680 (0.535–0.825) | 0.963 (0.933–0.993) 0.853 (0.743–0.963) | 0.927 (0.886–0.968) 0.664 (0.518–0.810) | 0.989 (0.980–0.998) 0.842 (0.764–0.919) |
Clinic nomogram | 0.704 (0.633–0.775) 0.609 (0.458–0.760) | 0.870(0.817–0.923) 0.822 (0.703–0.941) | 0.700(0.628–0.772) 0.585 (0.432–0.738) | 0.897 (0.867–0.926) 0.837 (0.764–0.910) |
Clinic-ML nomogram | 0.983 (0.963–1.000) 0.713 (0.573–0.853) | 0.994 (0.982–1.000) 0.869 (0.764–0.974) | 0.985 (0.966–1.000) 0.699 (0.557–0.841) | 0.998 (0.996–1.000) 0.864 (0.794–0.935) |