Introduction
Aim of the work
Material and methods
Included patients
Statistical analysis
Conceptual overlap
Selection of the optimum model structure
Model building
Model class | Hyperparameters | Model structures (n)a | Model trials (n)b | ||
---|---|---|---|---|---|
Hyperparameter | Values | Number | |||
Cumulative model (CM) | Parallelism | TRUE or FALSE | 2 | 10 | 40 |
Link | Logit, Probit, Cauchit, Cloglog, or Logc | 5 | |||
Penalized regression | Alpha | Ridge or Lasso | 2 | 16 | 64 |
Criteria | AIC or BIC | 2 | |||
Link | Logit, Probit, Cauchit, or Cloglog, | 4 | |||
Ordinal CART | CP | 20 randomly selected values | 20 | 80 | 320 |
Split | Misclassification cost in absolute or quadratic terms | 2 | |||
Prune | Misclassification rate or cost | 2 | |||
Ordinal forest | Nsets | 50, 100, or 150 | 3 | 27 | 108 |
Ntreeperdiv | 50, 100, or 150 | 3 | |||
Ntreefinal | 200, 400, or 600 | 3 | |||
133 | 532 |
Model evaluation
Results
General characteristics
The estimation sample
Question/Score | Whole estimation sample | TKR | ||||||
---|---|---|---|---|---|---|---|---|
No | Yes | |||||||
Not indicated | Indicated | |||||||
(n = 456) | (n = 285) | (n = 110) | (n = 61) | |||||
Mobility | ||||||||
L1 | 121 | 26.5% | 102 | 35.8% | 2 | 1.8% | 17 | 27.9% |
L2 | 111 | 24.3% | 93 | 32.6% | 3 | 2.7% | 15 | 24.6% |
L3 | 103 | 22.6% | 80 | 28.1% | 15 | 13.6% | 8 | 13.1% |
L4 | 94 | 20.6% | 10 | 3.5% | 66 | 60% | 18 | 29.5% |
L5 | 27 | 5.9% | – | 0% | 24 | 21.8% | 3 | 4.9% |
Self-care | ||||||||
L1 | 271 | 59.4% | 219 | 76.8% | 16 | 14.5% | 36 | 59% |
L2 | 61 | 13.4% | 44 | 15.4% | 10 | 9.1% | 7 | 11.5% |
L3 | 62 | 13.6% | 21 | 7.4% | 34 | 30.9% | 7 | 11.5% |
L4 | 38 | 8.3% | – | 0% | 31 | 28.2% | 7 | 11.5% |
L5 | 24 | 5.3% | 1 | 0.4% | 19 | 17.3% | 4 | 6.6% |
Usual activities | ||||||||
L1 | 112 | 24.6% | 97 | 34% | 1 | 0.9% | 14 | 23% |
L2 | 116 | 25.4% | 97 | 34% | 4 | 3.6% | 15 | 24.6% |
L3 | 113 | 24.8% | 80 | 28.1% | 20 | 18.2% | 13 | 21.3% |
L4 | 68 | 14.9% | 10 | 3.5% | 47 | 42.7% | 11 | 18% |
L5 | 47 | 10.3% | 1 | 0.4% | 38 | 34.5% | 8 | 13.1% |
Pain/discomfort | ||||||||
L1 | 38 | 8.3% | 30 | 10.5% | 1 | 0.9% | 7 | 11.5% |
L2 | 140 | 30.7% | 120 | 42.1% | 2 | 1.8% | 18 | 29.5% |
L3 | 140 | 30.7% | 111 | 38.9% | 15 | 13.6% | 14 | 23% |
L4 | 77 | 16.9% | 20 | 7.0% | 45 | 40.9% | 12 | 19.7% |
L5 | 61 | 13.4% | 4 | 1.4% | 47 | 42.7% | 10 | 16.4% |
Anxiety/depression | ||||||||
L1 | 118 | 25.9% | 86 | 30.2% | 10 | 9.1% | 22 | 36.1% |
L2 | 114 | 25% | 89 | 31.2% | 11 | 10% | 14 | 23% |
L3 | 138 | 30.3% | 87 | 30.5% | 40 | 36.4% | 11 | 18% |
L4 | 48 | 10.5% | 16 | 5.6% | 23 | 20.9% | 9 | 14.8% |
L5 | 38 | 8.3% | 7 | 2.5% | 26 | 23.6% | 5 | 8.2% |
EQ-5D VAS€ | 61.2 ± 24.7 | 69.8 ± 17.3 | 38.5 ± 25.8 | 59.6 ± 27.5 | ||||
Egypt utility | 0.38 ± 0.53 | 0.65 ± 0.26 | − 0.30 ± 0.40 | 0.35 ± 0.56 | ||||
Total OKS | 27.3 ± 13.2 | 34.3 ± 7.46 | 10.1 ± 5.75 | 25.3 ± 14.5 | ||||
Usual level of pain | 1.11 ± 1.11 | 1.44 ± 1.04 | 0.20 ± 0.57 | 1.18 ± 1.26 | ||||
Trouble with washing and drying | 3.06 ± 1.28 | 3.66 ± 0.72 | 1.56 ± 1.10 | 2.95 ± 1.40 | ||||
Trouble with transport | 2.39 ± 1.33 | 3.01 ± 0.96 | 0.91 ± 0.80 | 2.20 ± 1.47 | ||||
Walking time before severe pain | 2.70 ± 1.23 | 3.29 ± 0.79 | 1.36 ± 1.00 | 2.34 ± 1.29 | ||||
Pain on standing up from sitting | 2.16 ± 1.19 | 2.61 ± 0.98 | 1.00 ± 0.77 | 2.13 ± 1.26 | ||||
Limping | 2.38 ± 1.47 | 3.06 ± 0.98 | 0.76 ± 0.98 | 2.10 ± 1.71 | ||||
Difficulty kneeling | 2.11 ± 1.42 | 2.59 ± 1.23 | 0.94 ± 1.1 | 1.95 ± 1.53 | ||||
Pain at night | 2.16 ± 1.45 | 2.80 ± 1.15 | 0.60 ± 0.74 | 2.03 ± 1.51 | ||||
Pain interferes with work | 2.12 ± 1.36 | 2.75 ± 0.98 | 0.53 ± 0.63 | 2.03 ± 1.47 | ||||
Sense of knee instability | 2.44 ± 1.43 | 3.14 ± 0.89 | 0.77 ± 0.93 | 2.20 ± 1.63 | ||||
Can do household shopping alone | 2.66 ± 1.53 | 3.48 ± 0.88 | 0.76 ± 0.95 | 2.26 ± 1.60 | ||||
Trouble walking downstairs | 1.98 ± 1.25 | 2.49 ± 1.00 | 0.66 ± 0.63 | 1.95 ± 1.41 |
The external validation sample
Whole external validation sample | TKR | |||||||
---|---|---|---|---|---|---|---|---|
No | Yes | |||||||
Not indicated | Indicated | |||||||
(n = 115) | (n = 66) | (n = 21) | (n = 28) | |||||
Mobility | ||||||||
L1 | 45 | 39.1% | 31 | 47% | – | 0% | 14 | 50% |
L2 | 18 | 15.7% | 14 | 21.2% | – | 0% | 4 | 14.3% |
L3 | 30 | 26.1% | 18 | 27.3% | 4 | 19.0% | 8 | 28.6% |
L4 | 17 | 14.8% | 3 | 4.5% | 12 | 57.1% | 2 | 7.1% |
L5 | 5 | 4.3% | – | 0% | 5 | 23.8% | – | 0% |
Self-care | ||||||||
L1 | 66 | 57.4% | 49 | 74.2% | – | 0% | 17 | 60.7% |
L2 | 23 | 20% | 12 | 18.2% | 5 | 23.8% | 6 | 21.4% |
L3 | 12 | 10.4% | 4 | 6.1% | 5 | 23.8% | 3 | 10.7% |
L4 | 11 | 9.6% | 1 | 1.5% | 8 | 38.1% | 2 | 7.1% |
L5 | 3 | 2.6% | – | 0% | 3 | 14.3% | – | 0% |
Usual activities | ||||||||
L1 | 35 | 30.4% | 24 | 36.4% | – | 0% | 11 | 39.3% |
L2 | 31 | 27% | 23 | 34.8% | – | 0% | 8 | 28.6% |
L3 | 28 | 24.3% | 17 | 25.8% | 5 | 23.8% | 6 | 21.4% |
L4 | 14 | 12.2% | 2 | 3.0% | 10 | 47.6% | 2 | 7.1% |
L5 | 7 | 6.1% | – | 0% | 6 | 28.6% | 1 | 3.6% |
Pain/discomfort | ||||||||
L1 | 19 | 16.5% | 9 | 13.6% | – | 0% | 10 | 35.7% |
L2 | 37 | 32.2% | 29 | 43.9% | – | 0% | 8 | 28.6% |
L3 | 38 | 33% | 24 | 36.4% | 5 | 23.8% | 9 | 32.1% |
L4 | 11 | 9.6% | 2 | 3% | 9 | 42.9% | – | 0% |
L5 | 10 | 8.7% | 2 | 3% | 7 | 33.3% | 1 | 3.6% |
Anxiety/depression | ||||||||
L1 | 44 | 38.3% | 24 | 36.4% | 3 | 14.3% | 17 | 60.7% |
L2 | 35 | 30.4% | 22 | 33.3% | 5 | 23.8% | 8 | 28.6% |
L3 | 28 | 24.3% | 18 | 27.3% | 9 | 42.9% | 1 | 3.6% |
L4 | 6 | 5.2% | 1 | 1.5% | 3 | 14.3% | 2 | 7.1% |
L5 | 2 | 1.7% | 1 | 1.5% | 1 | 4.8% | – | 0% |
EQ-5D VAS | 69.3 ± 21.3 | 74.3 ± 17.8 | 47.4 ± 19.9 | 73.9 ± 19.9 | ||||
Egypt utility | 0.52 ± 0.47 | 0.68 ± 0.24 | − 0.22 ± 0.37 | 0.69 ± 0.41 | ||||
Total OKS | 30.6 ± 12.4 | 34.8 ± 7.83 | 10.8 ± 5.63 | 35.5 ± 10.3 | ||||
Usual level of pain | 1.50 ± 1.40 | 1.45 ± 1.17 | 0.095 ± 0.3 | 2.68 ± 1.39 | ||||
Trouble with washing and drying | 3.19 ± 1.19 | 3.67 ± 0.69 | 1.33 ± 0.97 | 3.46 ± 0.92 | ||||
Trouble with transport | 2.56 ± 1.18 | 2.92 ± 0.90 | 1.10 ± 0.94 | 2.79 ± 1.10 | ||||
Walking time before severe pain | 2.93 ± 1.13 | 3.52 ± 0.75 | 1.43 ± 0.75 | 2.68 ± 0.98 | ||||
Pain on standing up from sitting | 2.48 ± 1.19 | 2.64 ± 1.00 | 1.00 ± 0.71 | 3.21 ± 0.96 | ||||
Limping | 2.83 ± 1.25 | 3.18 ± 0.89 | 1.19 ± 0.99 | 3.21 ± 1.20 | ||||
Difficulty kneeling | 2.08 ± 1.49 | 2.38 ± 1.37 | 0.57 ± 0.87 | 2.50 ± 1.45 | ||||
Pain at night | 2.73 ± 1.33 | 3.06 ± 1.16 | 1.00 ± 0.89 | 3.25 ± 0.84 | ||||
Pain interferes with work | 2.39 ± 1.30 | 2.68 ± 1.13 | 0.72 ± 0.46 | 2.96 ± 1.07 | ||||
Sense of knee instability | 2.80 ± 1.26 | 3.30 ± 0.82 | 0.95 ± 0.86 | 3.00 ± 1.12 | ||||
Can do household shopping alone | 2.94 ± 1.37 | 3.55 ± 0.89 | 0.95 ± 1.20 | 3.00 ± 0.99 | ||||
Trouble walking downstairs | 2.16 ± 1.28 | 2.45 ± 1.04 | 0.48 ± 0.75 | 2.71 ± 1.08 |
Exploratory data analysis
Conceptual overlap
Important questions as determined by recursive feature elimination (RFE)
Model building on the estimation sample
Domain | Model type | Preparation of predictors | Tuned hyperparameters | Accuracy | |||
---|---|---|---|---|---|---|---|
Estimation sample | External validation sample | ||||||
Baseline1 | Crude2 | CV3 | Crude2 | ||||
(95% CI) | (95% CI) | (SD) | (95% CI) | ||||
MO | Penalized regression | Pre-processed | alpha = 1 | 26.5% (22.6,.30.9) | 0.658 (0.612, 0.701) | 0.656 | 0.687 |
criteria = aic | (0.037) | (0.593, 0.770) | |||||
link = cauchit | |||||||
SC | Random forest | RFE | nsets = 150 | 59.4% (54.7, 63.9) | 0.840 (0.803, 0.872) | 0.724 | 0.669 |
ntreeperdiv = 150 | (0.039) | (0.575, 0.754) | |||||
ntreefinal = 600 | |||||||
UA | Random forest | All predictors | nsets = 50 | 25.4% (21.5, 29.7) | 0.882 (0.848, 0.91) | 0.604 | 0.687 |
ntreeperdiv = 100 | (0.044) | (0.593, 0.770) | |||||
ntreefinal = 200 | |||||||
PD | Cumulative probability model | Pre-processed | parallel = TRUE | 30.7% (26.5, 35.2) | 0.686 (0.642, 0.729) | 0.671 | 0.678 |
link = cauchit | (0.039) | (0.584, 0.762) | |||||
AD | CART | RFE | cp = 0.00645 | 30.3% (26.1, 34.7) | 0.452 (0.405, 0.499) | 0.435 | 0.357 |
split = abs | (0.038) | (0.269, 0.451) | |||||
prune = mc |
Model evaluation on the external validation sample
Country | Type of value set | MAE | MSE whole external validation sample(n = 115, 100%) | ||
---|---|---|---|---|---|
Whole external validation sample | Utility ≥ median | Utility < median | |||
(n = 115, 100%) | (n = 58, 50.43%) | (n = 57, 49.57%) | |||
Canada | VT | 0.076 (0.072–0.085) | 0.043 (0.029–0.049) | 0.109 (0.101–0.128) | 0.011 (0.010–0.013) |
China | VT | 0.097 (0.086–0.098) | 0.072 (0.058–0.091) | 0.123 (0.109–0.134) | 0.017 (0.014–0.017) |
Denmark | VT | 0.127 (0.127–0.148) | 0.076 (0.077–0.078) | 0.178 (0.165–0.179) | 0.035 (0.036–0.045) |
Egypt | VT | 0.134 (0.118–0.157) | 0.089 (0.072–0.098) | 0.180 (0.166–0.173) | 0.033 (0.027–0.044) |
England | VT | 0.092 (0.091–0.096) | 0.064 (0.066–0.075) | 0.121 (0.097–0.122) | 0.016 (0.015–0.018) |
Ethiopia | VT | 0.096 (0.093–0.099) | 0.042 (0.037–0.049) | 0.152 (0.156–0.182) | 0.024 (0.022–0.026) |
France | VT | 0.087 (0.084–0.091) | 0.031 (0.032–0.035) | 0.143 (0.151–0.162) | 0.020 (0.018–0.024) |
Germany | VT | 0.101 (0.095–0.111) | 0.043 (0.036–0.043) | 0.160 (0.155–0.178) | 0.026 (0.023–0.033) |
HongKong | VT | 0.128 (0.126–0.128) | 0.094 (0.079–0.098) | 0.163 (0.164–0.172) | 0.027 (0.028–0.028) |
Hungary | VT | 0.107 (0.109–0.119) | 0.049 (0.046–0.051) | 0.166 (0.165–0.170) | 0.025 (0.025–0.027) |
Indonesia | VT | 0.125 (0.110–0.125) | 0.097 (0.097–0.100) | 0.153 (0.136–0.172) | 0.025 (0.022–0.025) |
Ireland | VT | 0.142 (0.139–0.186) | 0.088 (0.081–0.106) | 0.197 (0.161–0.216) | 0.042 (0.041–0.068) |
Japan | VT | 0.077 (0.066–0.092) | 0.078 (0.071–0.083) | 0.076 (0.069–0.077) | 0.010 (0.008–0.013) |
Malaysia | VT | 0.095 (0.078–0.099) | 0.069 (0.059–0.073) | 0.122 (0.103–0.117) | 0.016 (0.012–0.017) |
Netherlands | VT | 0.110 (0.108–0.123) | 0.073 (0.065–0.078) | 0.148 (0.142–0.160) | 0.024 (0.022–0.030) |
Peru_cTTO | VT | 0.137 (0.137–0.146) | 0.110 (0.092–0.117) | 0.165 (0.140–0.182) | 0.031 (0.035–0.036) |
Peru_DCE | VT | 0.072 (0.073–0.075) | 0.047 (0.040–0.049) | 0.098 (0.087–0.113) | 0.010 (0.010–0.011) |
Poland | VT | 0.081 (0.064–0.113) | 0.024 (0.016–0.025) | 0.140 (0.118–0.156) | 0.023 (0.015–0.042) |
Portugal | VT | 0.092 (0.093–0.104) | 0.051 (0.043–0.054) | 0.134 (0.123–0.147) | 0.018 (0.018–0.023) |
SouthKorea | VT | 0.070 (0.061–0.071) | 0.050 (0.044–0.050) | 0.090 (0.083–0.101) | 0.009 (0.007–0.009) |
Spain | VT | 0.099 (0.091–0.102) | 0.076 (0.072–0.094) | 0.122 (0.104–0.127) | 0.018 (0.016–0.018) |
Sweden | VT | 0.063 (0.062–0.082) | 0.045 (0.033–0.048) | 0.082 (0.075–0.084) | 0.008 (0.007–0.012) |
Taiwan | VT | 0.140 (0.118–0.158) | 0.105 (0.093–0.105) | 0.176 (0.159–0.205) | 0.035 (0.025–0.039) |
Thailand | VT | 0.089 (0.074–0.087) | 0.060 (0.056–0.076) | 0.119 (0.112–0.144) | 0.015 (0.011–0.015) |
Uruguay | VT | 0.066 (0.060–0.072) | 0.029 (0.024–0.030) | 0.104 (0.101–0.122) | 0.011 (0.008–0.012) |
USA | VT | 0.114 (0.110–0.125) | 0.077 (0.067–0.083) | 0.152 (0.135–0.148) | 0.024 (0.022–0.026) |
Vietnam | VT | 0.095 (0.090–0.110) | 0.071 (0.058–0.069) | 0.121 (0.122–0.133) | 0.017 (0.016–0.021) |
Denmark | CW | 0.094 (0.090–0.102) | 0.061 (0.062–0.069) | 0.127 (0.112–0.144) | 0.021 (0.018–0.025) |
France | CW | 0.121 (0.102–0.130) | 0.092 (0.093–0.097) | 0.150 (0.130–0.156) | 0.023 (0.020–0.025) |
Germany | CW | 0.082 (0.080–0.091) | 0.033 (0.038–0.045) | 0.131 (0.130–0.154) | 0.018 (0.015–0.021) |
Japan | CW | 0.072 (0.061–0.072) | 0.072 (0.067–0.078) | 0.073 (0.060–0.075) | 0.012 (0.008–0.012) |
Netherlands | CW | 0.098 (0.085–0.114) | 0.068 (0.060–0.085) | 0.129 (0.113–0.135) | 0.020 (0.016–0.028) |
Russia | CW | 0.080 (0.072–0.096) | 0.038 (0.037–0.048) | 0.123 (0.103–0.130) | 0.019 (0.014–0.031) |
Spain | CW | 0.110 (0.093–0.112) | 0.062 (0.051–0.066) | 0.159 (0.156–0.159) | 0.025 (0.020–0.027) |
Thailand | CW | 0.099 (0.088–0.109) | 0.081 (0.081–0.086) | 0.118 (0.097–0.132) | 0.019 (0.016–0.022) |
UK | CW | 0.106 (0.096–0.117) | 0.064 (0.047–0.079) | 0.148 (0.120–0.157) | 0.024 (0.020–0.025) |
USA | CW | 0.076 (0.074–0.079) | 0.048 (0.040–0.064) | 0.104 (0.098–0.128) | 0.012 (0.012–0.013) |
Zimbabwe | CW | 0.064 (0.059–0.067) | 0.037 (0.033–0.040) | 0.092 (0.073–0.090) | 0.009 (0.007–0.010) |
Mean | 0.098 | 0.063 | 0.133 | 0.02 | |
SD | 0.022 | 0.022 | 0.03 | 0.008 | |
Min | 0.063 | 0.024 | 0.073 | 0.008 | |
Max | 0.142 | 0.11 | 0.197 | 0.042 |
1. Cumulative model (CM) It predicts the cumulative probability of an observation being at or below a given level on the outcome. It assumes that ratings originate from the categorization of a latent continuous variable. We varied the structure of CM by modifying the following hyperparameters: a. Parallel curves or not. With parallel curves, predictors have the same coefficients across different levels of the outcome. With non-parallel curves, predictors were allowed to have different coefficients b. Link functions. Five-link functions were tried to transform the cumulative probability (p) to a continuous unbounded scale and can be modeled using ordinal least square regression. They were \(logit (p)=\mathrm{log}(\frac{p}{1-p})\);\(probit (p)={\varphi }^{-1}(p)\);\(cauchit\left(p\right)=\mathrm{tan}(\pi *\left(p-0.5\right))\); \(cloglog=\mathrm{ln}(-ln\left(1-p\right))\) and \(logc=-\mathrm{ln}(1-p)\) It fits CM that is penalized for having too many variables in the model. Imposing a penalty reduces the coefficient values; thus, the less contributive predictors have a coefficient close to or equal zero. We varied the structure of penalized regression model by modifying the following hyperparameters: a. Penalty term (\(\alpha )\). We set \(\alpha =0\) if penalty was applied to the sum of squared coefficients (Ridge penalized regression), and \(\alpha =1\) if penalty was applied to the sum of absolute coefficients (LASSO penalized regression) b. Criteria used to select the magnitude of penalty. AIC or BIC c. Link functions. Four link functions were used: \(logit (p)=\mathrm{log}(\frac{p}{1-p})\);\(probit (p)={\varphi }^{-1}(p)\);\(cauchit\left(p\right)=\mathrm{tan}(\pi *\left(p-0.5\right))\); and \(cloglog=\mathrm{ln}(-ln\left(1-p\right))\) 3. Ordinal CART CART [18] produces a tree to predict both linear and nominal outcomes. It is built-in splitting and pruning. With splitting, the data is partitioned into smaller subsets to minimize impurity in the new subsets as measured by Gini's index. Splitting continues till final homogeneous subsets; however, they might consist of a few similar data points. At this stage, the model predicts the estimation data perfectly, but might not predict a new data point well (overfitting). To avoid this, the tree is pruned back to the point of the least cross-validated overall misclassification We used a modified approach of CART, where a score is assigned to the ordered categories of the outcome [22]. This allows to assign a cost of misclassification; The larger the distance between the actual and predicted levels, the higher the weight given to the misclassification. We varied the structure of produced tree by modifying the following hyperparameters: a. Cost of misclassification in the generalized Gini index was calculated in absolute or quadratic terms b. Complexity Parameter (CP) is the minimum improvement needed to split at each node. If the split doesn't yield at least that much benefit (the value of cp), the split does not take place. We tried 20 randomly selected values for CP c. The cross-validated overall misclassification (used to determine pruning) was measured using: Misclassification error rate, all misclassifications were given same weight Misclassification cost rate, different weights were given to different misclassifications 4. Ordinal forests (OF) Random forest (RF) [17] is a flexible machine-learning algorithm to predict linear and nominal outcomes. It builds multiple decision trees and merges them to produce an accurate and stable prediction. For every tree, it selects a random number of participants and predictors We used a modified version of RF [22, 32]. It translates ordinal levels into scores, but instead of using a fixed score set, it optimizes them. It tries different score sets and builds a small forest to estimate the expected predictive performance of each set. The optimum score set (that achieved the highest predictions using small forests) is used to build the final OF We varied the structure of the OF by modifying the following hyperparameters: a. Number of score sets tried before the approximation of the optimal score set maybe 50, 100, or 150 sets b. Number of trees in the smaller forests maybe 50, 100, or 150 trees |
c. Number of trees in the final OF using the optimized score set maybe 200, 400, or 600 final trees |
1. All the 12 OKS questions as predictors 2. RFE-based important predictors are a subset of OKS questions determined by recursive feature elimination (RFE). RFE fits a random forest model with \(5\times 5\)-fold cross-validation to recursively eliminate predictors that were not required to build an accurate model [11] 3. Model-based important predictors are a subset of OKS questions which is most relevant to prediction as determined by a built-in algorithm within every model class 4. Pre-processed predictors: |
The 12 OKS questions were scaled and centered. Then, principal components (Explaining 90% of the variance in OKS questions) were extracted using principal component analysis (PCA) |