Introduction
Materials and methods
Network architecture design
Backbone
Classifier for EGFR genotyping
Decoder for GTV segmentation
Multi-task loss function
Experiment and analysis
Datasets
Training group (n = 155) | Internal testing set (n = 33) | P-value | ||
---|---|---|---|---|
Age | 0.425 | |||
Median(range) | 57 (46–78) | 56 (48–72) | ||
Sex | 0.276 | |||
Male | 90 (58.06%) | 20 (60.61%) | ||
Female | 65 (41.94%) | 13 (39.39%) | ||
Smoking | 0.135 | |||
Yes | 61 (39.35%) | 10 (30.30%) | ||
No | 94 (60.65%) | 23 (69.70%) | ||
EGFR Status | 0.149 | |||
Mutant | 83 (53.55%) | 19 (57.58%) | ||
Wild | 72 (46.45%) | 14 (42.42%) |
Training group (n = 155) | External testing set\(^a\) (n = 22) | External testing set\(^b\) (n = 16) | ||
---|---|---|---|---|
Age | ||||
Median(range) | 57 (46–78) | 56 (43-77) | 58 (48–81) | |
Sex | ||||
Male | 90 (58.06%) | 10 (45.45%) | 8 (50.00%) | |
Female | 65 (41.94%) | 12 (54.55%) | 8 (50.00%) | |
EGFR Status | ||||
Mutant | 83 (53.55%) | 12 (54.55%) | 9 (56.25%) | |
Wild | 72 (46.45%) | 10 (45.45%) | 7 (43.75%) |
Experimental design
Implementation details
Evaluation metrics
Statistical analysis
Results
Comparison with the existing algorithms
Model | GTV segmentation | EGFR genotyping | ||||||
---|---|---|---|---|---|---|---|---|
Dice | \(HD_{95}\) (mm) | Precision | Recall | Accuracy | Precision | Recall | F1-score | |
RA-Uent [43] | 0.8729 (0.85, 0.90) | 3.67 | 0.9128 (0.90, 0.93) | 0.8697 (0.85, 0.89) | – | – | – | – |
Swin Unet [40] | 0.6395 (0.61, 0.67) | 6.51 | 0.7143 (0.69, 0.74) | 0.6574 (0.62, 0.69) | – | – | – | – |
TransUnet [41] | 0.8565 (0.84, 0.88) | 3.78 | 0.8794 (0.84, 0.89) | 0.8631 (0.85, 0.88) | – | – | – | – |
Unet [38] | 0.8888 (0.86, 0.90) | 3.63 | 0.9031 (0.87, 0.91) | 0.9011 (0.89, 0.92) | – | – | – | – |
DeSeg [23] | 0.8566 (0.83, 0.87) | 4.13 | 0.8890 (0.86, 0.91) | 0.8789 (0.85, 0.89) | – | – | – | – |
Unet3+ [34] | 0.7946 (0.77, 0.82) | 4.63 | 0.7205 (0.70, 0.74) | 0.9412 (0.92, 0.96) | – | – | – | – |
ResNet-50 [16] | – | – | – | – | 0.7879 (0.65, 0.93) | 0.8333 (0.71, 0.96) | 0.7895 (0.65, 0.93) | 0.8108 |
Radiomics Model [14] | – | – | – | – | 0.6061 (0.44, 0.77) | 0.6667 (0.51, 0.83) | 0.6316 (0.47, 0.80) | 0.6486 |
RN-GAP [35] | – | – | – | – | 0.7273 (0.58, 0.88) | 0.7778 (0.64, 0.92) | 0.7368 (0.59, 0.89) | 0.7568 |
SE-Net [37] | – | – | – | – | 0.8182 (0.69, 0.95) | 0.8824 (0.77, 0.99) | 0.7895 (0.65, 0.93) | 0.8333 |
DenseNet [36] | – | – | – | – | 0.7879 (0.65, 0.93) | 0.8000 (0.66, 0.94) | 0.8421 (0.72, 0.97) | 0.8205 |
MTSA-Net (ours) | 0.8914 (0.88, 0.91) | 3.58 | 0.9063 (0.88, 0.92) | 0.9035 (0.89, 0.92) | 0.8788 (0.77, 0.99) | 0.9412 (0.86, 1.00) | 0.8421 (0.72, 0.97) | 0.8889 |
Model performance achieved on the external testing set
Ablation experiment
Effectiveness of the multi-Task loss function
Model | GTV segmentation | EGFR genotyping | |||||
---|---|---|---|---|---|---|---|
Dice | \(HD_{95}\) (mm) | Precision | Recall | Accuracy | Precision | Recall | |
\(\gamma = 2, \alpha = 0.7\)
| 0.8773 | 3.71 | 0.8911 | 0.8932 | 0.8485 | 0.8889 | 0.8421 |
\(\gamma = 2, \alpha = 0.8\)
| 0.8914 | 3.58 | 0.9063 | 0.9035 | 0.8788 | 0.9412 | 0.8421 |
\(\gamma = 2, \alpha = 0.9\)
| 0.8827 | 3.66 | 0.9047 | 0.8922 | 0.8788 | 0.8947 | 0.8947 |
Model | GTV segmentation | EGFR genotyping | |||||
---|---|---|---|---|---|---|---|
Dice | \(HD_{95}\) (mm) | Precision | Recall | Accuracy | Precision | Recall | |
Equal weights\(^1\) | 0.8656 | 3.83 | 0.9190 | 0.8626 | 0.8485 | 0.8500 | 0.8947 |
Ucertain weights (Ours) | 0.8914 | 3.58 | 0.9063 | 0.9035 | 0.8788 | 0.9412 | 0.8421 |