Multi-task logistic regression (MTLR)
The MTLR model presents a novel approach to survival analysis, extending traditional methods by directly modeling the survival function across multiple time intervals. The model accommodates the time-varying effects of covariates, allowing for a more nuanced understanding of risk factors. MTLR captures these dynamics, offering several advantages over Cox’s proportional hazards and Aalen’s additive models, which are traditional staples of survival analysis. Unlike these models, MTLR can handle non-proportional hazards and non-linear effects, resulting in improved predictive accuracy and flexibility.
A simplified view of the mathematical formulation behind MTLR is shown below:
Assume we divide the survival time into N discrete intervals, \([{t_0},{t_1}),[{t_1},{t_2}), \ldots,[{t_{N - 1}},{t_N}]\), where \({t_0}=0\) and \({t_N}=\infty \) (or some maximum follow-up time). For each interval i, MTLR models the probability that the event of interest (e.g., death) occurs within that interval, given that it has not occurred before \({t_i}\).
The probability of the event occurring in the ith interval, given covariates X, is modeled as:
$$P(T \in [{t_{i - 1}},{t_i})|X)=\frac{{\exp (\beta _{i}^{T}X)}}{{\sum\limits_{{j=0}}^{N} {\exp } (\beta _{j}^{T}X)}}$$
where:
-
T is the survival time,
-
X represents the covariates (or features) of the patient,
-
βi is the coefficient vector for interval i,
-
N is the number of intervals.
Random Survival Forests (RSF) represent another innovative extension of Random Forests specifically adapted for survival analysis. RSF does not assume a specific form for the underlying hazard function, making it adaptable for various datasets. By aggregating predictions from multiple decision trees built on various subsamples of the data, RSF enhances prediction accuracy and robustness.
For a new observation, the survival function is estimated by aggregating predictions from all the trees in the forest.
In mathematical terms, if we denote Si(t) as the survival function estimated by the ith tree in the forest for time t, and N as the total number of trees, the overall survival function S(t) for an observation is given by:
$$S(t)=\frac{1}{N}\sum\limits_{{i=1}}^{N} {{S_i}} (t)$$
The DeepSurv model in PySurvival is a deep learning-based approach to survival analysis, renowned for its ability to capture complex, non-linear relationships in the data. DeepSurv extends traditional survival analysis models by using neural networks, allowing for more flexible and potentially more accurate modeling of survival data, especially when dealing with high-dimensional and complex datasets.
DeepSurv utilizes a neural network to model the hazard function. The hazard function in the context of DeepSurv can be expressed as:
\(h(t|X)\, = \,{h_0}(t)exp(g(X,\theta ))\)
where:
-
\(h(t|X)\) is the hazard function at time t given covariates X.
-
\({h_0}(t)\) is the baseline hazard function, which is typically left unspecified.
-
\(exp(g(X,\theta ))\)is a non-linear function represented by the neural network, with X being the input covariates and θ representing the network’s parameters.
-
The neural network g(X,θ) learns complex relationships between the input covariates and the log-risk of the event occurring.
Our decision to employ DL, MTLR, and RF into our study, alongside a comparison with the traditional TNM staging system, stemmed from our aim to explore a spectrum of machine learning approaches that address the unique challenges posed by survival data. Each method was selected based on its specific strengths in addressing different aspects of survival analysis:
-
DL was chosen for its unparalleled capability in modeling complex, non-linear relationships within high-dimensional datasets.
-
MTLR was employed for its innovative approach to capturing time-varying effects of covariates across multiple time intervals.
-
RSF was utilized for its effectiveness in handling censored data and reducing overfitting, leveraging the ensemble strength of Random Forests tailored to survival analysis.
Various ML algorithms were employed in this study. We implemented a two-stage validation process for our survival analysis models. Initially, we conducted internal validation using the SEER database, where the data was randomly split into two portions: 60% for model training and 40% for validation. This partitioning enabled us to develop and subsequently evaluate the model within the same dataset. We employed a grid search approach combined with the C-index to select parameters for our survival analysis model. This method entails exploring a predefined set of parameter combinations, training the model on a training dataset, and then evaluating its performance on internal validation dataset using the C-index. This process systematically identifies the optimal parameters that enhance the model’s predictive accuracy by ranking survival times effectively. The results of the grid search are provided in the supplementary materials. For external validation, we utilized an independent dataset from China, allowing us to assess the model’s performance in a different patient population. This comprehensive approach ensured a thorough evaluation of our models across diverse clinical contexts [
17]. The ML algorithms that were tested in this study encompassed DL, MTLR and RF. The accuracy of these ML models was compared with TNM stage. In order to evaluate the performance of the model, various metrics were computed, including the area under the receiver operating characteristic curve [
18]. Area Under the Curve (AUC) is a performance measure that remains unaffected by specific thresholds and provides a comprehensive evaluation of the model’s performance.
The AUC values range from 0.5 to 1.0, where 0.5 represents random chance and 1.0 represents perfect classification. Additionally, the calibration of the model, which compares predicted outcomes to observed outcomes, was evaluated through visual examination of calibration plots. Decision curve analysis was performed to calculate the clinical net benefit for each prediction model. The net benefit measures the advantages gained by using the model’s predictions to guide decision-making. The net benefit of employing these strategies was compared to the models that rely on prognosis-based interventions, meaning interventions based on a predicted risk exceeding a specific threshold.