Background
Timing of kidney replacement therapy (KRT) initiation in critically ill patients with severe acute kidney injury (AKI) is controversial and has been the focus of several recent randomized trials [
1‐
4]. These trials have been driven by the premise that earlier KRT can facilitate more rapid correction of metabolic, acid–base, and fluid balance derangements, prevent AKI-related complications, and improve clinical outcomes [
5‐
7]. At the same time, KRT is also recognized as an invasive and resource-intensive intervention associated with risks, such as placement of a large central venous catheter, exposure to an extracorporeal circulation, and therapy-related complications, in particular episodes of hemodynamic instability, which may modify the probability of kidney recovery and independence from KRT [
2,
3,
8].
The
Standard versus Accelerated Initiation of Renal-Replacement Therapy in Acute Kidney Injury (STARRT-AKI) trial found no important difference in the primary endpoint of 90-day all-cause mortality when comparing the accelerated with the more conservative strategy for starting KRT in critically ill patients with severe AKI; however, the accelerated strategy conferred greater risk for KRT dependence at 90 days among hospital survivors [
3]. The STARRT-AKI trial was designed as a frequentist trial and was interpreted using a traditional framework of null hypothesis testing with a dichotomous interpretation of
p values under a Neyman–Pearson concept [
9]. The reinterpretation of the STARRT-AKI trial through a Bayesian framework may align more naturally with clinician decision-making and provide a more straightforward context, including the provision of direct probabilities of benefit or harm, probabilities of the effect size being within a range of relevant effect sizes, and estimates of equivalence [
10‐
12].
Accordingly, we performed a secondary post hoc analysis of the STARRT-AKI trial data under a Bayesian framework, focusing on assessing the effect of accelerated compared with standard KRT initiation on 90-day all-cause mortality and, secondly, on key kidney-specific outcomes.
Methods
Aim, Design and Setting We performed a post hoc secondary analysis of the STARRT-AKI trial (Data Creation Plan available at:
https://www.ualberta.ca/critical-care/research/current-research/starrtaki/documents.html) [
3,
13,
14]. In brief, the STARRT-AKI trial randomized critically ill patients greater than 18 years old with kidney dysfunction (serum creatinine level ≥ 1.13 mg per deciliter [100 μmol/l] in women and ≥ 1.47 mg per deciliter [130 μmol/l] in men) and severe AKI to two strategies for KRT initiation. Those allocated to the accelerated strategy were to commence KRT within 12 h of meeting eligibility criteria; the standard strategy entailed deferral of KRT unless a conventional indication for KRT or persistent AKI arose. Details of the protocol, analysis, and findings have been previously reported [
3,
13,
14].
Patients We included all patients from the modified intention-to-treat analysis (n = 2927).
Endpoints The primary endpoint was 90-day all-cause mortality. Key secondary endpoints included: (1) number of days alive and free of KRT and (2) days alive and free of hospitalization, both through 90 days. Additional secondary endpoints included: (3) composite for death/KRT at 90 days; (4) KRT dependence at 90 days among survivors; and (5) rehospitalization within 90 days.
Statistical Analysis We defined a priori that the model would be a Bayesian Hierarchical model adjusted for the presence of sepsis (Yes/No), type of ICU admission (surgical vs. medical) and baseline chronic kidney disease (CKD) status, defined as premorbid estimated glomerular filtration rate (eGFR) < 60 mL/min/1.73 m
2 (Yes/No), with study site added as a random intercept [see: data creation plan (DCP) at
https://www.ualberta.ca/critical-care/research/current-research/starrtaki/documents.html].
We considered neutral, optimistic, and pessimistic priors. The priors were defined on a log scale for the odds ratio (OR) and assumed a normal distribution. The neutral prior was defined so that 0.95 of the probability mass ranged from an odds ratio between 0.5 and 2.0; that is, it follows a normal distribution defined as
N(mean, standard deviation) equals to
\(N\left(0,0.355\right)\). The optimistic and pessimistic priors were mirrored around the effect size that the STARRT-AKI trial was designed to detect (a 6% absolute risk reduction in 90-day all-cause mortality from 40 to 34%, representing an OR = 0.77 [log[OR] = − 0.257]). Standard deviation was set to consider a 0.15 probability of harm for the optimistic prior and 0.15 probability of benefit for the pessimistic prior; that is, the optimistic prior was centered in a possible benefit (log[OR] = 0.257; OR ~ 0.77), while acknowledging the possibility of harm, and the pessimistic prior was centered at possible harm (log[OR] = − 0.257; OR ~ 1.30), while considering a 0.15 probability of benefit [
9]. Under these assumptions, the optimistic prior was
\(N\left(-0.257,0.249\right)\) and pessimistic prior was
\(N\left(0.257,0.249\right)\). Priors for other predictors were set as
\(N\left(0,1\right)\) for regularization. Default priors for random intercepts in
brms R package were used [
15].
We report the following metrics for the intervention (accelerated strategy) on the primary endpoint: (1) median of the posterior distribution; (2) posterior distribution 95% highest density interval (HDI); (3) probability of direction (PD; the probability that the effect size is on the side of the point estimate); (4) probability of “significance” based on a region of practical equivalence defined using traditional criteria; and (5) probability that the effect size is at least equal to or greater than what was considered as a minimal clinically important difference (MCID) in favor of the intervention, as defined by a survey of the STARRT-AKI international steering committee members (see Additional file
1); (6) probability that the effect size is at least 1.5 times higher than the one defined as MCID (which we considered as a “large” effect). The thresholds beyond which the effect was considered as “significant” were based on a difference in log(OR) that is equivalent of a standardized mean difference of 0.1 in Cohen’s d scale [equivalent to a log(OR) difference of 0.18; to convert from Cohen’s d to standardized log(OR) difference in Cohen’s
d scale, multiply the log(OR) by
\(\pi /\sqrt{3}\)], which would translate to an odds ratio between 0.83 and 1.19 [
16,
17]. These parameters were used to define the region of practical equivalence (ROPE) for this analysis; these values, albeit somewhat arbitrary, are considered as reasonable for equivalence testing [
16,
17]. We defined percentage inside ROPE as the proportion of the whole posterior distribution that lies within the ROPE. Convergence and stability of the Bayesian sampling were assessed using R-hat, which should be below 1.01 [
13], and effective sample size (ESS), which should be greater than 1000. Models were run using R package
brms [
15] and
emmeans [
18]. All analysis was run in R version 4.2.0.
Further, we also evaluated a secondary set of priors based on observations from earlier trials for the primary outcome, including: (1) the STARRT-AKI pilot trial [
19]; (2) the AKIKI and ELAIN trials (given divergent results) [
2,
4]; and (3) the individual patient data meta-analysis (IPDMA) (which included all prior trials except the main STARRT-AKI trial) [
20].
Secondary endpoints (days alive and KRT-free and days alive and hospital-free) were assessed using a zero–one inflated beta regression models and reported as absolute difference in days between the accelerated and standard strategies (with 95% credible intervals [CrI] of HDI) [
21,
22]. We also report the conditional probability of the difference in days alive and KRT-free and hospital-free favoring the accelerated strategy and the probability that the difference is within one day more to one day fewer interval or, secondarily, higher than the consensus MCID. Other secondary binary endpoints were assessed using a similar hierarchical logistic Bayesian model as performed with the primary endpoint. Secondary outcomes were assessed using only neutral priors (
\(N\left(0,0.355\right)\) for the intervention for the binary component and
\(N\left(0,1\right)\) for all other variables in the model (see ESM for details), and results are presented as median difference in proportions (with 95% HDI), as well as median OR (with 95% HDI) and the probability of benefit. We report missing values for all outcomes; a complete case analysis was used for all endpoints.
Consensus for minimal clinically important difference
We surveyed the 24 members of the international steering committee of the STARRT-AKI to generate consensus on a MCID for the primary and secondary endpoints (see Additional file
1). An absolute difference of 0.04 over the baseline event rate of 0.40 for the primary endpoint, all-cause mortality at 90 days, was considered as the MCID (which results in an odds ratio of approximately 0.84; log(OR) = − 0.175) (see ESM). The margin for a large effect was therefore set as
\(1.5\times -0.175 \approx 0.26\), which translates to a margin of large effects set as odds ratio below 0.77 or above 1.30. A margin of 3 days was considered as equivalent for the key secondary endpoints.
Discussion
In this post hoc Bayesian reanalysis of STARRT-AKI, the largest international randomized trial of acute KRT, we found that the probability that an accelerated strategy was associated with a clinically important or large treatment effect on 90-day all-cause mortality is very low. These findings were consistent across a spectrum of priors used to inform our Bayesian models, including the results from prior trials with conflicting results [
1,
2,
4]. In addition, we found high probabilities that the accelerated strategy resulted in fewer KRT-free days, as well as a higher risk of KRT dependence and rehospitalization at 90 days (all probabilities exceeding 0.90) compared with the standard strategy. These findings greatly extend the main frequentist analysis of the STARRT-AKI trial previously reported, by drawing emphasis on the exceedingly low likelihood of any meaningful benefit with a strategy of accelerated KRT initiation [
3]. While trials have utilized varying definitions of “accelerated” or “early” and “standard” or “delayed” to define the timing of KRT initiation, the findings of this analysis should strongly reinforce the adoption of a “watch and wait” strategy, where clinician decision-making on when to start KRT for critically ill patients with AKI should be prompted by development of conventional indications, medically refractory complications and/or persistent AKI [
3,
21].
The use of Bayesian reanalysis provides a unique opportunity to reappraise, augment, and expand the main results of large, randomized trials using an alternative framework [
10]. A Bayesian approach, integrating the concepts of probabilities of benefit or harm for a given intervention, may better mimic how clinicians integrate information to make clinical decisions at the bedside. This may have greater relevance for resource-intensive interventions with known risk profiles, such as KRT [
22]. In this reanalysis, we “stressed” the STARRT-AKI trial data with seven different priors for the primary endpoint of 90-day all-cause mortality (with only minor deviations in results). We further provided probabilistic interpretations of the primary and secondary outcomes based not only on thresholds for treatment effect sizes [
16,
17], but also by defining a minimal clinically important difference (MCID) from a consensus of the STARRT-AKI trial’s lead investigators.
Establishing a MCID can be challenging. This can often be based on cost-effectiveness analyses or quality-adjusted life years [
23] and is increasingly being adopted across disciplines and in clinical trial design [
24]. Despite this, there is surprisingly little guidance on how to best define MCID in critical care [
25,
26]. We used a very simple consensus analysis based on the expert opinion of the international steering committee of the STARRT-AKI trial [
3]. Though imperfect, this approach enabled a global perspective from clinicians who are deeply involved in critical care nephrology. First, there was consensus that 4% absolute difference in the primary endpoint of all-cause mortality at 90 days could be considered as a MCID. In the main STARRT-AKI analysis, we reported a relative risk of 1.00 (95% CI 0.93–1.09) [
3], that is, an absolute difference of 0, with the data being compatible under the null hypothesis to values in the range of a 7% reduction or 9% increase in 90-day all-cause mortality. Therefore, the main analysis was not able to theoretically rule out what could be considered a MCID, as defined by consensus for this analysis, since the 4% absolute reduction was within range of the reported treatment effect size under the frequentist paradigm. The findings of this Bayesian reanalysis can virtually eliminate the possibility that a 4% absolute reduction in the primary endpoint was compatible with the trial data, regardless of the variation in priors used to inform the analysis. Likewise, we were able to conclude with high probability that the accelerated strategy conferred greater KRT dependence, rehospitalization, and fewer KRT-free days when compared to a standard strategy for KRT initiation.
There are limitations to our analysis that warrant consideration. First, this secondary analysis was post hoc; however, we developed an a priori analytic plan prior to data analysis. Second, we recognize that priors used in Bayesian analysis are subjective. To address this, we used a range of priors, including those derived from prior trial data and consensus. Third, we did not impute for missing data. Fourth, we did not adjust for multiplicity of testing, though the concern for type I error may be reduced with Bayesian analysis compared with a frequentist analysis, and our findings were coherent with the main STARRT-AKI trial [
3,
11]. Fifth, we used margins for equivalence and for defining large effect sizes that may be questionable; however, we also present results based on consensus definition of MCID, which corroborates with consistent interpretation.
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.