Objectives Alteplase is commonly recommended for acute ischaemic stroke within 4.5 hours after stroke onset. The Third European Cooperative Acute Stroke Study (ECASS III) is the only trial reporting statistically significant efficacy for clinical outcomes for alteplase use 3–4.5 hours after stroke onset. However, baseline imbalances in history of prior stroke and stroke severity score may confound this apparent finding of efficacy. We reanalysed the ECASS III trial data adjusting for baseline imbalances to determine the robustness or sensitivity of the efficacy estimates.
Design Reanalysis of randomised placebo-controlled trial. We obtained access to the ECASS III trial data and replicated the previously reported analyses to confirm our understanding of the data. We adjusted for baseline imbalances using multivariable analyses and stratified analyses and performed sensitivity analysis for missing data.
Setting Emergency care.
Participants 821 adults with acute ischaemic stroke who could be treated 3–4.5 hours after symptom onset.
Interventions Intravenous alteplase (0.9 mg/kg of body weight) or placebo.
Main outcome measures The original primary efficacy outcome was modified Rankin Scale (mRS) score 0 or 1 (ie, being alive without any disability) and the original secondary efficacy outcome was a global outcome based on a composite of functional end points, both at 90 days. Adjusted analyses were only reported for the primary efficacy outcome and the original study protocol did not specify methods for adjusted analyses. Our adjusted reanalysis included these outcomes, symptom-free status (mRS 0), dependence-free status (mRS 0–2), mortality (mRS 6) and change across the mRS 0–6 spectrum at 90 days; and mortality and symptomatic intracranial haemorrhage at 7 days.
Results We replicated previously reported unadjusted analyses but discovered they were based on a modified interpretation of the National Institutes of Health Stroke Scale (NIHSS) score. The secondary efficacy outcome was no longer significant using the original NIHSS score. Previously reported adjusted analyses could only be replicated with significant effects for the primary efficacy outcome by using statistical approaches not reported in the trial protocol or statistical analysis plan. In analyses adjusting for baseline imbalances, all efficacy outcomes were not significant, but increases in symptomatic intracranial haemorrhage remained significant.
Conclusions Reanalysis of the ECASS III trial data with multiple approaches adjusting for baseline imbalances does not support any significant benefits and continues to support harms for the use of alteplase 3–4.5 hours after stroke onset. Clinicians, patients and policymakers should reconsider interpretations and decisions regarding management of acute ischaemic stroke that were based on ECASS III results.
Trial registration number NCT00153036.
- emergency medicine
This is an open access article distributed in accordance with the Creative Commons Attribution Non Commercial (CC BY-NC 4.0) license, which permits others to distribute, remix, adapt, build upon this work non-commercially, and license their derivative works on different terms, provided the original work is properly cited, appropriate credit is given, any changes made indicated, and the use is non-commercial. See: http://creativecommons.org/licenses/by-nc/4.0/.
Statistics from Altmetric.com
What is already known about this subject?
Thrombolysis with alteplase at 3–4.5 hours after onset of acute ischaemic stroke is widely reported to be effective for improving functional outcome.
The evidence basis for efficacy is primarily derived from the Third European Cooperative Acute Stroke Study (ECASS III) trial which had substantial baseline imbalances in stroke severity and history of prior stroke.
The influence of baseline differences on effect estimates from the ECASS III trial has not been thoroughly analysed.
What are the new findings?
Adjusting for baseline imbalances using multiple methods fails to find significant benefits and continues to find significant risks with alteplase 3–4.5 hours after stroke.
Originally reported unadjusted analyses suggesting efficacy in functional end points other than the primary outcome were not reproduced when using the original National Institutes of Health Stroke Scale score.
How might it impact on clinical practice in the foreseeable future?
Any prior decisions or interpretations based on ECASS III results warrant reconsideration.
Estimation of efficacy for alteplase at 3–4.5 hours after stroke should be reassessed with independent access to original trial data from all trials contributing to the assessment.
Until thorough reassessment is done clinicians should consider any estimates of efficacy for alteplase beyond 3 hours after stroke onset to have very low certainty.
The Third European Cooperative Acute Stroke Study (ECASS III) was a randomised controlled trial comparing alteplase (a thrombolytic medication) with placebo between 3 and 4 hours and 30 min (3–4.5 hours) after stroke onset in patients with acute ischaemic hemispheric stroke.1 In 2008, the ECASS III trial results were reported with a conclusion that alteplase increased the chance of being alive with minimal symptoms 3 months after stroke.1 Primarily based on or heavily influenced by the ECASS III trial, most clinical practice guidelines for the management of acute ischaemic stroke currently recommend extending the use of alteplase up to 4.5 hours after stroke onset.2–12 To date, ECASS III is the only trial to have reported benefit from use of alteplase 3–4.5 hours after stroke onset.
The ECASS III trial had substantial baseline imbalances in two variables that are prognostic for efficacy outcomes: National Institutes of Health Stroke Scale (NIHSS) score and history of prior stroke.1 13–15 The NIHSS score at baseline was lower (ie, better) in the alteplase group (mean 10.7, median 9) than in the placebo group (mean 11.6, median 10) (p=0.03).1 Prior stroke was reported in 7.7% alteplase patients and 14.1% of placebo patients (p=0.003).1
The primary efficacy outcome in ECASS III (being alive with minimal symptoms 3 months after stroke) was reported to be significant in some analyses adjusting for baseline imbalances. In the original report, an analysis adjusted for NIHSS score, time to treatment, smoking and hypertension (but not history of stroke) reported an OR 1.42 (95% CI 1.02 to 1.98) based on analysis of 785 patients.1 A subsequently reported ‘full model’ analysis which adjusted for NIHSS score, history of prior stroke and other prognostic variables that did not have baseline imbalances reported an OR 1.43 (95% CI 1.02 to 2.00) based on analysis of 784 patients.16 However, an analysis limited to 732 of 821 patients (89%) in ECASS III who did not have a prior stroke reported no significant difference in the primary efficacy outcome.13 16 Aside from the primary outcome, the other clinically relevant outcomes have not been reported with analyses adjusted for the baseline imbalances.13 16
It is unclear how much of the alteplase-attributed effect in the ECASS III trial is related to the effect of the drug (a true benefit) and how much is related to differences between the drug group and the placebo group that were occurring at the time of trial entry (such that baseline imbalances lead to a false signal of benefit). We reanalysed the ECASS III trial data to provide a view of clinically relevant patient-important outcomes adjusted for baseline imbalances, including sensitivity analyses and different analytical approaches to assess robustness of reported findings. Consistency across the results of the primary analysis, reanalysis and sensitivity analyses would provide reassurance about the credibility of the primary findings.17–19
We obtained access to the original data from the ECASS III trial with a study protocol posted at ClinicalStudyDataRequest.com (proposal number 1619) and included in online supplementary appendix.
We first attempted to reproduce results from table 2 through table 5 from the original publication1 to ensure that we had the same database and were using the same variables as those used for the publication by the original study team. We also attempted to reproduce the adjusted analyses for the primary outcome reported in the original publication1 and in a subsequently reported ‘full model’.16
We specified seven outcomes for our reanalysis:
Symptom-free status (modified Rankin Scale (mRS) score 0) at 90 days.
Disability-free status (mRS 0–1) at 90 days.
Dependence-free status (mRS 0–2) at 90 days.
Mortality at 7 days.
Mortality at 90 days.
Symptomatic intracranial haemorrhage (by ECASS III and National Institute of Neurological Disorders and Stroke (NINDS) study definitions) at 7 days.
Change across mRS 0–6 spectrum at 90 days (ordinal shift analysis).
The first six outcomes are dichotomous and the seventh outcome will have discrete values of 0–6.
For dichotomous outcomes we estimated the effect of alteplase on the probability of having the outcomes. For the change across mRS 0–6 spectrum at 90 days (ordinal shift analysis), we estimated the effect of alteplase on the aggregated probability of having an mRS score less than k where k varies between 1 and 6.
For all seven outcomes, the descriptive statistics were reported using the sample proportions. Inferential statistics were reported using relative risks and absolute risk differences for the six dichotomous outcomes and ORs for the ordinal shift analysis.
We assessed for baseline imbalances between the two treatment groups and identified baseline variables that were statistically significant (alpha=0.05) between the two groups. The statistically significant baseline imbalances identified in the original publication were NIHSS score and history of prior stroke.1
To test for robustness of results according to the statistical analysis applied, we planned to conduct three analytical approaches (multivariable modelling, matching and stratified analysis) to dichotomous outcomes and multivariable modelling to the ordinal shift analysis to apply different valid approaches to adjusting for potential confounders.
For the multivariable modelling, the independent variables were treatment assignment, NIHSS score, history of prior stroke and possibly other covariates if identified as significant baseline imbalances. For the dichotomous outcomes, we used log link, and binomial or Poisson distribution to obtain the estimated adjusted relative risk for alteplase compared with placebo.20–22 For the ordinal shift analysis, we did not categorise the mRS score into categories, but rather left the mRS score as 0–6 and analysed the data using ordinal logistic regression under the assumption of proportional odds. If the proportional odds assumption did not hold well, we would use multinomial logistic regression.
For matching, we planned to use the optimal matching procedure23 to obtain a 1 to 1 match. The match factors would have been age, sex, NIHSS score, history of prior stroke, time from stroke onset to treatment initiation and possibly other covariates if identified as significant baseline imbalances or potential confounders.
For the stratified analysis, we stratified the sample by NIHSS score and history of prior stroke. NIHSS score was trichotomised into lower (0–9), intermediate (10–19) and higher (20–42) groups reflecting three strata of stroke severity. We used the Cochran-Mantel-Haenszel test to obtain weighted relative risk from the pooled data of the six strata.
The reported finding of the treatment effect by the study authors assumed no interaction effects between randomised assignment and other covariates such as history of prior stroke.1 We assessed if the treatment effect varied differently across levels of covariates. We tested for the significance of the interaction term between randomised assignment and history of prior stroke and between randomised assignment and NIHSS score. If significant interactions were found, we would report the treatment effects across the levels of these covariates, but we did not find such interactions.
Where the missing data for outcomes or covariates were less than 2%, we did not attempt imputation of missing data and treated missing data as missing at random. If missing data occurred for more than 2% and less than 20%, we produced intention-to-treat analyses using best-case and worst-case assumptions for missing data. If missing data occurred for more than 20%, we did not use the variables.
We planned to use the SAS software V.9.424 for all data management and statistical analyses.
We reproduced tables 2–5 from the original publication1 with only minor discrepancies. The baseline characteristics data matched exactly except for age in the alteplase group (which was reported as mean of 64.9 with SD of 12.2 in the original report and was calculated as mean of 64.7 with SD of 12.1 in our analysis) (online supplementary appendix table S1). This discrepancy was explained by anonymisation rules mandated by the European Union General Data Protection Regulation.
The unadjusted analyses for efficacy outcomes matched initially with the exception of finding 209 patients in the alteplase group (rather than 210 patients) having the NIHSS score of 0 or 1 (online supplementary appendix table S2). This discrepancy resulted in the outcomes at 90 days of NIHSS score of 0 or 1 and the global outcome becoming no longer statistically significant. We identified the patient with discrepant data and clarified with the study sponsor. The NIHSS total score at day 90 (possible range 0–46 based on total of 17 individual item scores) was reported as 2 in the original data rather than calculated as the sum of 17 individual item scores. The 17 individual item scores were each recorded as 0. Using a calculated NIHSS total score of 0 for this patient replicated the originally reported analysis.
We reproduced the results in unadjusted analyses reported for symptomatic intracranial haemorrhage by ECASS II and NINDS definitions but did not find data reporting symptomatic intracranial haemorrhage by ECASS III definition (online supplementary appendix table S2). The ECASS III definition was the same as the ECASS II definition plus the requirement that ‘the hemorrhage must have been identified as the predominant cause of the neurologic deterioration’. The only other difference from the published results was that we found 11 cases of vascular serious adverse events in the placebo group rather than 10 cases originally reported.
All analyses reported for ECASS III were based on ‘NIHSS (inclusive of distal motor function left/right)’, resulting in an NIHSS total score with a range of 0–46 calculated from 17 component scores (the 15 components of the NIHSS score plus two additional distal motor function components). This approach using ‘NIHSS (inclusive of distal motor function)’ had been prespecified in the ECASS III protocol and statistical analysis plan. However, the original publication1 erroneously stated NIHSS scores ranged from 0 to 42 and this is based on 15 components for calculation of NIHSS scores.25 Our reanalysis protocol specifies to analyse NIHSS scores based on the standard 0–42 range. In our reanalysis we found all non-primary efficacy outcomes were not significant when using the NIHSS score as reported in the original publication (table 1).
Replication of previously reported adjusted analyses
We attempted to reproduce the adjusted analysis in the original publication1 and a subsequently reported ‘full model’ analysis.16 Our initial attempts failed to replicate the previously reported adjusted analyses. We clarified with the study sponsor and were able to reproduce both the previously reported adjusted analyses with three conditions:
Exclusion of patients who had incomplete or missing baseline NIHSS scores.
Treatment of the baseline NIHSS score as a categorical variable with five categories (0–5, 6–10, 11–15, 16–20, >20).
Treatment of the time from symptom onset to treatment as a categorical variable with seven categories (15 min windows).
The ECASS III protocol and statistical analysis plan provided detailed prespecification of unadjusted analyses for all end points but only predefined the role of adjusted analyses as for exploratory purpose and to identify predictive/confounding variables. The three conditions necessary to replicate the adjusted analyses were thus not prespecified. They were subsequently identified in the clinical trial report or in communication with the study sponsor (table 2). The previously reported adjusted analyses were statistically significant only under all three of these conditions (table 3).
Analyses adjusted for baseline imbalances
To follow our originally intended protocol we calculated NIHSS scores based on the 15 component scores in the original NIHSS score definition. If a baseline NIHSS component score had a missing value, it was imputed with a 0 and if a 90-day NIHSS component score had a missing value it was imputed with the maximum value for the item. The baseline and 90-day NIHSS scores were then computed.
The only two variables that were statistically significant as baseline imbalances were the NIHSS score and history of prior stroke. We were unable to conduct the matching analysis because the necessary tools were not present in the SAS software version available through the portal used for study data access.
For all efficacy outcomes at 90 days, multivariable analyses and stratified analyses adjusting for baseline imbalances found no significant treatment effect (table 4). This occurred for every dichotomous classification of mRS, for all other dichotomous efficacy outcome measures and for the ordinal shift analysis. For the ordinal shift analysis, the assumption of proportional odds was violated for our multivariable modelling (p=0.0004). Therefore, we used a multinomial logistic regression model and found no significant difference for this outcome (p=0.440). Increases in symptomatic intracranial haemorrhage with alteplase remained statistically significant in five of six adjusted analyses (table 4).
Descriptive statistics for our seven prespecified outcomes of interest are reported in online supplementary appendix table S2. There were no missing data for four safety outcomes. For the three prespecified efficacy outcomes, a best-case sensitivity analysis found significant effects for the outcome of mRS 0 or 1 across most methods of analysis, and variable results for mRS 0 and for mRS 0–2 based on method of analysis (online supplementary appendix table S3). In the worst-case sensitivity analysis (using either baseline NIHSS score in 0–42 range or in 0–46 range), none of the efficacy end points were significant in any unadjusted analysis or any analysis adjusted for baseline imbalances (online supplementary appendix table S4).
Reanalysis of the ECASS III trial data with multiple approaches adjusting for baseline imbalances does not support any statistically significant benefits that were previously reported and continues to support statistically significant harms for the use of alteplase 3–4.5 hours after stroke onset.
Strengths of this reanalysis include prespecified outcomes of clinical importance and prespecified analytical methods to avoid selective analysis and selective reporting. Limitations of this reanalysis include limitations to the trial data access such that matching analysis could not be performed. However, it is implausible for a matching analysis to find statistically significant results because the high likelihood of mismatched patients would be expected to reduce the sample sizes and resulting statistical power. Another limitation of reanalysis, or any method for adjusting for non-randomised factors influencing the effect estimates from a randomised trial, is such analyses cannot confidently produce new conclusions (neither a claim of efficacy nor a claim of absence of efficacy). The role of reanalysis and adjusted analyses is limited to increasing or decreasing the certainty in the unadjusted analysis of the randomised trial. In this case the reanalysis does not negate the original findings, but it greatly reduces the certainty for those findings.
In addition to the adjusted analyses conducted as our primary reanalysis per our predefined protocol, we discovered two unique differences between our expectations and the published findings when attempting to reproduce the originally reported results. First, the original authors used a modified NIHSS score, consistent with their prespecified protocol but inconsistent with the results reported in the trial publication. Analysis using the original NIHSS score (without additional scores from distal motor function) results in loss of significance in non-primary efficacy outcomes previously reported as significant in unadjusted analysis. Second, the original authors reported adjusted analyses which supported their original findings, but the precise methods and assumptions used for these adjusted analyses were not prespecified. Previously reported adjusted analyses suggesting efficacy show statistical significance only under multiple conditions that do not represent the most informative use of available data. Seven other ‘less selective’ approaches to these adjusted analyses fail to replicate significant effects.
We had a priori concerns for risk of bias due to known baseline imbalances in prognostic variables. Our reanalysis (intended to evaluate the robustness of ECASS III findings to multiple standard analyses adjusted for these baseline imbalances) resulted in numerous reasons to reduce our certainty in the primary results. First, all our prespecified adjusted analyses were inconsistent with the primary analysis, so we have substantial uncertainty for efficacy estimates. Second, the previously reported adjusted analysis appears to be selective analysis and reporting (not necessarily intentional and potentially just an oversight in reporting scientific findings with a lot of complexity). Rather than viewing consistent findings in one highly selected adjusted analysis we find inconsistent findings in seven of eight adjusted analyses for this method of adjustment, further reducing our certainty. Finally, the absence of significant effects in any non-primary efficacy outcome (with use of the original NIHSS score for unadjusted analyses) adds reasons for less certainty in the primary outcome unrelated to the risk of bias from baseline imbalances. All other methods of assessing functional outcomes being inconsistent with the primary outcome make it less likely that the difference in the primary outcome represents a true effect.
Overall decision-making should not be determined based on a single trial but rather careful synthesis of a body of evidence. We previously reported that the most current comprehensive meta-analyses supported the use of alteplase 3–4.5 hours after stroke despite the evidence directly comparing alteplase to no alteplase suggesting a 2% absolute increase in mortality and no clear benefit.26 The individual patient data meta-analysis reported an improvement in functional outcome with alteplase 3–4.5 hours after stroke (adjusted OR 1.26, 95% CI 1.05 to 1.51)27 despite combining data from a meta-analysis of the same trials except the Third International Stroke Trial (IST-3) (adjusted OR 1.34 with 1620 patients)28 and the IST-3 (adjusted OR 0.73 with 1177 patients).29 It is possible that differences might be explained by selective outcome reporting or analytical approach but the data are not easily available for independent and comprehensive analysis. Our findings with the ECASS III trial data provide further evidence of a need for comprehensive independent reanalysis using all available data informing use of alteplase for acute ischaemic stroke.
Concerns for certainty in reported analyses are not limited to use of alteplase 3–4.5 hours after stroke. Among the multiple randomised trials assessing alteplase for thrombolysis after acute stroke only two have been reported to support significant claims of efficacy for primary functional outcomes: ECASS III and NINDS.1 30 The NINDS trial has also been criticised with attention to baseline imbalances (including stroke severity),31 32 and the full influence of baseline confounding may not have been thoroughly and independently analysed and reported.33
This reanalysis confirms reports of concern that the baseline imbalances introduce such a risk of bias that conclusions of efficacy based on ECASS III data cannot be considered reliable.13 26 34 Clinicians, patients, policymakers, systematic review authors, clinical practice guideline developers and drug regulators should reconsider interpretations and decisions regarding management of acute ischaemic stroke that were based on ECASS III results.
The authors thank Boehringer Ingelheim for access to the study data, for multiple communications to clarify the data handling and protocols used and to the commitment to support transparency and free exchange of scientific discussion.
Contributors BSA, MMM and EM were part of the original team which published a call for independent analysis of the trial data for alteplase. BSA, LT, ARG, MMM and EM contributed to the protocol for ECASS III trial reanalysis. BSA negotiated the Data Sharing Agreement. GF conducted the statistical analysis. LT supervised the statistical analysis. All authors edited the manuscript. BSA is the guarantor of the study.
Funding GF and LT were supported in part by a charitable contribution from EBSCO Information Services.
Disclaimer EBSCO Information Services had no role in the design, conduct, reporting or decision to publish for this research.
Patient consent for publication Not required.
Provenance and peer review Not commissioned; externally peer reviewed.
Data availability statement Data may be obtained from a third party and are not publicly available. The authors are unable to share the data as it was obtained through a Data Sharing Agreement with the study sponsor through ClinicalStudyDataRequest.com.
If you wish to reuse any or all of this article please use the link below which will take you to the Copyright Clearance Center’s RightsLink service. You will be able to get a quick price and instant permission to reuse the content in many different ways.