Article Text
Abstract
Objectives Individual participant data (IPD) from randomised controlled trials (RCTs) can be used in network meta-analysis (NMA) to underpin patient care and are the best analyses to support the development of guidelines about the use of healthcare interventions for a specific condition. However, barriers to IPD retrieval pose a major threat. The aim of this study was to present barriers we encountered during retrieval of IPD from RCTs in two published systematic reviews with IPD-NMA.
Methods We evaluated retrieval of IPD from RCTs for IPD-NMA in Alzheimer’s dementia and type 1 diabetes. We requested IPD from authors, industry sponsors and data repositories, and recorded IPD retrieval, reasons for IPD unavailability, and retrieval challenges.
Results In total, we identified 108 RCTs: 78 industry sponsored, 11 publicly sponsored and 19 with no funding information. After failing to obtain IPD from any trial authors, we requested it from industry sponsors. Seven of the 17 industry sponsors shared IPD for 12 950 participants (59%) through proprietary-specific data sharing platforms from 26 RCTs (33%). We found that lack of RCT identifiers (eg, National Clinical Trial number) and unclear data ownership were major challenges in IPD retrieval. Incomplete information in retrieved datasets was another important problem that led to exclusion of RCTs from the NMA. There were also practical challenges in obtaining IPD from or analysing it within platforms, and additional costs were incurred in accessing IPD this way.
Conclusions We found no clear evidence of retrieval bias (where IPD availability was linked to trial findings) in either IPD-NMA, but because retrieval bias could impact NMA findings, subsequent decision-making and guideline development, this should be considered when assessing risk of bias in IPD syntheses.
- methods
- systematic reviews as topic
Data availability statement
All data relevant to the study are included in the article or uploaded as supplementary information.
Statistics from Altmetric.com
WHAT IS ALREADY KNOWN ON THIS TOPIC
Well-conducted individual participant data (IPD) meta-analyses are considered a ‘gold standard’ in evidence synthesis and can be used to inform patient care. Analysis of IPD can facilitate more tailored decision making and the development of clinical guidelines.
Failure to retrieve IPD could bias IPD-network meta-analysis (NMA) findings and decision making, if availability is linked to trial results, and can occur for an entire study dataset or a part of it (eg, missing information on a treatment group or a specific outcome or a type of participant).
WHAT THIS STUDY ADDS
IPD availability in Alzheimer’s dementia and type 1 diabetes remained low; despite efforts for obtaining IPD through contacting original authors, IPD were available only from study sponsors or data sharing platforms.
Lack of randomised controlled trial (RCT) identifiers (eg, National Clinical Trial number) in published papers and unclear data ownership, particularly when an RCT involves multiple sponsors, are considerable challenges in IPD retrieval.
In our examples, we found no clear evidence of IPD retrieval bias.
There were practical and analytical challenges in analysing IPD from data sharing platforms, including additional costs.
HOW THIS STUDY MIGHT AFFECT RESEARCH, PRACTICE OR POLICY
IPD were predominantly available for trials published after 2005, the year the International Committee of Medical Journal Editors mandated the registration of clinical trials. Researchers planning to undertake an IPD-NMA should consider the feasibility of accessing IPD from older trials and whether restricting to newer trials would cause any detriment to, or possibly improve their IPD-NMA.
Access to IPD can facilitate tailored decision making but retrieval bias could impact IPD-NMA findings, associated decision making and guideline development and should be considered when assessing risk of bias in IPD syntheses.
Introduction
Synthesis of individual participant data (IPD) from randomised controlled trials (RCTs) can provide best evidence for decision making, especially if brought together in a network meta-analysis (NMA). The use of IPD in NMA can strengthen evidence used for decision making, and is considered a ‘gold standard’ approach to evidence synthesis.1 2 Retrieval of IPD usually requires contacting RCT authors, but if authors cannot be located, as an alternative to contacting them, trial sponsors or data repositories might be contacted. NMAs bring together evidence from multiple intervention comparisons for a specific condition, allowing for intervention ranking according to their efficacy and are increasingly common3 4 with some very large examples.5
Empirical evidence suggests that approximately 50% of RCTs in IPD meta-analyses are publicly sponsored (50%) and 10% are non-sponsored,6 and that the extent to which IPD can be retrieved may depend on RCT characteristics, such as funding source, study size, study quality and treatment effect.7 There is some evidence that IPD are more likely to be retrieved from large, multicentre, industry-sponsored RCTs compared with small, single-centre, publicly sponsored RCTs.8 9
Reluctance to share IPD is a barrier,10 which can result in evidence syntheses having to be abandoned due to lack of access to IPD.11 Only 25% of the IPD meta-analyses published up to the year 2015 retrieved 100% of the eligible IPD, and 43% retrieved data on no more than 80% of participants.2 6 IPD retrieval has been found to be greatest in systematic reviews with meta-analyses that have funding support, asses non-pharmacological interventions, include fewer studies and have researchers who were coauthors of at least one eligible RCT.12
A scoping review showed that 67% of IPD-NMAs published until the year of 2014 retrieved IPD through the establishment of a collaborative group that included researchers who contributed their study data.13 IPD-NMAs are more frequently industry sponsored compared with NMAs with aggregate data only or a combination of aggregate data and IPD. For example, using previous scoping reviews of IPD-NMAs and NMAs irrespective type of data, industry-sponsored studies were identified in 17 (46%) of the 37 IPD-NMAs and in 151 (13%) of the total 1144 NMAs with aggregate data only or in combination with IPD.13 14
Well-conducted IPD-NMAs are considered the optimal evidence synthesis since they can produce more reliable results through high-quality statistical analysis.1 15 16 However, results from IPD-NMAs may be subject to retrieval bias (also known as availability bias). If IPD availability is linked to study results or characteristics, then IPD-NMA findings may be biased and not be representative of the whole evidence base.17 For example, if IPD were provided only for those RCTs that showed benefit of newer interventions, but no IPD were provided for those that favoured standard care, older interventions or those showing that the two groups were equivocal, then the IPD-NMA may overestimate the effectiveness of the newer interventions. In a network of interventions, retrieval bias from a single study can impact both direct and indirect intervention comparisons, since information flows via the network paths of evidence. Contribution of each study and direct comparison to the network estimate should be considered when retrieval bias is apparent.18 Greater understanding of these limitations and their potential impact on IPD-NMA findings (and thereby clinical decision making) is required and we have evaluated retrieval of IPD from industry-sponsored RCTs eligible for inclusion in two IPD-NMAs.
Methods
We sought IPD from authors and industry-sponsors of the RCTs included in two recent systematic reviews with IPD-NMAs. Two authors (ACT and LS) contacted original authors and another two authors (AAV and LS) contacted industry-sponsors of the included RCTs through email to request IPD. We contacted all industry sponsors reported in each RCT publication, some of whom indicated that their data were available through data sharing platforms, such as Vivli (https://vivli.org/), Clinical Study Data Request (CSDR)19 and Yale University Open Data Access (YODA).20 These consortium platforms provide an independent system for accessing anonymised IPD from multiple sponsors. At the time of our initial contact with the sponsors (January 2017),7 13 pharmaceutical companies were part of the CSDR consortium and YODA was a partnership between Yale University and the Medtronic, Johnson & Johnson, and SI-BONE companies. The of the Vivli platform was not contacted at that time since it was launched in 2018.21
For the IPD requisition process, we followed Dillman’s methods,22 23 where we sent (1) an email to the corresponding author followed by the next-in-order author, as presented in each eligible RCT in case of non-working emails, to obtain IPD, (2) four email reminders to the authors (ie, at 2, 6, 10 and 14 weeks intervals after the first email), (3) a reminder letter in week 7 and (4) a telephone reminder in week 15. We offered coauthorship on our systematic review in case authors provided their anonymised IPD and met the International Committee of Medical Journal Editors (ICMJE) authorship criteria for the underlying systematic review.24 We recorded funding as reported in each RCT, and confirmed it with the original author. We contacted industry funders after navigating the data sharing process through their websites, online portals, emails or phone inquiries. We sent two follow-up reminders to the funders in case of no initial response.
For each eligible study in each IPD-NMA, we extracted the year of publication, treatment effect (ie, mean difference (MD)) along with corresponding standard errors per treatment comparison for each primary outcome, study size and funding type (industry/mixed-sponsored, publicly sponsored, non-sponsored RCTs). For industry-sponsored RCTs, we also captured IPD availability. If IPD were unavailable, we recorded the sponsors’ reasoning for this (if provided), and summarised our challenges with retrieving IPD. We also summarised challenges once IPD were obtained from the sponsor or IPD repository using communication trackers and task logs.
We captured the overall IPD retrieval rate and explored the distribution of IPD availability per year and per absolute z-score of each study-specific treatment effect. We used the absolute z-score values across all treatment effects, irrespective of treatment comparison, to capture the magnitude of the treatment effect in relation to its precision. We generated forest plots of the average treatment effects and their 95% confidence intervals according to IPD availability for each treatment comparison, separately. We used the random effects model and the metagen function of the meta package in R25 to present the adjusted IPD (using all requested (and provided) covariates) against the aggregate data as reported in the publications. More details about the provided covariates and the IPD analyses are presented in the relevant systematic reviews.9 26 We assessed IPD acquisition over publication years, using the Cox and Stuart trend test and the trend library in R.27 We summarised the total number of RCTs with and without IPD for each treatment comparison. We compared the study-specific MD and associated SE between RCTs with available IPD and RCTs without IPD, as well as IPD retrieval for trials published before and after 2005.28 We compared the study size across different funding RCT types in box plots and assessed whether study size was associated with IPD availability.
Results
We explored IPD retrieval in two clinical fields: Alzheimer’s dementia and type 1 diabetes (see network plots in online supplemental appendix 1). We requested IPD from the authors of 125 RCT publications (96 for Alzheimer’s dementia and 29 for type 1 diabetes mellitus, including unique RCTs and indentified companion reports), but none of the authors shared their IPD. We ended pursuit for a given study if its author did not respond after a fourth reminder. The time taken for authors to respond to our enquiry was between 0 and 117 days. Of the 125 RCT publications, 17 were companion reports and authors of both the main and companion reports were contacted to increase chances of IPD retrieval. Hence, we identified 108 unique RCTs; of these, 78 were funded by at least one industry sponsor, 11 were publicly sponsored and 19 did not report funding information. We then contacted industry sponsors only, as we were unable to locate contact information for public-sponsors (eg, government grant and university research funders). Seventeen industry sponsors funded these 78 RCTs. Five sponsors had partnered with three data sharing platforms (CSDR, Vivli and YODA). Seven (41%) industry sponsors agreed to share their IPD through proprietary sponsor-specific platforms (e, five sponsors through CSDR and YODA, and two sponsors through their specific platforms; one of the sponsors switched to the Vivli platform while we were navigating IPD). Overall, the 7 sponsors shared IPD for 26 of the 78 RCTs (33%) through proprietary-specific platforms.
Supplemental material
Alzheimer’s dementia IPD-NMA
Overall 80 trials (21 138 participants) were eligible for inclusion in the Alzheimer’s dementia NMA. Of these, 55 were industry sponsored (40 single industry sponsor, 15 multiple industry sponsors), 9 were publicly sponsored and 16 did not report funding information. We requested data from the 15 industry sponsors of the 55 RCTs. Six sponsors shared IPD for 14 RCTs (8007 participants; figure 1), facilitated by three data sharing platforms (CSDR for 4 RCTs, YODA for 5 RCTs, Vivli for 2 RCTs (the latter 2 RCTs were initially provided through the AbbVie platform)) and 1 industry sponsor (Lundbeck) for three RCTs. Of the remaining nine sponsors, five did not respond to any of our multiple emails, so we decided to no longer pursue and four declined to provide data.
Type 1 diabetes IPD-NMA
Overall, 28 RCTs (7428 participants) were eligible for inclusion in the type 1 diabetes mellitus NMA, of which 23 were industry sponsored, 2 were publicly sponsored and 3 did not report funding information. Of the 23 industry-sponsored RCTs, 22 were funded by a single industry sponsor and one was jointly funded by two industry sponsors. One sponsor shared IPD for 12 RCTs (4943 participants; figure 1).
Challenges with IPD and impact on analysis
In total, we retrieved IPD for 26 RCTs through the following data repositories: CSDR (EISAI (1 RCT), GlaxoSmithKline (1 RCT), Novartis (2 RCTs)),19 YODA (Janssen (5 RCTs)),20 Lundbeck (3 RCTs), Novo Nordisk (12 RCTs) and Vivli (AbbVie (2 RCTs)). We failed to retrieve IPD from 52 industry-sponsored RCTs (67%): 41 RCTs in Alzheimer’s dementia and 11 RCTs in type 1 diabetes (online supplemental appendix 2). The reasons for not providing IPD varied. The most frequent reason was difficulty with study-identification (21 RCTs, 40%) because sponsors could not locate eligible studies without an RCT identifier (eg, National Clinical Trial (NCT) (https://www.clinicaltrials.gov/); figure 2). Locating IPD was particularly challenging for RCTs published before the ICMJE requirement for prospective clinical trial registrations in 2005, since many of these did not have an NCT number or the International Standard Randomised Controlled Trial Number (https://www.isrctn.com/). For IPD ownership for RCTs with multiple sponsors, the sponsors were unable to accurately confirm which sponsor had full ownership of the IPD (11 RCTs, 21%). Other reasons for not sharing IPD included: age of RCT and IPD being no longer available (seven RCTs, 13%), publication language of RCT (one RCT, 2%) and lack of response from a study’s principal investigator who held provision rights (one RCT, 2%). There was a lack of further clarification for non-sharing IPD for nine RCTs (17%), and lack of a response from sponsors in three RCTs (6%).
We encountered several challenges after IPD were identified through the data repositories (figure 2). First, data sharing agreements were a prerequisite for all sponsors and data repositories providing access to IPD. Initial communication clarifying data sharing arrangements between all parties took between 0 and 24 days, and final approval of data sharing agreement ranged from 154 to 474 days after submitting the research proposal. A timeline for IPD retrieval across the sponsor process is presented in figure 3.7 Apart from the regular project costs associated with IPD acquisition, including administration, legal work, library staff and research staff, there were additional costs associated with accessing data through repositories. Some sponsors and data repositories required a license to access certain coding dictionaries. For example, some costs were associated with the WHO Drug Dictionary license to obtain access to the history of medications used for each participant. Two sponsors required this license as a prerequisite for IPD provision; the approximate cost for this was just under US$9000 per sponsor. The sponsors had initially used the dictionary to code their data; as such, we would have required the WHO Drug Dictionary to read and understand the IPD coding for the relevant studies. Second, some sponsors and data repositories permitted limited-time access to the IPD through their remote-access platforms (ranging from one to two years, on average). When this time frame passed, a cost was associated with retaining access to IPD in some cases (eg, US$25 per day for the AbbVie studies through Vivli). In other cases, we were able to extend access without a charge and data sharing agreement renewal. Third, shared IPD did not include full information collected in the RCT and reported in the associated publication (eg, only the placebo arm data were available, only baseline data were available with final outcome data missing, or date of follow-up was coded, making it impossible to determine first and last time-points). For example, we were able to include only 12 (6906 participants) of the 14 RCTs that provided data for our Alzheimer’s dementia IPD-NMA due to incomplete outcome data.9 Furthermore, at least one covariate that had been collected in the RCTs was missing from provided IPD in 10 RCTs (ie, six of 14 RCTs in Alzheimer’s dementia and four of 12 RCTs in type 1 diabetes). In such cases, we used aggregate data in our NMAs and performed two-stage IPD-NMA models to avoid excluding studies not providing or with incomplete IPD. Fourth, IPD were only available through proprietary sponsor-specific platforms per sponsor or data repository, which did not allow combination of all retrieved data in a single space to enable development of single model (eg, for a one-stage NMA). Fifth, relevant software was not available in all platforms or, older versions were provided, or we were not allowed to instal any new software or routines within these platforms (eg, the R package mice29 was missing or an older version of the lme430 R package was provided). Sixth, some sponsors (ie, AbbVie and Janssen through YODA) switched platforms while we were navigating the data, which caused additional delays in completing our IPD-NMA.
Characteristics of retrieved IPD
There was no clear trend in the success of retrieving IPD by year of publication (p=0.68; figure 4A). In the Alzheimer’s dementia dataset, IPD were only available for trials published after 2005, when ICMJE mandated the registration of clinical trials (online supplemental appendix 3). Availability was irrespective of the ratio of treatment effect over its precision (ie, z-score) or statistical significance (online supplemental appendix 4). We visually explored the possibility of retrieval bias across treatment effects, and found no clear evidence of important differences in comparisons with and without IPD in the two diseases (online supplemental appendix 5).
Funding source did not play an important role in the treatment effect statistical significance, but industry-sponsored RCTs enrolled a higher number of participants compared with publicly sponsored RCTs or those with unclear funding (online supplemental appendices 6 and 7). Sponsors of larger studies were more likely to share their IPD compared with smaller studies (figure 4B, online supplemental appendix 8). In Alzheimer’s dementia, 16 RCTs included more than 250 participants, 11 of which shared their IPD and were published after 2005. Similarly, in type 1 diabetes, 14 studies included more than 250 participants, 11 of which shared their IPD, with 7/11 RCTs published after 2005. Larger studies with a small treatment effect were more likely to provide IPD (online supplemental appendix 9).
There was a tendency to obtain IPD in treatment comparisons informed by at least two RCTs (online supplemental appendix 10). Also, of the 13 treatment comparisons for which we retrieved IPD, 11 (85%) treatment comparisons were predominantly informed (>50%) by RCTs published after 2005.
Discussion
Access to IPD allowed us to overcome potential reporting bias in our two IPD-NMAs.9 26 In particular, we were able to include four RCTs for which data or entire outcomes were missing from the publications, and to assess for potential effect modifiers that were not reported in the original publications (eg, comorbidities, history of medications) where we investigated treatment-by-covariate interactions at the patient level. However, a large number of RCT and high proportion of IPD were not retrieved, although availability did not appear to be related to the individual study findings.
Unavailable IPD can occur for part of a study dataset (eg, missing information on entire treatment group or a specific outcome) or for an entire study and can lead to the exclusion of the RCT from the analysis. Retrieval bias is potentially a major threat for any IPD-NMA, and could undermine its findings and use in decision making. Hence, we recommend that retrieval bias should be explored in future IPD-NMAs, particularly when IPD-NMA results are used to inform guidelines which will impact healthcare and research.16 Similarly, systematically missing covariates in retrieved IPD (with some covariates available in some RCTs but missing from the others, even if planned to be collected in the RCT protocol)31 should be considered in IPD-NMA. Retrieval bias can impact the network geometry, and hence network estimates, treatment ranking and validity of assumptions in NMA. The possibility of reporting bias in RCT publications included in NMAs should also be explored and there is an important role for trial registries in reducing the risk of reporting bias.32
Despite efforts to obtain IPD through contacting original authors, study sponsors or data sharing platforms, data availability for RCTs in Alzheimer’s dementia and type 1 diabetes was low. A limitation for both our IPD-NMAs is that we waited up to 117 days for authors to reply to our request for IPD, and up to 474 days for a data sharing agreement to be finalised after submitting the research proposal. This waiting time frame for a systematic review with IPD-NMA highlights a need to consider that slow communication from sponsors and data repositories after the initial inquiry can mean that it takes 1 or 2 years in IPD-NMAs to retrieve and access IPD. Another limitation in our efforts to obtain IPD was that we were restricted to contacting only industry sponsors, due to the challenge of locating up-to-date contact information for public-sponsors (eg, government grant and university research funders). Hence, our findings mainly represent IPD retrieval from industry-sponsored RCTs, which may be different in publicly sponsored studies. Also, small-study effects was evident in the RCTs for Alzheimer’s dementia, and this may have affected IPD availability.9
Locating the relevant RCTs and data ownership were major challenges in IPD retrieval. Importantly, for our Alzheimer’s dementia NMA, IPD were only available for trials published after 2005, the year of the ICMJE mandate, which increased the likelihood of an RCT being registered in a publicly accessible database. In total, we retrieved IPD for five RCTs (only for the type 1 diabetes NMA) published before 2005 and 21 RCTs published after 2005 (for both IPD-NMAs). Other factors may also play an important role in IPD retrieval, such as study size. Researchers planning to undertake an IPD-NMA should consider the feasibility of accessing IPD from older trials, especially when they are small in size, and whether restricting to the most recent trials would harm or benefit their IPD-NMA (eg, in applicability to current practice or in reducing the chance of retrieval (or availability) bias). Any decision to restrict the IPD-NMA to more recent trials should be made a priori, at the protocol stage, for transparency.
Our study highlights the problems of data repositories for sharing trial data for evidence syntheses. Accessing IPD via a repository is time-consuming and can take more than a year after an initial inquiry with a sponsor or data repository.7 More challenges can arise if IPD from different RCTs are not all available through the same repository/platform, and if they are available at different time points. Additional costs may be incurred in retaining access to IPD beyond an initial ‘free of charge’ window, in order to enable access to all trials across all platforms at the same time. Obtaining access to data coding dictionaries and IPD may also incur a cost. The use of different proprietary-specific platforms, multiple costs of doing so, inability to analyse all data in the same space (and model), and lack of software in data repositories should be considered when planning an IPD-NMA. Researchers should weigh the benefits and challenges of retrieving and using IPD at the project design/protocol stage, before embarking on an IPD meta-analysis, including NMA.
Data availability statement
All data relevant to the study are included in the article or uploaded as supplementary information.
Ethics statements
Patient consent for publication
Ethics approval
Not applicable.
Acknowledgments
We thank the following sponsors for sharing the data with us for our two IPD-NMAs: “This publication is based on research using data from data contributors, AbbVie, Inc, that has been made available through Vivli, Inc. Vivli has not contributed to or approved, and is not in any way responsible for, the contents of this publication.” “This study, carried out under YODA Project #2017-1671, used data obtained from the Yale University Open Data Access Project, which has an agreement with JANSSEN RESEARCH & DEVELOPMENT, L.L.C. The interpretation and reporting of research using this data are solely the responsibility of the authors and does not necessarily represent the official views of the Yale University Open Data Access Project or JANSSEN RESEARCH & DEVELOPMENT, L.L.C.” This publication used data obtained from Eisai, GlaxoSmithKline, and Novartis carried under www.ClinicalStudyDataRequest.com. This publication used data obtained from Novo Nordisk through their online platform, as well as from Lundbeck.
References
Supplementary materials
Supplementary Data
This web only file has been produced by the BMJ Publishing Group from an electronic file supplied by the author(s) and has not been edited for content.
Footnotes
Twitter @AVeroniki
Contributors AAV, ACT and SES conceived and designed the study. AAV abstracted data, contacted sponsors, analysed data, interpreted results, and wrote a draft manuscript. AAV is the guarantor of the study. SPCL contacted authors, sponsors and data sharing repositories, and edited the manuscript. LAS, MC, ACT, and SES provided input into the design, interpreted results, and edited the manuscript. All authors read and approved the final manuscript.
Funding SES is funded by a Tier 1 Canada Research Chair in Knowledge Translation. ACT is funded by a Tier 2 Canada Research Chair in Knowledge Synthesis.
Competing interests AAV is on the editorial board for the journal but was not involved with the peer review process or decision to publish.
Patient and public involvement Patients and/or the public were not involved in the design, or conduct, or reporting, or dissemination plans of this research.
Provenance and peer review Not commissioned; externally peer reviewed.
Supplemental material This content has been supplied by the author(s). It has not been vetted by BMJ Publishing Group Limited (BMJ) and may not have been peer-reviewed. Any opinions or recommendations discussed are solely those of the author(s) and are not endorsed by BMJ. BMJ disclaims all liability and responsibility arising from any reliance placed on the content. Where the content includes any translated material, BMJ does not warrant the accuracy and reliability of the translations (including but not limited to local regulations, clinical guidelines, terminology, drug names and drug dosages), and is not responsible for any error and/or omissions arising from translation and adaptation or otherwise.