A study protocol of qualitative data sharing practices in clinical trials in the UK and Ireland: towards the production of good practice guidance

Background: Data sharing enables researchers to conduct novel research with previously collected data sets, thus maximising scientific findings and cost effectiveness, and reducing research waste. The value of sharing anonymised data from clinical trials is well recognised with a moderated access approach recommended. While substantial challenges to data sharing remain, there are additional challenges for qualitative data. Qualitative data including videos, interviews, and observations are often more readily identifiable than quantitative data. Existing guidance from UK Economic and Social Research Council applies to sharing qualitative data but does not address the additional challenges related to sharing qualitative data collected within trials, including the need to incorporate the necessary information and consent into already complex recruitment processes, with the additional sensitive nature of health-related data. Methods: Work package 1 will involve separate focus group interviews with members of each stakeholder group: trial managers, clinical trialists, qualitative researchers, members of research funding bodies and trial participants who have been involved in qualitative research. Data will be analysed using thematic analysis and managed within QSR NVivo to enhance transparency. Work package 2 will involve a documentary analysis of current consent procedures for qualitative data collected as part of the conduct of clinical trials. We will include documents such as participant information leaflets and consent forms for the qualitative components in trials. We will extract data such as whether specific clauses for data sharing are included in the consent form. Content analysis will be used to analyse whether and how consent is being obtained for qualitative data sharing. Conclusions: This study will provide insight into the existing practice of sharing of qualitative data in clinical trials and the current issues and opportunities, to help shape future research and development of guidance to encourage maximum learning to be gained from this valuable data.


Introduction
Data sharing enables researchers to conduct novel research with previously-collected data sets, thus maximising scientific findings and cost effectiveness, and reducing research waste (DuBois et al., 2018). The value of sharing data in clinical trials is well recognised, with a moderated (controlled) access approach recommended (Sydes et al., 2015), and guidelines exist for how this should be done (Keerie et al., 2018;Institute of Medicine, 2015;Ohmann et al., 2017). While substantial challenges to data sharing remain, there are additional challenges for qualitative data (NASEM, 2020). This is particularly evident in the context of general data protection regulation (GDPR) in Europe and its implementation in the Data Protection Act (2018), where an individual's right to the privacy of their personal data is paramount. There are specific challenges in sharing qualitative data, including videos, interviews, and observations, which are often more readily identifiable than quantitative data. Therefore, concerns for privacy become challenging, specifically regarding pseudonymisation which has been identified as a major barrier to data sharing (Aitken et al., 2016;Ruggiano & Perry, 2019). Pseudonymisation can be described a technique that replaces or removes information in a data set that identifies an individual (Mourby et al., 2018). Furthermore, while the public may assume or expect their quantitative data to be shared, they may not be clear or comfortable with sharing their qualitative data (Aitken et al., 2016).
It is well recognised that qualitative components of trials are valuable for: developing further research hypotheses; gathering complementary information to contribute to answering research questions in depth and helping to explain findings (Rapport et al., 2013). In addition, qualitative research in trials can be particularly helpful in developing and evaluating complex interventions as it provides valuable insights to issues experienced by potential participants (Rapport et al., 2013). However, there is little guidance on how to approach sharing this type of data. Guidance from UK economic and social research council applies to sharing qualitative data generally, but qualitative research in trials faces additional challenges, including the need to incorporate the necessary information and consent into already complex recruitment processes, with the additional sensitive nature of health-related data. Consequently, investigators lack proper guidance on how to comply with data sharing guidelines in a way that provides adequate anonymity protections (Tsai et al., 2016).
To this end, the aim of the project is to explore whether and how trial teams share qualitative data collected as part of the design, conduct or delivery of clinical trials. This project will provide the foundation for further methodological work and the future production of guidance by exploring potential challenges and opportunities when considering qualitative data sharing in trials.

Methods
Work package 1 Study design. This study will employ a qualitative descriptive approach using thematic analysis of data (Braun & Clarke, 2006). This approach explores general beliefs and views that expose the experiences described by target populations (Al Dandan et al., 2019). The perspectives and beliefs of participants will be gathered using semi-structured focus group interviews via Zoom. The interview guide and all participant documents can be found as extended data (Houghton et al., 2021).

Research team roles and prior experience. The research team was established through the Health Research Board-Trials
Methodology Research Network (HRB-TMRN) and MRC-NIHR Trials Methodology Research Partnership (TMRP) with researchers who share a common interest in qualitative data sharing in trials. Participants will be informed that the research is funded by the HRB-TMRN in collaboration with the MRC-NIHR TMRP and the purpose of the study is clearly stated in the participant information leaflets (Houghton et al., 2021). All focus groups will be moderated and analysed by CH, LB and MD with assistance from MMC. CH, LB and MD have extensive experience in qualitative methodologies, qualitative evidence synthesis and qualitative data analysis. KG, NR and JW will provide methodological expertise and advise on how to recruit qualitative researchers in trials in the UK. CG will advise on existing guidance for data sharing in clinical trials and implications for qualitative data. ET has expertise in health research transparency and trial methodology and will provide input on all aspects of the project. KMS will provide guidance and support in the scoping activities to identify existing practices in trials. MS will provide advice on how best to access the UK trial community and will assist with the dissemination of the project findings. VB will advise on the development of the focus group interview guide for trial participants and will advise on recruitment for the focus group interview.
Sampling and recruitment. We will use a maximum variation approach to sampling to capture the views and experiences of different stakeholder groups across the UK and Ireland (Patton et al., 2008). We will conduct four focus groups of 6-8 key stakeholders each, from the UK and Ireland to explore their perspectives of qualitative data sharing in clinical trials. This will equal approximately 32 interviewees. The sample size for this study will follow guidance from Vasileiou et al. (2018) regarding 'data adequacy', whereby we aim to have sufficient data for meaningful analysis capturing the perspectives of those

Amendments from Version 1
The protocol has been revised and improved following the constructive feedback from both peer reviewers. We have explained why participants should no longer be enrolled in a clinical trial to be eligible to participate in our focus groups. We have explained in more detail in the rigour section about how this project will control for bias. We have made other minor corrections and clarifications to the manuscript, in particular with regard to the coding and data analysis process. A more detailed breakdown of these revisions can be found in the responses to reviewers.
Any further responses from the reviewers can be found at the end of the article REVISED involved in qualitative research in clinical trials. In the context of this study, "data adequacy" appears more suitable than "data saturation" in the process of decision-making regarding sample sizes, given that the concept of data saturation is often poorly defined within qualitative studies (Sebele-Mpofu, 2020). Separate focus group interviews will be conducted with members of each stakeholder group: trial managers, clinical trialists, qualitative researchers, members of research funding bodies and trial participants who have been involved in qualitative research.
The aim is to recruit participants who have had experience of qualitative data in clinical trials either as researchers, participants in trials or as funders reviewing grant application for clinical trials. We will recruit participants by contacting the UK Clinical Research Collaboration (UKCRC), Irish Clinical Research Facilities (CRF) and through Health Research Board Trials Methodology Research Network (HRB TMRN) and MRC-NIHR TMRP networks. We will ask them to circulate a recruitment email through their respective mailing lists and social media channels.
Once prospective participants express an interest in engaging with the study, they will be sent the participation information leaflet (see extended data) and will be required to complete an online consent form prior to their focus group (see extended data). Participants will be informed of the interview procedures and the recordings at least one week in advance of the research study. Participants will be provided with contact information for the research assistant (MMC) if they have any questions in advance, and it will be emphasised that consent in the research study is completely voluntary. Informed consent will be obtained prior to any data collection. Participants will also be required to consent to the use of recordings. Participation in focus groups will only commence once informed consent from all participants is received. Due to the minimally invasive nature of this study, as well as the familiarity with participants regarding research ethics and the research process, it is not anticipated that the study will cause any discomfort or distress. It is also not anticipated that the study will causes any discomfort or distress as participants can withdraw at any stage of the focus group. However, if participants do become distressed during the course of the interviews, a distress protocol (see extended data) will be implemented by the interviewers. While participants can withdraw at any point during the focus group, upon focus group completion it will not be possible to withdraw individuals due to the group format of the recording.

Inclusion criteria:
• Aged 18 years or over • Any of the following: • Trial managers, clinical trialists, and qualitative researchers who have experience of qualitative research in clinical trials.
• People working with trial funding agencies who have experience of reviewing grant applications for clinical trials with qualitative components.
• People who have participated in a completed clinical trial where data has been collected using qualitative methods. In comparison to individual interviews, which aim to explore individual attitudes, beliefs and feelings, focus groups elicit a multiplicity of views and use the process of interactions between participants to generate additional ideas and clarifications (George, 2013). Focus groups enable researchers to obtain a large amount of information within a short period of time.
This study will employ online methods of data collection as face-to-face contact is not possible due to current coronavirus disease 2019 (COVID-19) restrictions, and virtual focus groups offer the opportunity to bring people together who are not geographically co-located. Focus group interviews will be conducted virtually using a secure Zoom video conferencing account and will be audio-visually recorded. In addition, field notes will be taken during the focus groups as they maintain contextual details and non-verbal expressions for data analysis and interpretation (Houghton et al., 2013;Tong et al., 2007). Focus groups are expected to be one-off, and approximately one hour in duration. Participants will be made aware that focus group data cannot be withdrawn once the interview is finished but they do not have to answer particular questions if not comfortable doing so.
We have developed a semi-structured interview guide (see extended data), that explores perspectives of sharing qualitative data, potential benefits and challenges of same, and recommendations for what guidance is needed to support those involved in sharing qualitative data. A semi-structured topic guide consists of open-ended questions and will help the researcher to remain flexible by adapting questions and elaborating on ideas (Willig, 2013). This method will help generate rich data as it allows participants to build on one another's statements and comments (Guest et al., 2013;Tong et al., 2007).

Data protection.
As focus group data will be collected through online methods, procedures for data collection and processing will follow the six principles of the European (GDPR), and the Irish Data Protection Act, 2018. For focus groups, participants are asked to comply with confidentiality within the group and will sign consent to this prior to commencing the group discussion. Online focus groups will be conducted via a secure Zoom account. Online focus groups will be moderated by a member of the research team and will be recorded.
When transcription is completed, audio visual recordings will be destroyed and only the anonymised transcripts with the pseudonyms will be retained. In accordance with NUIG Policy, transcripts, field notes and documents will be retained for a minimum of seven years. In accordance with GDPR 2018 and NUIG Personal Data Security Schedule (PDSS), electronic records will be held on the NUI Galway One Drive server accessed through a password protected, encrypted laptop belonging to the lead researcher. All responses will remain confidential and individual names will not be directly linked to individual responses at any time during or after the study. All interviewee responses will be pseudonymised prior to reporting of the results. Pseudonymised findings may be shared with members of the wider project research team to assist with and/or inform data analysis; however, all identifiable information will be removed and individual responses will not be reported.
Only members of the research team based in NUIG will have access to the raw data collected, it will not be shared with anyone else, though a professional transcription company will be employed under strict data confidentiality agreements to transcribe the interviews.
Topic guides. The interview topic guide and questions were developed by members of the research team by reviewing existing literature regarding the sharing of qualitative data in clinical trials. The interview questions were based on the principles of developing semi-structured interviews in qualitative research and therefore intended as a broad guide. CH, LB and MD, who will conduct the focus groups, are experienced qualitative researchers and are comfortable with participant driven conversations. Additional probes will be used to explore certain areas of interest in more depth. The interview guides for each stakeholder group are available as extended data.
Data analysis. Data from focus groups will be analysed using thematic analysis (Braun & Clarke, 2006). Thematic analysis is an inductive approach to analysis, going beyond description into interpretation and linked to telling a coherent story about what is going on in the data (Clarke & Braun, 2018, p106). Thematic analysis will be carried out in line with the six key steps outlined by Braun and Clarke.
Step one: Familiarisation. The first step involves the researcher(s) becoming familiar with and engaging with the data by reading and re -reading transcripts.
Step two: Coding. This step begins once the researcher is familiar with the data and involves generating initial codes across the data set.
Step three: Generating themes. Searching for themes will begin and visualising how different codes may combine to form an overarching theme commences.
Step four: Reviewing themes.
Step four will involve reviewing potential themes and will require questioning the boundaries of and judging whether there is sufficient data to support each theme.
Step five: Defining and naming themes. Clear definitions and names will be established for each theme in this stage.
Step six: Writing up. This step involves producing the final report which is achieved by weaving together the themes in a logical and meaningful manner (Braun & Clarke, 2006).

Work package 2
Study design. This study will also employ a documentary analysis of current (i.e. past 5 years) consent procedures for qualitative data collected as part of the conduct of clinical trials.

Data collection.
We will contact trials managers and individual researchers involved in using qualitative data in trials and by contacting UKCRC, Irish Clinical Research Facilities (CRF), and through HRB TMRN and MRC-NIHR TMRP networks, we aim to explore current consent procedures for qualitative data collected as part of the conduct of trials. We will develop a tailored data extraction form to extract data such as whether specific clauses for data sharing are included in the consent form. CH and MMC will review documents including participant information leaflets and consent forms for the qualitative components in trials. We aim to capture consent procedures for approximately forty qualitative studies in trials.

Data analysis.
We will use a tailored extraction form to extract data such as whether specific clauses for data sharing are included in the consent form. We will use content analysis to analyse whether and how consent is being obtained for qualitative data sharing (Elo & Kyngäs, 2008). Content analysis is a valuable method for analysing qualitative material and seeks to analyse data in view of the meanings someone attributes to them (Krippendorff, 2018). This will provide baseline information on the prevalence of qualitative data sharing as well as the strategies being employed to do so. We will also analyse the purpose for which sharing of qualitative data is being requested, for example, within a group of trials, or for broader open science purposes.
The findings from both the focus group interviews and the documentary analysis of current consent procedures, will provide an insight into what is happening currently, the challenges and opportunities, and what is needed in terms of best practice guideline development for sharing qualitative data in trials.

Rigour.
The research team will agree the coding and theme development from the qualitative phase to ensure the data is represented sufficiently in the developed themes which will minimise researcher bias. The analysis will be conducted by CH, supported by MD and LB and managed within QSR Nvivo version 12 to provide a transparent audit trail of the decisions made through the analysis (Houghton et al., 2013). The coding will be an iterative process, and CH along with other members of the research team including MD and LB will move between the transcripts and the nodes created. MD and LB will also offer feedback on interpretations of the data and encourage reflexivity. We will utilise the queries tools in Nvivo to enable us to ask 'questions' of the analysis to minimise any potential researcher bias (Houghton et al., 2013). A codebook will be created within QSR NVivo to exhibit the reliability and credibility of our findings. It is important to note that the raw transcript materials generated during the study will be confidential. Only members of the research team will have access to the raw transcript materials.
Dissemination. The findings of both work packages will be presented to the TMRP Trial Conduct and Health Informatics working groups in a format of their preference, for example as part of the MRC-NIHR TMRP and the HRB-TMRN webinar series. We will also use the study findings to inform a consensus building workshop, following project completion, to identify model data sharing documents to act as an interim source of good practice guidance pending development of comprehensive guidance, and to make these available for the trials community via MRC-NIHR TMRP Trial Conduct Qualitative Research in trials (QRiT) target group. The consensus process will involve forming an advisory panel from the QRiT target group, thus providing an opportunity for members of this group to help drive this agenda and contribute their expertise in this area. This will also highlight the MRC-NIHR, TMRP/ HRB-TMRN work in this area, facilitating networking for subsequent activity. We will use the findings to inform further grant applications to develop more cohesive best practice recommendations guidance on sharing qualitative data collected in clinical trials. We will develop a plain language summary and disseminate the findings through social media and websites such as MRC-NIHR TMRP, HRB-TMRN and Qualitative research in Trials Centre (QUESTS).

Conclusion
This study will provide insight into the existing practice of sharing of qualitative data in clinical trials and the current issues and opportunities, to help shape future research and development of guidance to encourage maximum learning to be gained from this valuable data.

Data availability Underlying data
No data is associated with this article.

Spencer Phillips Hey
Center for Bioethics, Harvard Medical School, Boston, MA, USA Data sharing is an increasingly important issue for trials, particularly as the volume of trials --and therefore the volume of trial data potentially available for secondary use --continues to grow at such an incredible pace. While I have encountered several studies examining the logistical and ethical challenges associated with sharing patient-level, quantitative data, this protocol is the first I've seen describing a plan to explore issues around sharing of qualitative data. I thus found this protocol paper to be a welcome development, and the proposed research strikes me as a valuable contribution aimed at addressing an important gap in knowledge.
However, similar to the previous reviewer, I do think there are some sections in this paper that need greater discussion and clarification.
Foremost among these (for me) is the subsection on Rigour and the thematic analysis: I think much more should be said about how this project will control for bias -and in particular, the possibility that the themes identified will reflect (or be strongly biased towards) the interest, concerns, or assumptions of the research team. For example, I note that the focus group interview guide (Appendix 5) includes prompts about potential benefits (question 4), challenges (question 5), and recommendations (question 6) for sharing qualitative data, but it does not include a question about potential harms. Perhaps "challenges" is intended to encompass the potential for harm, but it is not clear to me that it does and I would not interpret the word in that way. I'd thus strongly suggest adding a dedicated prompt about potential harms (e.g., to privacy or dignity) and being clear to distinguish this from the prompt about logistical/technical/social "challenges".
The one sentence in the protocol on this potential for bias currently reads: "The research team will agree [to?] the coding and theme development from the qualitative phase to ensure the data is represented sufficiently in the developed themes which will minimise researcher bias." I suspect there may be a grammatical error here, which I've tried to correct with my bracketed addition. But even the corrected sentence doesn't really tell me much about what will be done to avoid bias. In fact, since my concern is that the results may be biased toward the research team's interest, this sentence would only seem to confirm that this bias is "baked in" to the study plan.
Following on this, I worry also about the potential for bias arising from the sample size and sampling strategy. 32 interviewees strikes me as a rather small number of participants, particularly if this 32 will be a heterogeneous mix of five different stakeholders (trial managers, clinical trialists, qualitative researchers, members of research funding bodies, and trial participants who have been involved in qualitative research). I am not an expert in qualitative research, so this concern may be misplaced, but even looking up the methodological references provided in the protocol suggests that 6 interviews per group is on the low side to achieve "data adequacy" (e.g., Vasileiou et al. 2018). (Although I confess I found that Vasileiou paper somewhat difficult to decipher as a methodological guidance document, since it is itself, a thematic analysis of qualitative research methods...) Similarly for the recruitment strategy: Leveraging research networks in the UK and Ireland makes complete sense from a pragmatic perspective. However, this strategy seems likely to miss individuals who may have had negative experiences with qualitative data sharing (e.g., because they may have specifically tried to disconnect themselves from established research institutions), and I think it would critical to try and capture at least some of this perspective.
© 2021 Howe N. This is an open access peer review report distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

Nicola Howe
Newcastle Clinical Trials Unit, Newcastle University, Newcastle upon Tyne, UK Overall I found this protocol paper interesting and well thought out. The rationale and methods are clearly stated. The research (and guidance) gap around sharing of qualitative trial data and the contribution that this research will make, and how the results will be disseminated are all available in this protocol. However, there are several areas of clarification that would be useful for readers that the authors may like to expand upon in order that this research could be truly reproducible. I have listed these below for the authors' consideration.

Work package 1:
It might be useful to expand the point about the development of the interview topic guide. The authors mention that they reviewed existing literature regarding sharing of qualitative data in clinical trials, but did they model any of their questions on those in the existing literature, with modifications? If so, which existing literature was used? The authors could refer briefly to the papers here.
I understand the exclusion criteria, but it might be useful for the authors to state explicitly why participants should no longer be taking part in a clinical trial.
Can the authors expand upon/confirm that the coding in NVivo was data driven or theory driven? The authors don't mention using a framework to code, but can they confirm that they did not and that the coding was instinctive?
It might be useful for the authors to state which researcher will conduct the coding, or whether more than one researcher will do this. The authors do make clear that there will be team consensus on codes and themes afterwards to reduce bias but will there be consensus during coding?

Work package 2:
Could the tailored extraction form be provided in extended data?
Is the rationale for, and objectives of, the study clearly described? Yes

Are sufficient details of the methods provided to allow replication by others? Partly
Are the datasets clearly presented in a useable and accessible format?

Not applicable
Competing Interests: Like Matthew Sydes, I am member of the UKCRC Registered Trial Network participant data sharing task and finish group, although we have not collaborated on a paper.
Reviewer Expertise: clinical trials, data sharing, data management, participant's attitudes towards data sharing, I confirm that I have read this submission and believe that I have an appropriate level of expertise to confirm that it is of an acceptable scientific standard.