A systematic literature review of researchers’ and healthcare professionals’ attitudes towards the secondary use and sharing of health administrative and clinical trial data

Abstract A systematic literature review of researchers and healthcare professionals’ attitudes towards the secondary use and sharing of health administrative and clinical trial data was conducted using electronic data searching. Eligible articles included those reporting qualitative or quantitative original research and published in English. No restrictions were placed on publication dates, study design, or disease setting. Two authors were involved in all stages of the review process; conflicts were resolved by consensus. Data was extracted independently using a pre-piloted data extraction template. Quality and bias were assessed using the QualSyst criteria for qualitative studies. Eighteen eligible articles were identified, and articles were categorised into four key themes: barriers, facilitators, access, and ownership; 14 subthemes were identified. While respondents were generally supportive of data sharing, concerns were expressed about access to data, data storage infrastructure, and consent. Perceptions of data ownership and acknowledgement, trust, and policy frameworks influenced sharing practice, as did age, discipline, professional focus, and world region. Young researchers were less willing to share data; they were willing to share in circumstances where they were acknowledged. While there is a general consensus that increased data sharing in health is beneficial to the wider scientific community, substantial barriers remain. Systematic review registration PROSPERO CRD42018110559

advancing new therapeutics, and developing improved supportive care interventions. However, clinical trials are expensive and can take several years to complete; a frequently quoted figure is that it takes 17 years for 14% of clinical research to benefit the patient [2,3].
Those who argue for increased data sharing in healthcare suggest that it may lead to improved treatment decisions based on all available information [4,5], improved identification of causes and clinical manifestations of disease [6], and provide increased research transparency [7]. In rare diseases, secondary data analysis may greatly accelerate the medical community's understanding of the disease's pathology and influence treatment.
Internationally, there are signs of movement towards greater transparency, particularly with regard to clinical research data. This change has been driven by governments [8], peak bodies [9], and clinician led initiatives [5]. One initiative led by the International Council of Medical Journal Editors (ICMJE) now requires a data sharing plan for all clinical research submitted for publication in a member scientific journal [9]. Further, international examples of data sharing can be seen in projects such as The Cancer Genome Atlas (TCGA) [10] dataset and the Surveillance, Epidemiology, and End Results (SEER) [11] database which have been used extensively for cancer research.
However, consent, data ownership, privacy, intellectual property rights, and potential for misinterpretation of data [12] remain areas of concern to individuals who are more circumspect about changing the data sharing norm. To date, there has been no published synthesis of views on data sharing from the perspectives of diverse professional stakeholders. Thus, we conducted a systematic review of the literature on the views of researchers and healthcare professionals regarding the sharing of health data.

Methods
This systematic literature review was part of a larger review of articles addressing data sharing, undertaken in accordance with the PRISMA statement for systematic reviews and meta-analysis [13]. The protocol was prospectively registered on PROSPERO (www.crd.york.ac. uk/PROSPERO, CRD42018110559).
The following databases were searched: EMBASE/ MEDLINE, Cochrane Library, PubMed, CINAHL, Informit Health Collection, PROSPERO Database of Systematic Reviews, PsycINFO, and ProQuest. The final search was conducted on 21 October 2018. No date restrictions were placed on the search; key search terms are listed in Table 1. Papers were considered eligible if they: were published in English; were published in a peer review journal; reported original research, either qualitative or quantitative with any study design, related to data sharing in any disease setting; and included subjects over 18 years of age. Systematic literature reviews were included in the wider search but were not included in the results. Reference list and hand searching were undertaken to identify additional papers. Papers were considered ineligible if they focused on electronic health records, biobanking, or personal health records or were review articles, opinion pieces/articles/letters, editorials, or theses from masters or doctoral research. Duplicates were removed and title and abstract and full-text screening were undertaken using the Cochrane systematic literature review program Covidence [14]. Two authors were involved in all stages of the review process; conflicts were resolved by consensus.
Quality and bias were assessed at a study level using the QualSyst system for quantitative and qualitative studies as described by Kmet et al. [15]. A maximum score of 20 is assigned to articles of high quality and low bias; the final QualSyst score is a proportion of the total, with a possible score ranging from 0.0 to 1.0 [15].
Data extraction was undertaken using a pre-piloted form in Microsoft Office Excel. Data points included author, country and year of study, study design and methodology, health setting, and key themes and results. Where available, detailed information on research participants was extracted including age, sex, clinical/academic employment setting, publication and grant history, career stage, and world region.
Quantitative data were summarised using descriptive statistics. Synthesis of qualitative findings used a metaethnographic approach, in accordance with guidelines from Lockwood et al. [16].The main themes of each qualitative study were first identified and then combined, Table 1 Key search criteria (data sharing) OR (data link*) OR (secondary data analysis) OR (data reuse) OR (data mining) AND (real world data) OR (clinical trial) (medical record*) OR (patient record*) OR (routine data) OR (administrative data) AND attitud* OR view* OR opinion* OR perspective* OR satisfaction AND (breast cancer) OR (breast neoplasm) OR (breast tumo*) OR (Carcinoma, breast) AND/OR patient* OR consumer* AND/OR doctor* OR clinician OR oncologist OR specialist AND/OR Researcher* OR scientist* OR 'data custodian' *Search includes 'wildcards' or truncation if relevant, into categories of commonality. Using a constant comparative approach, higher order themes and subthemes were developed. Quantitative data relevant to each theme were then incorporated. Using a framework analysis approach as described by Gale et al. [17], the perspectives of different professional groups (researchers, healthcare professionals, data custodians, and ethics committees) towards data sharing were identified. Where differences occurred, they are highlighted in the results. Similarly, where systematic differences according to other characteristics (such as age or years of experience), these are highlighted.

Results
This search identified 4019 articles, of which 241 underwent full-text screening; 73 articles met the inclusion criteria for the larger review. Five systematic literature reviews were excluded as was one article which presented duplicate results; this left a total of 67 articles eligible for review. See Fig. 1 for the PRISMA diagram describing study screening.
This systematic literature review was originally developed to identify attitudes towards secondary use and sharing of health administrative and clinical trial data in breast cancer. However, as there was a paucity of material identified specifically related to this group, we present the multidisciplinary results of this search, and where possible highlight results specific to breast cancer, and cancer more generally. We believe that the material identified in this search is relevant and reflective of the wider attitudes towards data sharing within the scientific and medical communities and can be used to inform data sharing strategies in breast cancer.

Study quality
Results of the quality assessment are provided in Table  2. QualSyst [15] scores ranged from 0.7 to 1.0 (possible range 0.0 to 1.0). While none were blinded studies, most provided clear information on respondent selection, data analysis methods, and justifiable study design and methodology.

Themes
Four key themes, barriers, facilitators, access, and ownership were identified; 14 subthemes were identified. A graphical representation of article themes is presented in Fig. 2. Two articles reflect the perspective of research ethics committees [23] and data custodians [27]; concerns noted by these groups are similar to those highlighted by researchers and healthcare professionals.

Barriers and facilitators
Reasons for not sharing Eleven articles identified barriers to data sharing [20,22,24,25,27,[29][30][31][32][33][34]. Concerns cited by respondents included other researchers taking their results [24,25], having data misinterpreted or misattributed [24,27,31,32], loss of opportunities to maximise intellectual property [24,25,27], and loss of publication opportunities [24,25] or funding [25]. Results of a qualitative study showed respondents emphasised the competitive value of research data and its capacity to advance an individual's career [20] and the potential for competitive disadvantage with data sharing [22]. Systematic issues related to increased data sharing were noted in several articles where it was suggested the barriers are 'deeply rooted in the practices and culture of the research process as well as the researchers themselves' [33] (p. 1), and that scientific competition and a lack of incentive in academia to share data remain barriers to increased sharing [30].
Insufficient time, lack of funding, limited storage infrastructure, and lack of procedural standards were also noted as barriers [33]. Quantitative results demonstrated that the researchers did not have the right to make the data public or that there was no requirement to share by the study sponsor [33]. Maintaining the balance between investigator and funder interests and the protection of research subjects [31] were also cited as barriers. Concerns about privacy were noted in four articles [25,27,29,30]; one study indicated that clinical researchers were significantly more concerned with issues of privacy compared to scientific researchers [25]. The results of one qualitative study indicated that clinicians were more cautious than patients regarding the inclusion of personal information in a disease specific registry; the authors suggest this may be a result of potential for legal challenges in the setting of a lack of explicit consent and consistent guidelines [19]. Researchers, particularly clinical staff, indicated that they did not see sharing data in a repository as relevant to their work [29] Trust was also identified as a barrier to greater data sharing [32]. Rathi et al. identified that researchers were likely to withhold data if they mistrusted the intent of the researcher requesting the information [32]. Ethical, moral, and legal issues were other potential barriers cited [19,22]. In one quantitative study, 74% of respondents (N = 317) indicated that ensuring appropriate data use was a concern; other concerns included data not being appropriate for the requested purpose [32]. Concerns about data quality were also cited as a barrier to data reuse; some respondents suggested that there was a perceived negative association of data reuse among health scientists [30].

Reasons for sharing
Eleven articles [19-22, 24, 25, 29-33] discussed the reasons identified by researchers and healthcare professionals for sharing health data; broadly the principle of data sharing was seen as a desirable norm [25,31]. Cited benefits included improvements to the delivery of care,   birth, and postcode of mother etc.); 47% concerned about external access to the billing or the health records; 50% external access to identifiable records was either the reason for requiring consent or an important factor. Several noted the fact that researchers would be going through the record itself, which, by nature, is identifying.
Sites that stated it depends (n = 3, 10%), reasons for Whether or not consent would be required hinged entirely on the potential for indirectly identifying individuals from the combination of full postcode with ethnicity or date of birth.
If this information was essential, then consent would be required. If not or truncated postcode or age category were used consent would not be required.
No respondents were concerned about external access to records.
Sites not requiring (n = 10, 38%), reasons for 70% minimal risk, nature of the research, as the rationale for not requiring consent. Deemed minimal risk because either: lack of direct contact with individuals or anonymity of the data being extracted from the health record. 40%, had policy is to not require consent for research involving retrospective chart review; 20%, indicated that their provincial body specifically permitted release of personal information without consent for research purposes if they believe that the researcher will protect the patient's identity.
Quantitative studies  5 (100%) clinical compared to 2 (13%) of the scientific researchers indicated privacy was a concern. Repositories 27% and 24% rated uploading to data repositories as 'very highly' or 'highly' (%) relevant to their work respectively, but experience levels low.

Scientific staff
Relevance of sharing data in a repository more highly ranked than their expertise in doing so. More likely to consider sharing data in a repository relevant to their work.
The odds of having HIGH relevance in the scientific group are 5.75 times larger than in the clinical group. The odds of having HIGH expertise in this task in the scientific group are also greater than in the clinical group.
Clinical staff Relevance of sharing data in a repository more highly ranked than their expertise in

Predictors of sharing and norms
The effects of social norm (β .0.339; p < 0.01) and attitude (β .0.331; p < 0.01) were relatively higher than other factors. Perceived usefulness and perceived concern were found to have indirect effects on intention of data reuse through attitude.
A positive social norm towards data reuse positively supports researchers' data reuse intention.

Reasons for sharing
The perceived usefulness is found to be the strongest indicator that is indirectly influential to reuse intention.

Reasons for not sharing
Negative association with data reuse practice among health scientists. Legal issues relating to privacy, cultural barriers, and technical challenge were cited. Must comply with laws, regulations, and No significant differences in reasons for withholding data between respondents categorised trialists' academic productivity and geographic location, trial funding source and size, and the journal in which it was published.

Reasons for sharing
78%, promotion of open science. No significant differences in reasons for sharing data between respondents by trialists' academic productivity and geographic location, trial funding source and size, and the journal in which it was published. An exception to this was, has or would share data from their published study in order to receive academic benefits or recognition based on geographic location (p < 0.001). Western Europe responded affirmatively 58% compared to 31% in the US or Canada, and 43% ROW.

Reasons for not sharing
Rates of overall concern ranged between 67 and 84%. No significant differences in overall concern about sharing data through repositories between respondents by trialists' academic productivity and geographic location, trial funding source and size, and the journal in which it was published. 74% identified ensuring appropriate data use (65%) as a reason for withholding data from their published study. Concerns included data not appropriate for the requested purpose, and the potential for misinterpretation and misleading secondary analyses. 69% indicated that paying for the costs of data does not include the right to use that data or that they do not believe that data users should be required to pay data creators. Curation 59.8% (agree strongly or somewhat) they are satisfied with cataloguing or describing their data. 45% and 73% are satisfied with the process of storing data beyond the life of the project compared to short term, respectively.
35% of the respondents stated that they are dissatisfied with the long-term storage process. 46%, do not make their data electronically available to others.

<6% of scientists who make
'all' of their data available via some mechanism, tends to reenforce the lack of data sharing within the communities surveyed. Differences by age, discipline, professional focus and world region Not all scientists share data equally or have the same perceptions of data sharing and 0.8 Younger more likely to think lack of access to data is a major impediment to progress in science and has restricted their ability to answer scientific questions. Discipline Majority shared data with others, but respondents from medical fields and social sciences were less likely to make their data electronically available.
Professional focus 74% and 79% of research-intensive respondents and teaching-intensive respondents showed willingness to place some data into a central data repository with no restrictions, and willingness to share across broad group of researchers who use data in different ways, 77% and 83% respectively.

World region
Non-N. America/non-European's more likely to think that lack of access to data is a major impediment to progress in science (Other = 79%, Europe =72%, and N. America =64%) and has restricted their ability to answer scientific questions (Other = 63%, Europe = 55%, and N. America 47%). 'Other' parts of the world are most willing to place all of their data into a central data repository with no restrictions (53%); more likely to make their data available if they could place conditions on access (73%); and the most satisfied with their ability to integrate data from disparate sources to address research questions (58%).

Experience of sharing
Nearly one third of the respondents chose not to answer whether they make their data No significant differences across subject disciplines when it came to perceived risks associated with data sharing.

World region
Asia: more strongly about data access as an important part of their own scientific pursuits; however, agreed more strongly than those from other geographic regions that permission was needed to access data.
N. American: more wary of possible misuse of shared data. Were also less likely than Asian respondents to agree that conditions for use of their data were fair. Views on sharing Increased acceptance of and willingness to engage in data sharing. More agreement and willingness among scientists to share at least some or all of their data across broader groups with no limitations. Education and medicine/health science were more inclined to agree that they do not have the right to make their data available in the first place.  [19], contributing to the advancement of science [20,24,29], validating scientific outputs, reducing duplication of scientific effort and minimising research costs [20], and promoting open science [31,32]. Professional reasons for sharing data included academic benefit and recognition, networking and collaborative opportunities [20,24,29,31], and contributing to the visibility of their research [24]. Several articles noted the potential of shared data for enabling faster access to a wider pool of patients [21] for research, improved access to population data for longitudinal studies [22], and increased responsiveness to public health needs [20]. In one study, a small percentage of respondents indicated that there were no benefits from sharing their data [24]. Analysis of quantitative survey data indicated that the perceived usefulness of data was most strongly associated with reuse intention [30]. The lack of access to data generated by other researchers or institutions was seen as a major impediment to progress in science [33]. In a second study, quantitative data showed no significant differences in reasons for sharing by clinical trialists' academic productivity, geographic location, trial funding source or size, or the journal in which the results were published [32]. Attitudes towards sharing in order to receive academic benefits or recognition differed significantly based on the respondent's geographic location; those from Western Europe were more willing to share compared to respondents in the USA or Canada, and the rest of the world [32].

Views on sharing
Seven articles [19-21, 29, 31, 33, 34] discussed researchers' and healthcare professionals' views relating to sharing data, with a broad range of views noted. Two articles, both qualitative, discussed the role of national registries [21], and data repositories [31]. Generally, there was clear support for national research registers and an acceptance for their rationale [21], and some respondents believed that sharing de-identified data through data repositories should be required and that when requested, investigators should share data [31]. Sharing de-identified data for reasons beyond academic and public health benefit were cited as a concern [20]. Two quantitative studies noted a proportion of researchers who believed that data should not be made available [33,34]. Researchers also expressed differences in how shared data should be managed; the requirement for data to be 'gate-kept' was preferred by some, while Table 3 Studies by country

Country study undertaken (in alphabetical order) Reference
Australia [22] Canada [23] England and Northern Ireland [19] Germany [28] Japan [18] Multiple [26,27,[31][32][33][34] Scotland [21] Sub-Saharan Africa [20,24] United States of America (USA) [25, 29, 30,   others were happy to relinquish control of their data once curated or on release [20]. Quantitative results indicated that scientists were significantly more likely to rank data reuse as highly relevant to their work than clinicians [29], but not all scientists shared data equally or had the same views about data sharing or reuse [33]. Some respondents argued that not all data were equal and therefore should only be shared in certain circumstances. This was in direct contrast to other respondents who suggested that all data should be shared, all of the time [20].
Differences by age, background, discipline, professional focus, and world region Differences in attitudes towards shared data were noted by age, professional focus, and world region [25,27,33,34]. Younger researchers, aged between 20-39 and 40-49 years, were less likely to share their data with others (39% and 38% respectively) compared to other age groups; respondents aged over 50 years of age were more willing (46%) to share [33]. Interestingly, while less willing to share, younger researchers also believed that the lack of access to data was a major impediment to science and their research [33]. Where younger researchers were able to place conditions on access to their data, rates of willingness to share were increased [33].
Respondents from the disciplines of education, medicine/health science, and psychology were more inclined than others to agree that their data should not be available for others to use in the first place [34]. However, results from one study indicated that researchers from the medical field and social sciences were less likely to share compared to other disciplines [33]. For example, results of a quantitative study showed that compared to biologists, who reported sharing 85% of their data, medical and social sciences reported sharing their data 65% and 58% percent of the time, respectively [33].
One of the primary reasons for controlling access to data, identified in a study of data custodians, was due to a desire to avoid data misuse; this was cited as a factor for all surveyed data repositories except those of an interdisciplinary nature [27]. Limiting access to certain types of research and ensuring attribution were not listed as a concern for sociology, humanities or interdisciplinary data collections [27]. Issues pertaining to privacy and sensitive data were only cited as concerns for data collections related to humanities, social sciences, and biology, ecology, and chemistry; concerns regarding intellectual property were also noted [27]. The disciplines of biology, ecology, and chemistry and social sciences had the most policy restrictions on the use of data held in their repositories [27].
Differences in data sharing practices were also noted by world region. Respondents not from North American and European countries were more willing to place their data on a central repository; however, they were also more likely to place conditions on the reuse of their data [33,34].

Experience of data sharing
The experience of data sharing among researchers was discussed in nine articles [20,[24][25][26][28][29][30][31][32][33]. Data sharing arrangements were highly individual and ranged from ad hoc and informal processes to formal procedures enforced by institutional policies in the form of contractual agreements, with respondents indicating data sharing behaviour ranging from sharing no data to sharing all data [20,26,31]. Quantitative data from one study showed that researchers were more inclined to share data prior to publication with people that they knew compared to those they did not; post publication, these figures were similar between groups [24]. While many researchers were prepared to share data, results of a survey identified a preference of researchers to collect data themselves, followed by their team, or by close colleagues [26].
Differences in the stated rate of data sharing compared to the actual rate of sharing [25] were noted. In a large quantitative study (N = 1329), nearly one third of respondents chose not to answer whether they make their data available to others; of those who responded to the question, 46% reported they do not make their data electronically available to others [33]. By discipline, differences in the rate of refusal to share were higher in chemistry compared to non-science disciplines such as sociology [25]. Respondents who were more academically productive (> 25 articles over the past 3 years) reported that they have or would withhold data to protect research subjects less frequently than those who were less academically productive or received industry funding [32].
Attitudes to sharing de-identified data via data repositories was discussed in two articles [29,31]. A majority of respondents in one study indicated that de-identified data should be shared via a repository and that it should be shared when requested. A lack of experience in uploading data to repositories was noted as a barrier [29]. When data was shared, most researchers included additional materials to support their data including materials such as metadata or a protocol description [29].
Two articles [28,30] focused on processes and variables associated with sharing. Factors such as norms, data infrastructure/organisational support, and research communities were identified as important factors in a researcher's attitude towards data sharing [28,30]. A moderate correlation between data reuse and data sharing suggest that these two variables are not linked. Furthermore, sharing data compared to self-reported data reuse were also only moderately associated (Pearson's correlation of 0.25 (p ≤ 0.001)) [26].

Predictors of data sharing and norms
Two articles [26,30] discussed the role of social norms and an individual's willingness to share health data. Perceived efficacy and efficiency of data reuse were strong predictors of data sharing [26] and the development of a 'positive social norm towards data sharing support(s)[ed] researcher data reuse intention' [30] (p. 400).

Policy framework
The establishment of clear policies and procedures to support data sharing was highlighted in two articles [22,28]. The presence of ambiguous data sharing policies was noted as a major limitation, particularly in primary care and the increased adoption of health informatics systems [22]. Policies that support an efficient exchange system allowing for the maximum amount of data sharing are preferred and may include incentives such as formal recognition and financial reimbursement; a framework for this is proposed in Fecher et al. [28].

Research funding
The requirement to share data funded by public monies was discussed in one article [25]. Some cases were reported of researchers refusing to share data funded by tax-payer funds; reasons for refusal included a potential reduction in future funding or publishing opportunities [25].

Access and ownership
Articles relating to access and ownership were grouped together and seven subthemes were identified.

Access, information systems, and metadata
Ten articles [19-22, 26, 27, 29, 33-35] discussed the themes of access, information systems, and the use of metadata. Ensuring privacy protections in a prospective manner was seen as important for data held in registries [19]. In the setting of mental health, researchers indicated that patients should have more choices for controlling access to shared registry data [35]. The use of guardianship committees [19] or gate-keepers [20] was seen as important in ensuring the security and access to data held in registries by some respondents; however, many suggested that a researcher should relinquish control of the data collection once curated or released, unless embargoed [20]. Reasons for maintaining control over registry data included ensuring attribution, restricting commercial research, protecting sensitive (non-personal) information, and limiting certain types of research [27]. Concerns about security and confidentiality were noted as important and assurances about these needed to be provided; accountability and transparency mechanisms also need to be included [21]. Many respondents believed that access to the registry data by pharmaceutical companies and marketing agencies was not considered appropriate [19].
Respondents to a survey from medicine and social sciences were less likely to agree to have all data included on a central repository with no restrictions [33]; notably, this was also reflected in the results of qualitative research which indicated that health professionals were more cautious than patients about the inclusion of personal data within a disease specific register [19].
While many researchers stated that they commonly shared data directly with other researchers, most did not have experience with uploading data to repositories [29]. Results from a survey indicated that younger respondents have more data access restrictions and thought that their data is easier to access significantly more than older respondents [34]. In the primary care setting, concerns were noted about the potential for practitioners to block patient involvement in a registry by refusing access to a patient's personal data or by not giving permission for the data to be extracted from their clinical system [21]. There was also resistance in primary care towards health data amalgamation undertaken for an unspecified purpose [22]; respondents were not in favour of systems which included unwanted functionality (do not want/ need), inadequate attributes (capability and receptivity) of the practice, or undesirable impact on the role of the general practitioner (autonomy, status, control, and workflow) [22].
Access to 'comprehensive metadata (is needed) to support the correct interpretation of the data' [26] (p. 4) at a later stage. When additional materials were shared, most researchers shared contextualising information or a description of the experimental protocol [29]. The use of metadata standards was not universal with some respondents using their own [33].

Curation
Several articles highlighted the impact of data curation on researchers' time [20-22, 29, 33] or finances [24,28,29,33,34]; these were seen as potential barriers to increased registry adoption [21]. Tasks required for curation included preparing data for dissemination in a usable format and uploading data to repositories. The importance of ensuring that the data is accurately preserved for future reuse was highlighted; it must be presented in a retriable and auditable manner [20]. The amount of time required to curate data ranged from 'no additional time' to 'greater than ten hours' [29]. In one study, no clinical respondent had their data in a sharable format [29]. In the primary care setting, health information systems which promote sharing were not seen as being beneficial if they required standardisation of processes and/or sharing of clinical notes [22]. Further, spending time on non-medical issues in a time poor environment [22] was identified as a barrier. Six articles described the provision of funding or technical support to ensure data storage, maintenance, and the ability to provide access to data when requested. All noted a lack of funding and time as a barrier to increased sharing data [20,24,28,29,33,34].

Consent
Results of qualitative research indicated a range of views regarding consent mechanisms for future data use [18-20, 23, 35]. Consenting for future research can be complex given that the exact nature of the study will be unknown, and therefore some respondents suggested that a broad statement on future data uses be included [19,20] during the consent process. In contrast, other participants indicated that the current consent processes were too broad and do not reflect patient preferences sufficiently [35]. The importance of respecting the original consent in all future research was noted [20]. It was suggested that seeking additional consent for future data use may discourage participation in the original study [20]. Differences in views regarding the provision of detailed information about sharing individual level data was noted suggesting that the researchers wanted to exert some control over data they had collected [20]. An opt-out consent process was considered appropriate in some situations [18] but not all; some respondents suggested that consent to use a patient's medical records was not required [18]. There was support by some researchers to provide patients with the option to 'opt-in' to different levels of involvement in a registry setting [19]. Providing patients more granular choices when controlling access to their medical data [35] was seen as important.
The attitudes of ethics and review boards (N = 30) towards the use of medical records for research was discussed in one article [23]. While 38% indicated that no further consent would be required, 47% required participant consent, and 10% said that the requirement for consent would depend on how the potentially identifying variables would be managed [23]. External researcher access to medical record data was associated with a requirement for consent [23].

Acknowledgement
The importance of establishing mechanisms which acknowledge the use of shared data were discussed in four articles [27,29,33,34]. A significant proportion of respondents to a survey believed it was fair to use other researchers' data if they acknowledged the originator and the funding body in all disseminated work or as a formal citation in published works [33]. Other mechanisms for acknowledging the data originator included opportunities to collaborate on the project, reciprocal data sharing agreements, allowing the originator to review or comment on results, but not approve derivative works, or the provision of a list of products making use of the data and co-authorship [33,34]. In the setting of controlled data collections, survey results indicated that ensuring attribution was a motivator for controlled access [27]. Over half of respondents in one survey believed it was fair to disseminate results based either in whole or part without the data provider's approval [33]. No significant differences in mechanisms for acknowledgement were noted between clinical and scientific participants; mechanisms included co-authorship, recognition in the acknowledgement section of publications, and citation in the bibliography [29]. No consentient method for acknowledging shared data reuse was identified [29].
Ownership Data ownership was identified as a potential barrier to increased data sharing in academic research [28]. In the setting of control of data collections, survey respondents indicated that they wanted to maintain some control over the dataset, which is suggestive of researchers having a perceived ownership of their research data [28]. Examples of researchers extending ownership over their data include the right to publish first and the control of access to datasets [28]. Fecher et al. noted that the idea of data ownership by the researcher is not a position always supported legally; 'the ownership and rights of use, privacy, contractual consent and copyright' are subsumed [28] (p. 15). Rather data sharing is restricted by privacy law, which is applied to datasets containing data from individuals. The legal uncertainty about data ownership and the complexity of law can deter data sharing [28].

Promotion/professional criteria
The role of data sharing and its relation to promotion and professional criteria were discussed in two articles [24,28]. The requirement to share data is rarely a promotion or professional criterion, rather the systems are based on grants and publication history [24,28]. One study noted that while the traditional link between publication history and promotion remains, it is 'likely that funders will continue to get sub-optimal returns on their investments, and that data will continue to be inefficiently utilised and disseminated' [24] (p. 49).

Discussion
This systematic literature review highlights the ongoing complexity associated with increasing data sharing across the sciences. No additional literature meeting the inclusion criteria were identified in the period between the data search and the submission of this manuscript. Data gaps identified include a paucity of information specifically related to the attitudes of breast cancer researchers and health professionals towards the secondary use and sharing of health administrative and clinical trial data.
While the majority of respondents believed the principles of data sharing were sound, significant barriers remain: issues of consent, privacy, information security, and ownership were key themes throughout the literature. Data ownership and acknowledgement, trust, and policy frameworks influenced sharing practice, as did age, discipline, professional focus, and world region.
Addressing concerns of privacy, trust, and information security in a technologically changing and challenging landscape is complex. Ensuring the balance between privacy and sharing data for the greater good will require the formation of policy and procedures, which promote both these ideals.
Establishing clear consent mechanisms would provide greater clarity for all parties involved in the data sharing debate. Ensuring that appropriate consent for future research, including secondary data analysis and sharing and linking of datasets, is gained at the point of data collection, would continue to promote research transparency and provide healthcare professionals and researchers with knowledge that an individual is aware that their data may be used for other research purposes. The establishment of policy which supports and promotes the secondary use of data and data sharing will assist in the normalisation of this type of health research. With the increased promotion of data sharing and secondary data analysis as an established tool in health research, over time barriers to its use, including perceptions of ownership and concerns regarding privacy and consent, will decrease.
The importance of establishing clear and formal processes associated with acknowledging the use of shared data has been underscored in the results presented. Initiatives such as the Bioresource Research Impact Factor/Framework (BRIF) [36] and the Citation of BioResources in journal Articles (CoBRA) [37] have sought to formalise the process. However, increased academic recognition of sharing data for secondary analysis requires further development and the allocation of funding to ensure that collected data is in a usable, searchable, and retrievable format. Further, there needs to be a shift away from the traditional criteria of academic promotion, which includes research outputs, to one which is inclusive of a researcher's data sharing history and the availability of their research dataset for secondary analysis.
The capacity to identify and use already collected data was identified as a barrier. Moves to make data findable, accessible, interoperable, and reusable (FAIR) have been promoted as a means to encourage greater accessibility to data in a systematic way [38]. The FAIR principles focus on data characteristics and should be interpreted alongside the collective benefit, authority to control, responsibility, and ethics (CARE) principles established by the Global Indigenous Data Alliance (GIDA) which a people and purpose orientated [39].

Limitations
The papers included in this study were limited to those indexed on major databases. Some literature on this topic may have been excluded if it was not identified during the grey literature and hand searching phases.

Implications
Results of this systematic literature review indicate that while there is broad agreement for the principles of data sharing in medical research, there remain disagreements about the infrastructure and procedures associated with the data sharing process. Additional work is therefore required on areas such as acknowledgement, curation, and data ownership.

Conclusion
While the literature confirms that there is overall support for data sharing in medical and scientific research, there remain significant barriers to its uptake. These include concerns about privacy, consent, information security, and data ownership.