A systematic literature review of health consumer attitudes towards secondary use and sharing of health administrative and clinical trial data: a focus on privacy, trust, and transparency
Systematic Reviews volume 9, Article number: 235 (2020)
We aimed to synthesise data on issues related to stakeholder perceptions of privacy, trust, and transparency in use of secondary data. A systematic literature review of healthcare consumer attitudes towards the secondary use and sharing of health administrative and clinical trial data was conducted. EMBASE/MEDLINE, Cochrane Library, PubMed, CINAHL, Informit Health Collection, PROSPERO Database of Systematic Reviews, PsycINFO, and ProQuest databases were searched. Eligible articles included those reporting qualitative or quantitative original research and published in English. No restrictions were placed on publication dates, study design or disease setting. One author screened articles for eligibility, and two authors were involved in the full text review process. Data was extracted using a pre-piloted data extraction template by one author and checked by another. Conflicts were resolved by consensus. Quality and bias were assessed using the QualSyst criteria for qualitative and quantitative studies. This paper focuses on a subset of 35 articles identified from the wider search which focus on issues of privacy, trust, and transparency. Studies included a total of 56,365 respondents. Results of this systematic literature review indicate that while respondents identified advantages in sharing health data, concerns relating to trust, transparency, and privacy remain. Organisations collecting health data and those who seek to share data or undertake secondary data analysis should continue to develop trust, transparency, and privacy with healthcare consumers through open dialogue and education. Consideration should be given to these issues at all stages of data collection including the conception, design, and implementation phases. While individuals understand the benefits of health data sharing for research purposes, ensuring a balance between public benefit and individual privacy is essential. Researchers and those undertaking secondary data analysis need to be cognisant of these key issues at all stages of their research. Systematic review registration: PROSPERO registration number CRD42018110559 (update June 2020).
Healthcare provides an opportune setting for increased data sharing and secondary data analysis. Secondary data analysis of existing data originally collected for other purposes  can provide insights into real-world clinical practice  and generate new clinical evidence . There are many forms of data collected during an individual’s interactions with health services, including administrative and clinical trial data which are the focus of this review. Administrative data are data originally collected for administrative and billing purposes , but have the capacity to be used to identify systemic issues and service gaps and used to inform improved health resourcing. Clinical trials are expensive and take an approximately 17 years to complete, and less than 14% of the evidence is translated into practice . Given the low rates of evidence being translated into practice, it can be suggested that the secondary use of this data has greater importance. The secondary analysis of clinical trial data can further advance the medical community’s understanding of diseases and potentially limit the expenditure of funds on already tested hypotheses.
Increased access to data for secondary use is complex and continues to attract strong debate within the health and scientific communities as well as the general public. While researchers are now being encouraged to increase data accessibility for secondary research [6, 7], a range of stakeholder-perceived barriers and concerns remain, including issues such as trust, transparency, and privacy [8, 9]. Despite the impact of these issues on willingness to share data, there is a lack of synthesis of stakeholder views to guide policy and practice.
This paper presents the results of a subset of articles identified in our systematic literature review and focuses on healthcare consumer concerns relating to privacy, trust, and transparency in the setting health administrative and clinical trial data reuse.
This systematic literature review presents the results of a subset of articles identified in a larger review of articles addressing data sharing and was undertaken in accordance with the PRISMA statement for systematic reviews and meta-analysis . The protocol was prospectively registered on PROSPERO (www.crd.york.ac.uk/PROSPERO, CRD42018110559; updated June 2020).
The following databases were searched: EMBASE/ MEDLINE, Cochrane Library, PubMed, CINAHL, Informit Health Collection, PROSPERO Database of Systematic Reviews, PsycINFO, and ProQuest. The search was conducted on 24 June 2020. No date restrictions were placed on the search; key search terms are listed in Table 1.
Our original goal was to focus on attitudes towards data reuse by breast cancer patients. However, due to a paucity of studies targeting this group, we re-ran the search without this limitation and present the results of all disease settings and noted specific cases where breast cancer or any cancers were included. Breast cancer is a disease that impacts older individuals; therefore, respondents under the age of 18 years were excluded from this analysis, as were attitudes towards biobanking and genetic research.
We noted that increasingly the delineation between data collected for administrative purposes and other forms of electronic documentation such as electronic health records (EHR) (or other terms for these) becomes less clear. These records can contain both administrative and clinical data. Where possible, EHRs were excluded from this literature review; however, we acknowledge that the lack of separation has made this a grey area.
Papers were considered eligible if they were published in English in a peer-reviewed journal; reported original research, either qualitative or quantitative with any study design, related to data sharing in any disease setting; and included subjects over 18 years of age. Reference list and hand searching was undertaken to identify additional papers. Systematic literature reviews were included in the wider search but were not included in the results. Papers were considered ineligible if they focused on electronic health records (including other terms for these), health information exchanges, biobanking and genetics, and were review articles, opinion pieces, articles, letters, editorials or non-peer-reviewed theses from masters and doctoral research. Duplicates were removed and title and abstract and full text screening were undertaken using the Cochrane systematic literature review programme ‘Covidence’ . One author screened articles for eligibility and two authors were involved in the full text review process; conflicts were resolved by consensus.
Quality and bias were assessed at a study level using the QualSyst system for quantitative and qualitative studies as described by Kmet et al ; this is a validated tool and can be used to assess both qualitative and quantitative studies. No modifications were made to the QualSyst criteria prior to use. Quality and bias assessment was undertaken independently by two authors; conflicts were resolved by consensus. A maximum score of 20 is assigned to articles of high quality and low bias; the final QualSyst score is a proportion of the total, with a possible score ranging from 0.0 to 1.0 .
Data extraction was undertaken by one author using a pre-piloted form in Microsoft Office Excel; a second author confirmed the data extraction. Conflicts were resolved by consensus. Data points included author, country and year of study, study design and methodology, health setting, and key themes and results. Where available, detailed information on research participants was extracted including age, sex, employment status, highest level of education, and health status.
Quantitative data were summarised using descriptive statistics. Synthesis of qualitative findings used a meta-aggregative approach, in accordance with guidelines from Lockwood et al . The main themes of each qualitative study were first identified and then combined, if relevant, into categories of commonality. Using a constant comparative approach, higher-order themes and subthemes were developed. Quantitative data relevant to each theme were then incorporated. Using a framework analysis approach as described by Gale et al , the perspectives of different groups towards data sharing were identified. Where differences occurred, they are highlighted in the results. Similarly, where systematic differences according to other characteristics (such as age or sex) occurred, these are highlighted.
This search identified 10,499 articles, of which 323 underwent full text screening; 75 articles met the inclusion criteria for the larger review. The PRISMA diagram is presented in Fig. 1. This article presents a subset of the results of the wider search which explores attitudes of health consumers towards privacy, trust, and transparency. The results relating to attitudes towards data sharing and reuse by researchers and healthcare professionals, and attitudes towards consent in the context of data sharing and reuse by healthcare consumers are presented in subsequent publications.
A subset of 35 [15,16,17,18,19,20,21,22,23,24,25,26,27,28,29,30,31,32,33,34,35,36,37,38,39,40,41,42,43,44,45,46,47,48,49] of the 75 articles addressed issues relating to privacy [15,16,17,18,19,20,21,22,23,24,25,26,27,28,29,30,31,32,33,34,35,36,37,38,39,40,41,42,43,44,45,46,47,48,49], trust [16,17,18, 21, 23, 24, 26, 28,29,30, 32,33,34,35,36,37, 39,40,41,42, 44,45,46, 48], and/or transparency [15,16,17, 26, 30, 32, 33, 37, 40, 42,43,44, 48] and are included in this analysis (Fig. 1 and Table 2). A total of 56,365 respondents were included in the studies.
Study design, location, clinical focus, and study populations
Qualitative research methodologies included face-to-face interviews and/or focus groups [32,33,34, 36,37,38, 49]. Other designs included surveys [16,17,18,19,20,21, 23,24,25,26,27,28,29, 35, 39, 41, 44] and combinations of deliberative sessions with surveys [15, 40, 45, 46] and focus groups and interviews . Two studies used a citizens’ jury model [47, 48] and another was a nested cohort within a randomised controlled trial . Studies were conducted in several countries; a breakdown by country is presented in Table 3.
Most articles focused on the general public’s attitudes towards secondary data usage, particularly in general medicine [18, 22, 25,26,27,28,29,30, 34, 37,38,39, 43,44,45, 48, 49], but also national cancer databases [31, 41], clinical trials [21, 32], fertility , pharmaco-epidemiological , and epidemiological  research. Other studies focused on health consumers’ attitudes to secondary data usage in individuals: attending US Veterans Affairs (VA) facilities  or recently discharged from tertiary care , or with arthritis and other chronic conditions . Others were in the setting of human immunodeficiency virus (HIV) , breast cancer (BC), colon cancer (CC) , or heterogeneous cancers [19, 20], acquired immune deficiency syndrome (AIDS), or multiple sclerosis (MS), or mental health concerns , presenting with rare diseases [16, 17], in adults or parents of children with cystic fibrosis (CF), sickle cell disease (SCD), or diabetes mellitus (DM) , or in adults with potentially stigmatising health conditions (DM, hypertension, chronic depression, alcoholism, HIV, BC, or lung cancer) .
The majority of articles discussed general attitudes towards health data linkage and secondary use [16, 22, 27, 30, 37, 39, 43, 46], linking health administrative data to clinical trial data  or clinical trial data reuse [21, 32], linking administrative data to survey data , access to medical records [15, 19, 23, 25, 26, 28, 29, 34, 35, 40, 45, 47, 48], statistical databases , research registries [17, 18, 31, 33, 36, 41], and health data for epidemiological research . Privacy as sociotechnical capital  and commercial access to health data  were considered in one article each.
Results of the quality assessment are provided in Table 2. QualSyst  scores ranged from 0.5 to 1.0 (possible range 0.0 to 1.0). While none were blinded studies, most provided clear information on respondent selection and data analysis methods and used justifiable study designs and methodologies. No key themes stood out for studies which received poorer judgements. No data were from randomised studies, with the highest level of evidence from a nested cohort study. Other data were obtained from lower-quality studies such as surveys and interviews.
A total of 12,794 respondents provide a view on trust; results were from surveys, questionnaires, focus groups, and interviews. One study was a nested cohort in a randomised control trial and two used a citizens’ jury model. Participants emphasised that organisations must develop, maintain, and promote high levels of patient trust [24, 26, 42]. Developing this trust can be achieved through the maintenance of confidential records and by providing information on how the individual’s information is used and by whom . The importance of trust in health organisations, clinicians, and university researchers was also noted [18, 26, 29, 40, 42], although generally respondents trusted that organisations would keep their data private and confidential and that this would not be intentionally violated . If a personal connection with the research team is established, then it is easier for individuals to form a trusting relationship . The highest levels of trust was placed in the doctor [16, 26, 46], the National Health Service (NHS) , and hospitals [16, 46], while the lowest trust was in commercial organisations , pharmaceutical companies and insurance companies , or for-profit organisations . An individual’s trust in an organisation was a determinant of what level of control they preferred over their data  and their willingness to participate in research , with trust overcoming concerns about privacy and confidentiality . Where an organisation shows clear and relevant connections between their research and the information contained in the records, respondents trusted that the organisations will maintain the data appropriately . Ensuring researchers act in the patient’s best interest and clearly and transparently disclosing the research being undertaken also built trust . Respondents were generally trusting of the original research team and they trusted that they would use their data appropriately . In a study about the use of fertility data, many respondents believed that registry data was already used for research purposes thus showing an established trust in the clinic, hospital, and wider health institutions .
The ability to maintain data security, privacy, confidentiality, and accurate records, change or delete incorrect data, and ensure that data would not be used to discriminate against an individual, all contributed to levels of trust . Granting access to a small number of named individuals was not seen as a solution to resolving privacy concerns, as these individuals themselves may not be trustworthy . Any research undertaken using secondary data analysis must not undermine or compromise an individual’s trust in medical research . The level of respondents’ education influenced their view of trust, with a higher level of education being more trusting of their government and research institutions compared to those who finished their education earlier . In the setting of fertility, most respondents were willing to share their data, suggesting trust in the organisation and registry .
In contrast, the theme of distrust was noted in several articles representing a total of 6830 respondents and included data from questionnaires, surveys, focus groups, and a citizens’ jury. A general distrust in the health system, research, and sharing of health information [30, 36] was noted, with some respondents not trusting any organisation with their data  or the organisational capacity to maintain records appropriately . While there was a desire to support the use of anonymised health data for research purposes, concerns regarding trust in the systems and data security remained [34, 40]. The provision of information on the source of research funding [40, 42] and data management systems  can increase transparency and trust, but providing more information on data use does not necessarily increase public trust . In a study from the UK, some individuals with a ‘pessimistic dystopian’ mindset had limited trust in commercial organisations accessing health data, believing it would create new harms , with some suggesting that organisations may use the data inappropriately (exploit or manipulate individuals or populations or might manipulate the data to support their own agenda) . Access to information by pharmaceutical companies and insurance agencies had lower levels of support, suggesting a distrust in these organisations. Older respondents (≥ 65 years of age) showed less trust in these organisations compared to younger respondents (≤ 25 years of age) . Respondents who believed that data sharing had more negative than positive effects were more likely to have a college education . Generally, these respondents believe that people could not be trusted and were concerned about data reidentification and information theft . These low levels of trust were associated with a decreased willingness to share data with both for-profit and non-profit organisations alike . Sharing data with an ‘unknown’ researcher was also associated with distrust; further, some believed that the increased digitisation of healthcare would lead to a decrease in the traditional provision of care . In the setting of fertility, respondents’ levels of trust decreased given some respondents saw them as a business; it is essential that the information provided to people clearly state the purpose of data reuse and should note that it would not be used for purposes such as marketing . Lucero et al. noted that there is a psychological component to uncertainty and mistrust. This leads to a distrust in volunteering for research and the need for organisations and ethics review boards to engage with communities to build trust . To decrease distrust, respondents wanted to have face-to-face contact with researchers during a study’s recruitment process .
Privacy and confidentiality: differences according to demographic and health characteristics
A total of 44,366 respondents provide a view on privacy and confidentiality. Responses were obtained through surveys, deliberative workshops, dialogues and interviews, and questionnaires.
General concerns about privacy
Concerns about privacy and confidentiality were one reason for not sharing health data [16, 28, 29, 36, 37, 40, 44]. One study noted that the respondents’ concerns about privacy had increased over the past 5 years . Concerns about the sense of ‘big brother’ and the potential for data to be used to discriminate  were expressed, with some consumers expressing a belief in the natural right (not dependent on law or custom) to privacy . Where safeguards were in place to protect the data, most respondents in one study were willing to share their data, irrespective of the proposed data use [21, 32], except in the setting of litigation .
The inclusion of an individual’s postcode, name or address, and receiving a letter inviting them to participate in research from the cancer registry was not considered to be a breach of privacy . In a study of UK respondents, no substantial differences in privacy concerns were found according to sex or age; however, small but significant variations were noted by factors such as education, ethnicity, socioeconomic status, and an experience of cancer in the immediate family .
In other studies, the relationship between age and sex and concerns regarding trust and privacy were contradictory. Younger respondents expressed higher levels of trust in researchers and were more willing to let their data be used for research, but they also had high levels of privacy concerns . Conversely, other studies noted that older respondents were more likely to agree to data linkage , while respondents aged 18 to 19 years and over 60 years had lower levels of privacy concerns compared to other groups .
Levels of concern about privacy were also influenced by the respondent’s level of education and employment. Those with commercial or technical qualifications had more concerns regarding privacy compared to all other education groups and those with a post-graduate degree had the fewest privacy concerns . One article considered privacy in the context of sociotechnical capital, composed of awareness of privacy, attitudes towards the importance of privacy and data sharing, and confidence in the ability to maintain privacy . Individuals with higher levels of education and income had higher rates of health privacy capital . In a second study, respondents who were employed in manual, routine, or intermediate work were more likely to share their data compared to those in professional roles . Respondents employed by a government organisation were more concerned about privacy . Respondents who did not respond to finance questions had lower rates of consent for data linkage . Other influences on privacy included social networks . In one study, differences in rates of privacy concern between those who answered the survey online compared to those who answered via telephone were noted; those who answered by telephone were less privacy concerned . Some differences in privacy concerns were noted by country. A study of European respondents found those based in Sweden, Slovenia, and Denmark were less concerned about privacy concerned compared to respondents from Lithuania .
Health status also impacted privacy concerns. Respondents in good health were more likely to agree to the use of data in healthcare registries compared to those in poor health . Nevertheless, in the setting of no additional digital security measures (restricted access, etc.) being applied, individuals with poor health were less concerned about privacy compared to those in good health .
Sensitivity of data
A total of 3347 respondents provided a view on sensitivity of data. Individuals may consider some forms of medical data to be more sensitive than others. Data related to sexually transmitted diseases, family medical history including genetic disorders, drug and alcohol use , and mental illness [43, 49] raised the most privacy concerns, particularly the possibility of inappropriate data access [23, 49]. A UK report noted that ‘new ways of collecting and sharing data, under new circumstances, can give rise to conflicting expectations around data privacy’  and that different types of data came with different privacy expectations . In one study, respondents believed that data was a similar resource as tissue samples, suggesting that data is equally as sensitive as biospecimens . A study of respondents who were seeking fertility services found that while there was a willingness to share data, they were concerned about the potential for the data to cause harm (potential for stigma), not only to them but also their children . Further, some respondents were concerned that the data collected on them, while required by legislation, was not collected on fertile couples .
Control of data
A total of 6859 respondents provided a view on control of data; results were obtained from surveys, a nested cohort, focus groups, and a citizens’ jury. Individuals’ desire to maintain some control over their health data was evident across studies, with many seeing this as key to transparency [40, 48]. Respondents were selective about those with whom they were willing to share their data. Respondents to two UK studies preferred their data to stay within the NHS [34, 43], with some believing that once data left the organisation control would be lost . Health data access by private/commercial organisations [21, 26, 37, 41,42,43,44], pharmaceutical companies [21, 42, 43], and health insurance companies [21, 43, 44], all were seen as inappropriate. Respondents were concerned that insurance companies would use health data to adjust premiums which was considered inappropriate and without clear public benefit . Not allowing third parties to access their data was based on a distrust of these organisations [42, 44], perceived lack of transparency from research conducted by pharmaceutical companies , concern about the companies’ motivations (e.g. profit, marketing) [40, 42, 44], doubts about data security, distrust in their capacity to put society before profit, and a belief that a commercial organisation may on-sell their data . In one study, some respondents indicated they would prefer that research not be undertaken if it required allowing commercial access to health data; however, most respondents wanted third-party access to health data if disallowing this resulted in research not being undertaken . In contrast, other studies found respondents were happy to allow pharmaceutical company access to their registry data if it is undertaken in a transparent manner , and where consent is sought, respondents in a second study accepted the principle of commercial access to health data . Respondents to one survey suggested that increased sharing may be used for marketing purposes, stolen, used for profit, or to discriminate against an individual . Further, some respondents wanted to be informed when their data was being used . Digitisation of health data was seen by some as a mechanism to increase control and transparency over their data, and increased participation in research . Some believed that consent was required to access records for research, or to identify potential research participants, without which it was a violation of their privacy .
Benefit to society
The importance of research and its benefit to society was noted as important in several studies with a total of 7006 respondents. It was noted that society’s views on privacy may be changing, creating conflicting values between privacy protection and public benefit . Generally, respondents were positive about sharing their data for research [18, 20, 49]. In some circumstances, societal benefit may outweigh concerns regarding privacy [29, 44, 47]; further, research using health data should have a societal impact and not be undertaken just for academic reasons . Ensuring transparency about the public benefit of research and sharing of results and analysis at a study’s conclusion  were important. Where data was used for public benefit, such as improved medical care and treatments, improved public health, or management of public funds, and organisations made a clear and compelling case for access to the data, access should be granted as it could potentially benefit both the individual and the health service . Public benefit was seen as a justification for access to health data and an individual’s right to privacy should not prevent research that could benefit the general public . Altruism was also noted as reason to support health research using existing data, with some wanting their data to be used to ‘maximum potential’ . In the setting of fertility data, sharing for the greater good was important to some ; however, this was not universal as some believed that it increased the risk of harm (fraud, identity theft, targeted marketing)  and that the premise that public benefit outweighs privacy concerns was not supported . In one survey, some respondents valued maintaining individual control over their data more than societal benefit and respondents expressed a lack of willingness to trade loss of privacy for public good . This was echoed in a second study where the majority of respondents believed that the right to privacy should be respected over all else; however, respondents also believed that if the data were made anonymous and privacy was maintained, data should be used for research that benefits society .
Views about specific data sharing scenarios
Digitisation of health records
A total of 700 respondents provided a view on health data linkage. An Australian study identified concerns relating to confidentiality during the data linkage process, specifically the possibility that the individual making the linkage may know the person and find out confidential information about them, although this was not universal . The use of de-identified data was not seen as a breach of privacy  and that the current data linkage best practice provides sufficient privacy protection . Transparency about process and data usage was an important factor in individuals’ decisions to allow data linkage . The importance of ensuring privacy when undertaking the linkage of clinical trial data to health administrative data was seen as important . A survey of UK respondents found that they were not concerned about health record linkage as long as the data was used to increase health knowledge, consistency between health services, and administrative efficiency .
Registries and patient-provided data
A total of 25,814 respondents provide a view on registries and patient-provided data. Studies indicated that use of health information exchanges and digital health platforms can improve care . While these positives were noted, views on use of these technologies varied widely .
While respondents agreed with the principles of electronic data sharing, they desired transparency and a mechanism for independent scrutiny of data access and use . Some respondents indicated a preference for more electronic health data sharing .
There was a high level of trust in using data from disease registries  and a willingness to share data; in a cancer setting, only a small number of respondents were opposed to data collection . The most common concern about registry-based research was the protection of privacy . One mechanism suggested to ensure maximum transparency in registry research was to involve patient organisations in the development of clinical trials and registries .
Strategies to address privacy and trust concerns
A total of 25,052 respondents provided a view on data security. Many health organisations already have well-established protocols to ensure patient privacy. Transparent information about data security and protection measures is important to maintain trust [26, 32] with some suggesting that systems to protect confidentiality should be more secure for shared data than those used in usual medical practice . Some respondents were particularly concerned about unauthorised access to their data  particularly the chance for data to be lost, stolen, or ‘hacked’, or shared without consent , although trust in researchers and health care providers to maintain data security  remained. Some respondents in one study were happy to ‘trade-off’ any potential risk to privacy and security for benefits, like improved treatment and services . Breaches in data security significantly reduced the levels of trust in an organisation to keep the individual’s health data private [33, 46]. Interestingly in one study, respondents were more willing to allow access to their electronic records, compared to paper-based records 
The role of legal and ethics bodies in protecting privacy
A total of 4219 respondents provided a view on the role legal and ethical bodies have in protecting privacy. The use of laws, regulations, and policies to protect an individual’s privacy in the UK , the USA , and Australia  were noted. Without developing an understanding of individual privacy concerns and perceptions of privacy, King et al. note that it will be ‘impossible to provide adequate law as well as effective technical solutions for protecting privacy’ . In the UK, laws allow for data use if the risk to privacy is proportionate ; the NHS Code of Practice on Confidentiality establishes rules for the protection of privacy . In relation to new laws to improve data collection, one study noted that 81% of respondents (N = 2335) would support a law making a cancer registry statutory in the UK . In the USA, mechanisms such as the National Institute of Health (NIH) requirement for manuscripts from NIH-funded research to be made publicly available were considered beneficial in fostering public accountability and trust . Further, the US Health Insurance Portability and Accountability Act of 1996 (HIPAA) establishes a national standard for the protection of health information . However, some believe that concerns about privacy are not fully addressed by HIPAA, which treats all health data, except psychotherapy, the same . In a study, some respondents were unaware that under some circumstances their medical data could be used without their permission . Respondents in two studies advocated for clear and consistently applied penalties for individuals who breach privacy, such as job termination, paying fines, and/or going to jail ; measures such as this may increase perceptions of trust and accountability . The role of ethics and institutional review boards in protecting privacy was noted in two articles [17, 40]. Respondents supported the role of ethics committees to manage access to health data and trusted their decisions . It is important that health consumers recognise the role of these bodies in regulating access to data for research  and in protecting patient rights . Finally, the development of clear policies and procedures will allow for more support for the secondary use of data, while increasing transparency for the healthcare consumer .
A total of 5302 respondents provided a view on data anonymisation. Data anonymisation was central to an individual’s decision to share health data for research or health and service improvement programmes [26, 28, 40, 49]. There was a lack of understanding between the terms anonymisation and identifiable data . In one study, many respondents were in favour of anonymous databases for research, noting it was beneficial and would advance medical research without impacting on their privacy . In the setting of appropriate privacy, confidentiality frameworks, and ethical oversight, Parkin and Paul note that an informed public are more likely to be receptive to research using potentially identifiable health information .
Even when data is de-identified, some respondents remained concerned in the setting of extra security measures and data anonymisation , which were not seen as safeguards . Some respondents believed that even if data had identifying features removed it was not completely de-identified  and were concerned about sharing de-identified data with non-healthcare professionals . Respondents were asked about their preferences for either a computer system or human programmer to anonymise (extract and link) data; some expressed concern about the potential for identification of individuals and noted a need for trust in the people undertaking these tasks . While respondents recognised the capacity of computers to undertake the anonymisation process, they suggested they would not trust a completely computerised system citing concerns about data infrastructure and data accuracy .
Communication and education
A total of 8511 respondents provided a view on the importance of communication and the role of education in promoting data sharing. Providing increased information about data use and research more generally allowed individuals to feel their privacy is being maintained while contributing to health research with societal benefits . Providing education was also seen as a mechanism to improve transparency [15, 16, 33] and trust [23, 26]. Specific information on how and when the data will be used [15, 45], and knowing how and where their contact details were sourced  were all important to individuals. Information on the data aggregation and anonymisation processes , and the systems used to protect data , should be provided. In a UK cancer registry study, cancer patients opposed to current data collection processes were more concerned about lack of information about the registry and consent processes than privacy . Information and education about database governance, including data storage, length of data accessibility, and use of data, should be clear at the time of consent, particularly for data held in disease registries .
Most of the articles included in this subset discussed the connection between issues of privacy, trust, and transparency, and consent. Specific issues of consent in relation to secondary data use and sharing are discussed in a separate publication. Broadly, seeking consent for the secondary analysis of health data, either anonymised or potentially identifiable, was seen as a way to build trust, respect, and transparency [26, 30] and address an individual’s privacy concerns [23, 27, 39].
This systematic literature review highlights the ongoing complexity associated with secondary data analysis and linking health data. Data gaps identified included a paucity of information specifically related to our primary area of interest, and the attitudes of breast cancer patients towards the secondary use and sharing of health administrative and clinical trial data. Interestingly, given the high rate of cancer more generally in society, this population was underrepresented in the results.
While respondents believed that the principles of data sharing were sound, significant concerns regarding privacy, information security, trust, and transparency remain. Further, the diversity of attitudes towards privacy suggests that there is little clarity on what predicts an individual’s attitudes towards privacy, highlighting an area for future study. Many respondents supported the use of health data for social benefit; however, this was not universal. The literature underscores the importance of communication between those who collect data or act as data custodians and health consumers. Health consumers should be provided clear information on how their data privacy will be maintained, how the data will be secured, and how access to their data will be regulated. Providing increased information to health consumers about how, when, and where their health data may be used, and with whom it may be shared, is essential in the development and maintenance of transparent data sharing systems and policies. Concerns relating to privacy and the misuse of data may be, in part, mitigated by increased education of health consumers regarding their national privacy laws and regulations. Providing information on penalties for breaches of privacy and how an individual’s health data can and cannot be used is important. This may reduce some specific concerns regarding inappropriate use of data, ‘big brother’ sentiments, and any perceptions of discrimination based on data. While not specifically discussed in the articles, it is important to note that as the use of artificial intelligence increases in healthcare, ensuring penalties for discrimination based on data analysis will become more complex. Examples of discriminatory algorithms, in society and in healthcare have been highlighted by researchers [50,51,52], and these need to be closely examined and tested as our reliance on data-driven healthcare increases. Further, health consumers need to be provided information about how any research is undertaken including anonymisation and aggregation processes, and the requirement for ethics committee oversight.
Our results suggest that trust is an important component in the discussion regarding the secondary analysis of health data: trust in the organisations, clinicians, and infrastructures used to maintain data. Onora O’Neill has written extensively on issues of trust in a modern society and in health and argues that despite the sentiment expressed by some that trust more generally in society has decreased, it has not; rather the culture of suspicion has increased . Therefore, it is essential that organisations wishing to undertake secondary analysis on their datasets need to develop trust between themselves and health consumers.
The papers included in this study were limited to those indexed on major databases; some literature on this topic may have been excluded if it was not identified during the ‘grey’ literature and hand searching phases. As the search was restricted to English language publications, some relevant literature may have been excluded from the search. Given the initial focus of this research being attitudes towards data sharing and reuse in breast cancer, individuals under 18 years of age were excluded from the analysis. A final limitation of this research is that much of the data was from research methods (surveys, interviews) that are not considered to be level 1 evidence; however, a randomised controlled trial methodology is not necessarily appropriate to this research subject.
Results of this systematic literature review indicate that while respondents identified advantages in health information data sharing, including post-market medication surveillance and the potential to decrease medical errors, concerns relating to trust, transparency, and the protection of privacy remain. Additional work is therefore required within these areas during the conception, design, and implementation phases of any health data sharing programmes to ensure the balance between public benefit and individual privacy is maintained.
The literature confirms that while consumers understand the benefits of health data sharing for research purposes, issues of trust, transparency, and privacy remain central to acceptance of health data sharing policies and programmes in the general community. Researchers and those undertaking secondary data analysis should work with consumer organisations to ensure consumer concerns are addressed.
Availability of data and materials
All data generated or analysed during this study are included in this published article.
Glaser BG. Retreading research materials: the use of secondary analysis by the independent researcher. Am Behav Sci. 1963;6(10):11–4.
Carrato A, Falcone A, Ducreux M, Valle JW, Parnaby A, Djazouli K, et al. A systematic review of the burden of pancreatic cancer in Europe: real-world impact on survival, quality of life and costs. J Gastrointestinal Cancer. 2015;46(3):201–11.
Khozin S, Blumenthal GM, Pazdur R. Real-world data for clinical evidence generation in oncology. JNCI J Nat Cancer Inst. 2017;109(11).
Cadarette SM, Wong L. An introduction to health care administrative data. Can J Hosp Pharm. 2015;68(3):232–7.
Balas EA, Boren SA. Managing clinical knowledge for health care improvement. Yearbook of Medical Informatics. 2000.
National Health and Medical Resarch Council (NHMRC). Open access policy Canberra: Australian Government; 2018 [updated November 2018. Available from: https://www.nhmrc.gov.au/about-us/resources/open-access-policy].
National Health and Medical Resarch Council (NHMRC). National Statement on Ethical Conduct in Human Research (2007) - updated 2018 Canberra: Australian Government 2018 [Available from: https://www.nhmrc.gov.au/about-us/publications/national-statement-ethical-conduct-human-research-2007-updated-2018#block-views-block-file-attachments-content-block-1].
Kostkova P, Brewer H, de Lusignan S, Fottrell E, Goldacre B, Hart G, et al. Who owns the data? Open data for healthcare. Front Public Health. 2016;4.
Esmaeilzadeh P. The impacts of the perceived transparency of privacy policies and trust in providers for building trust in health information exchange: empirical study. JMIR Med Inform. 2019;7(4):e14050.
Moher D, Liberati A, Tetzlaff J, Altman DG, Group P. Preferred reporting items for systematic reviews and meta-analyses: the PRISMA statement. PLoS Med. 2009;6(7):e1000097-e.
Veritas Health Innovation. Covidence systematic review software. Melbourne: Cochrane Collaboration; 2018.
Kmet LM, Cook LS, Lee RC. Standard quality assessment criteria for evaluating primary research papers from a variety of fields. Edmonton: Alberta Heritage Foundation for Medical Research (AHFMR; 2004.
Lockwood C, Munn Z, Porritt K. Qualitative research synthesis: methodological guidance for systematic reviewers utilizing meta-aggregation. Int J Evid Based Healthcare. 2015;13(3):179–87.
Gale NK, Heath G, Cameron E, Rashid S, Redwood S. Using the framework method for the analysis of qualitative data in multi-disciplinary health research. BMC Med Res Methodol. 2013;13(1):117.
Campbell B, Thomson H, Slater J, Coward C, Wyatt K, Sweeney K. Extracting information from hospital records: what patients think about consent. BMJ Qual Saf. 2007;16(6):404–8.
Courbier S, Dimond R, Bros-Facer V. Share and protect our health data: an evidence based approach to rare disease patients’ perspectives on data sharing and data protection-quantitative survey and recommendations. Orphanet J Rare Dis. 2019;14(1):175.
Darquy S, Moutel G, Lapointe A-S, D'Audiffret D, Champagnat J, Guerroui S, et al. Patient/family views on data sharing in rare diseases: study in the European LeukoTreat project. Eur J Hum Genet. 2016;24(3):338.
Eloranta K, Auvinen A. Population attitudes towards research use of health care registries: a population-based survey in Finland. BMC Med Ethics. 2015;16:48.
Hamajima N, Tajima K. Patients' views on reference to clinical data. J Epidemiol. 1997;7(1):17–9.
Hay AE, Leung YW, Pater JL, Brown MC, Bell E, Howell D, et al. Linkage of clinical trial and administrative data: a survey of cancer patient preferences. Curr Oncol. 2017;24(3):161.
Mello MM, Lieou V, Goodman SN. Clinical trial participants’ views of the risks and benefits of data sharing. N Engl J Med. 2018;378(23):2202–11.
Ni MY, Li TK, Hui RWH, McDowell I, Leung GM. Requesting a unique personal identifier or providing a souvenir incentive did not affect overall consent to health record linkage: evidence from an RCT nested within a cohort. J Clin Epidemiol. 2017;84:142–9.
Page SA, Mitchell I. Patients' opinions on privacy, consent and the disclosure of health information for medical research. Chronic Dis Can. 2006;27(2):60–7.
Park YJ, Chung JE. Health privacy as sociotechnical capital. Comput Hum Behav. 2017;76:227–36.
Patil S, Lu H, Saunders CL, Potoglou D, Robinson N. Public preferences for electronic health data storage, access, and sharing—evidence from a pan-European survey. J Am Med Inform Assoc. 2016;23(6):1096–106.
Robinson G, Dolk H, Given J, Karnell K, Gorman EN. Public attitudes to data sharing in Northern Ireland. Northern Ireland: Administrative Data Research Centre; 2016.
Sala E, Burton J, Knies G. Correlates of obtaining informed consent to data linkage: respondent, interview, and interviewer characteristics. Sociol Methods Res. 2012;41(3):414–39.
Whiddett R, Hunter I, Engelbrecht J, Handy J. Patients’ attitudes towards sharing their health information. Int J Med Inform. 2006;75(7):530–41.
Willison DJ, Schwartz L, Abelson J, Charles C, Swinton M, Northrup D, et al. Alternatives to project-specific consent for access to personal information for health research: what is the opinion of the Canadian public? J Am Med Inform Assoc. 2007;14(6):706–12.
Audrey S, Brown L, Campbell R, Boyd A, Macleod J. Young people’s views about consenting to data linkage: findings from the PEARL qualitative study. BMC Med Res Methodol. 2016;16:34.
Barrett G, Cassell JA, Peacock JL, Coleman MP. National survey of British public’s views on use of identifiable medical data by the National Cancer Registry. Br Med J. 2006;332(7549):1068–72.
Broes S, Verbaanderd C, Casteels M, Lacombe D, Huys I. Sharing of clinical trial data and samples: the cancer patient perspective. Front Med. 2020;7:33.
Carson C, Hinton L, Kurinczuk J, Quigley M. ‘I haven’t met them, I don’t have any trust in them. It just feels like a big unknown’: A qualitative study exploring the determinants of consent to use Human Fertilisation and Embryology Authority registry data in research. BMJ Open. 2019;9(5):e026469.
Haddow G, Bruce A, Sathanandam S, Wyatt JC. ‘Nothing is really safe’: a focus group study on the processes of anonymizing and sharing of health data for research purposes. J Eval Clin Pract. 2011;17(6):1140–6.
Kass NE, Natowicz MR, Hull SC, Faden RR, Plantinga L, Gostin LO, et al. The use of medical records in research: what do patients want? J Law Med Ethics. 2003;31(3):429–33.
Lee SB, Zak A, Iversen MD, Polletta VL, Shadick NA, Solomon DH. Participation in clinical research registries: a focus group study examining views from patients with arthritis and other chronic illnesses. Arthritis Care Res. 2016;68(7):974–80.
Lucero RJ, Kearney J, Cortes Y, Arcia A, Appelbaum P, Fernandez RL, et al. Benefits and risks in secondary use of digitized clinical data: views of community members living in a predominantly ethnic minority urban neighborhood. AJOB Empirical Bioethics. 2015;6(2):12–22.
Sakshaug JW, Couper MP, Ofstedal MB, Weir DR. Linking survey and administrative records: mechaisms of consent. Sociol Methods Res. 2012;41(4):535–69.
Xafis V. The acceptability of conducting data linkage research without obtaining consent: lay people’s views and justifications. BMC Med Ethics. 2015;16(1):79.
Damschroder LJ, Pritts JL, Neblo MA, Kalarickal RJ, Creswell JW, Hayward RA. Patients, privacy and trust: Patients’ willingness to allow researchers to access their medical records. Soc Sci Med. 2007;64(1):223–35.
Macmillan Cancer S, Cancer Research UK, Ipsos M. Perceptions of the cancer registry: attitudes towards and awareness of cancer data collection. London: Cancer Research UK; 2016.
Slegers C, Zion D, Glass D, Kelsall H, Fritschi L, Brown N, et al. Why do people participate in epidemiological research? J Bioethical Inquiry. 2015;12(2):227–37.
Wellcome Trust, C. M. Insight. Summary report of qualitative research into public attitudes to personal data and linking personal data. London: Wellcome Trust; 2013.
Wellcome Trust, Mori I. The one-way mirror: public attitudes to commercial access to health data. London: Wellcome Trust; 2016.
Willison DJ, Swinton M, Schwartz L, Abelson J, Charles C, Northrup D, et al. Alternatives to project-specific consent for access to personal information for health research: insights from a public dialogue. BMC Med Ethics. 2008;9(1):18.
Willison DJ, Steeves V, Charles C, Schwartz L, Ranford J, Agarwal G, et al. Consent for use of personal information for health research: do people with potentially stigmatizing health conditions and the general public differ in their opinions? BMC Med Ethics. 2009;10:10.
Parkin L, Paul C. Public good, personal privacy: a citizens’ deliberation about using medical information for pharmacoepidemiological research. J Epidemiol Community Health. 2011;65(2):150–6.
Tully MP, Bozentko K, Clement S, Hunn A, Hassan L, Norris R, et al. Investigating the extent to which patients should control access to patient records for research: a deliberative process using citizens’ juries. J Med Internet Res. 2018;20(3).
King T, Brankovic L, Gillard P. Perspectives of Australian adults about protecting the privacy of their health information in statistical databases. Int J Med Inform. 2012;81(4):279–89.
Hao K. Facebook’s ad-serving algorithm discriminates by gender and race. MIT Technology Review [Internet]. 2019. Available from: https://www.technologyreview.com/2019/04/05/1175/facebook-algorithm-discriminates-ai-bias/.
O'Neil C. Weapons of math of destruction: how big data increases inequality and threatens democracy. London: Allen Lane; 2016.
Parikh RB, Teeple S, Navathe AS. Addressing bias in artificial intelligence in health care. JAMA. 2019;322(24):2377–8.
O'Neill O. A question of trust; the BBC Reith Lectures 2002. Cambridge: Cambridge University Press; 2002.
The authors would like to thank Ms Ngaire Pettit-Young, Information First, Sydney, NSW, Australia, for her assistance in developing the search strategy.
This project was supported by the Sydney Vital, Translational Cancer Research, through a Cancer Institute NSW competitive grant. The views expressed herein are those of the authors and are not necessarily those of the Cancer Institute NSW. FB is supported in her academic role by the Friends of the Mater Foundation.
Ethics approval and consent to participate
Consent for publication
EH, ML, PB, and FB declare that they have no competing interests.
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
About this article
Cite this article
Hutchings, E., Loomes, M., Butow, P. et al. A systematic literature review of health consumer attitudes towards secondary use and sharing of health administrative and clinical trial data: a focus on privacy, trust, and transparency. Syst Rev 9, 235 (2020). https://doi.org/10.1186/s13643-020-01481-9