Secure data environment for NHS health and social care data - policy guidelines
Updated 23 December 2022
Applies to England
Introduction
In Data saves lives: reshaping health and social care with data, we committed to implementing secure data environments as the default way to access NHS health and social care data for research and analysis. The strategy also sets out our intentions for the use of secure data environments to access NHS health and social care data through 12 clear guidelines. This publication provides additional background and detail for how we have developed those guidelines and their intended outcome.
Value of health and social care data for research and analysis
NHS health and social care data has immense value beyond the direct care of patients. It accelerates the discovery of new treatments from industry and academia, and helps the NHS to plan better services.
Improving the use of health data for research and analysis was a core theme of Better, broader, safer: using health data for research and analysis (the ‘Goldacre review’), which stated:
Data can drive research. It can be used to discover which treatments work best, in which patients, and which have side effects. It can be used to help monitor and improve the quality, safety and efficiency of health services. It can be used to drive innovation across the life sciences sector.
If we are to unlock the full potential of data, we must make sure that the public has confidence in how their data is used and protected. We believe this will only be possible by moving from the current system that relies on data sharing, to one that is built on data access. Secure data environments will be key to achieving this ambition.
What a secure data environment is
Secure data environments are data storage and access platforms, which uphold the highest standards of privacy and security of NHS health and social care data when used for research and analysis. They allow approved users to access and analyse data without the data leaving the environment.
In summary, secure data environments allow organisations to control:
- who can become a user to access the data
- the data that users can access
- what users can do with the data in the environment
- the information users can remove
A range of different users will benefit from improved access to NHS health and social care data. These users have different data requirements and skill sets and need to access data to produce different outputs.
For example, access to data in secure data environments will be used for planning and population health management, such as the NHS COVID-19 data store and the planned federated data platform. Primary use is for internal planning and management, for instance by integrated care groups and analysts.
Access to data in secure data environments will also be used for research and wider analysis, such as the platforms created by NHS Digital and OpenSafely. Primary use is to support medical research and development. Primary users include academic and industry researchers, as well as policy analysts, with a specific research question.
How secure data environments will be delivered
Our secure data environment policy aims to simplify a complex, rapidly developing landscape. To get implementation right, we are making a number of key investments to make sure that our policy works in practice.
All of the examples that follow will inform, and must apply, secure data environment policy for any use of NHS health and social care for analysis and research. They will not apply to the use of data for direct patient care - where there needs to be fewer barriers in place to make sure that patients receive the care they need.
NHS Digital’s (NHSD) national secure data environment
NHS Digital is the current safe haven for health and care data. NHS Digital is currently piloting a national secure data environment, which provides approved researchers from trusted organisations with timely and secure access to NHS health and social care data.
This national environment currently supports the work of over 100 users from across the NHS, academia, industry and charity sectors. For example:
- the British Heart Foundation is researching the impact and effects of the COVID-19 pandemic on cardiovascular diseases
- DATA-CAN - an innovation hub funded by Health Data Research UK - is working to understand the impact of COVID-19 on people affected by cancer
With support from the NHS Transformation Directorate’s Data for Research & Development programme, enhancements to this environment are planned throughout 2022 and beyond. This will include expanding the pilot to accommodate users with the aim that all data held nationally is managed through a secure data environment when used for research and planning purposes.
Sub-national secure data environments
During 2022, as a result of a competitive process, investment has been made in 4 localities to scope and define how NHS-owned secure data environments might work best at a regional level.
The ambition is to provide researchers and analysts with access to NHS health and social care data at a significant regional scale, maintain patient confidentiality, and make sure connectivity to local communities and NHS care teams.
The network of sub-national secure data environments currently includes London, Wessex, Greater Manchester and the Thames Valley. The explorative work on secure data environments being carried out will help create a community of practice from which we can learn. This will directly inform the development of secure data environment policy.
Federated data platform
In the Data saves lives strategy we also committed to ensuring that we maintain the valuable data connectivity developed over the past few years using the COVID-19 data platform.
NHS England intends to procure a federated data platform (FDP), which is an ecosystem of technologies and services to be implemented across the NHS in England. This will be an essential enabler for transformational improvements across the NHS.
The FDP will enable, and must apply, secure data environment policy for any use of NHS health and social care beyond direct patient care. For example, when using data to support population health management and operational planning. This procurement will also support integrated care systems to implement secure data environment policy.
The purpose of these guidelines
These guidelines set out our expectations for how secure data environments will be used to access NHS health and social care data. They have been developed in collaboration with leading experts in the field.
These guidelines have been developed to:
- strengthen public confidence and trust in the transition to using secure data environments to access NHS health and social care data
- provide additional information about the use of secure data environments, as outlined in the Data saves lives strategy
- describe the foundations on which the NHS Transformation Directorate will further develop secure data environment policy, in collaboration with the public and expert stakeholders
- communicate the direction of travel for secure data environment policy signalling areas that require further development
- communicate the fundamental principles which secure data environments must adhere to
Some choices have already been made to make sure that successful implementation, such as that all NHS health and social care data will be accessed through a secure data environment and that any exceptions will be strictly limited. We also commit to establishing an accreditation process and an organisation that will ensure compliance, which in turn will standardise and limit the number of platforms that can provide access to NHS data. We recognise that these guidelines do not contain the full details to support a transition that will require significant changes in behaviour and process. The next phase of work will involve translating these high level ambitions into workable practice, which will be supported by broad engagement.
The Five Safes framework
These guidelines are arranged according to the Five Safes framework developed by the Office for National Statistics (ONS).
The fives safes are widely regarded as representing best practice in data protection, they include:
- safe settings - the environment prevents inappropriate access, or misuse
- safe data - information is protected and is treated to protect confidentiality
- safe people - individuals accessing the data are trained, and authorised, to use it appropriately
- safe projects - research projects are approved by data owners for the public good
- safe outputs - summarised data taken away is checked to make sure it protects privacy
Secure data environment guidelines
Safe settings
The principle of ‘safe setting’ is about preventing inappropriate access, or misuse, of data.
The safe settings principle will be upheld by secure data environments because data security is integral to their design.
1. Secure data environments will be the default way to access NHS Health and Social Care Data for research and analysis
Secure data environments must be adopted by organisations hosting NHS health and social care data for research and analysis. These environments have features that improve data privacy and security, which will help build public trust in the use of their data.
Instances of analysing or disseminating data outside of a secure data environment will be extremely limited. Any exceptions will require significant justification, such as where explicit consent from clinical trial participants has been obtained. Further guidance about exceptions to the secure data environment standards will be provided in the coming months.
2. Secure data environments providing access to NHS health and social care data must meet defined criteria
The design, implementation and management of secure data environments must meet minimum requirements. This will include technical, behavioural, governance, and training specifications. Owners of these environments must be able to continue to demonstrate that they fulfil defined criteria in order to be categorised as an ‘NHS accredited secure data environment’. All environments will be held to the same requirements and oversight.
In the coming months we will publish additional technical guidance and information governance requirements, and information about how secure data environments will be accredited. We will also communicate details about the plans, approach and timescales for this transition.
This will make sure that we can provide assurance that all NHS accredited secure data environments uphold the same privacy and security standards. It will also help to build public trust in how their data is used.
3. Secure data environments must maintain the highest level of cyber security to prevent unauthorised access to data
Secure data environments must adhere to the principle of ‘security by design’. All aspects of cyber security must be integrated into the design and implementation of these environments. This includes information governance, data encryption, and data access management standards.
Security by design will make sure that secure data environments comply with the UK General Data Protection Regulation (UK GDPR) requirement of data protection by design and by default. They will uphold data protection legislation and safeguard individual rights.
4. Secure data environment owners must be transparent about how data is used within their environment
Owners of secure data environments must be open about the way data is used within their secure data environment. They must be able to detail who is accessing the data and for what purpose. This may be achieved, for example, by organisations ensuring that clear and accessible reporting is in place for their secure data environment.
The Office for Statistics Regulation’s recent report on lessons learned from the COVID-19 pandemic demonstrated that public trust in the use of their data increased when they were able to see how it is used. Being transparent about how NHS health and social care data is used in secure data environments can help to build public understanding and trust.
Transparency about how data is used also increases the accountability of data controllers and data users.
Safe people
The principle of ‘safe people’ is about ensuring that individuals accessing data are trained and authorised, to use it appropriately.
The safe people principle will be upheld by secure data environments by making sure that users are verified before access is granted and are able to access appropriate data only. Patients and the public will also be engaged in decisions about who can access their data.
5. The secure data environment may only be accessed by appropriate, verified users
Access to NHS health and social care data within a secure data environments must be carefully controlled. Only authorised users will be granted access to data for approved purposes. Owners of secure data environments must have robust technical and governance processes in place to accurately verify the identity of users, and for managing their access to data within the environment.
This will enable a variety of users - with sufficient levels of training, qualifications, and expertise - to analyse NHS health and social care data. Allowing appropriate access to this data will facilitate data-driven planning, research, and innovation in the NHS.
6. Secure data environments must make sure that patients and the public are actively involved in the decision making processes to build trust in how their data is used
Owners of secure data environments must make sure that the public are properly informed and meaningfully involved in ongoing decisions about who can access their data and how their data is used. For example, by ensuring that relevant technical information is presented in an accessible way (that is, through publishing privacy notices and data protection impact assessments).
This will make sure that secure data environments comply with UK General Data Protection Regulation (GDPR), which requires that data controllers provide individuals with information about how their data is used.
Secure data environment owners must also be able to demonstrate that they have, or plan to, undertake active patient and public involvement activities. Patient and public involvement and engagement (PPIE) activities must follow the NHS Research Authority’s principles.
This guideline supports the commitments made in the Data saves lives strategy, to build and maintain public trust in the use of NHS health and social care data, through active PPIE . It will make sure that all perspectives are taken into consideration in the design and implementation of secure data environments and help build public trust in how NHS health and social care data is stored and used.
Safe data
The principle of ‘safe data’ is about making sure that information is protected and is treated to protect confidentiality.
The safe data principle will be upheld by secure data environments by their design and function, which prevents the dissemination of identifiable data.
7. Data made available for analysis in a secure data environment must protect patient confidentiality
Data must be treated in a secure data environment to protect confidentiality using techniques such as data minimisation and de-identification. De-identification practices mean that personal identifiers are removed from datasets to protect patient confidentiality. This includes techniques such as aggregation, anonymisation, and pseudonymisation. The level of de-identification applied to data may vary based on user roles and requirements for accessing the data.
Data minimisation practices help make sure that access to data is relevant and limited to what is necessary in relation to the purposes for which they are processed. This is in line with Information Commissioner’s Office (ICO) guidance. Applying data minimisation and de-identification practices enables approved individuals to access data for high quality analysis intended for the public good whilst also maintaining patient confidentiality.
Data protection law will continue to apply. This means there must always be a valid lawful basis for the collection and processing of personal information (including special category information) within secure data environments, as defined under data protection legislation. Where the data being accessed is confidential patient information, the requirements of the common law duty of confidentiality must also be met. More information on this can be found in the Transformation Directorate’s guidance on confidential patient information.
We will provide further information about the application of these practices in due course, when we publish additional guidance for secure data environments.
8. Inputs to a secure data environment must be assessed and approved
Owners of secure data environments must have robust processes in place for checking external inputs before they are approved to enter the environment. This includes data, code tools, and any other inputs.
Owners of secure data environments must have processes in place to make sure that the linking of NHS health and social care data with other datasets is carried out within the environment itself. They must also make sure that only approved and appropriately qualified individuals conduct dataset linking. This must be upheld unless there is significant justification for not doing so (in line with guideline 1).
There must also be processes in place to assure the quality of external datasets before they are imported into the secure data environment.
Linking NHS health and social care data to data from other sources has the potential to greatly enhance the quality of analysis and research findings. Secure data environments will facilitate data linkage, whilst also maintaining data protection.
Safe projects
The principle of ‘safe projects’ is about making sure that research projects are approved by data owners for the public good.
The safe projects principle will be upheld by secure data environments by:
- a) supporting open working practices that deliver efficiencies and improve the quality of analysis and findings
- b) making data available for a range of uses intended for the public good
9. Secure data environments must adhere to a policy of open-working and support code-sharing
Secure data environments must support open working, ensuring that code developed in these environments is reusable. Examples of how this could be achieved include:
- applying the principles of the NHS Open Source Code policy
- using the Reproducible Analytical Pipelines (RAP) strategy
Code developed in secure data environments must be published in the open unless there is a specific rationale for not doing so. We will engage further on these exceptions, and publish guidance in due course. This may include making it available in open repositories.
Working in the open will allow researchers to view, reuse and adapt existing code and enhance shared understanding of how the datasets in these environments are used. This will enable users to easily reproduce previous analysis, which will save time and improve the consistency and accuracy of analytical findings. This will lead to better outcomes for patients, the public, and the NHS.
10. Secure data environments must be able to support flexible and high-quality analysis for a diverse range of uses
Owners of secure data environments must engage with their intended users to make sure that they provide the necessary functionality and tools required for analysis. A range of users with different requirements and skill sets will need to access data within these environments. They will need to analyse different data to produce different outputs.
This will make sure that a variety of users will benefit from improved access to NHS health and social care data in secure data environments, which will enable data-driven planning, research, and innovation across the NHS.
11. All uses of data within secure data environments must be for the public good
The use of NHS health and social care data must be ethical, for the public good, and comply with all existing law. It must also be intended for health purposes or the promotion of health. Data access must never be provided for marketing or insurance purposes.
Owners of secure data environments must make sure there are processes in place to assess the reasons for accessing NHS health and social care data in a secure data environment. These processes must fulfil minimum national standards, which we will set out.
This will make sure that appropriate access is given to NHS health and care data, which will support the delivery of improved outcomes across the health and care system. It will also help build public confidence in why their data is accessed and how it is used.
Safe outputs
The principle of ‘safe outputs’ makes sure that any summarised data taken away is checked to make sure it protects privacy.
The safe outputs principle will be upheld by secure data environments by making sure that the results of analysis contain only aggregated, non-identifiable results that match the approvals of users and their projects.
12. Outputs from a secure data environment must be assessed and approved and must not identify individuals
All information must be checked before it leaves a secure data environment, including data, code, tools, and any other outputs.
There must be robust processes in place to maintain patient confidentiality and to make sure that outputs align with the intentions of individual projects.
This supports guideline 8, which states that any linking between NHS health and social care data with other datasets must be conducted within an NHS accredited secure data environment. Together these guidelines will make sure that secure data environments assist high quality analysis (for example, through data linking), whilst also maintaining data protection and patient confidentiality.
Next steps for secure data environment policy
We have now published the latest iteration of the secure data environment guidelines, expanding on the commitments made in the Data saves lives strategy. We have also published a simple explainer of secure data environment policy, which provides an outline of the policy in plain English.
The guidelines set out our ambition for secure data environment policy, which we will continue to develop in the coming months with key stakeholder groups. Below is a summary of some further planned work.
Public and patient engagement
Engaging with patients and the public on how we store and use their data is key to getting secure data environment policy right. We have started this process, having engaged public and patient groups on the contents of a simple explainer for secure data environment policy.
We plan to align engagement on data access policy (and the implementation of secure data environments) with broader engagements on data use within the NHS Transformation Directorate. We currently expect engagement to be initiated in spring 2023.
Delivery and implementation
We continue to consult with a wide range of stakeholders to ensure the successful implementation of data access policy (and secure data environments). This has informed our early definition of the minimum technical capabilities that every secure data environment hosting NHS data will need to have to ensure it upholds the highest standards of privacy and security.
It has also informed our thinking on the oversight process that will need to be in place for secure data environments hosting NHS data, which we know is key to ensuring the public have confidence in how their data is stored and used.
We are taking the time to get this right. Before we publish further guidance about the implementation of data access policy, we will be moving to a period of engagement and testing with the sector and public. More information will be provided in 2023.
The transition to secure data environments for access to NHS health and care data is a positive step forward. However, it is a complex and rapidly developing field and careful thought must be given to ensure successful implementation. For example, we intend to provide greater clarity on the below in the next phase of this work:
- what the target ecosystem of NHS accredited secure data environments should look like and what needs to change to achieve the desired end-state
- the requirements of an accreditation process, our overall approach to ensuring compliance, and the capabilities of an accreditation body
- exemptions to the use of secure data environments, the justifications required, and how this may change over time as technology develops and platforms improve
- a realistic transition timeline for adoption of the policy that is both ambitious and achievable
We will be working with a wide range of stakeholders to develop and publish information about these plans and timescales for transition and welcome all views. This process will also be informed by the NHS’s continued investment in a number of flagship programmes:
- the federated data platform (FDP)
- NHS Digital’s national secure data environment
- sub-national secure data environments
For more information, or to get involved, please email [email protected].