Accredited official statistics

Quality and methodology information (QMI) for healthcare-associated infections (HCAI) reports

Published 18 December 2024

About this report

This report outlines the quality and methodology information (QMI) relevant to the healthcare-associated infections (HCAI) statistics which are published by the UK Health Security Agency (UKHSA). HCAI statistics are published monthly, quarterly and annually and include:

Accredited official statistics:

Official statistics in development:

This QMI report supports users in understanding the strengths and limitations of these statistics, ensuring UKHSA is compliant with the quality standards stated in the Code of Practice for Statistics. The report covers:

  1. The strengths and limitations of the data used to produce the statistics.
  2. The methods used to produce the statistics.
  3. The quality of the statistical outputs.

About the statistics

These statistics present trends in the counts and rates of the 6 infections which comprise the mandatory surveillance of bacteraemia and Clostridioides difficile (C. difficile) infections (CDI): Escherichia coli (E. coli) bacteraemia, Pseudomonas aeruginosa (P. aeruginosa) bacteraemia, Klebsiella species (Klebsiella spp.) bacteraemia, meticillin-resistant Staphylococcus aureus (MRSA) bacteraemia, meticillin-susceptible Staphylococcus aureus (MSSA) bacteraemia and CDI. The data is broken down by various key epidemiological and clinical characteristics.

Geographical coverage: England

Publication frequency: Monthly (monthly data tables), quarterly (quarterly epidemiological commentary), annual (annual epidemiological commentary, annual data tables, independent sector report)

Change log

18 December 2024: QMI report first published.

Contact

To contact the team responsible for producing these statistics, please email [email protected]

Suitable data sources

Statistics should be based on the most appropriate data to meet intended uses.

This section describes the data used to produce the statistics.

Data sources

The primary data source for the numerator data is the mandatory surveillance of bacteraemia and CDI which is collected through UKHSA’s HCAI data capture system (DCS) covering 6 data collections. Mandatory surveillance began in response to increasing rates of MRSA bacteraemia across NHS trusts and was subsequently rolled out for other HCAI when concern arose. Independent sector providers were also mandated to submit data from April 2009. A complete timeline of changes of the mandatory surveillance is available online.

The inclusion criteria for reporting each bacteraemia or infection to the surveillance system are:

  • for MRSA bacteraemia, positive blood cultures caused by S. aureus resistant to meticillin, oxacillin, cefoxitin or flucloxacillin
  • for MSSA bacteraemia, positive blood cultures caused by S. aureus which are susceptible to meticillin, oxacillin, cefoxitin, or flucloxacillin, and not subjected to MRSA reporting
  • for E. coli bacteraemia, all laboratory-confirmed positive blood cultures cases of E. coli bacteraemia
  • for CDI (patients older than 2 years old),
    • diarrhoea stools (Bristol Stool types 5 to 7) where the specimen is C. difficile toxin positive
    • toxic megacolon or ileostomy where the specimen is C. difficile toxin positive
    • pseudomembranous colitis revealed by lower gastro-intestinal endoscopy or computed tomography
    • colonic histopathology characteristic of CDI (with or without diarrhoea or toxin detection) on a specimen obtained during endoscopy or colectomy
    • faecal specimens collected post-mortem where the specimen is C. difficile toxin positive or tissue specimens collected post-mortem where pseudomembranous colitis is revealed, or colonic histopathology is characteristic of C. difficile infection

For the annual epidemiological commentary, the mandatory surveillance data is linked to the data sources below to obtain key epidemiological characteristics.

Additional patient characteristics such as patient postcode, GP code, fact of death and date of death, are obtained from linkage with the NHS Spine Summary Care Records data set.

NHS acute trust-level population data does not currently exist in England as NHS acute trusts do not treat patients within defined geographical boundaries. Therefore, a suitable proxy for population is required to calculate hospital-onset and hospital-onset-healthcare-associated (HOHA) rates. The occupied overnight bed-days from the national KH03 data set provides the daily average overnight bed occupation for a specific time period: Financial year 2007 to 2008 to financial year 2009 to 2010 and quarterly from financial year 2010 to 2011 onwards. This data set is an open access return published by NHS England and provides a measure of clinical activity in each trust, which is used as a proxy measure of the patient population. The latest published KH03 data can include revisions to previously published data.

The high-level ethnic group for each case is identified by using the Office for Health Improvement and Disparities (OHID) COVID-19 Health Inequalities Monitoring for England (CHIME) tool, which relies on NHS England Hospital Episode Statistics (HES) records. This data is linked to our surveillance records using NHS number and date of birth.

The Office for National Statistics (ONS) mid-year population estimates up to and including 2022 (the latest at the time of query) are used at national and integrated care board (ICB)-level for the calculation of total, community-onset and community-onset-community-associated incidence rates. For 2023 and 2024 estimates we assumed they stayed the same as the proxy year 2022.

National population estimates provide age-stratification for Index of Multiple Deprivation (IMD) or ethnicity—the ONS and former Department for Levelling Up, Housing and Communities contributed to the IMD estimates (up to and including 2022, with 2022 serving as proxy for 2023 and 2024), while the 2021 Census provides the ethnicity estimates (with 2021 serving as proxy for years 2018 to 2020 and 2022 to 2024, assuming ethnicity distribution remained constant during this period).

Data quality

The data that we use to produce statistics must be fit for purpose. Poor quality data can cause errors and can hinder effective decision making.

We have assessed the quality of the source data against the data quality dimensions in the Government Data Quality Framework.

This assessment covers the quality of the data that was used to produce the statistics, not the quality of the final statistical outputs. The quality summary assesses the quality of the final statistical outputs.

Strengths and limitations of the mandatory surveillance data

The strengths of the data are:

  1. The surveillance is at patient-level and in real-time, including both risk factor data and information on both date of positive specimen, date of inpatient admission, and date of recent discharge, which allow for onset location and prior trust exposure to be ascertained. These enhanced data provide a platform to identify potential interventions, which could not be garnered from other surveillance schemes in England.

  2. In addition, the surveillance scheme is a census of all microbiologically confirmed episodes of bacteraemias and CDI, which provides up to 2% greater ascertainment than comparative voluntary surveillance schemes (excluding CDI cases, due to issues with voluntary surveillance described in Routine SGSS-DCS audit with voluntary laboratory surveillance data). Of note, the financial penalty structure on trusts for incomplete data has not been a major factor in the reduction of CDI in England (Gerverand others, 2015).

  3. Well-completed patient identifiers allow for direct linkage with other data sources which make fuller data sets and reduce data entry burden for trusts. For example, data can be linked from the mandatory surveillance scheme with data from the voluntary laboratory reports to access antimicrobial susceptibility information, HES for comorbidity information and prior healthcare interactions.

  4. Reporting from the live mandatory surveillance database, HCAI DCS, for registered users such as healthcare professionals provides real-time statistics and other tabulations or graphical representations of their data.

The limitations of the data are:

  1. Despite the ability to link the mandatory surveillance data with other data sets, the completion of the data return takes time which leads to variable field completion for the non-mandatory fields and restricts the data’s utility.

  2. There is a potential conflict between the use of this data for epidemiological purposes by UKHSA and performance management or audit by others.

  3. While the effect on data validity is not currently of great concern, as discussed in Mandatory HCAI Surveillance Data in NHS performance management, the emphasis on performance management surrounding reductions in MRSA bacteraemia and CDI could lead to an emphasis on the infection prevention and control of these infections over others which have not had similar attention.

In summary, the mandatory data completion requirements, combined with long-term surveillance and enriched data, make the mandatory surveillance data the most reliable and suitable source for these statistics.

Accuracy

Accuracy is about the degree to which the data reflects the real world. This can refer to correct names, addresses or represent factual and up-to-date data.

The accuracy of the case-level data submitted to the mandatory surveillance of healthcare associated infections scheme is assured by the chief executive officer (CEO) of all the reporting acute trusts via the monthly sign off process, which was mandated by Chief Medical Officer (CMO) from October 2005. To add or delete cases after the sign-off, the reporting organisation CEO needs to request an unlock to the mandatory surveillance team in a formal process described in Accuracy and reliability.

Completeness

Completeness describes the degree to which records are present.

For a data set to be complete, all records are included, and the most important data is present in those records. This means that the data set contains all the records that it should and all essential values in a record are populated.

Completeness is not the same as accuracy as a full data set may still have incorrect values.

Routine SGSS-DCS audit

We undertake routine comparison and quality assurance of HCAI DCS data with voluntary laboratory surveillance data and the Second-Generation Surveillance System (SGSS), which is used by laboratories to report cases of microbial infection from various samples like blood, urine and faeces. Information on antibiotic and antifungal susceptibility is also submitted where relevant. Although primarily an internal system used by healthcare professionals, the data reported via this system is routinely compared to the mandatory data collected via HCAI DCS. This routine comparison between surveillance systems provides a data quality check of case ascertainment on the HCAI DCS.

It is not currently possible to include C. difficile data in the routine HCAI DCS and SGSS comparison as information on these cases is not comparable due to data quality and reporting issues in the SGSS. C. difficile testing is a two-stage process where the second stage identifies the C. difficile toxin. As only C. difficile toxin-positive cases are reportable to the mandatory surveillance system, it is not currently possible to differentiate reported C. difficile cases which have tested positive for C. difficile toxins from those which have not with an acceptable degree of accuracy from the SGSS.

In general, more cases are captured via the HCAI DCS than the SGSS. Meticillin resistance in the mandatory surveillance is reported by NHS acute trusts after susceptibility testing but meticillin resistance in the SGSS is determined by selecting the most severe susceptibility results from patients’ blood cultures within a 14-day period. This difference explains some of the apparent over-ascertainment of the voluntary MRSA reports in some financial years, and therefore this should be considered when comparing the case numbers for MRSA and MSSA bacteraemia for SGSS versus HCAI DCS.

Not all cases in the SGSS would be reported to the HCAI DCS. SGSS cases for each bacteraemia and infection reported to the HCAI DCS are defined as:

  • for MRSA bacteraemia, the earliest S. aureus blood isolate per patient within a 14-day period with resistant or indeterminate result to meticillin, oxacillin, cefoxitin or flucloxacillin result within the 14-day period
  • for MSSA bacteraemia, the earliest S. aureus blood isolate per patient within a 14-day period with a susceptible result to meticillin, oxacillin, cefoxitin or flucloxacillin result within the 14-day period
  • for E. coli bacteraemia, the earliest E. coli blood isolate per patient within a 14-day period
  • for Klebsiella spp. bacteraemia, the earliest Klebsiella spp. (including Enterobacter aerogenes blood isolate per patient within a 14-day period
  • for P. aeruginosa bacteraemia, the earliest P. aeruginosa blood isolate per patient within a 14-day period

Ordered matching between cases from HCAI DCS and SGSS show only about 5% of infection captured by the HCAI DCS cannot be found in the SGSS.

As part of routine laboratory data checks, laboratories with cases reported to the SGSS but not identified in the HCAI DCS are contacted for feedback on the discrepancy. The cases are closed if:

  • the unmatched case is subsequently identified in the HCAI DCS
  • the unmatched case is added to the HCAI DCS as a new record
  • there is a legitimate reason for not reporting it to the HCAI DCS

Accounting for the open cases identified in the SGSS, the HCAI DCS captures an estimated 89%, 97%, 94% and 95% of S. aureus, E. coli, Klebsiella spp. and P. aeruginosa bacteraemia cases, respectively which are eligible for mandatory reporting, suggesting the HCAI DCS provides an accurate national picture of the overall burden of infection which is under mandatory surveillance in England.

Uniqueness

Uniqueness describes the degree to which there is no duplication in records. This means that the data contains only one record for each entity it represents, and each value is stored once.

Some fields, such as National Insurance number, should be unique. Some data is less likely to be unique, for example geographical data such as town of birth.

The HCAI DCS has a de-duplication algorithm where cases reported by the same reporting organisation with matching NHS number and date of birth are flagged to the reporting trust to determine whether the case is a true duplicate. The CEO of the reporting organisation is required to sign-off data monthly which provides an additional verification of the uniqueness of the data.

Consistency

Consistency describes the degree to which values in a data set do not contradict other values representing the same entity. For example, a mother’s date of birth should be before her child’s.

Data is consistent if it doesn’t contradict data in another data set. For example, if the date of birth recorded for the same person in 2 different data sets is the same.

The HCAI DCS includes various validation rules which prevent the entry of invalid dates such as disallowing a specimen date preceding a patient’s date of birth. In such cases, a meaningful validation error message will be displayed to the data-entry user to correct the input before proceeding.

Timeliness

Timeliness describes the degree to which the data is an accurate reflection of the period that it represents, and that the data and its values are up to date.

Some data, such as date of birth, may stay the same whereas some, such as income, may not.

Data is timely if the time lag between collection and availability is appropriate for the intended use.

Cases entered in the HCAI DCS require sign-off by the CEO of the reporting organisation on the 15th of each month for the previous month’s data. Hence there is minimal delay between the data collection and availability.

Validity

Validity describes the degree to which the data is in the range and format expected. For example, date of birth does not exceed the present day and is within a reasonable range.

Valid data is stored in a data set in the appropriate format for that type of data. For example, a date of birth is stored in a date format rather than in plain text.

HCAI DCS enforces data validation for many fields to prevent users from entering incorrect information and ensure accuracy. For example, the “specimen date” must be in the correct date format, otherwise, an error message will be displayed to resolve before further progress. Whenever possible, drop-down lists help minimise data entry errors.

Sound methods

Statistical outputs should be made using the best available methods and recognised standards.

This section describes how the statistics were produced and quality assured.

Data set production

All cases of bacteraemia infection and CDI originate in the hospital (hospital-onset or HO) or community (community-onset or CO).

A case of bacteraemia is classified as hospital-onset if it meets all the following criteria:

  1. The patient is an in-patient, day-patient, emergency assessment patient or unknown location.
  2. The specimen was taken at an acute trust or at an unknown location.
  3. The specimen was taken on or after day 3 of the admission (admission date is considered day 1).

Cases that do not meet all the above criteria are categorised as community onset.

A case of CDI is classified as hospital-onset if it meets all the following criteria:

  1. The patient is an in-patient, day-patient, emergency assessment patient or unknown location.
  2. The specimen was taken at an acute trust or at an unknown location.
  3. The specimen was taken on or after day 4 of the admission (admission date is considered day 1).

Cases that do not meet all the above criteria are categorised as community onset.

It is not possible for UKHSA to change the onset status of a case as it is determined by the above criteria based on the data provided by the reporting organisation. A case may change from one category to another only if the relevant case details are incorrect and requires amendment by the trust. Reports published before September 2017 used the term ‘trust-apportioned’ for hospital-onset cases and ‘not trust-apportioned’ for community-onset cases which was simply a change in terminology.

All cases are also attributed to an ICB. ICBs, which cover a specific geographical area, are NHS organisations responsible for planning health services for their local population.

A sub-ICB for each case is attributed in the following order:

  1. If the patient’s GP practice code is available (and is based in England), the case will be attributed to the ICB at which the patient’s GP is listed, or
  2. If the patient’s GP practice code is unavailable but the patient is known to reside in England, the case is attributed to the ICB catchment area in which the patient resides, or
  3. If both the patient’s GP practice code and patient post code are unavailable or if a patient has been identified as residing outside England, then the case is attributed to an ICB based on the postcode of the headquarters of the acute trust that reported the case.

For ICB, all cases of bacteraemia and CDI are attributed to an ICB regardless of onset. UKHSA’s HCAI DCS does not currently request NHS organisations to record patient ICB details for any bacteraemia or CDI case such as patient GP registration details and patient residential postcode. However, to obtain this data, an extract comprising patient NHS number, date of birth, patient forename, patient surname and sex are submitted to NHS Digital via Demographics Batch Services tracing service daily and matched in a two-stage algorithm using a combination of the provided patient details.

Cases are categorised into one of the following 6 groups for CDI:

  1. Hospital-onset-healthcare-associated (HOHA): date of onset is greater than 2 days after admission (where day of admission is day 1).
  2. Community-onset-healthcare-associate (COHA): is not categorised HOHA and the patient was most recently discharged from the same reporting trust in the 28 days prior to the specimen date (where day 1 is the specimen date).
  3. Community-onset-indeterminate-association (COIA): is not categorised HOHA and the patient was most recently discharged from the same reporting trust between 29 and 84 days prior to the specimen date (where day 1 is the specimen date).
  4. Community-onset-community-associated (COCA): is not categorised HOHA and the patient has not been discharged from the same reporting organisation in the 84 days prior to the specimen date (where day 1 is the specimen date).
  5. Unknown: the reporting trust answered ‘Don’t know’ to the question regarding previous discharge in the 3 months prior to CDI case.
  6. No Information: the reporting trust did not provide any answer for questions on prior admission.

Cases are categorised into one of the following 5 groups for each bacteraemia:

  1. HOHA: date of onset is greater than 2 days after admission (where day of admission is day 1).
  2. COHA: is not categorised HOHA and the patient was most recently discharged from the same reporting trust in the 28 days prior to the specimen date (where day 1 is the specimen date).
  3. COCA: is not categorised HOHA and the patient has not been discharged from the same reporting organisation in the 28 days prior to the specimen date (where day 1 is the specimen date).
  4. Unknown: the reporting trust answered ‘Don’t know’ to the question regarding previous discharge in the month prior to the current episode.
  5. No Information: the reporting trust did not provide any answer for questions on prior admission.

Prior trust categories use the following denominator data when calculating rates:

  • HOHA: the infection occurred within hospital and is healthcare associated. Hospital overnight bed-days are used as a denominator as the patient has already been admitted to hospital.
  • COHA: the infection occurred within the community but is healthcare associated. Hospital overnight bed-days and hospital day-only are used as a denominator. The addition of ‘day only’ accounts for community cases who have not been admitted and may initially present as day-only.
  • COCA: the infection occurred within the community and is community associated. Population data is used in the rate calculation.

Monthly data tables

These data tables include a monthly count of total reported cases for each data collection as well as a breakdown by prior trust exposure for the last 13 months. The counts are reported at national (England), ICB, NHS acute trust, UKHSA centre and NHS region-level. The data tables also include information on whether the data was signed off.

Quarterly Epidemiological Commentary (QEC)

The incidence rate of total and CO cases is calculated using their quarterly count and the mid-year population for England. It is converted to an annualised incidence rate to allow comparisons with annual incidence.

Its calculation is: the count of reported episodes in England in a given quarter divided by the mid-year population of England in that year, multiplied by the number of days in that year, divided by the number of days in that quarter and multiplied by 100,000.

The incidence rate of HO cases is calculated using their quarterly count and the KH03 average bed-day activity for England.

Its calculation is: the count of reported episodes in a given quarter in England divided by the daily average number of occupied overnight beds in that quarter in England, then divided by the number of days in the same quarter and multiplied by 100,000.

Percentage changes in rates are calculated using raw rate numbers while those presented in the commentary have been rounded to one decimal place. Similarly, graphs included in this report use raw rates numbers. The raw rate numbers are included in the Quarterly Epidemiological Commentary’s accompanying data.

Annual Epidemiological Commentary (AEC)

To calculate time-to-onset of an episode (bacteraemia or CDI) among inpatients, the number of days between the date of admission to an NHS acute trust and the date of positive specimen are used. This was performed for only patients who were admitted to an acute trust and for those whose specimen was taken on or after the date of admission also at an NHS acute trust. The number of days between the date of admission and the date of specimen is then grouped into meaningful categories by the number of days.

The ICB rate (per 100,000 population) is calculated as the number of new cases attributed to the ICB divided by the total ICB population for the financial year, then multiplied by 100,000.

The Office of National Statistics (ONS) mid-calendar year population estimates are used to calculate the financial year population. For instance, for financial year 2023 to 2024 mandatory surveillance data, we use the mid-calendar year 2022 population estimate. For the current year (for example, 2024) the mid-2024 population was unavailable, so the most recent (mid-2023) population estimate was used. For acute trust rates, hospital-onset cases are used as the numerator.

Bed occupancy data (KH03) from NHS England is as an indicator of the total activity in each trust during the relevant periods and is used in the denominator of acute trust rates. KH03 has been published quarterly since April 2010.

The denominator of acute trust rates for all cases or ones that are hospital-onset healthcare associated (HOHA), hospital-onset (HO) or community-onset (CO) use the total overnight beds KH03 metric. However, for community-onset healthcare-associated (COHA) cases, the denominator is the total ‘overnight beds plus day-only beds’ KH03 metrics. These are obtained by multiplying the raw average daily KH03 metrics by the number of days in the relevant period.

The acute trust rate is then the number of new cases reported by the trust, divided by the relevant denominator multiplied by 100,000: the rates of all, HOHA, HO or CO cases is expressed as ‘per 100,000 bed-days’ while for COHA it is ‘per 100,000 bed-days and day admissions’.

Rates at the England, integrated care board (ICB) or sub-ICB level are calculated by number of new cases that fall into that area, divided by population and multiplied by 100,000, expressed as ‘per 100,000 population’.

Prior to trust apportioning, the rates for all cases were calculated per acute trust. Therefore, to retain the historical time series, an all-cases rate per acute trust is also calculated. ‘All reported cases’ refers to all bacteraemias or C. difficile infections that are detected by the acute trust that processed the specimen. It does not necessarily imply the infection was acquired there.

To calculate the denominator of rates by IMD and age, ONS mid-year populations by the lower layer super output area (LSOA) total during years 2018 to 2022 are used and linked to IMD deciles. The IMD for each case is identified by using the postcode of residence at the time of infection and linked to the LSOA of residence and its 2019 IMD decile. IMD deciles are converted into quintiles. Populations are then converted to financial year-level.

To calculate the denominator of rates by ethnicity and age, 2021 Census populations are used. As 2018 to 2020 and 2022 to 2024 calendar year populations by ethnic group are unavailable the Census populations by ethnic group and age in 2021 are used as a proxy for the years before and after: the proportion of each age and ethnic group strata observed in 2021 is applied to the respective populations in 2018 to 2020 and 2022 to 2024. This assumes that the age and ethnic distribution in England has not changed substantially since 2018. These were then converted from calendar to financial year.

As C. difficile infection numerators only include people aged 2 years and over, IMD and ethnicity populations produced are restricted to this age group.

The observed incidence rates were calculated for each organism and financial year as the number of infections in a financial year in a given ethnic group or IMD quintile divided by the population in given ethnic or IMD quintile, then multiplied by 100,000.

Age-standardised rates are estimated using direct standardisation with the 2013 European Standard Population and Byar’s method and Dobson method adjustment using the PHEindicatormethods R package. Rates are not calculated for counts less than 10 when the method becomes unreliable.

Cases without a known IMD or ethnicity value (respectively 1.5% and 4.4% of all cases) are excluded from the calculation of IMD and ethnicity rates. This means that rates stratified by IMD or by ethnicity are slight underestimates.

A missing IMD value may be because:

  • the patient’s residence was not in England
  • the patient’s residence was in an area that has not been assigned an IMD value yet
  • the patient was homeless

A missing ethnicity value may be because:

  • the patient had opted not to state their ethnic group on admission to hospital
  • the trust did not record a valid NHS number or date of birth on the DCS
  • HES did not contain an ethnicity value

Mortality rate is used for assessing risk of death and is calculated by dividing the number of deaths by the population at risk. This reflects the incidence of all-cause deaths following these infections in the population.

Case fatality rate is a measure for comparing survivability of different infections and is expressed as the number of deaths as a percentage of all reported cases.

Data is presented on all-cause mortality, and therefore includes deaths that may not be directly attributable to the infections.

Two Quarterly Mandatory Laboratory Returns (QMLR) indicators measure blood culture sets examined and stool specimens tested for diagnosis of CDI. The blood culture sets (per 1,000 bed-days) is calculated as number of blood culture sets examined divided by the product of the average overnight bed occupancy and number of days in the period, then multiplied by 1,000. The C. difficile toxin test rate (per 1,000 bed-days) is calculated as the number of stool specimens tested for diagnosis of CDI divided by the product of average overnight bed occupancy and number of days in the period, then multiplied by 1,000.

Independent sector (IS) report

Counts and rates (per 100,000 bed-days and discharges) of MRSA, MSSA, E. coli, Klebsiella spp., P. aeruginosa bacteraemia and CDI are presented by IS organisation for the latest 12-month period with comparison of rates to the previous year.

An IS organisation can comprise a group of hospitals owned by one company or a single hospital. It is possible to identify a group versus a hospital using the ‘number of hospitals in organisation’ field in the HCAI DCS.

The modified inpatient bed-days (bed-days plus discharges) are provided for the most recent financial year available as an indication of the size of each facility.

The hospital types, 50 beds or more for a large hospital, less than 50 beds for a small hospital. NHS treatment centre and diagnostic centre seeing mainly day case patients, are listed for the hospitals within a group. All types are listed where a group comprises more than one hospital type. IS organisations are requested to submit their bed-day plus discharge denominators. The calculation for the bed-day plus discharge denominator for shorter stay hospitals is the sum of the number of bed days in a year and the number of discharges in a year.

Instead of counting the number of midnights the patient was resident for, this counts the number of different days on which they were in the hospital. A day case will count as 1, a one-night stay in the year will count as 2.

Bed-days in the financial year April 2023 to March 2024 is the sum of the number of beds occupied each midnight during the year. For example, the sum of the number of bed occupants at midnight for the day ending 1 April 2023 is added to the number of bed occupants at midnight for each subsequent day up to and including 31 March 2024. 

Alternatively, if the bed-days is being derived from admission dates and discharge dates, the calculation is the discharge date or 1 April 2024 (whichever is earlier) minus by the admission date or 1 April 2023 (whichever is later).

Only patients who are admitted to hospital before 1 April 2024 and discharged on or after 1 April 2023 are counted towards a bed-day in that financial year. That is, the latest date they could have been admitted was 31 March 2024 and the earliest date they could have been discharged was 1 April 2023. If the patient is still in hospital and does not yet have a discharge date then, 1 April 2024 should be used as discharge date. The sum of the days for all the patients then provides the total number of bed-days.

Discharges in the financial year April 2023 to March 2024 are include the number of patients with a discharge date between 1 April 2023 and 31 March 2024. It is the sum of the number of patients discharged on 1 April 2023 and the number discharged for each subsequent day up to and including 31 March 2024. It should include any day cases that took place during the year.

Figures provided are aggregated for each organisation (which could own more than hospital or facility) or for the individual hospital if an organisation comprises one hospital or facility.

Quality assurance

All statistical processing is performed independently by 2 scientists and final data cross-checked to verify that the data is correct. In addition, when rates are calculated for our quarterly commentaries and annual data tables and commentary, we also independently process the data used for denominators (occupied overnight bed days (KH03 return) from NHS England and population data from the Office of National Statistics).

Confidentiality and disclosure control

Personal and confidential data is collected, processed, and used in accordance with the UKHSA privacy notice. All UKHSA staff with access to personal or confidential information must complete mandatory information governance training, which must be refreshed every year. Information is stored on computer systems that are kept up-to-date and regularly tested to make sure they are secure and protected from viruses and hacking. UKHSA staff do not store data on their own laptops or computers. Instead, data is stored centrally on UKHSA servers.

No personally identifiable information is included in the published data. The structure of the published tables prevents them from being broken down in ways that could compromise individual privacy through cross-referencing. Additionally, when small numbers are reported in the data, a careful assessment is conducted to balance the need for detailed reporting with the potential risk of secondary disclosure, ensuring privacy is maintained without compromising the usefulness of the data.

Geography

Mandatory surveillance includes data from all NHS trusts in England. Each report contains data for overall counts and rates (except for monthly tables which include counts only) at different geographic levels. Monthly tables are published at national (England), ICB, UKHSA centre, NHS region, and NHS trust levels. The quarterly epidemiological commentary is published at national (England) level. The annual epidemiological commentary is published at national (England) and ICB- level.

Quality summary

The Code of Practice for Statistics defines quality in statistics as:

  • fitting their intended uses
  • based on appropriate data and methods
  • not materially misleading

Quality requires skilled professional judgement about collecting, preparing, analysing, and publishing statistics and data in ways that meet the needs of people who want to use the statistics.

This section assesses the statistics against the European Statistical System dimensions of quality.

Relevance

Relevance is the degree to which the statistics meet user needs in both coverage and content.

These mandatory surveillance outputs are critical to tracking progress towards controlling key healthcare-associated infections. In particular, the National Action Plan for AMR 2024 to 2029 and 2019 to 2024 before it, set out ambition to control Gram-negative bacteraemia including 3 infections covered by this surveillance (E. coli, Klebsiella spp. and P. aeruginosa).

The data also allows for NHS acute trusts to monitor their infection rates, and benchmark against peers and nationally.

The different statistics published are used in several ways including the following:

  • mandatory HCAI surveillance outputs are used to monitor progress on controlling key healthcare-associated infections and for providing epidemiological evidence to inform action to reduce them,
  • mandatory surveillance outputs are routinely used to appraise local or regional NHS management of infection levels within their area
  • data provides unique case level information
  • data is used to support the NHS objective of improving the quality and safety of health services and promoting patient choice by providing access to information on NHS performance
  • data is used nationally for benchmarking purposes and for the performance management of MRSA bacteraemia and CDI objectives set by NHS Improvement
  • data or outputs are routinely used to answer relevant Parliamentary Questions
  • data is used to inform patient choice via the NHS Choices website
  • NHS acute trusts and sub-ICB locations use this data to monitor progress against these objectives and to help inform action to reduce these infections locally
  • the E. coli, Klebsiella spp. and P. aeruginosa bacteraemia surveillance outputs are an integral part of NHS Improvement’s strategy for to prevent any increase in Gram-negative bloodstream infections by 2029 compared to the 2019 to 2020 financial year baseline, as part of the UK National Action Plan for AMR 2024 to 2029 which superseded the previous UK National Action Plan for AMR 2019 to 2024

We have continued to make changes to the publications to meet user needs. From the 2022 to 2023 annual epidemiological commentary, we have added a section on age-standardised incidence rates by IMD and ethnicity. From the 2023 to 2024 annual report, we have also included analysis of 2 QMLR indicators: total blood culture sets and total CDI toxin tests.

Accuracy and reliability

Accuracy is the proximity between an estimate and the unknown true value. Reliability is the closeness of early estimates to subsequent estimated values.

Infection cases are reported by NHS acute trusts. As part of the verification process, the CEO of the acute trust signs off infection data reported each month by the 15th of the following month. This sign-off process provides formal assurance that the data is accurate and complete. Published statistics; therefore, include details of all cases for the reported period.

On occasion, however, a notification is received that an amendment is required. This may occur when sign-off is required prior to full laboratory results being available and result in additional cases being added following laboratory confirmation. Alternatively, deletions may be required as an acute trust may have entered case information wrongly. In that situation, a CEO must request the deletion of the wrong information to be replaced with the correct one.

NHS acute trusts or external agencies like he Care Quality Commission may also perform audits of local infection data. This can result in requests to add infection episodes that had not previously been entered. Finally, an NHS trust may ask to delete a case if it is a duplicate of a case reported from another trust.

NHS acute trusts may request to alter their data to improve the sub-ICB location (SICBL) attribution of a given infection record. This process is undertaken via an ‘unlock’ of the HCAI DCS. A log of the number of unlocked cases by data collection and unlock reason is maintained.

A total of 66 (62%) acute trusts requested an unlock of at least 1 case across all organisms affecting data in the financial year 2022 to 2023 which totalled 297 unlocked cases. 45.1% of those unlocks were additions, 33.7% were amendments and 21.2% were deletions to a locked period. In financial year 2022 to 2023, compared to the previous financial year 2021 to 2022, there was an increase of 3.1% in the number of trusts that requested unlocks to change their data but a 35.2% decrease in the total number of unlocks. The number of unlock requests to add a new case declined by 58.4%, while number of requests to delete or amend a case increased by 8.7% (44 to 63 requests) and 43.2% (92 to 100 requests), respectively.

The HCAI DCS includes facilities to assist NHS acute trusts to identify duplicate infection episodes within their organisation. A pop-up for potential duplicates at case entry is available to determine that no duplicates have been entered for a designated period. Following sign off, as the CEO of an acute trust has verified their data as being accurate, data used for statistical publications is not altered by the UKHSA mandatory HCAI surveillance team to remove potential duplicate records. This may result in multiple listings of the same infection episode in the data set.

Although there should not be an over-coverage as the mandatory surveillance of healthcare associated infections data set is a national-level data collection, there is a possibility that some cases may not be reported to the HCAI DCS, resulting in under-coverage. To ascertain the level and to rectify this, a consistency study is performed comparing voluntary reported laboratory information for England with the mandatory surveillance scheme data set.

Data changes between releases are highlighted in each publication, so that users are made aware of any changes to historical data between publications. Further information on this process is available on the caveats page of each routine publication.

Not all IS organisations have signed off their data or submitted data for the reporting period, potentially leading to unfinalised and inaccurate data.

Measurement error

All mandatory HCAI surveillance data is collected via the HCAI DCS. The appendices of the mandatory HCAI surveillance protocol detail definitions and guidance on each field in the data collection. Therefore, there should be little concern over the interpretation of the questions by different users, although it should be noted that some questions are subjective in nature such as asking the clinical opinion of the treating physicians.

There is a low item non-response error as the bulk of data used to produce the mandatory HCAI surveillance outputs is from mandatory questions in the HCAI DCS. This means that a response is required to save the infection episode. The exceptions are in the data collected on risk factors for bacteraemias presented in the AEC because the risk factor or source of bacteraemia questions are not mandatory fields. However, there are accompanying statements in the relevant sections of the AEC on the level of response for this data

However, unit non-response where individual NHS acute trusts who have not entered data and/or signed off data exists. All trust-level outputs highlight such non-responders. Consistent non-responders are further referred to NHS England for follow-up.

Processing error

Processing errors may occur during the data entry stage. The data collected via the HCAI DCS is either entered by hand or partially uploaded (key responses to questions required to save an infection episode) using the HCAI DCS data upload wizard. Data entry errors may occur because the source data at the acute trust is incorrect or missing or in the transcription process.

While it is not possible to provide a level or direction of bias through processing errors for the entire data collection, it is possible to estimate the collective level of processing errors for 2 key variables, date of birth and NHS number), which can be used as an indicator for the full data collection. Assessing the percentage of all cases which could not be attributed via a match with the NHS Spine provides an indication of data entry errors.

There is the potential for bias in the statistics as organisations aim to meet performance targets. Therefore, there is a conflict between the use of statistics for both epidemiology/public health and for performance management.

A separate data set, (the Quarterly Mandatory Laboratory Returns) which includes the numbers of C. difficile toxin tests performed by laboratories in England between 2008 and 2013, was queried to ascertain if there were any changes in the testing of C. difficile toxin over a 6-year period in England. While there has been an overall decline in the count and rate of C. difficile toxin testing in England over this time period, there has been a much greater decline in the count and rate of CDI, with a much higher ratio of toxin tests performed per case of CDI identified in 2013 than in 2008, leading to little evidence of large-scale changes in testing practices over time and that ‘gaming’ by NHS acute trusts to avoid exceeding CDI objectives and incurring financial penalties has not been a major factor in the reduction of CDI in England.

Timeliness and punctuality

Timeliness refers to the time gap between publication and the reference period. Punctuality refers to the gap between planned and actual publication dates.

Mandatory HCAI surveillance data is published in as timely a manner as possible. Data is signed off by acute trusts’ chief executives 15 days after the end of each month, meaning that sign off for each month is required by the 15th of the following month. Data is published on a monthly, quarterly and annual basis and are pre-announced at least 28 days in advance, in line with the Code of Practice for Statistics.

The UKHSA official statistics publication calendar is available online which includes mandatory HCAI surveillance-specific announcements.

Monthly data tables 

Monthly data is processed and analysed before being published on the first Wednesday of the following month. This occurs between 2 and 6 weeks following the end of a given month, depending on how the month falls. For example, January 2017 data was signed off on 15 February 2017 and then published on 1 March 2017. This is 2 weeks from sign-off to publication.

QEC 

The QEC is published approximately 2 months following sign-off of the last full month of data for inclusion. For the April 2019 to March 2020 publications, this was increased to 4 months. The increase is to allow for the inclusion of the most recent hospital admissions data which would otherwise be unavailable at the time of the QEC’s production. This change is relevant due to the lower than usual levels of hospital admissions in April 2019 to March 2020 due to the COVID-19 pandemic. Publication of this report occurs on the first Thursday of the fourth month after the quarter covered in the reported. For example, data up to and including December 2021 were signed off on 15 January 2022 and published on 7 April 2022.

Annual data tables and AEC

Annual data tables and the accompanying AEC is usually published in early July each year. For the April 2019 to March 2020 publication, this was delayed to September. This delay was to allow for the inclusion of the most recent hospital admissions data which would otherwise be unavailable at the time of the AEC’s production. Similarly, this change due to the lower than usual levels of hospital admissions in April 2019 to March 2020 due to the COVID-19 pandemic.

The annual data tables include counts and rates for both acute trusts and clinical commissioning groups (CCGs). The AEC represents the most substantial HCAI mandatory surveillance output produced or published each financial year. The lead time necessary for analysis and compilation of data cannot be underestimated. Decreasing the amount of time between sign off and publication of these reports has been considered. However, doing so would not allow enough time to undertake relevant data quality checks on either the data used for preparing the report or the report itself. Hence the benefit of using the current publication schedule far outweighs any minor benefit that might be achieved in reducing the lead time for the QEC publication.

Furthermore, the changes to the publication schedule for 2020 to 2021 was due to those periods having atypical levels of hospital admission, requiring the need to wait and use published admission data.

Accessibility and clarity

Accessibility is the ease with which users can access the data, also reflecting the format in which the data is available and the availability of supporting information. Clarity refers to the quality and sufficiency of the metadata, illustrations and accompanying advice.

All HCAI outputs have been reviewed for accessibility requirements, with several changes made to ensure they are accessible. Since 2022, the QEC (containing data up to January to March 2022) and AEC (containing data up to April 2021 to March 2022) have been published in HTML format which provides the accessibility features mentioned in the GOV.UK accessibility statement. This format enhances accessibility by supporting screen readers and allowing easy navigation using a keyboard, ensuring the content is accessible to a wider audience. Additionally, HTML allows for text resizing and media alternatives such as alt text which help to improve the overall user experience. The publications have also been reviewed for clarity, incorporating plain English language, main messages and data visualisations.

The reports include data visualisations which help users to understand the data. These have been reviewed and updates to ensure colours used provide sufficient contrast to be distinguished and are colour-blind friendly. The accompanying data tables are published in ODS format and follow accessibility guidelines. Each sheet contains only one table and no nested tables. Each sheet contains a header which provides a description of the corresponding table.

Coherence and comparability

Coherence is the degree to which data that are derived from different sources or methods, but refer to the same topic, are similar. Comparability is the degree to which data can be compared over time and domain.

The mandatory HCAI surveillance scheme aligns closely with surveillance processes and definitions of the European Centre for Disease Control (Europe) and the Centers for Disease Control and Prevention (USA), to allow comparability where possible.

There are, however, some differences between the English mandatory HCAI surveillance scheme and the surveillance undertaken by others, including the UK devolved administrations and internationally. These include some case definitions and protocols for diagnosing the infections, definitions regarding inpatient episode versus trust apportioned or assigned episodes, age groups included in the surveillance schemes and the way in which data is presented by periods. As the population sizes of the other devolved administrations are different to England, crude counts of infections cannot be compared amongst countries in the UK. Furthermore, as the population demographics amongst the devolved administrations differ, the denominators used to calculate any infection rates are also not directly comparable. Therefore, the data provided in the published reports from Public Health Agency Northern Ireland, Public Health Wales and Health Protection Scotland is not directly comparable with the data published by UKHSA.

Uses and users

Users of statistics and data should be at the centre of statistical production, and statistics should meet user needs.

This section explains how the statistics are used, and how we understand user needs.

Appropriate use of the statistics

These statistics present information on cases reported to mandatory surveillance since the start of surveillance for each infection. Data is presented for transparency and to allow tracking of these mandated infections across multiple settings.

The onset algorithm, and later the prior trust exposure algorithm, introduced since 2017 to align more closely with the European Centre for Disease Control (Europe) and the Centers for Disease Control and Prevention (USA) sought to attribute cases to either a hospital or community setting and whether a case was healthcare-associated to identify where the infection may have occurred. Prior healthcare exposure definitions require trusts to enter exposure to reporting trust alone may lead to an underestimation of healthcare association as patients may have had contact with other healthcare settings which is not captured. However, this definition has been consistently applied across all surveillance years. Also, when comparing data with other UK nations or countries, it is important to consider any differences in infection definitions, onset and prior trust exposure algorithms, and the deduplication window used.

The IS report does not provide a basis:

  • for comparisons between different IS organisations due to their variable size and range (case mix) of patients seen
  • for reliable comparison of these infections between the NHS and IS organisations

Known uses

We are aware that the statistics have been used for:

  • monitoring progress on controlling key healthcare associated infections and for providing epidemiological evidence to inform action to reduce them
  • education and training
  • strategy and resource allocation
  • benchmarking purposes and for the performance management of MRSA bacteraemia and CDI objectives
  • research
  • informing patient choice

Known users

National users

UKHSA uses the data to:

  • undertake epidemiological analyses at national/regional/local level and
  • provide, on request, relevant response to Parliamentary Questions

The Department of Health and Social Care (DHSC) uses the data to:

  • routinely brief ministers on national and regional incidence of MRSA, MSSA, E. coli, Klebsiella spp. and P. aeruginosa bacteraemia and CDI and
  • inform and identify national level targets for interventions or reduction strategies

NHS England and NHS Improvement use the data to:

  • identify and establish performance management
  • set national and local level performance management targets
  • assess performance against objectives

Regional or local users

ICBs use the data to assess NHS trust or SICBL performance against targets and objectives at a local level.

UKHSA Field Service and UKHSA regions use the data to:

  • assist in outbreak investigation as necessary
  • inform public health initiatives at a local level

NHS acute trusts use the data to:

  • inform trust boards of the current organisational position in terms of key HCAIs (MRSA, MSSA and E. coli bacteraemia and CDI)
  • monitor progress against performance management objectives

SICBL use the data to:

  • monitor progress against performance management objectives
  • assist in the commissioning of services from relevant acute level providers

User engagement

A routine ‘Stakeholder Engagement Forum’ is held every 6 months. This meeting includes representation from a wide range of national and local level stakeholders as such as SICBL and acute trusts.

Standing items on the meeting’s agenda include recent publications, experiences, improvements and future developments.

Following the meeting, a summary of the discussion is produced and is available on the HCAI DCS website.

Meeting feedback is used to improve ongoing engagement. It is also used to inform future development and to ensure that data users remain central to the process.

Most health protection functions in the UK are devolved to the other UK nations’ public health agencies. National Services Scotland publishes HCAI reports, Public Health Wales publishes HCAI data on its dashboard while Public Health Agency of Northern Ireland reports on data annually.

The European Centre for Disease Prevention and Control also reports on HCAI.