Skip to main content

An international comparison of longitudinal health data collected on long COVID in nine high income countries: a qualitative data analysis

Abstract

Background

Long coronavirus disease (COVID) presents a significant health challenge. Long-term monitoring is critical to support understanding of the condition, service planning and evaluation. We sought to identify and examine longitudinal health data collected on long COVID to inform potential decisions in England regarding the rationale for data collection, the data collected, the sources from which data were collected and the methods used for collection.

Methods

We included datasets in high-income countries that experienced similar coronavirus disease 2019 (COVID-19) waves to England pre-vaccine rollout. Relevant datasets were identified through literature searches, the authors’ networks and participants’ recommendations. We undertook semi-structured interviews with individuals involved in the development and running of the datasets. We held a focus group discussion with representatives of three long COVID patient organisations to capture the perspective of those with long COVID. Emergent findings were tested in a workshop with country interviewees.

Results

We analysed 17 datasets from nine countries (Belgium, Canada, Germany, Italy, the Netherlands, New Zealand, Sweden, Switzerland and the United Kingdom). Datasets sampled different populations, used different data collection tools and measured different outcomes, reflecting different priorities. Most data collection was research (rather than health care system)-funded and time-limited. For datasets linked to specialist services, there was uncertainty surrounding how long these would continue. Definitions of long COVID varied. Patient representatives’ favoured self-identification, given challenges in accessing care and receiving a diagnosis; New Zealand’s long COVID registry was the only example identified using this approach. Post-exertion malaise, identified by patients as a critical outcome, was absent from all datasets. The lack of patient-reported outcome measures (PROMs) was highlighted as a limitation of datasets reliant on routine health data, although some had developed mechanisms to extend data collection using patient surveys.

Conclusions

Addressing research questions related to the management of long COVID requires diverse data sources that capture different populations with long COVID over the long-term. No country examined has developed a comprehensive long-term data system for long COVID, and, in many settings, data collection is ending leaving a gap. There is no obvious model for England or other countries to follow, assuming there remains sufficient policy interest in establishing a long-term long COVID patient registry.

Peer Review reports

Background

Long coronavirus disease (COVID) is a major legacy issue of the coronavirus disease 2019 (COVID-19) pandemic [1,2,3,4,5,6,7,8]. The World Health Organization estimates that 17 million people in Europe alone were experiencing long COVID symptoms in 2022 [7]. The term long COVID was created by patients early on in the pandemic in the absence of an internationally agreed definition of the condition [9,10,11]. Long COVID is also commonly referred to as post COVID-19 syndrome, post COVID-19 condition, long-haul COVID-19, chronic COVID-19 and post-acute sequelae of SARS-CoV-2 (PASC) in the literature [12]. The National Institute for Health and Care Excellence (NICE) in the United Kingdom identifies long COVID as a multi-system condition with a range of debilitating symptoms that continue or develop after acute COVID-19 and are not explained by an alternative diagnosis [13]. It includes both ongoing symptomatic COVID-19 (from 4 to 12 weeks) and post-COVID-19 syndrome (12 weeks or more). Uncertainty persists around the proportion who will fully recover [14].

Research has identified more than 200 symptoms with impacts on multiple organ systems [15]. Long COVID is having a significant long-term impact on the health, wellbeing, daily activities and ability to work of those experiencing prolonged symptoms, as well as wider effects on their families and society [6, 14, 16,17,18,19,20,21,22,23,24]. Beyond the devastating personal impacts, the prevalence and complex nature of the condition pose considerable implications for both the health and care systems, as well as to the economy [18, 20, 22, 23]. In the United Kingdom, a survey by the Trade Union Congress found that 20% of people with long COVID were on sick leave and 16% were working reduced hours [25], while another study estimated 0.3% of the total working population have left employment owing to long COVID [26].

Long-term monitoring and follow up of people with long COVID is therefore critical to support our understanding of the condition, including the prevalence, natural history and long term outcomes, and to understand which treatments, interventions and care models work best and for whom. Data are also needed to support service improvement to understand demand for services, equity of access and potential areas of unmet need, and to support service planning, delivery and evaluation [3, 27]. To support this endeavour there is a need to develop or adapt existing data systems to identify and follow individuals with long COVID over time. Recommendations have been made in a number of countries to establish a long COVID disease register. For example, the Bevan Commission in Wales recommended the Welsh Government take steps to consolidate existing databases and resources held within different organisations and recruit a self-referred population of people with long COVID [28, 29].

In this study, we sought to identify and examine longitudinal health data collected on long COVID across high income countries to inform potential decisions on developing a long COVID data set for England by identifying:  the rationale for data collection, the data collected, the sources from which data were collected and the methods used for collection.

Methods

Selection of long COVID datasets

We selected higher income countries, with well-developed health and social care data systems, that experienced similar COVID waves to England pre-vaccine rollout. These countries were Austria, Belgium, Canada, England, France, Germany, Italy, the Netherlands, Spain, Sweden and Switzerland. To identify potentially relevant long COVID datasets, we searched PubMed for research outputs and commentaries, Google for news articles, and relevant government department and public health bodies’ websites within each country, as well as the authors’ professional networks and the recommendations of country experts and individuals approached to participate. Searches varied between platforms. We used broad search terms such as “long COVID” and “data collection” to capture any ongoing research about long COVID. We ran the searches in the languages of the countries we were finding datasets in, but only reviewed English language publications. We reviewed any sources which referenced collecting or utilizing existing data on long COVID, to determine how data were collected and from which datasets. We undertook follow-up searches of the identified long COVID data collection activities to obtain more information and manually reviewed reference lists of identified journal papers and reports to find other relevant sources. Relevant long COVID datasets included any dataset which captured individual-level data related to long COVID and reported repeat information on the included participants. We attempted to obtain the contact details of staff or authors to invite for interview.

On advice from experts, we extended our list of countries to include New Zealand, despite it not having experienced similar COVID waves pre-vaccination, on the grounds that it was distinctive in having established a national registry.

Data collection

We conducted semi-structured interviews with informants involved in the development, management, collection and analysis of the identified datasets. They were purposively selected on the basis of their close involvement in the development and/or running of these datasets and had sufficient familiarity with the datasets to be able to respond to our questions. Further participants were identified through the authors’ networks and on the basis of recommendations from individuals who had already been approached to participate.

Potential informants were invited by email and provided with background information on the purpose of the study. Interviews were conducted on the video conferencing platform Zoom between 10 October 2022 and 9 February 2023. All interviews were conducted in English and, with consent, were audio-recorded, lasting between 1 and 1.5 h. One interview was not audio-recorded; instead, notes were taken.

The research team designed the topic guide (Appendix 1). Questions covered the objectives of data collection; how long COVID is defined and how participants are sampled and recruited or identified within existing data; how (and for how long) individuals are followed-up; the measures collected across the different levels of the health and social care system, demographic data, as well as service use and the outcomes captured; management and governance of the data; funding sources; and perception of data quality.

NHS England commissioned our work as a rapid scoping exercise, so we did not have time to consult patient groups about topic guide questions. However, to capture the patient perspective on the utility of the existing data and implications for future data collection in England, we undertook a focus group discussion with representatives of three long COVID patient groups, one United Kingdom-based (Long COVID Support United Kingdom) one German (Long COVID Deutschland) and one pan-European (Long COVID Europe). A summary of early findings was shared in advance of the discussion. The discussion focused on three questions: (1) what should be the aims of collecting data related to long COIVD; (2) who should be included in a long COVID dataset; and (3) which measures should be collected? The discussion took place over Zoom.

Analysis

Transcripts and audio recordings of the interviews with country informants were reviewed by two team members, J.E. and E.S. Data were extracted by the same team members into a common extraction table to capture information on the history and development of each dataset, its objectives and accomplishments, sampling and recruitment methods, the measures collected, management and governance of data, funding, support from other organisations and general views from informants on the quality of the data. Information from literature and any additional documentation shared by informants was also added to the extraction table. J.E. or E.S. independently populated the extraction tables for half of the interviews and reviewed each others’ entries against interview transcripts, highlighting points of disagreement or missing information. The populated coding frame was shared with interviewees via email for their review and to clarify any points of uncertainty that had arisen during the coding. Extraction tables and subsequent drafts of the analysis were shared with senior team members, N.M. and R.W., for input and feedback.

Developing recommendations

All participants in the study, including patient group members, were invited to attend an online workshop on 23 May 2023. The purpose of the workshop was to share the key study findings and reflect on the future direction of data collection related to long COVID. Through breakout discussions, participants were asked to consider why data should be collected on long COVID, what data should (or should not) be collected and how such data should be collected. Breakout rooms were used to enable attendees to debate these three points. Time was also allocated for all attendees to share and discuss each breakout room’s insights. The online workshop was recorded with the permission of attendees and notes were taken by the research team to capture the breakout discussions. The workshop both corroborated the research team’s initial findings and recommendations, and generated additional insights that informed the development of the study recommendations.

Ethics

The project was approved by the London School of Hygiene & Tropical Medicine’s research ethics committee, approval number: 28096. All informants gave written consent to take part in the study.

Results

We conducted interviews with individuals representing 17 longitudinal datasets from nine countries (Table 1 and Appendix 2). In Spain, individuals from the Spanish Network for Research on Long COVID (REiCOP) and the Spanish Society of General and Family Doctors (SEMG) informed us they were in the preliminary stages of developing a register (REGICOVID-AP Clinical Registry), but within the timeframe of this study we were not able to collect further details [30]. We were not able to identify any examples of longitudinal data collection in Austria or France.

Table 1 Overview of the aims of data collection, when data collection started and ended

The included datasets sampled different populations, used different data collection tools and measured different outcomes. We grouped the datasets into five types on the basis of the population sampled and the data collection tools used: (1) population surveys (n = 3); (2) surveys of individuals with a positive COVID-19 test (n = 4); (3) individuals with a COVID-19 test result identified in routine health data (n = 4); (4) datasets of individuals who had sought health care specifically for long COVID (n = 5) and (5) individuals who self-reported or self-identified as having long COVID (n = 1). Datasets were managed by a mixture of local and national governmental departments, academic institutions and independent nonprofit foundations (working with the healthcare sector).

Most of the datasets examined in this study were time-limited. Dataset types 1, 2 and 3 (see Table 1) had either ended or were winding down. Likewise, specialist long COVID services in all countries we examined were at that moment only seeing patients for a limited period – patients in British Columbia attended post-COVID-19 recovery clinics for up to 18 months, after which care reverted to their GPs, while the Belgian care pathway covered care for up to 12 months. In all countries, there was uncertainty over how long these specialist services would continue to be funded. The Aotearoa New Zealand long COVID register was conducting surveys for 6 months post-recruitment, beyond which participants were to be followed up in electronic health records (EHRs).

Why collect data on long COVID?

The purpose of data collection varied across the different types of datasets, providing insight into different research and policy priorities (Table 1). Type 1 and 2 datasets aimed to establish the population prevalence of COVID-19 and long COVID, the risk factors for developing long COVID, disease development and prognosis and the impacts of long COVID on individuals. The Belgian COVIMPACT, Dutch RIVM Long COVID, the German NAPKON, the Swiss Immunitas Research Program and the United Kingdom ONS-CIS were set up before long COVID had been formally characterized, and, as such, long COVID was not the primary research focus these surveys sought to address [33, 35]. In contrast, the Canadian CAHS was established specifically to determine the prevalence of persistent COVID-19 symptoms, contact with and use of the health system and the relationship with pre-existing conditions [31]. Like the Canadian CAHS, the Aotearoa New Zealand long COVID registry (type 5) was developed specifically to address research questions related to long COVID, to estimate the clinical, quality of life and economic impacts of long COVID in New Zealand, as well as continually monitoring health outcomes and inequities [71].

Like type 2 datasets, those in type 3 also examined the long-term impacts of having had a COVID-19 infection. Both the Italian Long-CoViD CCM and the Swedish SCIFI-PEARL examined the impact of COVID-19 infection on the health care system, while the Dutch NIVEL combined primary care dataset (PCD) and persistent complaints aimed to map out the so-called care pathways of individuals with long COVID [47, 48, 50, 52, 53, 72].

Type 4 datasets were specifically focused on individuals accessing specialist services to understand the care they received and (in some cases) inform the development of care models [48, 60, 61]. Finally, several datasets aimed to examine the relationship between COVID-19 and pre-existing conditions, including the Canadian CAHS, the United Kingdom CVD-COVID-UK and the German ABC19 study [31, 56, 58, 64].

Interviewees across countries discussed the need for longer-term patient monitoring and follow-up to improve the understanding of long COVID. Interviewees and workshop participants commented that the heterogeneity in purpose across datasets reflected the limited understanding of the condition and its impacts. They identified multiple data needs including to support understanding of the prevalence, risk factors and progression of the condition, patients’ journeys and use of health care, treatment effectiveness and the study of the personal impact on those living with long COVID.

Who is data collected from?

An overview of the population included, and the recruitment methods, is presented in Table 2.

Table 2 Overview of who is included in data collection

Defining long COVID

The emergent nature of the condition and a lack of a clear definition was identified as a key challenge among interviewees. The definition used varied across datasets and was based on symptoms, diagnosis, clinical assessment or self-identification (see Table 2). Most commonly, long COVID was defined in line with the WHO’s definition, on the basis of individuals’ self-reported experience of symptoms beyond the acute stage of illness, although the time point post-infection varied. Datasets that drew on routine data relied on diagnostic codes. However, several interviewees discussed concerns about the reliability and validity of diagnostic codes, in particular, doubts about clinicians’ familiarity with the codes, whether they were using them routinely and applying them uniformly. Patient representatives also expressed concerns that diagnostic codes are not accurately capturing individuals with long COVID.

“Set definition is UO-8 code. But it was a new code, and nobody knew how to use it, so I think there’s been both overuse and underuse in different areas and in different parts of the health care system” (university researcher, Sweden, SCIFI-PEARL).

An explicit aim of the NIVEL-PCD in the Netherlands and the Swedish Covid-19 Investigation for Future Insights – a Population Epidemiology Approach using Register Linkage (SCIFI-PEARL) was to characterize long COVID using different definitions [49, 52, 73].

Population captured

The population captured also varied across datasets. Type 1 datasets recruited a representative sample of the general population irrespective of participants’ COVID-19 or long COVID status. Type 2 datasets sampled individuals who had tested positive for COVID-19; in general, individuals were recruited when they received their test results from their national or regional testing programme. In most contexts, research access to testing data had been made available under special legislative powers to tackle COVID-19. A Swiss interviewee commented that:

“[T]his setup is far from common in Switzerland, so it’s really rare to establish such a collaboration between the research group and a governmental body because of all the privacy issues, etcetera” (university researcher and physician, Switzerland, Zurich Coronavirus Cohort).

Type 3 datasets tracked all individuals with a positive COVID-19 test or a long COVID diagnosis in national or regional EHRs. Like type 2 datasets, the respondent from the Dutch NIVEL-PCD reported that access to testing data in medical records was only made available under special legislation, and access was granted for a period of about one year. The Italian Long COVID dataset intended to collect data from all individuals with a positive test result in regional administrative data, but, during acute phases of the pandemic, the Department of Prevention “encountered considerable difficulties in monitoring all the positives over time” (senior director, National Institute of Health).

The five type 4 datasets captured individuals who had sought health care specifically for long COVID. The examples from Canada, England and Italy captured individuals who had been referred to specialist services, while those in Belgium and Germany included individuals receiving care in primary care settings. Patient representatives voiced concerns that only examining individuals accessing services (type 3 and 4) would fail to capture a representative sample of individuals living with long COVID, given the challenges that individuals with long COVID have experienced in accessing care and receiving a diagnosis. Workshop participants also highlighted the inherent biases in collecting data only from individuals accessing specialist clinics and the impact this is likely to have on understanding of the epidemiology. They told us that data collected from individuals accessing specialist services are not likely to be generalizable beyond those settings. For example, the Canadian PC-ICCN interviewee commented on the inequitable access to the clinic, noting the “higher rates of hospitalization with COVID among non-Caucasian individuals, but those in post-recovery clinics are mainly Caucasian”(manager, Provincial Health Research Services Authority).

Patient representatives’ preferred method of recruitment was self-referral, as used by the Aotearoa New Zealand long COVID registry (type 5), which allows any individual self-identifying with long COVID to sign up online. However, some interviewees voiced concerns related to the representativeness of data where individuals self-refer. Only 8.4% of participants in the Aotearoa New Zealand registry are Māori (compared with 19.6% of the general population), hence active recruitment in Māori and Pasifika communities to increase registration among these groups [74]. The lack of representation of minoritized ethnic groups and those from lower socioeconomic groups was also noted among the other dataset types. For example, the German NAPKON study received almost no non-German consent forms, suggesting the non-German speaking population are underrepresented, while the Dutch Long COVID study only captured “5 to 6 per cent of people with migration background whereas population-wide this proportion is very different” (interview with two non-university researchers, National Institute for Public Health).

Control group

To characterize the condition and identify risk factors associated with developing long COVID, the interviewees highlighted the importance of including a comparison group given that many of the long COVID symptoms reported are nonspecific and prevalent in the general population. Only type 1, 2 and 3 datasets included a control group. The interviewee from New Zealand told us that whilst there is no control group “linkage with the IDI [Integrated Data Infrastructure, a large research database that holds data about life events, such as education, income, benefits, migration, justice and health] will allow for us to undertake some matching”. Control groups differed by data source and comprised individuals with and without self-reported symptoms, matched controls from those who had never tested positive for COVID-19 or who had other respiratory illnesses and random samples of the general population. Datasets using EHRs were also able to compare against pre-pandemic trends.

Which data were collected and how?

Survey versus use of routine health records

Type 1, 2 and 5 datasets collected primary data using self-reported surveys. In addition, the Dutch NIVEL-PCD (type 3) and the Canadian PC-ICCN (type 4) undertook patient surveys for a sub-set of individuals and type 4 datasets included surveys completed with a clinician. A reported advantage of surveys over EHRs was the flexibility to add questions and the ability to capture the fluctuating and episodic nature of the condition. These attributes were cited as particularly valuable in capturing long COVID considering that the condition and its impacts have been poorly described.

“[W]hen we set up the study, we did not know that long COVID is going to be a thing. We only wanted to track health status over time. [...] The studies did prove to be a very flexible tool in a way, and we were able to adapt questionnaires to emerging questions, add questions that may have been important” (university researcher and physician, Switzerland, Corona Immunitas Research Programme).

Another reported advantage of surveys over EHRs was the ability to collect patient-reported outcome measures (PROMs).

“When it comes to policy making these impact on your life questions, if you can work, what is the quality of your life, you don’t really get from a registry, an electronic health record, these are all very important” (interview with two university researchers, Netherlands, Long COVID study).

“I think the best data is coming from the patient side. Maybe this is, we should build it, shift the focus more on the patient side, so that the workload is a little bit more on the patient’s side and less on the physician side, and we try to simplify our CRF” (manager, Germany, ABC19).

Outcome measures

The most important outcome to assess, according to the patient representatives, was post-exertion malaise (PEM). This is one of 12 core outcomes that researchers have recommended should be evaluated in all research studies and in clinical care for people with long COVID [79,80,81]. PEM was not included in any of the datasets examined. Table 3 presents the included measures mapped against the PC-COS (an international consensus study developing a standardized set of outcomes for people with long COVID) and demonstrates the heterogeneity between datasets in terms of both the outcomes measured and the measurement instruments used. The most frequently collected outcomes relate to the impact of symptoms on daily life, health-related quality of life, respiratory functioning and mental health.

Table 3 Included studies mapped against the core outcome set for adults with post-COVID-19 condition [79, 81]

The lack of PROMs was a notable gap in those datasets that rely solely on EHRs (type 3 datasets and Belgium Post-COVID care pathway and NHS England Long COVID registry). The Dutch NIVEL-PCD (type 3) has captured PROMs by making use of software previously developed to automatically flag patients with a COVID-19 diagnosis (see Panel 1) [51, 75]. The dataset includes EHRs for everyone attending primary care and extensive PROMs on a subset of individuals who tested positive for COVID-19, including services not captured in the EHR such as mental health care received, self-care and data on quality of life, lifestyle and employment. At present, PROMs are not routinely collected across services by the English NHS Long COVID registry, although there is ongoing work to enable the reporting of EQ-5D-5L [68, 76]. Additionally, a digital platform completed on a smartphone web application has been developed to collect PROMs in 40 post-COVID clinics across the country, although the PROMs collected varies between clinics [77, 78].

Burden of data collection

While surveys were reported to offer flexibility, issues were raised in relation to the burden that they can place on both patients and clinicians. For example, the survey used by the Italian National surveillance system was developed from the WHO’s Global COVID-19 Clinical Platform Case Report Form (CRF) for Post-COVID Condition [82], but it was considered overly burdensome for clinicians to complete. The process of refining the survey was reported by the interviewee to be challenging given the lack of consensus on many aspects of long COVID.

"It was very debated what to collect because clinicians had very different opinions [...] some wanted a lot of data collected, others would focus on some core data. In the end, we collect three kinds of data. We collect symptoms as defined by the patients. Then we collect new diagnoses, which is what the physicians believe the patient had or have. And then we collect some, but not too many data on tests [...] we decided that, diving into laboratory tests was very complicated and we decided to simplify and not collect, for example, blood tests. We put more interest in collecting data on quality of life, on anxiety and depression and on diagnoses” (senior director, Italy, National surveillance).

The German NAPKON study (type 2) captured the largest volume of primary data of any of the datasets examined, with over 3000 data items collected [40], which the interviewee considers contributed to drop out. Likewise, the Belgium post-COVID care pathway and the German ABC-19 interviewees reported that there was a lot of pressure on primary care and GPs during the pandemic, and it was challenging for them to engage with data collection on top of their already heavy workload.

“When the registry was ready to go the second wave came in, and nobody had time to think about research. And the next thing was, when the wave was gone, we started to vaccinate people, the same doctors that they are focused on COVID had to vaccinate all people. So we had a lot of obstacles to get our registry running” (manager, Germany, ABC-19).

Benefits of data linkage

Large scale data systems that link data from different parts of the health and care system have been developed in several countries (type 3). For example, the CVD-COVID-UK dataset and the Swedish-SCFI-PEARL datasets are highly comprehensive datasets that use individual identity numbers to link a diverse range of routine health data, including primary and secondary care data and existing disease registers. The Swedish SCIFI-PEARL links 20 different registers/EHRs to identify patients diagnosed with COVID-19 across Sweden. The database includes all individuals with a positive polymerase chain reaction (PCR) test identified in SmiNet (the national register of notifiable communicable diseases managed by the Public Health Agency of Sweden), as well as patients with relevant COVID-19 ICD-10 and procedure codes in routine health records and individuals whose cause of death was recorded as being due to COVID-19. Health care contacts for COVID-19 in primary care are only captured in two regions, which cover roughly 40% of the Swedish population [52]. The National Register of the Total Population from Statistics Sweden, a representative sample of the general population, has been used to construct different comparison groups as required for different statistical analyses. The CVD-COVID-UK/COVID-IMPACT Consortium includes 57 million patient records across England, 5.5 million across Scotland and 3.2 million across Wales. CVD-COVID-UK links primary care data, hospital episodes (covering inpatient, outpatient, emergency department and critical care episodes), registered deaths (including causes of death), COVID-19 laboratory tests, community dispensed medicines, specialist intensive care, cardiovascular audit, hospital electronic prescribing and COVID-19 vaccination data [54].

Access to data compiled in the CVD-COVID-UK/COVID-IMPACT dataset had been enabled under time-limited control of patient information (COPI) notices, issued by the Secretary of State for Health and Social Care to require organisations to share confidential patient information with approved users for COVID-19 research purposes without requiring patients’ consent. A challenge raised by the interviewee was that “until such time as there are equivalent datasets available under different provision notices, if we broaden our scope at the moment, we would lose access to certain datasets including primary care” (project manager/university researcher, HDR UK).

Likewise, some type 4 datasets have made use of linking to reduce the burden of primary data collection. For example, the Canadian PC-ICCN originally captured a larger number of diagnostic tests and extensive bloodwork, such as computed tomography, echocardiogram, pulmonary function tests and 6-min walking tests, but like the Italian example, these were dropped following an evaluation of their utility for clinical decision-making to reduce the burden of testing on patients and the health care system. Instead, patients were invited to participate in the provincial biobank network, which collects blood samples that can be linked to the registry. A total of four other datasets collected biosamples (two type 1 datasets [Canadian CHAS, UK ONS-CIS] and two type 2 datasets [German NAPKON, Zurich Coronavirus Cohort]). The Dutch NIVEL-PCD (type 3) had planned to collect biosamples, but because of resource constraints this was not possible.

“We only administered questionnaire data. Probably it would be very useful to also have some measurements like immunological parameters for stuff you could measure in the laboratory. We didn’t do that, also because of pragmatic reasons especially at the scale we are including people. There was really an infrastructure problem setting up a study during a pandemic where everyone is stretched to their limits, so this was the most pragmatic solution” (non-university researcher, Netherlands, NIVEL-PCD).

Length of patient follow-up versus data completeness

Length of follow-up varied across datasets (see Appendix 2). Like recruitment, interviewees noted higher loss to follow-up among some population groups. For example, the Belgium COVIMPACT study found men and individuals with lower education levels were more likely to be lost to follow-up. Some datasets, such as the Aotearoa New Zealand long COVID registry, have requested participants’ consent for their data to be linked to Statistics NZ’s Integrated Data Infrastructure using their National Health Index number, to enable follow-up in routine data after surveys end. However, patient representatives expressed concerns that datasets that rely on routine health system data for long term follow-up could miss the episodic nature of long COVID, potentially mischaracterizing individuals who are self-managing as having recovered.

“You’ve got to be really careful to ensure that people aren’t knocked off the register because nothing’s been logged for a matter of maybe weeks” (patient representative).

Interviewees from the Italian National Surveillance and the German ABC-19 studies echoed these concerns. Follow-up was reliant on patients returning to the clinic (only relevant to the GP follow-up component of the study), regardless of whether they were still experiencing symptoms; they voiced concerns that this might have led to higher loss to follow-up among individuals who recover, feel able to self-manage or no longer wish to attend the service.

Discussion

Long COVID is having a long-lasting impact on the health, wellbeing, daily activities and livelihoods of those experiencing prolonged symptoms, as well as their families [83]. Even on the basis of conservative estimates, the burden of illness represents a challenge to the health and care system and the economy [5, 84]. Many gaps in the understanding of long COVID exist, and interviewees in this current study identified multiple outstanding research questions [13]. To address these gaps requires long-term monitoring of individuals with long COVID using data from different sources. In this study we examined 17 examples of longitudinal long COVID health datasets established in nine countries. The examples identified ranged from population surveys which captured individuals’ symptoms and experience of long COVID to a national register of individuals self-identifying as having long COVID.

The long COVID datasets examined in this study highlight the heterogeneity of approaches taken between countries to data collection, using different definitions of long COVID, populations and controls, outcomes and outcome measures [4, 12, 85]. The heterogeneity between datasets likely reflects the emergent nature of long COVID, the diverse interests of researchers and funders, differences in aims and the data sources available, as well as the different time points at which they were developed. For example, several of the patient surveys were established before long COVID had been characterized and were adapted to include questions related to long COVID once the research need had emerged, while others were designed to specifically examine long COVID.

The patient surveys in datasets type 1 and 2, and those associated with the Dutch NIVEL-PCD (type 3), had either ended or were about to end, leaving a gap in many data systems. Patient surveys were reported to be key to answer questions related to the prevalence, risk factors and evolution of the condition, as well as examination of the impacts of long COVID on individuals and their families. Patient surveys were seen to be particularly important to measure outcomes not captured in routine data, in particular, PROMs, and to characterize the fluctuating and episodic nature of long COVID. The challenge is how to sustain patient surveys in the long-term given the short-term nature of most research funding. A recent study of disease registers in the United Kingdom identified lack of long-term funding as the key threat to their sustainability [86]. For these disease registers, charities associated with the disease play a central role in providing some continuity of funding and running registers.

Patient representatives raised concerns, echoed by interviewees, that datasets that included only individuals with a positive COVID-19 test result recorded in routine data (type 2 and 3) or captured only individuals in EHRs (type 3) or only individuals accessing specialist services (type 4) will not provide a representative sample of individuals with long COVID. Not all individuals with long COVID had been tested for SARS-CoV-2, particularly at the start of the pandemic when many countries stopped or reduced testing in the community or, as highlighted by one of the Italian interviewees, when surveillance systems failed to record all test results as they became overwhelmed during peaks in infection. Further, in many countries, testing data were only made available to researchers for a limited time.

Patients in several countries have faced challenges in receiving a diagnosis and accessing care [87,88,89,90]; 36–42% of individuals included in the Aotearoa New Zealand Long COVID registry had not received a clinical diagnosis [74]. The lack of a standardized definition of long COVID, no diagnostic tests and low level of awareness of the existence of ICD codes for long COVID were reported to have resulted in heterogeneity in the use of ICD codes and underreporting of the condition. An analysis of OpenSAFELY data in England and ten United Kingdom longitudinal studies found the use of diagnostic codes to be low compared with survey data based on self-reported long COVID [91, 92]. Questions therefore remain about the reliance on ICD codes for long COVID, which is likely to limit what can be done using EHRs at present.

For conditions that are poorly characterized and/or where patients do not always receive a clinical diagnosis, it can be hard to accurately capture and track the affected population. Registries for conditions other than long COVID have faced similar challenges. For example, the lack of a consistent approach to diagnosis and misclassification of myalgic encephalomyelitis/chronic fatigue syndrome (ME/CFS) has led to the under-reporting of cases and insufficient research, medical care and treatment [93]. Similarly, for complex regional pain syndrome (CRPS), there is currently no clinically recognized diagnostic test. The CRPS Network has developed a broad definition and collects information that allows participants to be divided into further subgroups on the basis of different definitions of the condition [94]. Such an approach would enable researchers to continue to examine the accuracy of diagnostic coding, as with the Swedish SCIFI-PEARL study and the Dutch NIVEL-PCD (see Table 2), and it reflects the preference of long COVID patient representatives whom we consulted. They questioned the accuracy of coding in patients’ health records given the complex presentation of symptoms and the reported challenges experienced in receiving a diagnosis [9, 95].

Self-referral was the patient representatives’ preferred method of recruitment, to ensure data are captured from individuals accessing all types of health services, plus those who have not accessed any health services. Similarly, the Bevan Commission in Wales and the Inquiry into Long COVID and Repeated COVID Infections in Australia recommended that any future long-COVID registry should include a self-referred population along with routine data sources to adjust for recruitment bias and promote equitable access [28, 29, 96]. The Aotearoa New Zealand registry is the only example included in this study that is recruiting participants in this way. Researchers at Martin Luther University Halle-Wittenberg, TU Munich and Otto von Guericke University Magdeburg in Germany have established a long COVID register, which anyone over 18 years of age with self-reported symptoms can join [97]. Participants complete a questionnaire every 6 months and, at present, the funding is open-ended.

The pandemic was reported to have served as a catalyst for data sharing. Special legislation was introduced in several countries to provide time-limited access to data and to expedite consent and data linkage processes to support research on COVID-19. This was reported by several interviewees to have facilitated research that would not otherwise have been possible or would have taken much longer to set up. There is a need to review existing data access, including understanding the benefits and whether there have been any serious data breaches or patient harm as a result of providing so-called emergency access, as these arrangements raise questions around whether data should be made more readily available routinely for research purposes, including for long COVID.

Strengths and limitations

The strengths of this study are that we conducted thorough searches for long COVID data sets in higher income countries, conducted interviews with informants involved in the development or running of these data sets, held a focus group discussion with long COVID patient representatives on the emerging findings and held an online workshop with the study participants to test the draft recommendations.

Undertaking a thorough search of longitudinal studies within countries was challenging, as many of the datasets had not started producing published outputs that could be identified through bibliographic database searching. We relied heavily on snowball sampling, recruiting initial interviewees through the authors’ own networks and then via the recommendations of individuals approached to participate in the study. Given these challenges, we are likely to have missed examples from countries within the scope of this study. A recent mapping of long COVID surveillance systems across the EU identified many of the same datasets as this study, but additionally identified an example in Germany we did not pick up and the Spanish REGICOVID-AP Registry. In the time available for data collection, we were not able to obtain sufficient information to confidently characterize these two datasets for inclusion in the study [98]. Further, the search was deliberately limited to countries with a similar healthcare system to England and which experienced similar COVID waves prior to the vaccine rollout; New Zealand was added once we found out about its distinctive approach to long COVID data collection, even though its pandemic experience was very different from England’s. There may have been equally interesting examples from countries other than New Zealand that we were unaware of.

Of the datasets examined, not all of them have published protocols or made their data collection tools publicly available. As a result, the amount of information available on each dataset varied. Interviews aimed to provided additional insight and fill the gaps in information, providing a more comprehensive overview of the datasets examined but varied in the level of detail that interviewees could provide.

Recommendations

This study was commissioned to inform the further development of data systems for long COVID in England. The recommendations are informed by the findings of our interviews and workshop and, while they relate to the English context, they are likely to be generalizable to other settings.

First, there is a need to decide which questions the dataset should address. Determining the research priorities and the outcomes to be measured will require consultation with health authorities, patient organisations, clinicians and the research community, and it should be codesigned with people with lived experience of long COVID to ensure data are relevant and useful to those who will be using the data or may be affected by the findings. For example, the James Lind Alliance is an expert in helping patients, carers and clinicians work together to prioritize evidence needs through its Priority Setting Partnerships (https://www.jla.nihr.ac.uk/about-priority-setting-partnerships).

Given the definitional issues and the fact many individuals with long COVID are not (routinely) in contact with health care services, datasets should take an inclusive approach to capture a broad population using different definitions such as self-report, positive COVID test, recorded diagnosis and others, with a variable to indicate the basis of the individual’s inclusion. Such an approach would provide a deeper understanding of how long COVID is being experienced and enable data users to select subpopulations to examine particular questions. As part of efforts to improve the validity and completeness of data being collected, it will be important to keep clinicians, especially GPs, up-to-date on the clinical diagnostic coding of long COVID.

Different objectives will require different measures to be collected. For example, if the goal of research is to assess the impact of long COVID on people’s lives, then collecting employment and income data will likely be extremely useful alongside quality-of-life measures; this could be done either by linking data to employment records (if possible) or by collecting measures directly from people with long COVID about how their work has been affected. Alternatively, if establishing the risk factors and comorbidities for long COVID were the primary objective, then longitudinal diagnostic, symptom and healthcare usage data would need to be extracted from patient records, augmented, if possible, from patient surveys.

Datasets should include a range of outcomes measures, in particular PROMs, to shed light on quality of life and ability to function day-to-day. Since there is evidence that long COVID has significantly impacted individuals’ ability to work, data systems should aim to look beyond the clinical and health impacts to include labour market outcomes. There is a need to appraise the most effective and efficient way to collect the data, maintain the data set and make data available to health care providers, researchers, patients and others with an interest.

There is a strong case to build a population-based cohort study to follow individuals over longer time periods than many of the datasets identified here. For any data collection to achieve its objectives, there is a need to ensure that it has adequate funding for an extended period [86]. As COVID-19 and long COVID become lower priorities for governments, alternative funders are likely to be needed. Exploring the possibility and suitability of greater collaboration with ME/CFS organisations, as has been done in Australia, which expanded its ME/CFS Registry to include individuals with long COVID in October 2023, could be one route to maximize what can be achieved [99, 100].

Conclusions

Long COVID affects the health and quality of life of millions of people and represents a significant long-term health challenge. There is a demand for data to support a greater understanding of the natural history of the condition, the long-term effects on individuals with long COVID and the effectiveness of the range of treatments and services to support those living with long COVID. Addressing these needs will require a mix of data sources that capture different populations with long COVID over the longer term. None of the countries examined have implemented a comprehensive dataset for long COVID. Many of the datasets examined have only been funded in the short-term. As a result, there is no obvious model for England or other countries to follow, assuming there remains sufficient policy interest in establishing a long-term long COVID patient registry. Reliance on routine health care data alone would leave a gap in data important for understanding long COVID. It is important that the development of a longitudinal health data set on long COVID should be based on careful consideration of the priority questions to be addressed, the views of stakeholders, including people with lived experience, and the importance of sustainability of the data collection and management.

Availability of data and materials

The data analysed during the current study are not publicly available as the data contain potentially sensitive participant information. Consent was only sought to share anonymous quotations specifically for the purposes of this study but are available from the corresponding author on reasonable request.

Abbreviations

ABC-19:

Outpatient treatment of Covid-19 infections

BPI-SF:

Brief Pain Inventory-Short Form

C19-YRS:

COVID-19 Yorkshire Rehabilitation Scale

CCAHS:

Canadian COVID-19 Antibody and Health Survey

CFQ:

Cognitive Failures Questionnaire

CIS:

Checklist Individual Strength

CRF:

Clinical Platform Case Report Form

CRPS:

Complex regional pain syndrome

DASS21:

Depression Anxiety Stress Scale-21

EHRs:

Electronic health records

FACIT-F:

Functional Assessment of Chronic Illness Therapy – Fatigue

FAS:

Fatigue Assessment Scale

GAD-7:

Generalized Anxiety Disorder 7

ICD:

International Classification of Diseases

Long-CoViD CCM:

Analysis and Response Strategies for the Long-Term Effects of COVID-19 Infection

LCSS:

Long COVID Stigma Scale

ME/CFS:

Myalgic encephalomyelitis/chronic fatigue syndrome

mMRC:

MMRC Dyspnoea Scale

MoCA:

Montreal Cognitive Assessment

NAPKON:

German National Pandemic Cohort Network

NICE:

National Institute for Health and Care Excellence

ONS-CIS:

Office for National Statistics COVID-19 Infection survey

PCC:

Post-COVID Condition

PCD:

Combined primary care dataset

PC-COS:

Post-COVID Condition Core Outcomes

PC-ICCN:

Post COVID-19 Interdisciplinary Clinical Care Network

PEM:

Post-exertion malaise

PHQ-2:

Patient Health Questionnaire-2

PROMIS:

Patient-Reported Outcomes Measurement Information System

PROMS:

Patient-reported outcome measures

PCL-5:

PTSD Checklist for DSM5

REiCOP:

Spanish Network for Research on Long COVID

SCIFI-PEARL:

Swedish Covid-19 Investigation for Future Insights – a Population Epidemiology Approach using Register Linkage

SEMG:

Spanish Society of General and Family Doctors

SF-12:

Short Form 12

UCSD-SOBQ:

University of California, San Diego Shortness of Breath Questionnaire

References

  1. Sudre CH, Murray B, Varsavsky T, Graham MS, Penfold RS, Bowyer RC, et al. Attributes and predictors of long COVID. Nat Med. 2021;27(4):626–31.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  2. Dennis A, Cuthbertson DJ, Wootton D, Crooks M, Gabbay M, Eichert N, et al. Multi-organ impairment and long COVID: a 1-year prospective, longitudinal cohort study. J R Soc Med. 2023;116(3):97–112.

    Article  PubMed  PubMed Central  Google Scholar 

  3. Greenhalgh T, Sivan M, Perlowski A, Nikolich J. Long COVID: a clinical update. Lancet. 2024;404(10453):707–24.

    Article  CAS  PubMed  Google Scholar 

  4. Michelen M, Manoharan L, Elkheir N, Cheng V, Dagens A, Hastie C, et al. Characterising long COVID: a living systematic review. BMJ Glob Health. 2021;6(9): e005427.

    Article  PubMed  Google Scholar 

  5. O’Mahoney LL, Routen A, Gillies C, Ekezie W, Welford A, Zhang A, et al. The prevalence and long-term health effects of long COVID among hospitalised and non-hospitalised populations: a systematic review and meta-analysis. EClinicalMedicine. 2023;55: 101762.

    Article  PubMed  Google Scholar 

  6. Rajan S, Khunti K, Alwan N, Steves C, MacDermott N, Morsella A, et al. In the wake of the pandemic: preparing for long COVID. Health Systems and Policy Analysis Policy Brief 39 2021;WHO Regional Office for Europe: Copenhagen.

  7. WHO. Webpage “Post COVID-19 condition (Long COVID)” [https://www.who.int/europe/news-room/fact-sheets/item/post-covid-19-condition]. 2022.

  8. Wulf Hanson S, Abbafati C, Aerts JG, Al-Aly Z, Ashbaugh C, Ballouz T, et al. Estimated global proportions of individuals with persistent fatigue, cognitive, and respiratory symptom clusters following symptomatic COVID-19 in 2020 and 2021. JAMA. 2022;328(16):1604–15.

    Article  PubMed  PubMed Central  Google Scholar 

  9. Callard F, Perego E. How and why patients made long COVID. Soc Sci Med. 2021;268: 113426.

    Article  PubMed  PubMed Central  Google Scholar 

  10. Long COVID Support. Webpage “What is long COVID?” [https://www.longcovid.org/awareness/what-is-long-covid].

  11. Perego E, Callard F, Stras L, Melville-Jóhannesson B, Pope R, Alwan N. Why the patient-made term ‘long COVID’ is needed [version 1; peer review: 1 approved with reservations, 1 not approved]. Wellcome Open Res. 2020;5:224.

    Article  Google Scholar 

  12. Munblit D, O’Hara ME, Akrami A, Perego E, Olliaro P, Needham DM. Long COVID: aiming for a consensus. Lancet Respir Med. 2022;10(7):632–4.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  13. NICE, the Scottish Intercollegiate Guidelines Network (SIGN), the Royal College of General Practitioners (RCGP). COVID-19 rapid guideline: managing the long-term effects of COVID-19. NICE guideline [NG188]. 2020;online: https://www.nice.org.uk/guidance/ng188. Accessed on 19 Dec 2023.

  14. Malik P, Patel K, Pinto C, Jaiswal R, Tirupathi R, Pillai S, Patel U. Post-acute COVID-19 syndrome (PCS) and health-related quality of life (HRQoL) – a systematic review and meta-analysis. J Med Virol. 2022;94(1):253–62.

    Article  CAS  PubMed  Google Scholar 

  15. Davis HE, Assaf GS, McCorkell L, Wei H, Low RJ, Re’em Y, et al. Characterizing long COVID in an international cohort: 7 months of symptoms and their impact. EClinicalMedicine. 2021;38: 101019.

    Article  PubMed  PubMed Central  Google Scholar 

  16. Ayoubkhani D, Zaccardi F, Pouwels KB, Walker AS, Houston D, Alwan NA, et al. Employment outcomes of people with long COVID symptoms: community-based cohort study. Eur J Public Health. 2024;3:5.

    Google Scholar 

  17. Carlile O, Briggs A, Henderson AD, Butler-Cole BFC, Tazare J, Tomlinson LA, et al. Impact of long COVID on health-related quality-of-life: an OpenSAFELY population cohort study using patient-reported outcome measures (OpenPROMPT). Lancet Reg Health Eur. 2024;40: 100908.

    Article  PubMed  PubMed Central  Google Scholar 

  18. Gandjour A. Long COVID: Costs for the German economy and health care and pension system. BMC Health Serv Res. 2023;23(1):641.

    Article  PubMed  PubMed Central  Google Scholar 

  19. Jamoulle M, Kazeneza-Mugisha G, Zayane A. Follow-up of a cohort of patients with post-acute COVID-19 syndrome in a Belgian Family Practice. Viruses. 2022;14(9):2000.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  20. Koumpias AM, Schwartzman D, Fleming O. Long-haul COVID: healthcare utilization and medical expenditures 6 months post-diagnosis. BMC Health Serv Res. 2022;22(1):1010.

    Article  PubMed  PubMed Central  Google Scholar 

  21. Kwon J, Milne R, Rayner C, Rocha Lawrence R, Mullard J, Mir G, et al. Impact of long COVID on productivity and informal caregiving. Eur J Health Econ. 2023.

  22. Menges D, Ballouz T, Anagnostopoulos A, Aschmann HE, Domenghino A, Fehr JS, Puhan MA. Burden of post-COVID-19 syndrome and implications for healthcare service planning: a population-based cohort study. PLoS ONE. 2021;16(7): e0254523.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  23. Office for National Statistics (ONS). Self-reported long COVID and labour market outcomes, UK: 2022. ONS Statistical Bulletin. 2022;Online: https://www.ons.gov.uk/peoplepopulationandcommunity/healthandsocialcare/conditionsanddiseases/bulletins/selfreportedlongcovidandlabourmarketoutcomesuk2022/selfreportedlongcovidandlabourmarketoutcomesuk2022 . Accessed on 17 Nov 2023.

  24. Walker S, Goodfellow H, Pookarnjanamorakot P, Murray E, Bindman J, Blandford A, et al. Impact of fatigue as the primary determinant of functional limitations among patients with post-COVID-19 syndrome: a cross-sectional observational study. BMJ Open. 2023;13(6): e069217.

    Article  PubMed  PubMed Central  Google Scholar 

  25. TUC & Long COVID Support. Workers’ experience of Long Covid. Joint report by the TUC and Long Covid Support. The Trades Union Congress (TUC) and the Long Covid Support Employment Group (LCSEG). 2023.

  26. Reuschke D, Houston D. The impact of long COVID on the UK workforce. Appl Econ Lett. 2023;30(18):2510–4.

    Article  Google Scholar 

  27. NHS England. National commissioning guidance for post COVID services. 2022;online: https://www.england.nhs.uk/wp-content/uploads/2022/07/C1670_National-commissioning-guidance-for-post-COVID-services_V3_July-2022-1.pdf. Accessed on 19 Dec 2023.

  28. Bevan Commission. Establishing a long COVID registry for wales. 2021.

  29. Davies F, Finlay I, Howson H, Rich N. Recommendations for a voluntary long COVID registry. J R Soc Med. 2022;115(8):322–4.

    Article  PubMed  PubMed Central  Google Scholar 

  30. Sociedad Española de Médicos Generales y de Familia (SEMG). Webpage "The SEMG initiates new projects to generate and disseminate more knowledge about long COVID". Accessed on 31 July 2024. 2022.

  31. Statistics Canada. Webpage "Canadian COVID-19 Antibody and Health Survey (CCAHS). Detailed information for April 2022 to August 2022". https://www23.statcan.gc.ca/imdb/p2SV.pl?Function=getSurvey&Id=1480441. Accesed on 30 May 2024. 2023.

  32. Statistics Canada. Webpage “Canadian COVID-19 antibody and health survey – follow-up questionnaire, 2023” [https://www.statcan.gc.ca/en/survey/household/5339/follow-up] Accessed on 30 May 2024. 2023.

  33. Corona Immunitas. Webpage “E9 Zurich Coronavirus Vaccine Cohort (ZVAC)”https://www.corona-immunitas.ch/en/program/studies/e9-zurich-coronavirus-vaccine-cohort-zvac-study/. Accessed on 30 May 2024.

  34. Office for National Statistics (ONS). Webpage “Coronavirus (COVID-19) Infection Survey: methods and further information” https://www.ons.gov.uk/peoplepopulationandcommunity/healthandsocialcare/conditionsanddiseases/methodologies/covid19infectionsurveypilotmethodsandfurtherinformation. Accesed on 30 May 2024. 2023.

  35. Office for National Statistics (ONS). Webpage “About the study” https://www.ons.gov.uk/surveys/informationforhouseholdsandindividuals/householdandindividualsurveys/covid19infectionsurveycis. Accessed on 30 May 2024.

  36. Walker S, Diamond I, Rourke E, Farrar J, Bell J, Newton J. Incidence of COVID-19 (SARS-CoV-2) infection and prevalence of immunity to COVID-19 (SARS-CoV-2) in the UK general population as assessed through repeated cross-sectional household surveys with additional serial sampling and longitudinal follow-up – an Office for National Statistics Survey. University of Oxford. 2022; online: https://www.ndm.ox.ac.uk/covid-19/covid-19-infection-survey/protocol-and-information-sheets. Accessed on 30 May 2024.

  37. Smith P, Proesmans K, Van Cauteren D, Demarest S, Drieskens S, De Pauw R, et al. Post COVID-19 condition and its physical, mental and social implications: protocol of a 2-year longitudinal cohort study in the Belgian adult population. Arch Public Health. 2022;80(1):151.

    Article  PubMed  PubMed Central  Google Scholar 

  38. Sciensano. Webpage “COVIMPACT - Long COVID and its physical, mental and social implications” https://www.sciensano.be/en/projects/long-covid-and-its-physical-mental-and-social-implications. Accessed on 30 May 2024.

  39. NAPKON. Webpage “NAPKON Nationales Pandemie Kohorten Netz”. https://napkon.de/.

  40. Schons M, Pilgram L, Reese J-P, Stecher M, Anton G, Appel KS, et al. The German National Pandemic Cohort Network (NAPKON): rationale, study design and baseline characteristics. Eur J Epidemiol. 2022;37(8):849–70.

    Article  PubMed  PubMed Central  Google Scholar 

  41. Vehreschild JJ. Intersectoral Platform (SÜP) of the National Pandemic Cohort Network (NAPKON) (SUEP-NAPKON). ClinicalTrialsgov ID NCT04768998. 2021;online: https://clinicaltrials.gov/study/NCT04768998. accessed on 30 May 2024.

  42. Witzenrath M. Analysis of the Pathophysiology and Pathology of Coronavirus Disease 2019 (COVID-19), Including Chronic Morbidity. ClinicalTrialsgov ID NCT04747366. 2021;online: https://clinicaltrials.gov/study/NCT04747366. Accessed on 30 May 2024.

  43. Mutubuki EN, van der Maaden T, Leung KY, Wong A, Tulen AD, de Bruijn S, et al. Prevalence and determinants of persistent symptoms after infection with SARS-CoV-2: protocol for an observational cohort study (LongCOVID-study). BMJ Open. 2022;12(7): e062439.

    Article  PubMed  Google Scholar 

  44. National Institute for Public Health and the Environment Ministry of Health WaSR, ,. Webpage “Research on Long COVID” https://www.rivm.nl/en/coronavirus-covid-19/research/long-covid.

  45. Corona Immunitas. Webpage “E7 Zurich Coronavirus Cohort Study”. https://www.corona-immunitas.ch/en/program/studies/e7-zurich-coronavirus-cohort-study/. Accesed on 30 May 2024.

  46. Puhan M. Zurich Coronavirus Cohort: an observational study to determine long-term clinical outcomes and immune responses after coronavirus infection (COVID-19), assess the influence of virus genetics, and examine the spread of the coronavirus in the population of the Canton of Zurich, Switzerland. ISRCTNregistry 2020;ISRCTN14990068.

  47. Istituto Superiore di Sanità. Webpage “Long-CoViD CCM Project: Rationale” https://www.iss.it/en/long-covid-razionale. 2022.

  48. Istituto Superiore di Sanità. Webpage “Long-CoViD CCM Project: Aims”. https://www.iss.it/en/long-covid-obiettivi-progetto. 2023.

  49. Bosman L, Hoek R, van den Waarden W, Knottnerus B, Hek K, Berends M, Chu C, Homburg M, van Berger DL, Bij MS, van der olde Hartman T, Muris J, Peters L, Verheij R, Bos I. Post-COVID syndrome: how do we define it and how common is it? Utrecht: Nivel. 2022;online: https://www.nivel.nl/nl/publicatie/het-post-covid-syndroom-hoe-definieren-we-het-en-hoe-vaak-komt-het-voor. Accessed on 30 May 2024.

  50. NIVEL. Webpage “Post-COVID syndrome: persistent complaints after COVID-19 and patients’ ‘care pathways’: a mixed-method approach, 2021–2022”. https://www.nivel.nl/nl/project/post-covid-syndroom-aanhoudende-klachten-na-covid-19-en-de-zorgpaden-van-patienten-een. 2021.

  51. Veldkamp R, Hek K, van den Hoek R, Schackmann L, van Puijenbroek E, van Dijk L. Nivel Corona Cohort: a description of the cohort and methodology used for combining general practice electronic records with patient reported outcomes to study impact of a COVID-19 infection. PLoS ONE. 2023;18(8): e0288715.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  52. Nyberg F, Franzén S, Lindh M, Vanfleteren L, Hammar N, Wettermark B, et al. Swedish Covid-19 investigation for future insights – a population epidemiology approach using register linkage (SCIFI-PEARL). Clin Epidemiol. 2021;13:649–59.

    Article  PubMed  PubMed Central  Google Scholar 

  53. University of Gothenburg. Webpage “Swedish Covid-19 Investigation for Future Insights – a population epidemiology approach using register linkage (SCIFI-PEARL)”. https://www.gu.se/en/research/scifi-pearl. 2020.

  54. British Heart Foundation Data Science Centre. Webpage “CVD-COVID-UK/COVID-IMPACT”. https://bhfdatasciencecentre.org/areas/cvd-covid-uk-covid-impact/. Accesed on 30 May 2024.

  55. British Heart Foundation Data Science Centre. CVD-COVID-UK/COVID-IMPACT TRE Dataset Provisioning Dashboard: 13/07/23. 2023;online: https://bhfdatasciencecentre.org/wp-content/uploads/2023/07/230713-CVD-COVID-UK-COVID-IMPACT-TRE-Dataset-Provisioning-Dashboard-1.pdf. Accessed on 19 December 2023.

  56. HDRUK. Webpage “CVD-COVID-UK / COVID-IMPACT”. https://www.hdruk.ac.uk/projects/cvd-covid-uk-project/. Accesed on 30 May 2024.

  57. Health Data Research Innovation Gateway. Webpage “Trusted Research Environments for CVD-COVID-UK / COVID-IMPACT”. https://web.www.healthdatagateway.org/dataset/7e5f0247-f033-4f98-aed3-3d7422b9dc6d.

  58. Wood A, Denholm R, Hollings S, Cooper J, Ip S, Walker V, et al. Linked electronic health records for research on a nationwide cohort of more than 54 million people in England: data resource. BMJ. 2021;373: n826.

    Article  PubMed  Google Scholar 

  59. Rijksinstituut voor ziekte- en invaliditeitsverzekering (RIZIV). Webpage “Long COVID: Reimbursement of care in the event of persistent COVID-19 symptoms”. https://www.riziv.fgov.be/nl/themas/kost-terugbetaling/ziekten/Paginas/post-covid-tegemoetkoming-kosten-eerstelijnszorg-aanhoudende-symptomen.aspx. 2023.

  60. Levin A, Malbeuf M, Hoens AM, Carlsten C, Ryerson CJ, Cau A, et al. Creating a provincial post COVID-19 interdisciplinary clinical care network as a learning health system during the pandemic: integrating clinical care and research. Learn Health Syst. 2023;7(1): e10316.

    Article  Google Scholar 

  61. Naik H, Malbeuf M, Shao S, Wong AW, Tran KC, Russell JA, et al. A learning health system for long covid care and research in British Columbia. NEJM Catalyst. 2023;4(9):CAT.23.0120.

  62. Providence Health Care. Post-COVID-19 Interdisciplinary Clinical Care Network (PC-ICCN) Referral.online: http://www.phsa.ca/our-services-site/Documents/8565%20(BCHA.0186)%20Post-COVID-19%20ID%20Clinical%20Care%20Network%20(PC-ICCN)%20Referral%20Fillable.pdf. Accessed on 30 May 2024.

  63. Provincial Health Services Authority. Webpage “Post-COVID-19 Interdisciplinary Clinical Care Network” http://www.phsa.ca/our-services/programs-services/post-covid-19-care-network.

  64. IGES. Webpage "The IGES ABC-19 Registry" https://www.iges.com/abc19/.

  65. NHS England. Long COVID: the NHS plan for 2021/22. 2021;online: https://www.england.nhs.uk/coronavirus/wp-content/uploads/sites/52/2021/06/C1312-long-covid-plan-june-2021.pdf. Accessed on 19 December 2023.

  66. NHS England. Webpage “COVID-19 Post-Covid Assessment Service” [https://www.england.nhs.uk/statistics/statistical-work-areas/covid-19-post-covid-assessment-service/. Accesed on 30 May 2024.

  67. NHS England. Commissioning guidance for Post COVID services for adults, children, and young people. 2023;online: https://www.england.nhs.uk/wp-content/uploads/2022/07/PRN00488i-commissioning-guidance-for-post-covid-services-v4.pdf. Accessed on 19 December 2023.

  68. NHS England. The NHS plan for improving long COVID services. 2022;online: https://www.england.nhs.uk/wp-content/uploads/2022/07/C1607_The-NHS-plan-for-improving-long-COVID-services_July-2022.pdf. Accessed on 19 December 2023.

  69. University of Auckland. Webpage “Researchers launch long-COVID registry” https://www.auckland.ac.nz/en/news/2023/07/12/researchers-launch-long-covid-registry.html. Accessed on 30 May 2024. 2023.

  70. University of Auckland. Webpage “Welcome to the Long COVID Registry Aotearoa New Zealand” https://www.lcregistry.auckland.ac.nz/. Accesed on 30 May 2024. 2023.

  71. Daalder M. Newsroom article “‘This isn’t a life’: the crushing burden of long COVID” https://newsroom.co.nz/2023/11/17/this-isnt-a-life-the-crushing-burden-of-long-covid/. 2023.

  72. Floridia M, Grassi T, Giuliano M, Tiple D, Pricci F, Villa M, et al. Characteristics of Long-COVID care centers in Italy: a national survey of 124 clinical sites. Front Public Health. 2022;10: 975527.

    Article  PubMed  PubMed Central  Google Scholar 

  73. Bygdell M, Leach S, Lundberg L, Gyll D, Martikainen J, Santosa A, et al. A comprehensive characterization of patients diagnosed with post-COVID-19 condition in Sweden 16 months after the introduction of the International Classification of Diseases Tenth Revision diagnosis code (U09.9): a population-based cohort study. Int J Infect Dis. 2023;126:104–13.

    Article  PubMed  Google Scholar 

  74. Lorgelly P, Crossan J, Exeter D, McCullough A. The Burden of Long COVID in Aotearoa New Zealand: Establishing a Registry. Final Report. University of Auckland & Long Covid Support Aotearoa. 2024;online: https://lcregistry.auckland.ac.nz/files/2024/06/report_to_MoH.pdf.

  75. Hek K, Rolfes L, van Puijenbroek EP, Flinterman LE, Vorstenbosch S, van Dijk L, Verheij RA. Electronic health record-triggered research infrastructure combining real-world electronic health record data and patient-reported outcomes to detect benefits, risks, and impact of medication: development study. JMIR Med Inform. 2022;10(3): e33250.

    Article  PubMed  PubMed Central  Google Scholar 

  76. NHS Digital. Webpage “Community Services Data Set (CSDS)” https://digital.nhs.uk/data-and-information/data-collections-and-data-sets/data-sets/community-services-data-set. Accessed on 30 May 2024.

  77. Sivan M, Rocha Lawrence R, O’Brien P. Digital patient reported outcome measures platform for post-COVID-19 condition and other long-term conditions: user-centered development and technical description. JMIR Hum Factors. 2023;10: e48632.

    Article  PubMed  PubMed Central  Google Scholar 

  78. Sivan M, Greenwood D, Smith A, Rocha R, Osborne T, Goodwin M. A National Evaluation of Outcomes in long COVID Services using Digital PROM Data from the ELAROS Platform. 2023;online: https://locomotion.leeds.ac.uk/wp-content/uploads/sites/74/2023/10/National-Evaluation-of-LC-Service-Outcomes-using-ELAROS-Data-09-10-23.pdf. Accessed on 17 November 2023.

  79. Munblit D, Nicholson T, Akrami A, Apfelbacher C, Chen J, De Groote W, et al. A core outcome set for post-COVID-19 condition in adults for use in clinical practice and research: an international Delphi consensus study. Lancet Respir Med. 2022;10(7):715–24.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  80. Munblit D, Nicholson TR, Needham DM, Seylanova N, Parr C, Chen J, et al. Studying the post-COVID-19 condition: research challenges, strategies, and importance of Core Outcome Set development. BMC Med. 2022;20(1):50.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  81. PC-COS Project. Webpage "COS for adults. Results" https://www.pc-cos.org/pc-cos_results. Accessed on 30 May 2024.

  82. WHO. Global COVID-19 Clinical Platform Case Report Form (CRF) for Post COVID condition (Post COVID-19 CRF). 2021;online: https://www.who.int/publications/i/item/global-covid-19-clinical-platform-case-report-form-(crf)-for-post-covid-conditions-(post-covid-19-crf-). Accessed on 30 May 2024.

  83. Yang C, Tebbutt SJ. Long COVID: the next public health crisis is already on its way. Lancet Reg Health Eur. 2023;28: 100612.

    Article  PubMed  PubMed Central  Google Scholar 

  84. Woodrow M, Carey C, Ziauddeen N, Thomas R, Akrami A, Lutje V, et al. Systematic review of the prevalence of long COVID. Open Forum Infect Dis. 2023;10(7):233.

    Article  Google Scholar 

  85. Rando HM, Bennett TD, Byrd JB, Bramante C, Callahan TJ, Chute CG, et al. Challenges in defining long COVID: striking differences across literature. Electronic Health Records, and patient-reported information. medRxiv. 2021.

  86. Stubbs E, Exley J, Wittenberg R, Mays N. How to establish and sustain a disease register: Insights from a qualitative study of six registers in the UK. 2024;PREPRINT (Version 1) available at Research Square.

  87. Au L, Capotescu C, Eyal G, Finestone G. Long COVID and medical gaslighting: dismissal, delayed diagnosis, and deferred treatment. SSM Qual Res Health. 2022;2: 100167.

    Article  PubMed  PubMed Central  Google Scholar 

  88. Baz SA, Fang C, Carpentieri JD, Sheard L. “I don’t know what to do or where to go”. Experiences of accessing healthcare support from the perspectives of people living with long COVID and healthcare professionals: a qualitative study in Bradford, UK. Health Expect. 2023;26(1):542–54.

    Article  PubMed  Google Scholar 

  89. Ladds E, Rushforth A, Wieringa S, Taylor S, Rayner C, Husain L, Greenhalgh T. Persistent symptoms after Covid-19: qualitative study of 114 “long COVID” patients and draft quality principles for services. BMC Health Serv Res. 2020;20(1):1144.

    Article  PubMed  PubMed Central  Google Scholar 

  90. Kingstone T, Taylor AK, O’Donnell CA, Atherton H, Blane DN, Chew-Graham CA. Finding the ‘right’ GP: a qualitative study of the experiences of people with long-COVID. BJGP Open. 2020;4(5).

  91. Walker AJ, MacKenna B, Inglesby P, Tomlinson L, Rentsch CT, Curtis HJ, et al. Clinical coding of long COVID in English primary care: a federated analysis of 58 million patient records in situ using OpenSAFELY. Br J Gen Pract. 2021;71(712):e806–14.

    Article  PubMed  PubMed Central  Google Scholar 

  92. Thompson EJ, Williams DM, Walker AJ, Mitchell RE, Niedzwiedz CL, Yang TC, et al. Long COVID burden and risk factors in 10 UK longitudinal studies and electronic health records. Nat Commun. 2022;13(1):3528.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  93. Friedman KJ. Advances in ME/CFS: past, present, and future. Front Pediatr. 2019;7:131.

    Article  PubMed  PubMed Central  Google Scholar 

  94. Shenker N, Goebel A, Rockett M, Batchelor J, Jones GT, Parker R, et al. Establishing the characteristics for patients with chronic Complex Regional Pain Syndrome: the value of the CRPS-UK Registry. Br J Pain. 2015;9(2):122–8.

    Article  PubMed  PubMed Central  Google Scholar 

  95. Roth PH, Gadebusch-Bondio M. The contested meaning of “long COVID” – patients, doctors, and the politics of subjective evidence. Soc Sci Med. 2022;292: 114619.

    Article  PubMed  Google Scholar 

  96. House of Representatives Standing Committee on Health ACaS. Sick and tired: casting a long shadow. Inquiry into long COVID and repeated COVID infections. Parliament of Australia: Canberra. 2023.

  97. Martin Luther University Halle-Wittenberg. Webpage “Long COVID Register” https://webszh.uk-halle.de/longcovid-register/. Accessed on 31 July 2024.

  98. van der Heide I, Lambert M, Hansen J. Mapping long COVID across the EU—definitions, guidelines and surveillance systems in EU Member States – Final report. Publications Office of the European Union. 2024.

  99. Emerge. Webpage “Press Release: Official Launch Of The AusME Registry And Biobank” https://www.emerge.org.au/news/pres/. Accessed on 29 July 2024. 2023.

  100. Emerge. Webpage “AusME Registry & Biobank” https://www.emerge.org.au/ausme/. Accessed on 29 July 2024.

Download references

Acknowledgements

We thank all the participants who so generously gave their time to participate in our study. In particular we thank members of the long COVID patient organisations.

Funding

This study is funded by the NIHR Policy Research Programme through its core support to the Policy Innovation and Evaluation Research Unit (project no.: PR-PRU-1217-20602). The views expressed are those of the authors and are not necessarily those of the NIHR or the Department of Health and Social Care.

Author information

Authors and Affiliations

Authors

Contributions

All authors contributed to the concept and design of the study. J.E. and E.S. conducted the interviews. J.E. led the analysis and drafted the manuscript. All authors reviewed and edited the manuscript and approved the final version.

Corresponding author

Correspondence to Nicholas Mays.

Ethics declarations

Ethics approval and consent to participate

Ethical approval for this study was granted by the Research Ethics Committee at the London School of Hygiene and Tropical Medicine (ref. 28096). Participation in the study was entirely voluntary and participants were free to withdraw at any time without having to give a reason. We used a two-stage consent process. Potential participants were emailed an information sheet as part of the recruitment process and provided written informed consent for the interviews to be recorded and transcribed verbatim and for quotations to be used in any publications stemming from the study before taking part in an interview. Verbal consent was also sought at the start of the interview to confirm they were happy for the interview to be recorded and to answer any remaining questions.

Consent for publication

Not applicable.

Competing interests

The authors declare no competing interests.

Additional information

Publisher’s Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Supplementary Information

Rights and permissions

Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Exley, J., Stubbs, E., Wittenberg, R. et al. An international comparison of longitudinal health data collected on long COVID in nine high income countries: a qualitative data analysis. Health Res Policy Sys 23, 37 (2025). https://doiorg.publicaciones.saludcastillayleon.es/10.1186/s12961-025-01298-9

Download citation

  • Received:

  • Accepted:

  • Published:

  • DOI: https://doiorg.publicaciones.saludcastillayleon.es/10.1186/s12961-025-01298-9

Keywords