Development of a primary care research network focused on chronic disease: a feasibility study for both practices and research networks [version 1; peer review: 1 approved with reservations]

Background: High quality data should be a key resource for research and planning of healthcare, but low quality general practice data has been documented internationally. This study assessed the feasibility of collecting reliable chronic disease data in Irish general practice, using a program of training and feedback to improve the quality of coding for chronic conditions in practice information systems. Methods: Training in chronic disease coding and reporting was provided to a purposive sample of general practices in Ireland. From July to December 2020, practices reported the number of patients receiving free medical care, and the number of patients coded with each of eight chronic conditions: type 2 diabetes mellitus (T2DM), asthma, chronic obstructive pulmonary disease (COPD), ischaemic heart disease (IHD), heart failure (HF), atrial fibrillation (ATF), transient ischaemic attack (TIA) and cerebrovascular accident/stroke (CVA). Calculated prevalences were compared with national and international estimates. Results: We recruited and trained 16 practices with 65.5 full-time equivalent GPs and a study-eligible patient population of 36,327. There was a large degree of variation across practices for all conditions. For example, in July, reported prevalence of IHD ranged from 0.3% to 10.2% (a 34-fold difference), and reported prevalence of HF ranged from 0.2% to 4.0% (a 20-fold difference). No single practice Open Peer Review


Introduction
General practice in Ireland delivers an estimated 29.1 million consultations a year (Collins & Homeniuk, 2021). Each citizen receives, on average, over four consultations each year (Healthy Ireland, 2019), and more than 95% of these consultations are recorded on electronic medical record systems (O'Kelly et al., 2016). Irish general practitioners (GPs) work in a mixed public and private system, with a majority of consultations delivered to those with public healthcare entitlements (Behan et al., 2013). Those entitled to free medical care at the point of delivery are known as 'eligible patients', and make up approximately 37% of the Irish population (Central Statistics Office, 2016). For the remainder, a fee is paid on a per consultation basis.
Until recently, there was little incentive in Ireland to code a clinical condition. For many years, in a context of high GP workloads (Crosbie et al., 2020), it was regarded as another time-consuming task without incentivisation or resourcing for electronic medical record-keeping. Where coding was done, significant concerns regarding comprehensiveness and accuracy were reported (Collins & Janssens, 2012). One should note that issues with general practice coding are not uniquely Irish, having been reported in countries as diverse as Norway (Botsis et al., 2010), and Australia (Pearce et al., 2019;Springate et al., 2014).
In 2015, the Irish Health Services Executive (HSE) introduced the first national chronic disease care pathway for type 2 diabetes (T2DM) (Health Service Executive, 2015), followed in 2020 by the Chronic Disease Management (CDM) programme (Health Service Executive, 2020a). These schemes cover the portion of the population eligible for free GP care. The CDM has resourced GPs to code and manage eight chronic conditions in eligible patients, namely T2DM, asthma, chronic obstructive pulmonary disease (COPD), ischaemic heart disease (IHD), heart failure (HF), atrial fibrillation (ATF), transient ischaemic attack (TIA), and cerebrovascular accident (CVA). The CDM provides an incentive to GPs to code their patients living with these conditions, thereby also establishing a coding ethos in everyday practice. The CDM also incorporates a function whereby if a patient is identified in the CDM software as having one of these conditions, it is then back-coded into their patient record if not already recorded.
The limitations of Irish general practice coding have impacted on the ability to conduct trials. For example, in the DECIDE trial (Murphy et al., 2018), it was necessary to fund the development of a software tool to identify individuals with poorly controlled T2DM. Even so, difficulties arose in many practices due to lack of standardisation of data entry for chronic disease (Collins & Janssens, 2012). Similarly, in the 2015 SIMPLe trial on the management of urinary tract infections in general practice, 3,314 patients were enrolled and followed up over a nine-month period, but the use of coding for individual consultations was found to occur only 40% of the time (Galvin et al., 2015). Again however, one should note these issues are not unique to Irish general practice, and a lack of tools to explore a narrative record that contains no coded or structured data has been documented internationally (de Lusignan, 2005;de Lusignan & van Weel, 2006).

The current study
High quality primary care data should be a key resource for research and for effective planning of healthcare in Ireland and elsewhere (Mant, 2006;Pearce et al., 2019). Additionally, the availability of a cohort of practices with wide geographic distribution, varied levels of staffing, experience of reliably coding data, and affiliation with a research centre enhances the ability to run clinical trials (Fogel, 2018).
In 2015, the Health Research Board (HRB) established the Primary Care Clinical Trials Network Ireland (HRB Primary Care CTNI), which aims to improve individual patient health and health care by conducting high quality, internationally recognised, randomised trials which address important and common problems. The HRB Primary Care CTNI designed the current study to assess the feasibility of collecting reliable chronic disease data and to establish the effectiveness of a program of training and feedback to improve the quality of coding for chronic conditions in general practice information systems. Specific objectives were to assess: 1. How feasible is it to deliver this programme and collect the data?
2. What is the impact of this programme?
3. How reliable is the data generated by this activity?

Study design
This was a longitudinal study of reporting and feedback of coding quality in a purposively selected sample of Irish general practices.

Ethics statement
Ethical approval was secured from the research ethics committee of the Irish College of General Practitioners (ICGP) in December 2019 (20/12/2019). Formal written consent to participation was obtained, with signatures from lead GPs and practice data protection officers. Consent to publication of the data was later obtained via email on the 26 th of May 2021. Patient consent was not required for this study, as only aggregated practice-level data was collected.
Irish GP software systems and morbidity coding This study focused on the two most commonly used GP software systems in Ireland, 'Socrates' and 'Health One', both owned by Clanwilliam Health. 'Health One' has a very active user group which is independent of Clanwilliam Health, with a Technical Director Dr Rory O'Driscoll (a practising GP and a co-author on this paper). In both software systems, morbidity coding may utilise the International Classification of Primary Care (ICPC) (World Organization of Family Docstors (WONCA), 2020), or the International Classification of Diseases (10 th edition, ICD-10) (World Health Organisation, 2020). 'Health One' also has a system of coded phrases in its dictionary of terms which are linked with ICD-10 and ICPC.
Practices were encouraged to continue morbidity coding in whatever system they were using at the time of enrolment.

GP practice selection
The selection criterion for practices was current use of clinical coding. In November and December 2019, the study team identified and sent invitation letters to 17 practices across Ireland, including a mixture of small (1-2 GPs), medium (3 GPs), and large (4+ GPs) practices. The study team deemed the sample to be thus representative of the national profile of GP practices.

Baseline practice visits
In January and February 2020, all 17 contacted practices had responded positively to the invitation letters, and were visited by research team members ROC and PJM. At these visits, relevant practice personnel (usually at least one GP and a practice administrator) were provided with additional information about the study. Formal written consent to participation was obtained, with signatures from lead GPs and practice data protection officers. Practices received between €2,500 and €5,000 for participation, depending on practice size. Consent to publication of the data was later obtained by email on the 26 th of May 2021.
Training on morbidity coding was provided, tailored to the practice's software system, 'Socrates' or 'Health One' (see Files 1 and 2 in Extended data). Face-to-face training was also provided on how to complete monthly study reports. Monthly study reports captured only aggregated practice-level data, i.e. the total number of 'eligible patients' on the practice list, and the total number of these patients coded for each of the eight chronic conditions included in the CDM programme (see File 3 in Extended data for the report template). As above, 'eligible patients' were those entitled to free medical care at the point of delivery. Private (fee-paying) patients could not be included due to the inaccuracy of registration in Ireland.

Monthly reporting and feedback
Each calendar month (July to December 2020), practice administrators completed the study reports, and submitted to the study team via a secure clinical email service, HealthMail. Of note, the process of completing study reports differed significantly according to software system. 'Socrates' required detailed manual searches, manual calculations, and manual data entry (see File 2 in Extended data). 'Health One' allowed for a much more automated process (see File 1 in Extended data), which over the course of the study became almost fully automated with the help of the user group Technical Director (see File 4 in Extended data). For the duration of the study, bespoke training and support in report completion were provided to practices by author ROC, where requested by a practice or where identified as necessary by ROC.
Monthly feedback reports were sent to practices, allowing anonymised comparison of condition prevalences across practices (See Figure 1, which provides an example for one of the practices for Diabetes Mellitus II). Condition prevalences were calculated, using Microsoft Excel, as the percentage of eligible patients coded with the conditions. Due to the paucity of accurate published chronic disease prevalence data for Ireland, comparison to national benchmark data was restricted to the final analysis.

Practice characteristics
A total of 17 practices were recruited to the study; one practice withdrew due to administrative difficulties, leaving 16 reporting practices. Five practices were characterised by the research team as rural, 10 as urban, and one as mixed. Five practices were small (1-2 GPs), three were medium (3 GPs), and eight were large (4+ GPs). This is broadly similar to the national Irish profile (O'Kelly et al., 2016). The 16 practices

Reporting and feedback process
Initially, approximately half of the practices needed support to complete the monthly reports, mainly via telephone. Within three months, such support was rarely required, and most practices submitted reports on request.
Some practices using 'Socrates' had unexplained variations in the prevalence of conditions reported. For example, in one calendar month Practice G reported a rate of heart failure that was twice the rate reported in the previous months, and Practice L reported a rate of heart failure that was half the rate reported in the previous months. Considered more likely to represent error rather than true variation in prevalence, these figures were brought to the attention of the practices, confirmed as erroneous, and corrected. Of note, these errors were only apparent for practices using the more manual 'Socrates' process, and not for practices using the automated 'Health One' process.
Practice size was not an apparent determinant of the level of engagement with the project, or the degree to which errors occurred in the monthly figures submitted.
Monthly feedback reports were manually generated by author ROC using Microsoft Excel (see Figure 1) and emailed to practices. However, when it became apparent that not all practices had colour printers and so were unable to share their monthly feedback reports with their practice staff, author ROC also printed and posted copies of reports to practices.
Reported prevalence rates Table 1 shows the minimum, maximum, and average prevalence of eight chronic diseases across all practices for the first monthly report (July 2020) and the final monthly report (December 2020), and the lowest change and the highest change in any practice between July and December. It also shows the average change in the prevalence of each condition between July and December. The full dataset of practice reporting is available in Underlying data.
There was a large degree of variation across practices in reported prevalences for all conditions. In the July report, this variation was lowest for T2DM and ATF (3.6 and 5.4-fold difference respectively) and highest for IHD and Heart Failure (34.0 and 20.0-fold respectively).
No single practice had high or low prevalence rates across all conditions. Even those with highest prevalence in one condition had amongst the lowest in others. For example, Practice N reported a prevalence of 16.6% for asthma and a prevalence of 0.5% for TIA in July. The changes over time across all practices were minimal, averaging between 0.1 and 0.3% for all conditions reported. By the final report in December, a large degree in variation across practices in reported prevalences for all conditions still remained.

Comparison to published Irish and English prevalence rates
Currently, Ireland does not have a chronic disease register, and therefore no gold standard for chronic disease prevalence. In consequence, a comparison was made between the average prevalences in the current study (as reported in December 2020), and the best available published estimates for Ireland  Table 2 and Table 2A. Heart Failure (Jennings, 2014) IH� (Wilkins et al., 2017) Stroke (Wilkins et al., 2017) TIA Nil ATF (Wilkins et al., 2017) Whicher et al., 2020(Whicher et al., 2020.
For the Irish comparative data, the current study's average prevalences for heart failure and asthma showed the least discrepancy from these published estimates, a 1.1-fold and 1.3-fold discrepancy respectively. The current study's average prevalences for COPD and stroke showed the greatest discrepancy, at 4.1-fold and 3.2-fold respectively. Across all conditions, the current study's prevalences were higher than the previously published estimates.
The current study's prevalences tended to be closer to those published for England, rather than those previously published for Ireland. Again however, across all conditions the current prevalences were higher than the English estimates, if less markedly so.

Discussion
To our knowledge, this is the first such data quality study to be undertaken in Ireland where a general practice information programme of training, regular data gathering, and feedback to each practice was implemented.
Feasibility of programme delivery and data collection Collecting data remotely from practices was feasible in this study, as such activity was adequately resourced by direct funding. Practices also received reimbursement from the HSE by carrying out appropriate monitoring of "eligible" patients with the named chronic diseases in accordance with the Irish CDM programme (Health Service Executive, 2020a). Therefore, there was a financial incentive for practices to code patients as having these eight conditions, which increased the feasibility of the current study.
Over the six months or so of the project's duration, only one single-handed practice resigned. This was despite an unprecedented strain on all practices due to the COVID-19 pandemic.
The process was greatly assisted by having a nominated person in the practice who was responsible for carrying out the search in protected time. This, and calculating the number of suitable patients available in the practice, have been shown to be essential elements of establishing a feasible research study (Arain et al., 2010). This is important because recruitment of patients to clinical trials has been shown to be a problem (Treweek et al., 2018;Walters et al., 2017). The most effective recruitment methods in primary care research require practitioner involvement (Ngune et al., 2012). Study site selection is also an important element of the clinical trial process, as well as having experienced and involved staff, and an enthusiastic investigator (Fogel, 2018). Giving staff this experience of finding suitable patients and making regular data returns whilst developing a relationship with the investigating team during the course of this project will enhance the HRB Primary Care CTNI's ability to find suitable patients for clinical trials when appropriate in the future.
Monthly feedback reports provided to the practices enabled them to review their performance against that of their peers. This potentially enables data driven improvement, an essential element of audit, which leads to improved quality (Foy et al., 2020). A qualitative study of the participating practices is planned with a view to helping design future interventions to improve data quality.
Automation of the search process in 'Health One' led to considerable advantages. Monthly reporting could be completed very quickly (less than five minutes), and without error. An automated process also meant less reliance on individual staff members who had received training, thereby avoiding delays when those staff members were unavailable due to holiday or sick leave. Conversely, the manual process in 'Socrates' was time consuming (~ 30 minutes), error-prone, and dependent on the availability of (usually) a single trained individual. Long-term feasibility of programmes such as ours is therefore dependent on automation.

Data reliability
Concerns about the quality of coding in general practice have been expressed in Ireland (Collins & Janssens, 2012), and internationally (Barnett et al., 2012;Botsis et al., 2010;Pearce et al., 2019;Springate et al., 2014). While the current study did not formally assess coding quality, it is noteworthy that the study practices showed such enormous variation in reported prevalences for the eight chronic conditions.
One may speculate as to the underlying reason for this variation. For IHD and heart failure, some practices may have coded patients based on clinical suspicion only (e.g. dyspnoea or chest pain) without necessarily having definitive objective testing, while other practices may only have coded where the diagnosis had been proven by use of echocardiography and/or use of biochemical measures such as brain natriuretic peptide (BNP). Some variation could also be explained by the tendency of certain conditions such as asthma to remit spontaneously as children become adults (de Nijs et al., 2013;Trivedi & Denton, 2019), and a consequent failure by some GPs to un-code people whose disease had gone into remission. For some conditions, such as COPD, variation may signify patchy adherence to currently accepted diagnostic criteria (O'Halloran et al., 2020).
Variation in coding quality across practices is likely a cause of the discrepancy between our reported prevalences, and the best data for Ireland and England (Table 2). It should be noted however that English data represent the entire population, while ours only include those eligible for free care. Since this latter group is likely to be older, poorer, and sicker than those who are not eligible (Health Service Executive, 2021), some discrepancy between English figures and ours is to be expected. There is also a longer tradition of coding in England under the Quality and Outcome Framework (QOF) (NHS Digital, 2021), which goes back to 2004.
In the round, in attempting to account for the variation in reported prevalences, it is difficult to avoid the conclusion that there was an underlying variation in coding quality. Validation of coding in Irish general practice is needed, using proven methods. For example, appointing a Clinical Data Manager to a large medical centre to implement a data management strategy which involves coding consultations, tailored support and ongoing individualised feedback to clinicians has been shown to be effective (Sweeney & Perry, 2014).

Limitations
Prevalences reported by practices were not confirmed by audit. There was no assessment of practice characteristics that would have an effect on the prevalence of chronic conditions including age, gender and socioeconomic deprivation. In Ireland over 60% of the population are not eligible for free GP care (Central Statistics Office, 2016). This means that they are not registered with any single GP and can move from practice to practice for their medical care. Studying the care they receive and resultant outcomes is therefore challenging. As a result, we opted not to include this group in our current study. This is not ideal. Addressing this limitation in future studies will require the introduction of a unique patient identifier, as has been recognised nationally for some time (Department of Health and Children, 2004; Health Information and Quality Authority, 2010). The selection criterion for GP practices was current use of clinical coding. The study sample therefore may not be representative of Irish GPs. This study was conducted during the COVID-19 pandemic. It is possible that this had an effect on the coding prevalence as patients were less likely to attend in person. Also, preoccupation with COVID-19 symptoms and less emphasis on routine chronic disease management may have reduced the likelihood and accuracy of coding.

Conclusion
Although hampered by the COVID-19 pandemic, it was feasible to implement this programme of training and feedback to report on chronic disease data recorded in general practice.
Coding quality in Irish general practice is highly varied, and improvement would require a greater degree of intervention, including audit.
This project contains the following extended data: • Extended Data File 1 -Health One Instructions.docx (Health One training on morbidity coding).
• Extended Data File 4 -Health One Automation.wmv (instruction video of Health One automated process).
Data are available under the terms of the Creative Commons Zero "No rights reserved" data waiver (CC0 1.0 Public domain dedication).