Keywords
Systematic review, Primary care, Electronic health records, Medications, Prescribing, Lung cancer, Early detection of cancer, Cancer screening
Lung cancer is the second most common cancer and the leading cause of cancer death worldwide. A significant reason for its high mortality is delayed diagnosis, with lung cancer typically diagnosed at an advanced stage. Previous research has shown that prescribing rates of certain medications increase in the 24 months preceding a cancer diagnosis. This suggests a potential opportunity for early diagnosis of lung cancer by the identification of high-risk patients based on the prescribing of medications associated with a subsequent lung cancer diagnosis. Our aim is to identify all prescribing events associated within an increased incidence of primary lung cancer in the subsequent 24 months.
We will conduct a systematic review, and, where possible, a meta-analysis, reporting the findings in accordance with the PRISMA reporting guideline. All peer-reviewed studies in the English language that quantitatively describe an association between prescribing data and lung cancer diagnosis using a control group will be eligible. Details regarding prescribing rate in the lung cancer group versus the control group will be extracted with study characteristics. Quality appraisal of studies, using ROBINS-E will be used for assessing risk of bias. For each drug studied, we will report prescribing rate ratios (PRRs) with 95% confidence intervals (CIs). A meta-analysis using a pooled estimate of PRRs, either by fixed or random-effect models, will be performed if possible.
This systematic review will summarise the evidence on drugs that, when prescribed, suggest the possibility of an as-yet-undiagnosed lung cancer. This research has the potential to impact clinical practice by informing targeted screening strategies and refining early detection protocols for this harmful disease. If achieved, this could increase the numbers of lung cancers diagnosed at an earlier stage, with consequent improvements to patients in terms of survival, treatment tolerability and quality of life.
Systematic review, Primary care, Electronic health records, Medications, Prescribing, Lung cancer, Early detection of cancer, Cancer screening
With 2.2 million new cases and 1.8 million deaths globally per year, lung cancer is the second most common cancer and the leading cause of cancer death worldwide1. A significant reason for its high mortality is delayed diagnosis, with lung cancer commonly diagnosed at an advanced stage2,3. This results in poorer prognosis—the 1-year survival rate for lung cancer in Britain is 85% for Stage I disease versus just 25% for Stage IV4.
There are multiple reasons why lung cancer may be diagnosed at a late stage, including the fact that early disease may be asymptomatic, so by the time symptoms do arise disease may be advanced5,6. Patient delay may occur if the patient is unaware of the potential significance of their symptoms or misinterprets them. Additionally, symptoms possibly indicative of lung cancer are common in primary care and overlap with those of common benign and self-limiting illnesses, presenting a diagnostic challenge for primary care physicians2,6.
One promising approach to achieving earlier diagnosis involves analysing electronic patient records for subtle variations in health utilisation data that signal the presence of an as-yet-undiagnosed lung cancer7. Existing research demonstrates the potential of personalised risk prediction models as decision aids for general practitioners (GPs), enabling them to identify patients with increased risk of cancer, thereby potentially facilitating earlier diagnoses7–9.
Previous studies have reported that prescribing rates of specific medications increase in the 24 months preceding a cancer diagnosis8,10–13. For example, prescribing rates of steroids increase before a diagnosis of Hodgkin’s Lymphoma14, laxatives before colorectal cancer15, and proton pump inhibitors (PPIs) before gastric cancer16. The obvious interpretation of these results is that patients with cancer symptoms are sometimes treated for common, non-malignant conditions, which also cause the same symptoms. In theory, if these patients could be identified, the primary care interval17, and therefore diagnostic delay, could be reduced.
However, it is important that we do not limit our consideration to pre-diagnostic prescribing patterns that are explained by overlapping clinical features between a cancer and a non-malignant condition. Rather, any prescribing event that is reliably associated with an increased probability of a subsequent cancer diagnosis, even where this association does not appeal to clinical intuition, can theoretically be used to identify high-risk individuals in need of further investigation18. This could enable identification of at-risk patients and facilitate earlier cancer diagnoses, ultimately improving patient outcomes.
The aim of this study is to systematically review the existing literature concerning primary care prescribing events that may signal the presence of an as-yet-undiagnosed lung cancer. For clarity, we are not trying to identify drugs or drug classes that increase or decrease the risk of developing lung cancer in the future.
This aim will be realised by the following specific objectives:
(1) to identify any studies that explore the association between prescribing events and the incidence of lung cancer in the subsequent 24 months;
(2) to report or calculate, for each prescribing event and time window examined, the Prescribing Rate (PR) for both the cancer and control groups, the PR ratio (PRR) and Odds Ratio (OR) of incident lung cancer;
(3) to calculate the Positive Predictive Value (PPV) and Relative Risk (RR) of each prescribing event and time window examined in cohort studies, where the prevalence can be empirically determined;
(4) to estimate the implied PPV and implied RR in studies where this cannot be directly determined from the reported data, by imputing the missing data from similar studies.
We will conduct a systematic review, and, where possible, a meta-analysis, reporting the findings in accordance with the Preferred Reporting Items for Systematic Reviews and Meta-Analyses (PRISMA) guidelines19. The protocol will be prospectively registered with PROSPERO after peer review comments have been addressed20. This protocol is reported in line with the PRISMA-P guidelines21.
We will include all published, peer-reviewed studies in the English language that report either:
[1] prescribing rates (i.e., the rate of a prescribing event) for a “cancer group” and a control group during specified time windows in the 24 months prior to a primary lung cancer diagnosis (or control event in the case of the control group).
[2] lung cancer incidence in a “prescription group” and a control group for specified time windows not extending beyond 24 months of the prescribing event (or control event, in the case of the control group).
Population. This review will consider all studies that involve human subjects represented in primary care prescribing datasets.
Exposure. This review will focus on the prescribing of drugs in the 24 months preceding a primary lung cancer diagnosis.
Outcomes. The primary outcome of interest is the prescribing rate ratio (PRR) of a drug or drug class during specified time windows prior to lung cancer diagnosis—this must be reported or deducible by calculation or imputation.
Study types. This review will be limited to observational studies. Since cross-sectional studies examine only a single timepoint, they cannot capture pre-diagnostic prescribing, leaving only case-control and cohort studies. Systematic reviews and other literature reviews will not be eligible for inclusion, but the papers which they reference will be considered. By definition, case reports are not eligible since they do not report comparisons. Clinical trials are also ineligible, as they are investigator-controlled and do not report natural primary care prescribing patterns.
The exclusion criteria are as follows: (1) the study does not report on quantitative research; (2) the study does not focus on primary diagnosis of lung cancer; (3) the study does not focus on, or separately report, prescribing data prior to a lung cancer diagnosis; (4) the study does not compare a lung cancer group and a control group; (5) the study does not examine primary care prescribing data; (6) the study exclusively examines time windows extending beyond 24 months prior to lung cancer diagnosis.
Electronic databases will be systematically searched, with an end-date of 31st May 2023, using a combination of subject heading terms (e.g., MeSH terms) and keywords pertaining to three search themes: (1) lung cancer, (2) pre-diagnosis, and (3) prescribing. An additional search replaced the “pre-diagnosis” theme with that of “risk” or “association”, so as to capture epidemiological studies that view drug exposure as the cause of later lung cancer incidence, rather than as a signal of as-yet-undiagnosed lung cancer. An example of the search strategy is provided as Extended data21. An information specialist was consulted to optimise the development of the search strategy. The search strategy will be tailored to the specific requirements of each database, and validated filters will be employed to retrieve primary studies as necessary. We will search the following: Cochrane Library; Cumulative Index to Nursing and Allied Health Literature (CINAHL); Embase; MEDLINE; ProQuest Dissertation and Thesis Database; Scopus; Web of Science.
Two authors will independently screen the title and abstract of all papers for eligibility. Discrepancies will be discussed and resolved through consensus. Reviewers will then independently assess the full text of potentially relevant studies to determine whether they fulfil the inclusion criteria. In cases of uncertainty or disagreement, a third reviewer's opinion will be sought.
Data will be extracted from included studies by one reviewer and confirmed by a second reviewer using a pro forma specifically designed for the purpose. After deduplication file metadata (title and abstract) will be inputted into the systematic review the web-based systematic review tool, Rayyan, where screening and data extraction will occur22.
The first category of data extracted relates to study characteristics: (1) publication year; (2) country/region of the study population; (3) study design (case-control or cohort); (4) the number of lung cancer cases; (5) the number of controls. The second category of data relates to prescribing data: (6) the drug or drug class being examined; (7) the number or proportion of patients in the lung cancer group prescribed the drug; (8) the number or proportion of patients in the control group prescribed the drug. The final datapoint to be extracted, (9) lung cancer prevalence in the study population, will likely only be reported or deducible from cohort studies.
For each prescribing event and specified time window, we will report or calculate the Prescribing Rate Ratio (PRR)—i.e., the PR in the cancer group (PRc) divided by the PR in the control group (PRn). We will also report the Odds Ratio (OR) of cancer for prescribing, along with 95% confidence intervals (CIs) for both metrics. For cohort studies, where the prevalence of cancer (Prev) can be empirically determined, we will report or calculate the Positive Predictive Value (PPV) and Relative Risk (RR) of cancer. The PRR was selected as the primary outcome as this statistic was commonly reported in case-control studies (in our pilot search) and can be calculated from cohort studies. This maximises the ability to compare between studies.
After eligibility has been determined, the ROBINS-E (“Risk Of Bias In Non-randomized Studies of Exposures”) tool will be used to assess the risk of bias within studies23. Two reviewers should assess risk of bias and any discrepancies resolved by discussion with a third reviewer. Where concerns exist for any study, this will be reported.
We will report summary statistics for all of the data items outlined above. Where summary statistics can be calculated from the raw data, we will recalculate each statistic to ensure consistency in comparison between studies. We will report where this cannot be achieved.
For studies where the prevalence of lung cancer (prescribing events) cannot be empirically determined, we will calculate the PPV and OR implied by a range of credible prevalence figures. These will be selected from prevalence figures reported in similar studies or established national data, so as to aid comparison. Table 1 provides explanations of how relevant epidemiological metrics will be calculated using the extracted data, however this list may not be exhaustive and further mathematical transformations may be required depending on how data are reported. Any post hoc variations in our approach will be clearly outlined in the final paper.
PRR, prescribing rate ratios; OR, Odds Ratio; PPV, Positive Predictive Value; NPV, Negative Predictive Value; RR, Relative Risk.
| Epidemiological/diagnostic metric | Formula as a function of extracted data | 
|---|---|
| PRR (not a function of prevalence) | |
| OR (not a function of prevalence) | |
| PPV | |
| 1-NPV (for calculation purposes only) | |
| RR | 
We will report pooled estimates for the above statistics if multiple studies examine the association between incident lung cancer and a comparable prescribing event for a comparable time window in comparable populations. In this scenario, the GRADE (Grading of Recommendations Assessment, Development and Evaluation) approach will be used to provide a rating of the strength of the body of evidence24.
Forest plots will be used to display the results and a pooled estimate with a prediction interval. The proportion of total variability due to between-study heterogeneity will be assessed statistically using Higgins & Thompson’s I² values. The Tau-squared and Cochran’s Q tests for heterogeneity will also be reported. Where significant heterogeneity exists, this will be explored qualitatively between studies. If the heterogeneity is large and unexplained meta-analysis may not be possible.
For each analysis of more than 10 studies, a funnel plot and Egger test will be used to assess small samples publication bias. A sub-group analysis will be used to explore the effect of clinical factors (e.g., sex or age-stratification) and study design (i.e., case-control versus cohort) should these be available from multiple studies reporting the same drug-cancer association for the comparable time point or period. Where post hoc adjustments (e.g., imputation of unavailable data) are employed in the production of pooled estimates, a sensitivity analysis will also be performed.
Meta-analysis will be conducted in R version 4.2.3, using the “metafor” package25. If a meta-analysis is not possible, alternative synthesis will be conducted using the Synthesis without Meta-Analysis (SWiM) guideline.
We will conduct a systematic review identifying studies that explore the association between prescribing and an impending lung cancer diagnosis. Using our eligibility criteria, studies will be screened, and data items and summary measures extracted. Using this information, we will quantify the association between certain prescribing events and subsequent lung cancer diagnosis. A meta-analysis using a pool risk estimate will be performed if possible.
The strength of this review is that it will facilitate the production of a systematic overview of prescribing events that are suggestive of an as-yet-undiagnosed lung cancer. There are a number of potential limitations of this review. Primary care prescribing data stored centrally in repositories (which are used for pharmacoepidemiological research) may differ from primary care prescribing data stored locally, thereby impeding the assumption that the results could be applied directly into the General Practice context. Reduced rates of GP consultations due to the COVID-19 pandemic may affect results due to less prescribing and diagnosis during this period26.
A noteworthy limitation of this work is that precise universal conclusions cannot be drawn from a synthesis of studies conducted in diverse populations. Firstly, the background rates of lung cancer will differ according to geography and sociodemographics. Secondly, clinical practice, prescribing guidelines and over-the-counter medication use will vary significantly between regions. However, this globally oriented review will serve to illuminate pre-diagnostic prescribing patterns with broad universal relevance, as well as critical knowledge gaps specific to individual regions which require future research.
A previous systematic review, by White et al., summarised reported diagnostic windows and inflection points for a range of cancers based on variations in healthcare utilisation8. This study, which examined pre-diagnostic changes in consultation rate, prescriptions, and diagnostic tests, included four studies that reported prescribing rates of certain medications in the 12 months before a cancer diagnosis10,11,13,27, one of which focused on lung cancer10. The present systematic review, however, will include all studies that enable deduction of prescribing rates in a lung cancer group and control group, not just those from which an inflection point could be determined. Demonstrating the effect of this broader scope is the observation that our pilot search of the MEDLINE database only, revealed all four of the studies from White et al., as well as four others.
There are many opportunities for future research in this area. While this review will explore pre-diagnostic prescribing patterns in the wider primary care population, further insights could be gleaned from stratifying the population according to other clinical information such as major co-morbidities (e.g., asthma and Chronic Obstructive Pulmonary Disease) and smoking status. This review will also focus on primary lung cancer diagnoses and further research could be done on cancer recurrence or outcomes. It is not likely that any of the identified studies, nor any meta-analysis thereof, will lead directly to changes in policy or clinical practice. However, this systematic review will directly inform attempts to identify individuals at an elevated risk of lung cancer who might benefit from screening or rapid-access referral pathways.
Many clinical prediction tools have been developed to aid GPs in estimating a patient’s risk of various cancers28. A potential advancement could involve automatic analysis of prescribing records, alerting GPs (during a consultation) to increased cancer risk in certain patients. This information could serve as a decision aid to assist GPs in detecting lung cancer at an earlier stage. We hope that this systematic review will usefully summarise the existing literature for researchers working in this area, provoking further innovation through the cross-pollination of ideas. Findings may also be of interest to stakeholders beyond the research community, including GPs, community pharmacists, other primary healthcare professionals, and policymakers.
Open Science Framework: Primary care prescribing prior to lung cancer diagnosis (PPP-Lung): protocol for a systematic review. https://doi.org/10.17605/OSF.IO/78GHF21.
This project contains the following extended data:
Data are available under the terms of the Creative Commons Attribution 4.0 International license (CC-BY 4.0).
We would like to acknowledge Paul J Murphy (Information Specialist, RCSI Library), for his help designing the search strategy, Seamus Cotter (Patient and public involvement contributor, Irish Lung Cancer Community), for his input to the protocol, and Sarah Jacob (PRiCAN Research Group, RCSI) for facilitating the open-access availability of supplemental files via the Open Science Framework.
Is the rationale for, and objectives of, the study clearly described?
Yes
Is the study design appropriate for the research question?
Yes
Are sufficient details of the methods provided to allow replication by others?
Yes
Are the datasets clearly presented in a useable and accessible format?
Not applicable
Competing Interests: No competing interests were disclosed.
Reviewer Expertise: pathology, lung cancer
Alongside their report, reviewers assign a status to the article:
| Invited Reviewers | |
|---|---|
| 1 | |
| Version 1 24 Apr 24 | read | 
Provide sufficient details of any financial or non-financial competing interests to enable users to assess whether your comments might lead a reasonable person to question your impartiality. Consider the following examples, but note that this is not an exhaustive list:
Sign up for content alerts and receive a weekly or monthly email with all newly published articles
Register with HRB Open Research
Already registered? Sign in
Submission to HRB Open Research is open to all HRB grantholders or people working on a HRB-funded/co-funded grant on or since 1 January 2017. Sign up for information about developments, publishing and publications from HRB Open Research.
We'll keep you updated on any major new updates to HRB Open Research
The email address should be the one you originally registered with F1000.
You registered with F1000 via Google, so we cannot reset your password.
To sign in, please click here.
If you still need help with your Google account password, please click here.
You registered with F1000 via Facebook, so we cannot reset your password.
To sign in, please click here.
If you still need help with your Facebook account password, please click here.
If your email address is registered with us, we will email you instructions to reset your password.
If you think you should have received this email but it has not arrived, please check your spam filters and/or contact for further assistance.
Comments on this article Comments (0)