Iranian Journal of Medical Sciences

Document Type : Review Article

Authors

1 Faculty of Medicine, University of Padjadjaran, Bandung, Indonesia

2 Division of Gastroenterology, Pancreatobilliary and Digestive Endoscopy, Department of Internal Medicine, Faculty of Medicine, University of Indonesia Dr. Cipto Mangunkusumo National General Hospital, Jakarta, Indonesia

3 Human Cancer Research Center, Indonesian Medical Education and Research Institute, Faculty of Medicine, University of Indonesia, Jakarta, Indonesia

Abstract

Background: Colorectal cancer (CRC) screening is essential to reduce incidence and mortality rates. However, participation in screening remains suboptimal. ColonFlag, a machine learning algorithm using complete blood count (CBC), identifies individuals at high CRC risk using routinely performed tests. This study aims to review the existing literature assessing the efficacy of ColonFlag across diverse populations in multiple countries.
Methods: The Preferred Reporting Items for Systematic Review and Meta-Analysis (PRISMA) were followed in reporting this systematic review. Searches were conducted on PubMed, Cochrane, ScienceDirect, and Google Scholar for English articles, using keywords related to CBC, machine learning, ColonFlag, and CRC, covering the first development study from 2016 to August 2023. The Cochrane Prediction Model Risk of Bias Assessment Tool (PROBAST) was used to assess the risk of bias.
Results: A total of 949 articles were identified during the literature search. Ten studies were found to be eligible. ColonFlag yielded Area Under the Curve (AUC) values ranging from 0.736 to 0.82. The sensitivity and specificity ranged from 3.91% to 35.4% and 82.73% to 94%, respectively. The positive predictive values ranged between 2.6% and 9.1%, while the negative predictive values ranged from 97.6% to 99.9%. ColonFlag performed better in shorter time windows, tumors located more proximally, in advanced stages, and in cases of CRC compared to adenoma.
Conclusion: While ColonFlag exhibits low sensitivity compared to established screening methods such as the fecal immunochemical test (FIT) or colonoscopy, its potential to detect CRC before clinical diagnosis suggests an opportunity for identifying more cases than regular screening alone. 

Keywords

What’s Known

Detecting asymptomatic individuals for colorectal cancer (CRC) screening remains a challenging task. ColonFlag is a machine learning algorithm, incorporating age, gender, and 20 complete blood count (CBC) parameters from routine lab data. Machine learning techniques offer a valuable supplementary avenue, yet their feasibility for translation into clinical practice remains uncertain.

What’s New

ColonFlag demonstrated the ability to detect CRC in asymptomatic patients, yet it exhibited variability in performance across diverse populations. While ColonFlag is not intended to replace traditional screening programs, its potential to identify CRC before clinical diagnosis suggests an opportunity to detect more cases than regular screening alone.

Introduction

Colorectal cancer (CRC) stands as the world’s third most common cancer, with over 1.9 million new cases and 930,000 deaths in 2020 alone. 1 - 3 Developed countries witness 25-30% of CRC diagnoses in stage IV with distant metastases. 4 Effective screening is crucial to lower CRC incidence and mortality. 5 , 6 Current options include a decade-spanning colonoscopy or an annual fecal immunochemical test (FIT). 7 Despite recognized benefits, participation in CRC screening remains suboptimal. 8 , 9

Israel’s cost-effective approach uses a machine learning algorithm called ColonFlag to scan routine lab tests for high-risk indicators. 10 Anemia, identified with a 9.7% positive predictive value, can signal high-risk CRC. 11 In individuals lacking apparent anemia, colorectal neoplasms can still induce subtle changes in lab profiles due to minor blood loss. 12 , 13 The ColonFlag algorithm integrates demographic data and complete blood counts (CBC), predicting asymptomatic CRC, and has been validated in several countries. 6 , 14 - 19 This study aims to review the existing literature assessing the efficacy of ColonFlag across diverse populations in multiple countries.

Materials and Methods

Data Sources and Search Strategy

We adhered to PRISMA guidelines for our systematic review, registered on PROSPERO (ID: CRD42023454992). Searching on databases and gateways such as PubMed, Cochrane, ScienceDirect, and Google Scholar from 2016 to August 2023, we focused on English articles using specific keywords related to CBC, machine learning, ColonFlag, and CRC (table 1). We specifically chose articles from 2016 as it marks the first development study of ColonFlag. The objective of this study was to specifically evaluate ColonFlag as one of the existing machine learning algorithms. Titles and abstracts were independently assessed by RDP and SAS, with disagreements resolved through discussion with TAS.

No Query Results
PubMed 1. (Blood count* OR “full blood count*” OR “complete blood count*” OR “blood work”) 408,201
2. (((ColonFlag OR “machine learning” OR “Models, Statistical”[Mesh] OR “ROC Curve”[MESH] OR “predict* tool*”[tw] OR nomogram*[tw] OR “predict* model*”[tw] OR decision*[tw] OR scor*[tw] OR algorithm*[tw] OR “risk scor*”[tw] OR “sensitivity and specificity*”[tw] OR sensitivity[tw] OR specificity[tw] OR “predictive value of tests”[tw] OR prediction*[tw] OR “receiver operating characteristic curve*”[tw] OR “ROC curve*”[tw] OR “area under curve*”[tw] OR “area under curve”[tw] OR “area under the curve*”[tw] OR AUC[tw] OR “C statistic*”[tw] OR discriminat*[tw] OR classif*[tw] OR “absolute risk*”[tw] OR indices[tw] OR stratify*[tw] OR “c-statistic”[tw] OR “C statistic”[tw] OR “statistical learning”[tw] OR “statistical-learning”[tw] OR “positive predictive value*”[tw] OR “negative predictive value*”))) 6,181,583
3. ((“Colorectal Neoplasms”[Mesh] OR ((colorectal[tw] OR colorect*[tw]) AND (tumo*[tw] OR cancer[tw] OR carcinom*[tw] OR neoplas*[tw] OR malignan*[tw]))) OR (“Colonic Neoplasms”[Mesh] OR ((colon[tw] OR bowel[tw] OR colon*[tw]) AND (neoplas*[tw] OR tumo*[tw] OR cancer[tw] OR carcinom*[tw] OR malignan*[tw]))))) 449,967
4. #1 AND #2 AND #3 2,039
5. #4 NOT (“case reports”[Publication Type] OR “comment”[Publication Type] OR “editorial”[Publication Type] OR “guideline”[Publication Type] OR “introductory journal article”[Publication Type] OR “meta analysis”[Publication Type] OR “news”[Publication Type] OR “retracted publication”[Publication Type] OR “review”[Publication Type] OR “systematic review”[Publication Type]) 1,829
6. #5; filter English, Adult 19+ years 1,089
7. #6; filter 2016-2023 467
Cochrane 1. colorectal cancer OR colon cancer OR colorectal neoplasm* OR colon neoplasm* 24,087
2. “Full blood count” OR “complete blood count” 2,452
3. ColonFlag OR machine learning OR predict* model OR algorithm 39,149
4. #1 AND #2 AND #3 2
ScienceDirect 1. “colorectal cancer” OR “colorectal neoplasm” OR “colon cancer” OR “colon neoplasm” 262,063
2. ColonFlag OR machine learning 264,400
3. “Complete blood count” OR “full blood count” 80,689
4. #1 AND #2 AND #3 133
5. #4; filter 2016-2023 137
Google Scholar 1. “colorectal cancer” OR “colorectal neoplasm*” OR “colon cancer” OR “colon neoplasm*” 18,100
2. ColonFlag OR “machine learning” 18,600
3. “Full blood count” OR “complete blood count” 17,200
4. #1 AND #2 AND #3 823
5. #4; filter 2016-2023 632
6. #5 NOT “systematic review*” 347
Table 1.Detailed description of the search strategy used for systematic review

Inclusion and Exclusion Criteria

English-language primary research articles evaluating ColonFlag’s performance in CRC risk detection were included. Abstracts, conference proceedings, previously published systematic reviews, correspondence, and case studies were excluded.

Data Extraction

Three reviewers (RDP, SAS, NNH) independently assessed study eligibility and collected data using tailored extraction forms. Validation occurred through subsequent discussions, resolving disagreements until consensus. Extracted data included publication year, design, location, patient details (setting, type, population), sample size, data source, baseline patient characteristics, and model performance measures: Area Under the ROC Curve (AUC), sensitivity, specificity, positive predictive value (PPV), negative predictive value (NPV), and odds ratio (OR).

Risk of Bias

The PROBAST was used to assess bias in studies developing or validating prediction models. PROBAST includes signaling questions in four domains: 1) Participants: How well the study population represents the target group, how missing data is managed, and how participants are chosen for model development or validation. 2) Predictors: The selection and measurement of variables used in the model, including how missing data, categorization, and interactions are handled. 3) Outcome: How the outcome (what the model predicts) is measured and managed, considering blinding, completeness of data, and appropriate outcome definitions. 4) Analysis: Evaluation of model development aspects, the type of selected model, management of missing data, and methods used for validation. 20 Three reviewers independently performed the risk of bias evaluation, which was confirmed by subsequent discussion. Any discrepancies that arose were discussed for resolution.

Results

Study Selection

From 949 initially identified articles, 591 underwent screening after removing duplicates. Figure 1 outlines the selection process following PRISMA guidelines. Initially, 14 articles were eligible based on titles and abstracts. During the full-text assessment, four articles were excluded as they did not use ColonFlag as the intended index test. Two studies did not use artificial intelligence (AI), instead, they compared blood count parameters in two groups (n=1) and assessed the enhancement of FIT with blood test values (n=1). The other two studies employed a deep neural network for various parameters such as tumor marker and blood chemistry, not merely blood count (n=1), and evaluated AI models based on colonoscopy images and diverse datasets (n=1).

Figure 1. The flow diagram shows the study selection process following the PRISMA 2020 statement. We identified 949 records via online databases, of which 591 underwent screening based on title and abstract. Subsequently, 14 studies were evaluated for eligibility, with four studies excluded for not employing machine learning (n=2) or not utilizing ColonFlag (n=2). Finally, 10 studies were deemed suitable for inclusion in this systematic review.

Study Characteristics

This review included 10 studies outlined in table 2, providing details on the studies and subject characteristics. One study introduced ColonFlag as a novel algorithm, 10 seven studies 6 , 14 - 19 validated it across diverse populations, and two studies 21 , 22 compared ColonFlag’s performance with FIT. Sample sizes varied from 17,000 to 2.5 million individuals, drawn from asymptomatic subjects, electronic medical records (EMR), or primary care databases. Ayling and others focused on symptomatic individuals in a prospective study with approximately 500 subjects. 21 , 22 Goshen and others conducted a 14-month prospective study using ColonFlag to detect asymptomatic CRCs in a population at risk. 17 The remaining seven studies collected data retrospectively, and the majority of them additionally conducted a case-control analysis. Data, primarily from general practice records, was collected nationwide, with some studies including hospital records. Kinar and others expanded their dataset by incorporating records from Israel and the United Kingdom. 10

Study Study Design Patient setting Patient type Patient population Geographic location Timing of data collection Source of data Number of subjects Time window (months) Mean Age (years) Gender
Total Cases Control Male (%) Female (%)
Kinar, 2016 10 Retrospective cohort+case-control Primary care Anyone Inclusion: Age 50-75. Israel and UK Israel: January 2003-June 2011 Maccabi Healthcare Services (MHS) and the UK Health Improvement Network (THIN) Israel: 606,403 Israel: 3,315 UK: 25,613 3-6 Israel: 58.7 Israel: 46.4 Israel: 53.6
Exclusion: Diagnosed with cancers other than CRC. UK: January 2003-May 2012 UK: 30,674 UK: 5,061 UK: 67.4 UK: 49.2 UK: 50.8
Kinar, 2017 14 Retrospective cohort Primary care Anyone Inclusion: Aged 50-75 on January 1, 2008, with≥1 FBC in the MHS during the six-month testing period. Israel July 2007-December 2007 MHS and Israeli Cancer Registry 112,584 133 97 6 60.9 44 56
Exclusion: Cancer diagnosis before January 1, 2008, or no blood test during the testing period.
Birks, 2017 16 Case-control+retrospective cohort Primary care Anyone Inclusion: Patients >40 years old with≥1 FBC in their record. UK January 2000-April 2015 Clinical Practice Research Datalink (CPRD) 2,550,119 25,430 NA 18-24 No CRC=60.5±14.0 NA NA
Exclusion: <12 months registered, <2 years follow-up, prior CRC or precursors, hemoglobin gene defects. CRC=72.7±10.5
Hornbrook, 2017 15 Retrospective case-control Unclear Asymptomatic Inclusion: Eligible CRC cases with CBC before diagnosis. United States 1998-2013 Kaiser Permanente Northwest 17,095 900 16,195 0-6 and 6-12 58.0±11.8 44.1 55.9
Region’s Tumor Registry
Ayling, 2018 21 Prospective (PGC) and Retrospective cohort (RLH) Secondary care Symptomatic Inclusion: IDA patients referred to Plymouth Gastroenterology Clinic for FIT evaluation; IDA patients referred to Royal London Hospital for colonoscopy. UK March 2014-March 2017 Gastroenterology Clinic in Derriford Hospital, Plymouth, and Royal London Hospital medical records 592 NA NA NA Plymouth 48.14 51.86
Male: 70.9
Female: 69.1
London
Male: 66
Female: 60
Exclusion: Patients with non-anemia causes.
Goshen, 2018 17 Cohort prospective Secondary care Anyone Inclusion: Ages 50-75 in MHS with CBC recorded between October 2015 and December 2016, without a colonoscopy in the past 10 years, and no FIT in the 18 months before the index CBC. Israel October 2015-December 2016 MHS EMR and Israel Cancer Registry 79671 NA NA 1-6 and 7-12 NA NA NA
Exclusion: Referred for FIT in the last 3 months but not completed, and prior cancer diagnosis.
Hilsden, 2018 6 Retrospective cohort and case-control Secondary care Asymptomatic Inclusion: Asymptomatic individuals, 50-75, had a screening colonoscopy from January 2013 to Jun 2015, with a CBC within a year, average CRC risk, and personal/family history of polyps/CRC. Canada January 2013-June 2015 Alberta Health Services Forzani and MacPhail Colon Cancer Screening 17676 NA NA NA NA 46.6 53.4
Exclusion: Positive FOBT, prior CRC, genetic predisposition, or no CBC within a year before colonoscopy.
Schneider, 2020 18 Retrospective cohort+case-control Unclear Anyone Inclusion: KPNC Health Plan members (1996-2015), aged ≥37, with ≥1 outpatient CBC. Cases: Ages 50-75, CBC, no prior/current CRC diagnosis, later diagnosed with CRC. Controls: Ages 50-75, randomly selected CBC, no CRC diagnosis. Both require a 6-month health plan membership and CBC before colonoscopy. United States January 199-December 2015 Kaiser Permanente Northern California Health Plan 308,721 6,019 302,702 0-6 and 6-12 58.5±7.7 48 52
Ayling, 2021 22 Prospective cohort Secondary care Symptomatic Inclusion: Patients over 40, on urgent pathway for suspected CRC on May 1, 2020. UK May-October 2020 Barts Health NHS Trust 532 NA NA 6 63 50.81 49.82
Exclusion: No final diagnosis, declined investigations, inaccessible, overseas, unable to attend, awaiting definitive investigations, and invalid FIT.
Holt, 2023 19 Case-control+retrospective cohort Primary care Anyone Inclusion: Individuals >40 years, with one FBC in CPRD record (01/2000-28/04/2015) and associated ColonFlag score. UK January 2000-April 2015 CPRD and National Cancer Registry 1,893,641 18,130 270,750 18-24 NA NA NA
Exclusion: <2 years follow-up, <12 months registered, or hemoglobin gene defect.
Table 2.Characteristics of the studies included in the systematic review

ColonFlag Performance Test

Most studies focused on the AUC as the primary outcome, with secondary outcomes including sensitivity, specificity, PPV, NPV, and OR. AUC values across diverse populations ranged from 0.736 to 0.82. 10 , 15 , 16 , 18 , 19 Excluding Ayling’s prospective studies, 21 , 22 the sensitivity and specificity ranged from 3.91% to 35.4% and 82.73% to 94%, respectively. Ayling’s studies had higher sensitivity (52.4% and 88.24%) and lower specificity (71.3% and 71.07%). PPV varied between 2.6% and 9.1%, and NPV ranged from 97.6% to 99.9%. Table 3 shows the outcome of the included studies. Among the studies providing ColonFlag scores, 6 , 14 , 16 , 19 two 16 , 19 indicated higher scores in CRC-diagnosed individuals (x-=79-83.8) than those without a diagnosis (x-=51.5-56.3). In the development study, 10 an AUC of 0.826±0.01 was achieved, further validated on an external THIN database in the UK with an AUC of 0.81, OR of 40, and specificity of 94%. Figure 2 compares studies, revealing AUC ranging from 0.736 to 0.82.

Study Mean ColonFlag score AUC (95% CI) Sensitivity (%, 95% CI) Specificity (%, 95% CI) PPV (%) NPV (%) OR (95% CI)
Kinar, 2016 10 (Israel) 0.82±0.01* 88±2* 26±5*
Kinar, 2016 10 (UK) 0.81 94±1 40±6
Kinar, 2017 14 Female=59.3 17.3 21.8 (13.8, 34.2)
Male=46.8
Birks, 2017 16 No CRC=51.5±29.0 0.776 (0.771, 0.781) 3.91 (3.40, 4.48) 82.73 (82.68, 82.78) 8.8 99.6 26.5 (23.3, 30.2)
CRC=79.1±19.5
Hornbrook, 2017 15 0.8 (0.79, 0.82) 34.7 (28.9, 40.4)
Ayling, 2018 21 52.4 71.3 6.3 97.6
Goshen, 2018 17 21.7 33.3 (22.6, 49.1)
Hilsden, 2018 6 56.8±18.5 8.1 (6.4, 9.8) 5.1 (2.3, 8.9)
Schneider, 2020 18 0.78 (0.77, 0.78) 35.4 (33.8, 36.7) 17.7 (16.5, 18.7)
Ayling, 2021 22 88.24 (63.56, 98.54) 71.07 (66.94, 74.94) 9.1 (95% CI, 7.47, 11.15) 99.45 (95% CI, 98.03, 99.85)
Holt, 2023 19 No CRC=56.3 0.736 (0.715, 0.759) 10 2.6 99.9 1.05 (1.047, 1.053)**
CRC=83.8
AUC: Area Under the Curve; PPV: Positive predictive value; NPV: Negative predictive value; OR: Odds ratio; *Standard Deviation (SD value); **OR for a ColonFlag/unit increase
Table 3.Overall performance test of ColonFlag across the studies included in the systematic review

Figure 2. The AUC reported by five studies in six populations. AUC values ranged from 0.736 to 0.82 (blue diamonds) with their respective 95% confidence intervals (black horizontal lines).

An age-only detection algorithm achieved an AUC of 0.73. 15 In a case-control sensitivity analysis with age matching, the resulting AUC dropped to 0.583. 16 Notably, a comprehensive model with an AUC of 0.78 outperformed the AUC of 0.65 from an age-only model. Gender-specific age-alone models yielded AUCs of 0.61 for men and 0.60 for women, considerably lower than the comprehensive model’s AUC of 0.78. 18 Another study, initially showing an AUC of 0.736, dropped to 0.536 when age was excluded in case-control matching. Substituting any symptom for the ColonFlag score resulted in an AUC of 0.725. 19

ColonFlag Score Cut-off and Odds Ratio

Birks and others used a ColonFlag risk score cutoff of 99.84, yielding an OR of 26.5 for CRC diagnosis. 16 Kinar and others reported a similar value (99.38, top one percentile), resulting in an OR of 21.8. 14 Goshen and others used a cutoff score of 99.6, yielding an OR of 33.3. 17 Schneider and others assessed ColonFlag with a cutoff score of ≥96, corresponding to a specificity of 97%, resulting in an OR of 17.7. 18 Holt and others demonstrated a PPV of 10% at a ColonFlag score cutoff >99.8. 19

ColonFlag Performance Test Based on Various Subgroups Analysis

The studies analyzed various aspects, consistently reporting four: time window to CRC diagnosis (n=5), tumor location (n=4), CRC stage (n=3), and histopathological findings (n=4). ColonFlag performs better in shorter time windows, proximal tumor locations, advanced stages, and CRC compared to adenoma.

Time Window: Sensitivity during the initial 6 months surpassed the subsequent period for both the top one and three percentile groups across age groups. 14 - 16 , 18 Birks and others focused on the 18-24 month period in their primary analysis, with secondary analyses at intervals of 3-6, 6-12, 12-18, and 24-36 months before diagnosis, revealing declining AUC, sensitivity, and specificity with extended time windows. 16 Holt and others identified the ‘pre-symptomatic’ phase, indicating ColonFlag scores began rising around 3-4 years before diagnosis. Effective discrimination occurred in the 18-24 months preceding CRC diagnosis 19 (table 4).

Study AUCa Sensitivityb Others
Kinar, 2017 14 0-6 months
1% percentile=25%
3% percentile=29%
6-12 months
1% percentile=9.5%
3% percentile=20%
Birks, 2017 16 3-6 months=0.844 3-6 months=14.2% Specificityb
6-12 months=0.813 6-12 months=9.3% 3-6 months=92.50%
12-24 months=0.791 12-24 months=6.2% 6-12 months=86.98%
18-24 months=0.776 18-24 months=3.91% 12-24 months=84.98%
24-36 months=0.751 24-36 months=2.5% 18-24 months=82.73%
24-36 months=79.41%
Hornbrook, 2017 15 0-180 days:
50-75 age group=34.5%
40-89 age group=39.9%
181-360 days:
50-75 age group=18.8%
40-89 age group=27.4%
Schneider, 2020 18 0-182 days=40.5% ORc
183-365 days=25.0% 0-182 days=12.9
183-365 days=6.3
Holt, 2023 19 Males
0-6 months=0.624
6-12 months=0.605
12-18 months=0.557
18-24 months=0.536
Females
0-6 months=0.624
6-12 months=0.624
12-18 months=0.567
18-24 months=0.536
AUC: Area Under the Curve; OR: Odds Ratio; aComputed by plotting a Receiver Operating Characteristics (ROC) curve based on model predictions and true labels, then calculating the area under this curve. bUsing the predicted outcomes from a binary classification model and comparing them to the true outcomes of the instances. cCalculated by comparing the odds of the event in the exposed group to the odds of the event in the unexposed group using data.
Table 4.ColonFlag performance based on different time windows or time intervals from the blood count examination to the time of diagnosis

Tumor Location: Three studies revealed the ColonFlag’s capacity to detect CRC throughout the entire colon, especially excelling in proximal sites (table 5). 10 , 15 , 18 Its efficacy peaked in identifying cecal and ascending colon tumors, diminished in the transverse colon, and reached its lowest in the sigmoid colon and rectum. The OR in table 4 is the OR of the ColonFlag model for detecting tumors based on various locations in the colon. At a specificity of 99%, the OR for detecting cecal tumors was 93.4, significantly higher than the 10.2 OR for detecting rectal tumors. 15

Study Sensitivitya ORb Others
Kinar, 2016 10 Specificitya
Rectum=85.9%
Left colon=87.4%
Transverse colon=93.6%
Right colon=96.1%
Hornbrook, 2017 15 Cecum=93.4
Ascending=90.0
Transverse=51.1
Sigmoid=13.8
Rectum=10.2
Hilsden, 2018 6 Ascending/cecum=10.8% Ascending/cecum=2.6
Other=13.2% Other=3
Schneider, 2020 18 Distal=27.3% Distal=12.1 AUCc
Proximal=51.8% Proximal=34.7 Distal=0.74
Proximal=0.86
AUC: Area Under the Curve; OR: Odds Ratio; aUsing the predicted outcomes from a binary classification model and comparing them to the true outcomes of the instances. bCalculated by comparing the odds of the event in the exposed group to the odds of the event in the unexposed group using data. cComputed by plotting a Receiver Operating Characteristics (ROC) curve based on model predictions and true labels, then calculating the area under this curve.
Table 5.ColonFlag performance based on different tumor locations across the colon and rectum

Stage: ColonFlag demonstrated higher sensitivity and OR in detecting advanced-stage CRC compared to early-stage cases (table 6). 6 , 15 , 18 The performance difference between the two groups: early-stage (0, 1, 2) and advanced stages (SEER 3, 4, 7) was statistically significant. 18

Study Sensitivitya ORb AUCc
Hornbrook, 2017 15 In situ=12.1
I=16.7
II=54.1
III=57.3
IV=40.4
Hilsden, 2018 6 I/II=10.7% I/II=2.3%
III/IV=18.3% III/IV=4.6%
Schneider, 2020 18 Early stage (0, I, II)=28.8% Early stage (0, I, II)=0.75
Advanced stage (III, IV, VII)=43.4% Advanced stage (III, IV, VII)=0.82
aUsing the predicted outcomes from a binary classification model and comparing them to the true outcomes of the instances. bCalculated by comparing the odds of the event in the exposed group to the odds of the event in the unexposed group using data. cComputed by plotting a Receiver Operating Characteristics (ROC) curve based on model predictions and true labels, then calculating the area under this curve.
Table 6.ColonFlag performance based on CRC stage

Histopathological Findings: ColonFlag excelled in detecting CRC compared to its performance in identifying both CRC and high-risk adenomas. 21 , 22 Two studies demonstrated its ability to identify high-risk precancerous conditions, including advanced adenomatous polyps (table 7). However, ColonFlag exhibited lower performance in identifying any adenomatous polyps than its CRC detection performance. 6 , 18

Study Sensitivitya Specificitya PPVb NPVb ORc AUCd
Ayling, 2018 CRC=52.4% CRC=71.3% CRC=6.3% CRC=97.6%
CRC+HRA=46.9% CRC+HRA=72% CRC+HRA=13.1% CRC+HRA=93.8%
Hilsden, 2018 CRC=5.1
Advanced adenoma/SSP=2.0
Non-advanced adenoma/SSP=1.7
Non-neoplastic polyp=1.2
Schneider, 2020 CRC=35.4% CRC=17.7% CRC=0.78
Adenoma=3.8% Adenoma=1.3% Adenoma=0.57
Ayling, 2021 CRC=81.8% CRC=73.5% CRC=8.3% CRC=99.3%
CRC+HRA=42.8% CRC+HRA =73.4% CRC+HRA =13.7% CRC+HRA=92.8%
HRA: High-risk adenoma; OR: Odds ratio; SSP: Sessile serrated polyp; aUsing the predicted outcomes from a binary classification model and comparing them to the true outcomes of the instances. bUsing the predicted outcomes from ColonFlag and comparing them to the true outcomes of the instances. cCalculated by comparing the odds of the event in the exposed group to the odds of the event in the unexposed group using data. dComputed by plotting a Receiver Operating Characteristics (ROC) curve based on model predictions and true labels, then calculating the area under this curve.
Table 7.ColonFlag performance based on histopathology findings

Risk of Bias: Four studies were deemed high-risk, and one had unclear bias (table 8). Three studies inadequately addressed missing data, omitting many participants due to incomplete datasets. 6 , 10 , 14 Another study lacked information on handling missing data appropriately. 14 Most studies used retrospective cohort and case-control designs, with only two using a prospective cohort approach with a limited number of subjects. 21 , 22

No Study Risk of Bias (ROB) Applicability Overall
Partici pants Predictors Outcome Analysis Partici pants Predictors Outcome ROB Applica bility
1 Kinar, 2016 10 Low Low Low High Low Low Low High Low
2 Kinar, 2017 14 Low Low Low Unclear Low Low Low Unclear Low
3 Birks, 2017 16 Low Low Low Low Low Low Low Low Low
4 Hornbrook, 2017 15 Low Low Low Low Low Low Low Low Low
5 Ayling, 2018 21 Low Low Low Low Low Low Low Low Low
6 Goshen, 2018 17 Low Low Low High Low Low Low High Low
7 Hilsden, 2018 6 Low Low Low High Low Low Low High Low
8 Schneider, 2020 18 Low Low Low Low Low Low Low Low Low
9 Ayling, 2021 22 Unclear Low Low High Low Low Low High Low
10 Holt, 2023 19 Low Low Low Low Low Low Low Low Low
Table 8.Risk of bias assessment

Discussion

ColonFlag utilizes a machine learning algorithm, employing a random forest model with decision trees and cross-validation, incorporating age, gender, and 20 CBC parameters. 10 It generates scores on a 1 to 100 scale, reflecting CRC risk based on fluctuations in the CBC levels. 14 The algorithm identified red blood cell (RBC) and Hb-related factors as crucial for case identification, with platelet-related factors also significant, and white blood cell-related factors having a smaller impact. 23 ColonFlag was able to identify CRC in asymptomatic patients, even without anemia. 24 However, the reported sensitivity of ColonFlag exhibits considerable variation, spanning from 3.91% to 35.4%. This broad range, especially when considering the lower limit, suggests a significant risk of overlooking individuals at a high risk of CRC. The notable decrease in sensitivity poses a concern, markedly reducing the tool’s practical efficacy in clinical settings. The majority of the studies used a retrospective design, an absence of comparable diagnostic data (e.g., colonoscopy) for all cancer controls, and an inability to discern specific reasons for blood testing.

Age was the primary predictive factor, evident in decreased AUC when age was matched in a case-control sensitivity analysis. 15 , 16 , 18 Despite the value of age in assessing CRC risk, combining ColonFlag score or symptoms with age and gender did not significantly enhance predictive capability compared to using age and gender alone. This implies ColonFlag’s discriminative performance heavily relies on age rather than CBC changes. 19 Many studies use a >99 cutoff for a positive ColonFlag test, yielding notable OR for CRC detection, supporting further evaluation for scores exceeding this threshold. 25 Implementing one percentile cutoffs semiannually or three percentile cutoffs annually could offer comparable benefits. 14

The included studies span across various countries and populations, revealing variations in ColonFlag’s performance across these diverse demographic groups. The studies exhibit diverse study designs, ranging from retrospective, prospective cohort to case-control studies. They involved populations with different eligibility criteria and characteristics, some with limitations related to the quality and completeness of data, comparable diagnostic data, and potential inaccuracies in datasets. These diversities may introduce methodological variations and affect the synthesis of results.

The predictive performance of ColonFlag improves with a shorter time interval between CBC and diagnosis. It effectively discriminates between CRC patients and controls 18-24 months before diagnosis, without evident symptoms except for rectal bleeding. 19 This highlights the importance of investigating rectal bleeding for swift referral. The ColonFlag score shows an upward trend, diverging 3-4 years before diagnosis, within the pre-symptomatic phase. One-third of individuals with thrombocytosis and cancer had no documented cancer-related symptoms. 26 Early CRC detection is emphasized by monitoring CBC indices before symptoms appear. 23

ColonFlag identifies CRC across the entire colon, excelling in proximal areas, and enhancing noninvasive screening tools for right-sided colon cancer such as FOBT or FIT. 27 The varying specificity in different colonic regions aligns with reduced anemia prevalence toward the rectum, underscoring the clinical significance of ColonFlag, especially for right-sided CRC detection. 28 , 29 Lower Hb levels correlate with tumors closer to the colon’s proximal region. 30 Studies noted a significant Hb decrease in patients with proximal colon tumors compared to distal colon and rectum tumors. 30 - 32 Disparities between proximal and distal CRC may be due to bleeding mechanisms, but other factors such as immunological processes should also be considered. 30

Blood loss leading to iron deficiency is a primary cause of anemia in CRC patients. 33 Anemia in CRC often presents as microcytic, especially in advanced stages. 32 ColonFlag showed better performance in CRC cases than adenoma cases. Evaluating pre-cancerous lesions, the highest test performance was seen in advanced adenoma, while non-neoplastic polyps had the least robust performance. Iron deficiency and ferritin significantly decreased in CRC, 34 reinforcing the link between CRC and anemia. Prior studies found notable differences in 16 out of 23 blood cell parameters for CRC compared to adenoma and polyp, 35 consistent with a meta-analysis of CBC tests in CRC detection. 23 All eight indicators related to RBC displayed significant distinctions between CRC, adenoma, and polyp cases. 35 These outcomes align with a recent study where Hb, MCV, and serum ferritin levels decreased before a CRC diagnosis. 36

Inflammation plays a crucial role in carcinogenesis, 37 , 38 with chronic inflammation influencing every tumor development phase. Studies demonstrate the diagnostic potential of neutrophil-lymphocyte ratio (NLR), platelet-lymphocyte ratio (PLR), and mean platelet volume (MPV), 39 - 42 achieving an AUC of 0.904. 43 These parameters could potentially enhance the ColonFlag algorithm’s performance, enabling it to identify subtle patterns, correlations, and trends that might have otherwise gone unnoticed.

To the best of our knowledge, this systematic review is the first to evaluate ColonFlag’s efficacy comprehensively. The limitation of the study was its reliance on published data, which could introduce bias due to unreported outcomes. Additionally, the exclusion of articles in languages other than English was a limitation. Since the study was not a meta-analysis and lacked a comprehensive summary, no data analysis was undertaken to evaluate publication bias.

Conclusion

While ColonFlag exhibits low sensitivity compared to established screening methods such as the FIT or colonoscopy, its potential in detecting CRC before clinical diagnosis suggests an opportunity for identifying more cases than regular screening alone. The ColonFlag model does not serve as a substitute for traditional screening programs. Further prospective evaluation is warranted to assess the algorithm’s feasibility, efficiency, and accuracy across diverse clinical settings. Moreover, studies are needed to evaluate how additional medical records or routine laboratory data influence test performance.

Acknowledgment

This study was supported by Hibah Kolaborasi Riset Internasional University of Indonesia 2019 through the Directorate of Research and Community Engagement University of Indonesia.

Authors’ Contribution

RD.P: Study concept, study design, project administration, data curation, drafting, and reviewing the manuscript; SA.S: Study concept, study design, project administration, data curation, drafting, and reviewing the manuscript; NN.H: Study concept, study design, data curation, drafting, and reviewing the manuscript; TA.S: Study concept, study design, data curation, drafting, and reviewing the manuscript; M.A; Study concept, study design, data curation, drafting, and reviewing the manuscript; All authors have read and approved the final manuscript and agree to be accountable for all aspects of the work in ensuring that questions related to the accuracy or integrity of any part of the work are appropriately investigated and resolved.

Conflict of Interest:

None declared.

References

  1. Rawla P, Sunkara T, Barsouk A. Epidemiology of colorectal cancer: incidence, mortality, survival, and risk factors. Prz Gastroenterol. 2019; 14:89-103. Publisher Full Text | DOI | PubMed
  2. Siegel RL, Wagle NS, Cercek A, Smith RA, Jemal A. Colorectal cancer statistics, 2023. CA Cancer J Clin. 2023; 73:233-54. DOI | PubMed
  3. Morgan E, Arnold M, Gini A, Lorenzoni V, Cabasag CJ, Laversanne M, et al. Global burden of colorectal cancer in 2020 and 2040: incidence and mortality estimates from GLOBOCAN. Gut. 2023; 72:338-44. DOI | PubMed
  4. Hultcrantz R. Aspects of colorectal cancer screening, methods, age and gender. J Intern Med. 2021; 289:493-507. Publisher Full Text | DOI | PubMed
  5. Zheng S, Schrijvers JJA, Greuter MJW, Kats-Ugurlu G, Lu W, de Bock GH. Effectiveness of Colorectal Cancer (CRC) Screening on All-Cause and CRC-Specific Mortality Reduction: A Systematic Review and Meta-Analysis. Cancers (Basel).. 2023; 15Publisher Full Text | DOI | PubMed
  6. Hilsden RJ, Heitman SJ, Mizrahi B, Narod SA, Goshen R. Prediction of findings at screening colonoscopy using a machine learning algorithm based on complete blood counts (ColonFlag). PLoS One. 2018; 13:e0207848. Publisher Full Text | DOI | PubMed
  7. Helsingen LM, Kalager M. Colorectal Cancer Screening - Approach, Evidence, and Future Directions. NEJM Evid. 2022; 1DOI | PubMed
  8. Young GP, Rabeneck L, Winawer SJ. The Global Paradigm Shift in Screening for Colorectal Cancer. Gastroenterology. 2019; 156:843-51. DOI | PubMed
  9. Wools A, Dapper EA, de Leeuw JR. Colorectal cancer screening participation: a systematic review. Eur J Public Health. 2016; 26:158-68. DOI | PubMed
  10. Kinar Y, Kalkstein N, Akiva P, Levin B, Half EE, Goldshtein I, et al. Development and validation of a predictive model for detection of colorectal cancer in primary care by analysis of complete blood counts: a binational retrospective study. J Am Med Inform Assoc. 2016; 23:879-90. Publisher Full Text | DOI | PubMed
  11. Astin M, Griffin T, Neal RD, Rose P, Hamilton W. The diagnostic value of symptoms for colorectal cancer in primary care: a systematic review. Br J Gen Pract. 2011; 61:e231-43. Publisher Full Text | DOI | PubMed
  12. Spell DW, Jones DV, Harper WF, David Bessman J. The value of a complete blood count in predicting cancer of the colon. Cancer Detect Prev. 2004; 28:37-42. DOI | PubMed
  13. Ay S, Eryilmaz MA, Aksoy N, Okus A, Unlu Y, Sevinc B. Is early detection of colon cancer possible with red blood cell distribution width?. Asian Pac J Cancer Prev. 2015; 16:753-6. DOI | PubMed
  14. Kinar Y, Akiva P, Choman E, Kariv R, Shalev V, Levin B, et al. Performance analysis of a machine learning flagging system used to identify a group of individuals at a high risk for colorectal cancer. PLoS One. 2017; 12:e0171759. Publisher Full Text | DOI | PubMed
  15. Hornbrook MC, Goshen R, Choman E, O’Keeffe-Rosetti M, Kinar Y, Liles EG, et al. Early Colorectal Cancer Detected by Machine Learning Model Using Gender, Age, and Complete Blood Count Data. Dig Dis Sci. 2017; 62:2719-27. DOI | PubMed
  16. Birks J, Bankhead C, Holt TA, Fuller A, Patnick J. Evaluation of a prediction model for colorectal cancer: retrospective analysis of 2.5 million patient records. Cancer Med. 2017; 6:2453-60. Publisher Full Text | DOI | PubMed
  17. Goshen R, Choman E, Ran A, Muller E, Kariv R, Chodick G, et al. Computer-Assisted Flagging of Individuals at High Risk of Colorectal Cancer in a Large Health Maintenance Organization Using the ColonFlag Test. JCO Clin Cancer Inform. 2018; 2:1-8. DOI | PubMed
  18. Schneider JL, Layefsky E, Udaltsova N, Levin TR, Corley DA. Validation of an Algorithm to Identify Patients at Risk for Colorectal Cancer Based on Laboratory Test and Demographic Data in Diverse, Community-Based Population. Clin Gastroenterol Hepatol. 2020; 18:2734-41 e6. DOI | PubMed
  19. Holt TA, Virdee PS, Bankhead C, Patnick J, Nicholson BD, Fuller A, et al. Early detection of colorectal cancer using symptoms and the ColonFlag: case-control and cohort studies. NIHR Open Research. 2023; 3DOI
  20. Wolff RF, Moons KGM, Riley RD, Whiting PF, Westwood M, Collins GS, et al. PROBAST: A Tool to Assess the Risk of Bias and Applicability of Prediction Model Studies. Ann Intern Med. 2019; 170:51-8. DOI | PubMed
  21. Ayling RM, Lewis SJ, Cotter F. Potential roles of artificial intelligence learning and faecal immunochemical testing for prioritisation of colonoscopy in anaemia. Br J Haematol. 2019; 185:311-6. DOI | PubMed
  22. Ayling RM, Wong A, Cotter F. Use of ColonFlag score for prioritisation of endoscopy in colorectal cancer. BMJ Open Gastroenterol. 2021; 8Publisher Full Text | DOI | PubMed
  23. Virdee PS, Marian IR, Mansouri A, Elhussein L, Kirtley S, Holt T, et al. The Full Blood Count Blood Test for Colorectal Cancer Detection: A Systematic Review, Meta-Analysis, and Critical Appraisal. Cancers (Basel).. 2020; 12Publisher Full Text | DOI | PubMed
  24. Hippisley-Cox J, Coupland C. Identifying patients with suspected colorectal cancer in primary care: derivation and validation of an algorithm. Br J Gen Pract. 2012; 62:e29-37. Publisher Full Text | DOI | PubMed
  25. National Institute for Health and Care Excellence. Suspected cancer recognition and referral Suspected cancer recognition and referral overview. NICE Guideline. 2015;1-30.
  26. Bailey SE, Ukoumunne OC, Shephard EA, Hamilton W. Clinical relevance of thrombocytosis in primary care: a prospective cohort study of cancer incidence using English electronic medical records and cancer registry data. Br J Gen Pract. 2017; 67:e405-e13. Publisher Full Text | DOI | PubMed
  27. Edna TH, Karlsen V, Jullumstro E, Lydersen S. Prevalence of anaemia at diagnosis of colorectal cancer: assessment of associated risk factors. Hepatogastroenterology. 2012; 59:713-6. DOI | PubMed
  28. Haug U, Kuntz KM, Knudsen AB, Hundt S, Brenner H. Sensitivity of immunochemical faecal occult blood testing for detecting left- vs right-sided colorectal neoplasia. Br J Cancer. 2011; 104:1779-85. Publisher Full Text | DOI | PubMed
  29. Hirai HW, Tsoi KK, Chan JY, Wong SH, Ching JY, Wong MC, et al. Systematic review with meta-analysis: faecal occult blood tests show lower colorectal cancer detection rates in the proximal colon in colonoscopy-verified diagnostic studies. Aliment Pharmacol Ther. 2016; 43:755-64. DOI | PubMed
  30. Vayrynen JP, Tuomisto A, Vayrynen SA, Klintrup K, Karhu T, Makela J, et al. Preoperative anemia in colorectal cancer: relationships with tumor characteristics, systemic inflammation, and survival. Sci Rep. 2018; 8:1126. Publisher Full Text | DOI | PubMed
  31. Sadahiro S, Suzuki T, Tokunaga N, Mukai M, Tajima T, Makuuchi H, et al. Anemia in patients with colorectal cancer. J Gastroenterol. 1998; 33:488-94. DOI | PubMed
  32. Dunne JR, Gannon CJ, Osborn TM, Taylor MD, Malone DL, Napolitano LM. Preoperative anemia in colon cancer: assessment of risk factors. Am Surg. 2002; 68:582-7. PubMed
  33. Mandel JS, Church TR, Bond JH, Ederer F, Geisser MS, Mongin SJ, et al. The effect of fecal occult-blood screening on the incidence of colorectal cancer. N Engl J Med. 2000; 343:1603-7. DOI | PubMed
  34. Kishida T, Sato J, Fujimori S, Minami S, Yamakado S, Tamagawa Y, et al. Clinical significance of serum iron and ferritin in patients with colorectal cancer. J Gastroenterol. 1994; 29:19-23. DOI | PubMed
  35. Gan X, Chen Z-Y, Li Z-H, Zhou J-M, Sun Y, Cai D, et al. Conventional Laboratory Blood Indicators Are Valuable for Early Diagnosis of Colorectal Cancer. 2021. DOI
  36. Schneider C, Bodmer M, Jick SS, Meier CR. Colorectal cancer and markers of anemia. Eur J Cancer Prev. 2018; 27:530-8. DOI | PubMed
  37. Elinav E, Nowarski R, Thaiss CA, Hu B, Jin C, Flavell RA. Inflammation-induced cancer: crosstalk between tumours, immune cells and microorganisms. Nat Rev Cancer. 2013; 13:759-71. DOI | PubMed
  38. Itzkowitz SH, Yio X. Inflammation and cancer IV. Colorectal cancer in inflammatory bowel disease: the role of inflammation. Am J Physiol Gastrointest Liver Physiol. 2004; 287:G7-17. DOI | PubMed
  39. Kwon HC, Kim SH, Oh SY, Lee S, Lee JH, Choi HJ, et al. Clinical significance of preoperative neutrophil-lymphocyte versus platelet-lymphocyte ratio in patients with operable colorectal cancer. Biomarkers. 2012; 17:216-22. DOI | PubMed
  40. Li MX, Liu XM, Zhang XF, Zhang JF, Wang WL, Zhu Y, et al. Prognostic role of neutrophil-to-lymphocyte ratio in colorectal cancer: a systematic review and meta-analysis. Int J Cancer. 2014; 134:2403-13. DOI | PubMed
  41. Kilincalp S, Coban S, Akinci H, Hamamci M, Karaahmet F, Coskun Y, et al. Neutrophil/lymphocyte ratio, platelet/lymphocyte ratio, and mean platelet volume as potential biomarkers for early detection and monitoring of colorectal adenocarcinoma. Eur J Cancer Prev. 2015; 24:328-33. DOI | PubMed
  42. Peng HX, Yang L, He BS, Pan YQ, Ying HQ, Sun HL, et al. Combination of preoperative NLR, PLR and CEA could increase the diagnostic efficacy for I-III stage CRC. J Clin Lab Anal. 2017; 31Publisher Full Text | DOI | PubMed
  43. Stojkovic Lalosevic M, Pavlovic Markovic A, Stankovic S, Stojkovic M, Dimitrijevic I, Radoman Vujacic I, et al. Combined Diagnostic Efficacy of Neutrophil-to-Lymphocyte Ratio (NLR), Platelet-to-Lymphocyte Ratio (PLR), and Mean Platelet Volume (MPV) as Biomarkers of Systemic Inflammation in the Diagnosis of Colorectal Cancer. Dis Markers. 2019; 2019:6036979. Publisher Full Text | DOI | PubMed