Iranian Journal of Medical Sciences

Document Type : Original Article(s)

Authors

1 Department of Physiology and Medical Physics, School of Medicine, Baqiyatallah University of Medical Sciences, Tehran, Iran

2 Department of Surgery, School of Medicine, Baqiyatallah University of Medical Sciences, Tehran, Iran

3 Health Research Center, Life Style Institute, Baqiyatallah University of Medical Sciences, Tehran, Iran

4 Student Research Committee, Baqiyatallah University of Medical Sciences, Tehran, Iran

5 Radiation Sciences Research Center, Systems Biology and Poisonings Institute, Baqiyatallah University of Medical Sciences, Tehran, Iran

Abstract

Background: Pancreatic adenocarcinoma is one of the most aggressive and lethal cancers, with a poor prognosis primarily due to late-stage diagnosis. Improving the accuracy of pancreatic cancer diagnosis is crucial for enhancing survival outcomes, yet the sensitivity of conventional diagnostic methods remains a significant challenge. This study aims to evaluate the effectiveness of radiomics features extracted from Computed Tomography (CT) imaging, combined with machine learning models, for the detection of pancreatic adenocarcinoma.
Methods: A retrospective dataset from Baqiyatallah Hospital, Tehran, Iran (2024) of 100 participants (50 with pancreatic adenocarcinoma (primarily stages II-III) and 50 healthy controls) was used. CT images were acquired with a three-phase protocol, and radiomics features were extracted using 3D Slicer software. Three classifiers—Support Vector Machine (SVM), Logistic Regression (LR), and Random Forest (RF)—were employed, with feature selection methods including Recursive Feature Elimination (RFE), Mutual Information (MI), and Least Absolute Shrinkage and Selection Operator (LASSO). Model performance was assessed using accuracy, precision, sensitivity, F1 score, and area under the curve (AUC).
Results: The SVM classifier with LASSO feature selection achieved the highest performance, with an accuracy of 0.83 and an AUC of 0.89. LR and RF also demonstrated strong results, with LASSO providing the best feature selection for both classifiers. SHAP analysis revealed that textural features such as gray-level-non-uniformity and run-length-non-uniformity were the most important drivers for distinguishing pancreatic cancer from normal tissue. 
Conclusion: Radiomics-based machine learning models show promise for improving the diagnosis of pancreatic adenocarcinoma. The combination of LASSO and powerful classifiers such as SVM, LR, and RF offers a robust framework for non-invasive, accurate diagnostic tools.

Highlights

Amin Talebi (Google Scholar)
Zeinab Shankayi (Google Scholar)

Keywords

What’s Known

Radiomics analysis of CT images has shown promise in detecting pancreatic cancer by capturing tumor heterogeneity beyond human perception. Machine learning models have been used to classify pancreatic lesions, but most of them rely on limited sample sizes or handcrafted imaging features.

What’s New

This study integrates radiomics features with machine learning algorithms on a well-balanced dataset of pancreatic adenocarcinoma and normal cases. It demonstrates a reproducible pipeline from segmentation to classification, providing a robust framework for noninvasive diagnosis of pancreatic cancer based on CT imaging.

Introduction

Pancreatic adenocarcinoma is among the most aggressive malignancies, characterized by high mortality rates and limited treatment options. Globally, it ranks as one of the leading causes of cancer-related deaths, with a 5-year survival rate of less than 10%. 1 This dismal prognosis is largely attributed to the asymptomatic nature of the disease in its early stages, often leading to late-stage diagnosis. 2

Conventional diagnostic methods for pancreatic cancer include imaging techniques such as computed tomography (CT), magnetic resonance imaging (MRI), and endoscopic ultrasonography. Blood-based biomarkers such as Carbohydrate Antigen (CA) 19-9 are also commonly used. However, these approaches are often limited by low sensitivity and specificity. 3 Imaging-based detection heavily relies on the radiologists’ expertise, leading to variability in interpretation. Biomarkers, while helpful, are often elevated only in advanced stages, further emphasizing the need for more robust and precise diagnostic tools. 4

Radiomics has emerged as a transformative approach in medical imaging, enabling the extraction of quantitative features from imaging modalities such as CT and MRI. 5 These features capture tumor heterogeneity, shape, texture, and intensity, which are often imperceptible to the human eye. By leveraging these data, radiomics facilitates a deeper understanding of disease characteristics. 6 When combined with machine learning models, these features have demonstrated significant potential in improving diagnostic accuracy, prognostication, and treatment response prediction across a variety of cancers. 7 In pancreatic cancer, radiomics may overcome the limitations of conventional methods by providing objective, reproducible, and non-invasive insights into tumor biology. 8

While the potential of radiomics in pancreatic cancer is established, a systematic comparison of analytical pipelines is lacking. Therefore, this study aims to develop and validate an interpretable machine learning framework for the CT-based diagnosis of pancreatic adenocarcinoma. To achieve this, we conducted a rigorous head-to-head comparison of three feature selection methods and three machine learning classifiers, and integrated SHAP analysis to elucidate the key radiomic features, thereby identifying the most effective and clinically interpretable strategy.

Materials and Methods

Patients’ Population and Inclusion Criteria

This study was conducted in accordance with the ethical principles outlined in the Declaration of Helsinki. The study protocol was approved by the institutional review board (IRB) of Baqiyatallah University of Medical Science (BMSU, Approval Code: IR.BMSU.BAQ.REC.1403.058). The requirement for written informed consent was waived by the IRB due to the retrospective analysis of anonymized pre-existing data. This study, conducted at Baqiyatallah Hospital, Tehran, Iran in 2024, included a total of 100 participants, consisting of 50 individuals diagnosed with pancreatic adenocarcinoma and 50 with normal pancreas. Based on clinical and radiological assessment (American Joint Committee on Cancer 8th edition staging system), 9 the cancer patient cohort was primarily composed of stage II (locally advanced) and stage III (locally advanced/unresectable) diseases, reflecting the typical clinical presentation where CT imaging plays a central diagnostic role. Participants with a history of other malignancies, prior cancer treatments (such as chemotherapy or radiotherapy), or poor-quality imaging were excluded from the study to ensure reliable analysis.

Image Acquisition

All participants underwent CT imaging using a GE Optima 16-slice CT scanner (GE Healthcare, Chicago, IL, USA). A three-phase contrast-enhanced protocol was employed, involving the administration of iodine-based contrast media at a standard dose of 1 mL per kg of body weight followed by arterial, portal venous, and delayed phase imaging. The imaging parameters were set at 120 kVp and 200 mA. The raw data from the helical acquisition was reconstructed with a slice thickness of 2.5 mm and a 2.5 mm interval to provide the high-resolution images required for precise segmentation and radiomics feature extraction. All CT images were acquired from Digital Imaging and Communications in Medicine (DICOM) format from the Picture Archiving and Communication System (PACS) for further analysis.

Image Segmentation and Features Extraction

Segmentation of the pancreas was performed manually on portal venous phase CT images using 3D Slicer software (version 5.6.2; https://www.slicer.org) and its Segment Editor module. The segmentation was conducted by an experienced radiologist with over 5 years of expertise in abdominal imaging.

The segmentation protocol was as follows. The entire pancreas, including the head, uncinate process, neck, body, and tail, was delineated slice-by-slice on axial images. The segmentation boundaries were defined anatomically. The duodenal border of the pancreatic head was carefully distinguished from the duodenal wall, and the splenic border of the tail was differentiated from the splenic hilum. The main pancreatic duct was included within the volume of interest (VOI) if visible. Peripancreatic vessels (such as the superior mesenteric artery and vein and splenic artery and vein) and peripancreatic fat were meticulously excluded from the segmentation. For patients in the pancreatic adenocarcinoma group, the tumor volume was included within the pancreatic segmentation, aiming to capture the entire tumor burden as well as the surrounding parenchymal tissue to reflect tumor heterogeneity.

The manual segmentation was performed using the “Paint” and “Erase” tools within the Segment Editor module, allowing for precise voxel-level adjustments. The interactive nature of these tools enabled the radiologist to leverage the multi-planar reconstruction (MPR) views to verify the segmentation accuracy in the coronal and sagittal planes. The exact parameters for the paint brush were adjusted interactively by the operator based on the specific anatomy and were not recorded as a fixed setting.

Following segmentation, radiomics features were extracted from the entire segmented pancreatic volume using the PyRadiomics module (version 3.0.1; https://pyradiomics.readthedocs.io) integrated within 3D Slicer. The extracted features included first-order statistical features, shape-based (3D) features, and second-order texture features (Gray Level Co-occurrence Matrix - GLCM, Gray Level Run Length Matrix - GLRLM, Gray Level Size Zone Matrix - GLSZM, Gray Level Dependence Matrix - GLDM, and Neighboring Gray Tone Difference Matrix - NGTDM).

Machine Learning Modeling and Feature Selection

For machine learning modeling, three classifiers were employed to distinguish between pancreatic adenocarcinoma and normal cases:

  • 1. Random Forest (RF): An ensemble method that combines multiple decision trees, where each tree votes on the classification, and the final prediction is based on the majority vote, making it robust to overfitting.
  • 2. Support Vector Machine (SVM): A classifier that finds the optimal hyperplane to maximally separate the cancerous and normal cases in a high-dimensional space.
  • 3. Logistic Regression (LR): A linear model that estimates the probability of a case being cancerous by fitting a linear decision
  • 4. Boundary to the feature space.

The selection of RF, SVM, and LR classifiers was based on their established performance, interpretability, and widespread use in radiomics research. In this study, which aimed to provide a fair and baseline comparison of classifier and feature-selector combinations, we employed the default hyperparameters for all models as implemented in the scikit-learn library (version 1.2 https://scikit-learn.org). The classifiers were used with their default scikit-learn parameters, which included 100 decision trees and the Gini impurity criterion for Random Forest; a linear kernel with default regularization (C=1.0) for SVM; and L2 regularization (penalty=’l2’) and the ‘lbfgs’ solver for Logistic Regression.

All other parameters were kept at their library defaults.

To identify the most predictive radiomics features and mitigate the risk of overfitting, we employed three distinct feature selection techniques, each based on a different statistical principle. All feature selection was performed on the standardized feature matrix. The number of features to select (k=5) was pre-defined for all methods to allow for direct comparison. First, we applied Mutual Information (MI), a filter method that evaluates the non-linear statistical dependency between each feature and the diagnostic outcome.

Second, we utilized the Least Absolute Shrinkage and Selection Operator (LASSO), an embedded method that integrates feature selection within the model training process itself.

Finally, we implemented Recursive Feature Elimination (RFE), a wrapper method that adopts an iterative, model-driven approach, using an RF classifier as its core estimator.

Performance Evaluation

The performance of each combination of classifier and feature selection method was evaluated in three groups for each classifier and by using several metrics: accuracy, precision, sensitivity (recall), F1 score, positive predictive value (PPV), negative predictive value (NPV), and area under the curve (AUC). A stratified 5-fold cross-validation approach was applied for model evaluation, ensuring that the proportion of cancerous and normal cases was maintained in each fold. These metrics were calculated to provide a comprehensive assessment of the diagnostic performance of the models and to determine the most effective combination for detecting pancreatic adenocarcinoma. To interpret the predictions of our machine learning models and understand the contribution of individual radiomic features, we utilized SHapley Additive exPlanations (SHAP). The SHAP framework connects optimal credit allocation with local explanations using the classic Shapley values from game theory. In accordance with the model-specific recommendations from the SHAP library, we employed the following explainers:

Tree SHAP, a fast, exact algorithm for computing SHAP values for tree-based models.

Kernel SHAP, a model-agnostic method that approximates SHAP values by using a specially weighted linear regression on a background dataset, was used for the SMV and LR models. For this study, we used 100 randomly selected samples from the training set as the background dataset.

SHAP summary plots were then generated for each classifier-feature selection combination to visualize the impact of the top features on the model output.

Statistical Analysis

Descriptive statistics were used to summarize the demographic and clinical characteristics of the study population. The mean age and body mass index (BMI) were compared between the cancerous and normal groups using the independent samples t test, given the continuous nature of these variables. For categorical variables, such as the presence of diabetes and a family history of pancreatic cancer, comparisons between groups were performed using the Chi square test. P value <0.05 was considered statistically significant for all analyses. All statistical analyses and metric calculations were conducted using Python version 3.14 (Python Software Foundation, https://www.python.org) 10 in the Spyder IDE (version 5.0; https://www.spyder-ide.org).

Results

The baseline characteristics of the pancreatic adenocarcinoma and control groups are summarized in table 1. The two groups were well-matched, with no statistically significant differences (P>0.05) in age, gender, body mass index (BMI), diabetes prevalence, smoking history, or family history of cancer. This demonstrates the cohorts were comparable, allowing for a robust analysis of the radiomics models.

Characteristic Pancreatic Adenocarcinoma (n=50) Normal (n=50) P value*
Age (Years) 49±8.0 52±10.0 0.400
Sex Male 28 26 0.360
Female 22 24
BMI (Kg/m2) 26.2±4.1 25.3±3.2 0.790
Diabetes (yes) 10 6 0.400
Smoking history 8 11 0.700
Family history of cancer 6 5 >0.999
*Data are presented as mean±standard deviation or number (%); Continuous variables were compared using the independent samples t test; Categorical variables were compared using the Chi square test; P<0.05 was considered statistically significant.
Table 1.Comparison of baseline characteristics between pancreatic adenocarcinoma patients and healthy controls (Normal)

Figure 1 shows the radiomics workflow and describes the process of acquiring medical images, segmenting the pancreas, extracting quantitative radiomics features, and using these features to train and evaluate machine learning models for the diagnosis of pancreatic cancer.

Figure 1. Radiomics workflow for pancreatic cancer diagnosis is shown. A) Abdominal CT images were acquired, and the pancreas was segmented to define the region of interest. B) A large set of quantitative radiomic features were extracted and the most predictive ones were selected. C) The selected features were then used to build a machine learning model, with performance validated by ROC curves. Finally, the model’s predictions were interpreted using SHAP plots to identify the most impactful features.

Feature Selection Results

The three feature selection methods identified distinct but partially overlapping sets of radiomic features, reflecting their different selection criteria. The specific features chosen by each method are presented in table 2.

Feature selection technique Selected features
MI DNU, GLNU, GLNU.1, LAE, Coarseness
LASSO TE, MP, RLNU, GLNU.2, Coarseness
RFE DNU, GLNU, GLNU.1, RLNU, Coarseness
MI: Mutual Information; LASSO: Least Absolute Shrinkage and Selection Operator; RFE: Recursive Feature Elimination; DNU: Dependence Non-Uniformity; GLNU: Gray Level Non-Uniformity; LAE: Large Area Emphasis; TE: Total Energy, MP: Maximum Probability; RLNU: Run Length Non-Uniformity
Table 2.Features selected by each feature selection method

The feature “coarseness” was the only common feature selected by all three methods, suggesting its fundamental importance in characterizing pancreatic adenocarcinoma in our cohort.

Diagnostic Performance of the Support Vector Machine Classifier

Table 3 presents the results of the SVM classifier, evaluated with three different feature selection techniques: RFE, MI, and LASSO.

Feature selection technique Accuracy Precision Sensitivity F1 score Negative predictive value Positive predictive value
RFE 0.83±0.04 0.80±0.01 0.85±0.13 0.82±0.11 0.80±0.05 0.86±v0.01
MI 0.8±0.05 0.75±0.01 0.87± 0.16 0.80±0.13 0.75±0.06 0.87±0.01
LASSO 0.83±0.03 0.79±0.01 0.87±0.10 0.83±0.08 0.79±0.04 0.88±0.01
SVM: Support Vector Machine; RFE: Recursive Feature Elimination; MI: Mutual Information; LASSO: Least Absolute Shrinkage and Selection Operator
Table 3.Performance metrics of the SVM classifier using different feature selection techniques for pancreatic cancer diagnosis

The table shows the performance metrics, including accuracy, precision, sensitivity, F1-score, negative predictive value, and positive predictive value, for each of the feature selection methods.

The SVM classifier demonstrated robust and consistent performance across all feature selection methods. Notably, both LASSO and RFE yielded the highest accuracy (0.83), while MI provided the highest sensitivity (0.87). The LASSO technique achieved the best balance between precision and recall, as reflected by the highest F1-score (0.83) and PPV/NPV values, suggesting it was the most effective feature selector for the SVM model. Figure 2 displays the receiver operating characteristic (ROC) curves demonstrating the diagnostic performance of the SVM classifier when leveraging the three feature selection techniques -RFE, MI, and LASSO- for pancreatic cancer detection.

Figure 2. Receiver operating characteristic (ROC) curves compare the diagnostic performance of the SVM classifier using recursive feature elimination (RFE), mutual information (MI), and LASSO feature selection techniques for pancreatic cancer detection.

The SVM classifier achieved strong overall performance (AUC=0.87), with the LASSO feature selection method yielding the highest discriminative ability (AUC=0.89) as shown in figure 2.

Additionally, figure 3 presents the SHAP summary plot for the SVM classifier, illustrating the features with the greatest contribution to its predictive performance.

Figure 3. SHAP summary plots illustrates the most important radiomics features for the Support Vector Machine (SVM) classifier with different feature selection methods. Each point represents a patient. The feature’s impact on the model output is shown on the x-axis (SHAP value), where values >0 increase the prediction probability for cancer. Color indicates the feature value from low (blue) to high (red). A) SHAP summary plot for RFE+SVM, B) SHAP summary plot for mutual information+SVM, C) SHAP summary plot for LASSO+SVM

Analysis of the SHAP summary plots (figure 3) revealed that textural heterogeneity features, particularly variants of gray-level non-uniformity and run-length non-uniformity, were consistently among the most influential predictors across all SVM models. These features had high positive SHAP values, indicating that increased heterogeneity was strongly associated with a prediction of pancreatic cancer. While the top features for each model varied slightly depending on the feature selection technique, this core group of non-uniformity metrics reliably drove the model’s decisions.

Diagnostic Performance of the Logistic Regression Classifier

In the second group, pancreatic cancer diagnosis was performed using a LR classifier, in conjunction with three feature selection techniques: RFE, MI, and LASSO.

Table 4 presents the performance metrics for this LR classification group. Additionally, figure 4 displays the receiver operating characteristic (ROC) curves comparing the diagnostic performance of the LR classifier when applying the different feature selection methods.

Feature selection technique Accuracy Precision Sensitivity F1 score Negative predictive value Positive predictive value
RFE 0.79±0.05 0.76±0.06 0.81±0.13 0.78±0.01 0.76±0.05 0.82±0.06
MI 0.76±0.05 0.74±0.07 0.74±0.15 0.74±0.10 0.74±0.07 0.77±0.07
LASSO 0.79±0.06 0.77±0.08 0.79±0.18 0.78±0.12 0.77±0.07 0.81±0.08
LR: Logistic regression; RFE: Recursive feature elimination; MI: Mutual information; LASSO: Least absolute shrinkage and selection operator
Table 4.Performance metrics of the LR classifier using different feature selection techniques for pancreatic cancer diagnosis

Figure 4. Receiver operating characteristic (ROC) curves compare the diagnostic performance of the LR classifier using recursive feature elimination (RFE), mutual information (MI), and LASSO feature selection techniques for pancreatic cancer detection.

The LR classifier showed solid performance, with RFE and LASSO emerging as the superior feature selection methods, both achieving an accuracy of 0.79. Performance declined notably with MI, which resulted in the lowest scores across all metrics, including accuracy (0.76) and sensitivity (0.74). RFE provided the highest sensitivity (0.81) for the LR model, making it the best choice for minimizing false negatives. Additionally, figure 4 displays the receiver operating characteristic (ROC) curves comparing the diagnostic performance of the LR classifier when applying the different feature selection methods.

The LR classifier also demonstrated high performance (AUC=0.88), achieving its best result with LASSO feature selection (AUC=0.91), as shown in figure 4.

Furthermore, figure 5 presents the SHAP plot of this group.

The SHAP analysis for the LR models (figure 5) reinforced the importance of texture-based features. Gray-level-non-uniformity and coarseness were consistently top contributors across all feature selection methods. The specific top features varied, with RFE and MI highlighting run-length and dependence non-uniformity, while LASSO selected features representing intensity and energy (maximum-probability, total-energy). In all cases, higher values of these features increased the model’s prediction of cancer.

Figure 5. SHAP summary plots illustrate the most important radiomics features for the logistic regression (LR) classifier with different feature selection methods. Each point represents a patient. The feature’s impact on the model output is shown on the x-axis (SHAP value), where values >0 increase the prediction probability for cancer. Color indicates the feature value from low (blue) to high (red). A) SHAP summary plot for RFE+LR, B) SHAP summary plot for mutual information+LR, C) SHAP summary plot for LASSO+LR

Diagnostic Performance of the Random Forest Classifier

The final group employed a RF classifier, in combination with three feature selection methods: RFE, MI, and LASSO.

Table 5 summarizes the performance metrics for this RF classification group.

Feature selection technique Accuracy Precision Sensitivity F1 score Negative predictive value Positive predictive value
RFE 0.95±0.06 0.94±0.08 0.96±0.18 0.95±0.12 0.94±0.07 0.96±0.08
MI 0.96±0.05 0.94±0.08 0.98±0.13 0.96±0.10 0.94±0.05 0.98±0.08
LASSO 0.88±0.05 0.86±0.07 0.89±0.15 0.88±0.10 0.86±0.05 0.90±0.07
RF: Random forest; RFE: Recursive feature elimination; MI: Mutual information; LASSO: Least absolute shrinkage and selection operator
Table 5.Performance metrics of the RF classifier using different feature selection techniques for pancreatic cancer diagnosis

The RF classifier demonstrated exceptional performance, substantially outperforming all other models (table 4). MI feature selection yielded the best results, achieving peak accuracy (0.96) and remarkable sensitivity (0.98), indicating a near-perfect ability to identify true positive cases. RFE also performed excellently with comparable results. In stark contrast, LASSO was a significantly less effective feature selector for RF, with a notable drop in accuracy to 0.88. Furthermore, figure 6 presents the ROC curves, illustrating the diagnostic capabilities of the RF classifier when leveraging the different feature selection techniques.

Figure 6. Receiver operating characteristic (ROC) curves compare the diagnostic performance of the RF classifier using recursive feature elimination (RFE), mutual information (MI), and LASSO feature selection techniques for pancreatic cancer detection.

Additionally, figure 7 presents the SHAP plot of this group.

Figure 7. This figure shows SHAP summary plots illustrating the most important radiomics features for the random forest (RF) classifier with different feature selection methods. Each point represents a patient. The feature’s impact on the model output is shown on the x-axis (SHAP value), where values>0 increase the prediction probability for cancer. Color indicates the feature value from low (blue) to high (red). A) SHAP summary plot for RFE+RF, B) SHAP Summary Plot for Mutual Information+RF, C) SHAP summary plot for LASSO+RF

The RF classifier demonstrated consistent performance across feature selection methods, with LASSO yielding the highest AUC (0.87), slightly outperforming both RFE and MI (AUC=0.86 each), as shown in figure 6.

SHAP analysis for the RF classifier (figure 7) confirmed coarseness as a universally important feature across all models. The top predictors were predominantly texture-based, with RFE and MI emphasizing non-uniformity metrics (gray-level-non-uniformity, dependence-non-uniformity), while LASSO’s selection was more diverse, also incorporating intensity-based features such as total-energy and maximum-probability.

Discussion

The results of this study demonstrate the varying performance of three classifiers—SVM, LR, and RF—when combined with different feature selection techniques, namely RFE, MI, and LASSO. A critical step underpinning these results was the application of feature selection, which served a dual purpose; it enhanced model performance by reducing dimensionality and mitigating the risk of overfitting, and it significantly improved clinical interpretability by distilling a large set of 93 radiomics features down to a concise, biologically relevant subset. Consequently, each combination of classifier and feature selection method highlighted distinct strengths. This demonstrates the potential of this radiomics pipeline to serve as a diagnostic aid for clinically presenting pancreatic adenocarcinoma, which in our cohort primarily comprised stages II and III disease.

All three classifiers demonstrated strong potential for diagnosing pancreatic adenocarcinoma, with each exhibiting a distinct optimal feature selection partner. The SVM and LR models performed best with the LASSO algorithm, achieving their highest AUCs of 0.89 and 0.91, respectively. This aligns with existing literature that highlights LASSO’s efficacy in selecting discriminative radiomic features. 11 , 12 In contrast, the RF classifier achieved its peak performance with MI, yielding exceptional accuracy (0.96) and sensitivity (0.98). This finding is consistent with the known strength of ensemble methods such as RF in handling complex, high-dimensional data, as supported by prior radiomics research. 13 SHAP analysis revealed that textural heterogeneity was the dominant factor in model predictions across all classifiers. Features quantifying non-uniformity, such as gray-level-non-uniformity and run-length-non-uniformity, were consistently among the most influential, with higher values strongly increasing the probability of a cancer prediction. This aligns with the established understanding that pancreatic adenocarcinoma exhibits greater textural heterogeneity than normal tissue, a finding supported by previous radiomics studies. 14 , 15 The robustness of these texture features was underscored by their importance in both linear (LR) and non-linear (SVM, RF) models. While the specific top features varied slightly—for instance, RF also emphasized morphology-based features such as large-area-emphasis with certain selectors—the consensus on heterogeneity highlights its fundamental role in distinguishing pancreatic cancer.

The consistent primacy of texture-based features, particularly gray-level-non-uniformity and run-length-non-uniformity, across all models underscores that tumor heterogeneity is a fundamental imaging biomarker for pancreatic adenocarcinoma. This finding is robust, as it was consistently selected by different algorithms and is strongly supported by the existing radiomics literature. 16 - 18 The contribution of other features such as coarseness, dependence-non-uniformity, and total-energy (a measure of the overall magnitude of voxel intensities, potentially reflecting lesion density or cellularity) further suggests that a combination of heterogeneity, texture scale, and intensity distribution provides a comprehensive quantitative profile of the tumor. This validates our feature selection pipeline and confirms that these radiomic signatures capture critical aspects of tumor biology, offering a non-invasive means to enhance diagnostic precision. The interpretation of our findings should be considered in the context of the study’s design. Its single-center, retrospective nature and modest sample size (n=100) may limit the generalizability of our findings. Furthermore, despite being performed by an expert radiologist, manual segmentation introduces potential variability, 19 and the reliance on CT imaging alone does not capture the full spectrum of tumor biology offered by multi-modal approaches such as Positron Emission Tomography/Magnetic Resonance Imaging (PET/MRI). 20

Future research should focus on validating these results in larger, multi-center prospective cohorts. Subsequent studies would also benefit greatly from incorporating multi-parametric MRI or PET imaging to provide a more comprehensive tumor characterization, 20 and from exploring deep learning methods for automated feature extraction. Ultimately, integrating these radiomic profiles with genomic and molecular data could unlock powerful, multi-omics predictive models for personalized medicine in pancreatic cancer.

Conclusion

In conclusion, this research lays the groundwork for developing advanced, machine learning-based diagnostic tools that could significantly impact the diagnosis of pancreatic adenocarcinoma, potentially improving patient outcomes through earlier intervention and more personalized treatment strategies. Furthermore, this study utilized classifiers with their default hyperparameters to establish a baseline comparison. While this approach ensured a straightforward and reproducible analysis, it is recognized that a formal hyperparameter optimization process (e.g., via grid or random search) could potentially enhance model performance and will be an important focus of future work with larger datasets. It is important to note that our study focuses on diagnosing visible tumors and does not address the challenge of ‘early detection’ in a pre-symptomatic or screening population, which would require a fundamentally different study design with longitudinal data.

Acknowledgment

This research was supported by the Baqiyatallah University of Medical Sciences (Grant Code: F1R3P2S4). We confirm that this study is not derived from a thesis. We have not used any AI tools or technologies to prepare this article.

Authors’ Contribution

Z.S: Conceived and supervised the research, provided critical intellectual input during manuscript drafting and revision; A.T: Designed the methodology, acquired and curated the data, performed image segmentation and machine learning modeling, analyzed the results, and drafted the initial manuscript; J.A.M: Provided clinical expertise, performed and validated image segmentation, and contributed to data interpretation and manuscript revision; M.S: Contributed to the statistical and machine learning methodology and assisted in model validation and manuscript revision, T.C.M and A.R assisted in data acquisition, labeling, and preprocessing, and contributed to manuscript preparation. All authors critically revised the manuscript for important intellectual content. All authors have read and approved the final manuscript and agree to be accountable for all aspects of the work in ensuring that questions related to the accuracy or integrity of any part of the work are appropriately investigated and resolved.

AI Declaration

The authors declare that they have not used any AI tools or technologies to prepare this manuscript.

Conflict of Interest

None declared.

References

  1. McGuigan A, Kelly P, Turkington RC, Jones C, Coleman HG, McCain RS. Pancreatic cancer: A review of clinical diagnosis, epidemiology, treatment and outcomes. World J Gastroenterol. 2018; 24:4846-61. Publisher Full Text | DOI | PubMed [ PMC Free Article ]
  2. Rahib L, Smith BD, Aizenberg R, Rosenzweig AB, Fleshman JM, Matrisian LM. Projecting cancer incidence and deaths to 2030: the unexpected burden of thyroid, liver, and pancreas cancers in the United States. Cancer Res. 2014; 74:2913-21. DOI | PubMed
  3. Ye C, Sadula A, Ren S, Guo X, Yuan M, Yuan C, et al. The prognostic value of CA19-9 response after neoadjuvant therapy in patients with pancreatic cancer: a systematic review and pooled analysis. Cancer Chemother Pharmacol. 2020; 86:731-40. DOI | PubMed
  4. Kaur S, Baine MJ, Jain M, Sasson AR, Batra SK. Early diagnosis of pancreatic cancer: challenges and new developments. Biomark Med. 2012; 6:597-612. Publisher Full Text | DOI | PubMed [ PMC Free Article ]
  5. Lambin P, Rios-Velazquez E, Leijenaar R, Carvalho S, van Stiphout RG, Granton P, et al. Radiomics: extracting more information from medical images using advanced feature analysis. Eur J Cancer. 2012; 48:441-6. Publisher Full Text | DOI | PubMed [ PMC Free Article ]
  6. Aerts HJ, Velazquez ER, Leijenaar RT, Parmar C, Grossmann P, Carvalho S, et al. Decoding tumour phenotype by noninvasive imaging using a quantitative radiomics approach. Nat Commun. 2014; 5:4006. Publisher Full Text | DOI | PubMed [ PMC Free Article ]
  7. Anijdan S, Reiazi R, Tafti HF, Moslemi D, Moghadamnia A, Paydar R. Application of radiomics in radiotherapy: challenges and future prospects. Caspian J Pediatr. 2022; 8:127-34. DOI
  8. Preuss K, Thach N, Liang X, Baine M, Chen J, Zhang C, et al. Using Quantitative Imaging for Personalized Medicine in Pancreatic Cancer: A Review of Radiomics and Deep Learning Applications. Cancers (Basel). 2022; 14Publisher Full Text | DOI | PubMed [ PMC Free Article ]
  9. Amin MB, Greene FL, Edge SB, Compton CC, Gershenwald JE, Brookland RK, et al. The Eighth Edition AJCC Cancer Staging Manual: Continuing to build a bridge from a population-based to a more “personalized” approach to cancer staging. CA Cancer J Clin. 2017; 67:93-9. DOI | PubMed
  10. Python Software Foundation. Python 3.14 documentation. Beaverton (OR): Python Software Foundation; 2024 [cited 2025]. Available from: https://docs.python.org/3.14/
  11. Mukherjee S, Patra A, Khasawneh H, Korfiatis P, Rajamohan N, Suman G, et al. Radiomics-based Machine-learning Models Can Detect Pancreatic Cancer on Prediagnostic Computed Tomography Scans at a Substantial Lead Time Before Clinical Diagnosis. Gastroenterology. 2022; 163:1435-46. Publisher Full Text | DOI | PubMed [ PMC Free Article ]
  12. Huang Y, Zhang H, Ding Q, Chen D, Zhang X, Weng S, et al. Comparison of multiple machine learning models for predicting prognosis of pancreatic ductal adenocarcinoma based on contrast-enhanced CT radiomics and clinical features. Front Oncol. 2024; 14:1419297. Publisher Full Text | DOI | PubMed [ PMC Free Article ]
  13. Toyama Y, Hotta M, Motoi F, Takanami K, Minamimoto R, Takase K. Prognostic value of FDG-PET radiomics with machine learning in pancreatic cancer. Sci Rep. 2020; 10:17024. Publisher Full Text | DOI | PubMed [ PMC Free Article ]
  14. Kim HS, Kim YJ, Kim KG, Park JS. Preoperative CT texture features predict prognosis after curative resection in pancreatic cancer. Sci Rep. 2019; 9:17389. Publisher Full Text | DOI | PubMed [ PMC Free Article ]
  15. Qiu JJ, Yin J, Qian W, Liu JH, Huang ZX, Yu HP, et al. A Novel Multiresolution-Statistical Texture Analysis Architecture: Radiomics-Aided Diagnosis of PDAC Based on Plain CT Images. IEEE Trans Med Imaging. 2021; 40:12-25. DOI | PubMed
  16. Chen PT, Chang D, Yen H, Liu KL, Huang SY, Roth H, et al. Radiomic Features at CT Can Distinguish Pancreatic Cancer from Noncancerous Pancreas. Radiol Imaging Cancer. 2021; 3:e210010. Publisher Full Text | DOI | PubMed [ PMC Free Article ]
  17. Guo C, Zhuge X, Wang Q, Xiao W, Wang Z, Wang Z, et al. The differentiation of pancreatic neuroendocrine carcinoma from pancreatic ductal adenocarcinoma: the values of CT imaging features and texture analysis. Cancer Imaging. 2018; 18:37. Publisher Full Text | DOI | PubMed [ PMC Free Article ]
  18. Nasief H, Zheng C, Schott D, Hall W, Tsai S, Erickson B, et al. A machine learning based delta-radiomics process for early prediction of treatment response of pancreatic cancer. NPJ Precis Oncol. 2019; 3:25. Publisher Full Text | DOI | PubMed [ PMC Free Article ]
  19. Zhang C, Peng J, Wang L, Wang Y, Chen W, Sun MW, et al. A deep learning-powered diagnostic model for acute pancreatitis. BMC Med Imaging. 2024; 24:154. Publisher Full Text | DOI | PubMed [ PMC Free Article ]
  20. Li J, Fu C, Zhao S, Pu Y, Yang F, Zeng S, et al. The progress of PET/MRI in clinical management of patients with pancreatic malignant lesions. Front Oncol. 2023; 13:920896. Publisher Full Text | DOI | PubMed [ PMC Free Article ]