Artificial intelligence models, pretrained on vast datasets, significantly outperformed a standard baseline model in identifying nonmelanoma skin cancers (NMSC) from digital images of tissue samples, according to research presented at the 2025 American Association for Cancer Research (AACR) Annual Meeting in Chicago, IL.
These advanced, pretrained machine learning models could expand the reach of machine learning-based cancer diagnosis to resource-limited settings, explains researcher Steven Song, an MD/PhD candidate in the Medical Scientist Training Program at Pritzker School of Medicine and the Department of Computer Science at the University of Chicago in Chicago, IL, in a news release.
“In resource-limited settings, however, the lack of expert pathologists limits the ability to provide timely and widespread review and diagnosis of NMSC,” Song says. “Artificial intelligence and machine learning have long promised to fill resource gaps, but the development and deployment of bespoke machine learning models require significant resources that may not be available in many places—namely computational experts, specialized computational hardware, and large amounts of curated data to train each model.”
Effective ‘Off-the-Shelf’ Tools
Song and colleagues reasoned that machine learning models that have been previously trained on vast amounts of data (often called “foundation models”) in resource-rich environments might be effective “off-the-shelf” tools to guide NMSC diagnosis. This could allow machine learning to be used in settings with limited access to large datasets or the specialized equipment or experts needed for developing models from scratch, Song notes.
In this study, the researchers tested the accuracy of three contemporary foundation models—PRISM, UNI, and Prov-GigaPath—in identifying NMSC from digital pathology images of suspected cancerous skin lesions. All three foundation models work by converting a high-resolution digital image of a tissue pathology slide into small image tiles, extracting meaningful features from the tiles, and analyzing these features to compute the probability that the tissue contains NMSC.
The models’ accuracy in diagnosing NMSC was evaluated on 2,130 tissue slide images representing 553 biopsy samples from Bangladeshi individuals enrolled in the Bangladesh Vitamin E and Selenium Trial. High levels of exposure to arsenic through contaminated drinking water increases the risk for NMSC in this population, providing a relevant real-world context for the study, Song explains. Of the 2,130 total images, 706 were of normal tissue, and 1,424 were of confirmed NMSC (638 cases of Bowen’s disease, 575 cases of basal cell carcinoma, and 211 cases of invasive squamous cell carcinoma).
Accuracy of the three foundation models was compared with that of ResNet18, an established but older architecture for image recognition. “ResNet architectures have been used as a starting point for training vision models for nearly a decade and serve as a meaningful baseline comparison for evaluating the performance gains of newer pretrained foundation models,” Song notes.
Outperformed ResNet18
Each of the three newer foundation models significantly outperformed ResNet18—correctly distinguishing between NMSC and normal tissue in 92.5% (PRISM), 91.3% (UNI), and 90.8% (Prov-GigaPath) of cases, compared with an accuracy of 80.5% for ResNet18, representing a substantial improvement in performance.
To make the foundation models more amenable to use in resource-limited settings, Song and colleagues developed and tested simplified versions of each model. The simplified models, which require less extensive analysis of pathology image data, still significantly outperformed ResNet18, with accuracies of 88.2% (PRISM), 86.5% (UNI), and 85.5% (Prov-GigaPath), demonstrating robustness even with reduced complexity, according to the researchers.
In addition, Song and colleagues developed and applied an annotation framework designed to highlight cancerous regions on tissue slides identified by these foundation models. The framework does not require training on large datasets and instead leverages example images of cancerous tissue from a small number of biopsies. It then compares pathology image tiles against these examples to identify and annotate cancerous regions. Song explained that annotation could help guide the attention of a user towards regions of interest on each slide.
Machine Learning Has Potential to Aid Diagnosis
“Overall, our results demonstrate that pretrained machine learning models have the potential to aid diagnosis of NMSC, which might be particularly beneficial in resource-limited settings,” said Song. “Our study also provides insights that may advance the development and adaptation of foundation models for various clinical applications.”
A limitation of the study is that the models were evaluated on a single cohort of patients from Bangladesh, which may limit the generalizability of the findings to other populations. Another limitation is that, while the study approached its analyses from the perspective of resource-limited settings, it did not examine the practical details of deploying the pretrained machine learning models in such settings.
“While our study suggests foundation models as resource-efficient tools for aiding NMSC diagnosis, we acknowledge that we are still far from having a direct impact on patient care and that further work is needed to address practical considerations, such as the availability of digital pathology infrastructure, internet connectivity, integration into clinical workflows, and user training,” Song says.
 
				 
															