Abstract
Background
Recent advancements in immunotherapy, particularly pembrolizumab, have shown promising results in treating metastatic colorectal cancer (CRC) and triple-negative breast cancer (TNBC). Accurate detection of predictive biomarkers, such as microsatellite instability (MSI)/mismatch repair deficiency (MMRd) and programmed death-ligand 1 (PD-L1), is key to efficacy of these treatments. Traditional methods like immunohistochemistry (IHC) and next-generation sequencing are effective but are labor intensive and require subjective interpretation.
Methods
We developed a dual-modality transformer-based model for predicting MSI/MMRd and PD-L1 status using hematoxylin & eosin and IHC stained whole slide images. We evaluated the model using area under the receiver operating curve (AUROC). Time-on-treatment (TOT) and overall survival (OS) were derived from insurance claims and analyzed by Kaplan–Meier method. Hazard ratios (HR) were determined using the Cox proportional hazard model.
Results
Our AI framework achieves clinical-grade performance, with AUROC exceeding 0.97 for MSI/MMRd prediction in CRC and 0.96 for PD-L1 prediction in breast cancer. Patients with biomarker-positive model predictions demonstrated prolonged TOT and OS when treated with pembrolizumab. For breast cancer patients, the model’s predictions were superior to PD-L1 IHC in stratifying patients with improved outcomes on pembrolizumab, suggesting a reevaluation of existing PD-L1 status thresholds.
Conclusions
This study promotes the integration of advanced AI tools in clinical pathology, aiming to enhance the precision and efficiency of cancer biomarker evaluation and offering a customizable framework for varied clinical scenarios. Our model enhances predictive accuracy, integrating features from both staining methods, and exhibits superior prognostic precision compared to current biomarker assessments.

