EACR25-0321
Polymerase proofreading deficiency (PPD), caused by mutations in POLE and POLD1 genes, leads to extremely high tumor mutation burdens and distinct mutational signatures. While PPD tumors show promising responses to immunotherapy, identifying the specific variants responsible remains challenging. Current variant classification efforts are incomplete, with many variants of unknown significance, including well-known drivers like POLE P286R and V411L.
We analyzed 235,161 sequenced tumors from three major databases (GENIE, TCGA, and CPC) using an iterative SVM approach that incorporated both mutation burden and signature analysis. For structural characterization, we leveraged AlphaFold2 to generate 160 models for each variant, developing a novel metric based on pLDDT scores to assess structural impact. We further employed AlphaFold3 to analyze magnesium ion binding in specific variants. Clinical correlations were examined across different driver clusters.
We identified 567 PPD tumors across 13 cancer types, with 515 harboring driver variants. Our analysis revealed novel POLE drivers outside the exonuclease domain, challenging traditional assumptions about variant pathogenicity. Structural analysis achieved an AUC of 0.85 in distinguishing drivers from passengers, outperforming existing tools like AlphaMissense (AUC=0.73) and MutPred2 (AUC=0.63). We identified three distinct structural regions affected by PPD variants and classified drivers into six clusters based on their structural impact. These clusters showed significant associations with MMR status, mutation burden, and signature profiles. Clinical analysis revealed age-related correlations specific to male colorectal cancer patients and endometrial cancer cases. Our integrated approach provides novel insights into PPD variant classification and their functional impacts. The characterization of distinct structural clusters suggest multiple mechanisms through which PPD variants affect polymerase function. The differential associations of clusters with clinical features and molecular characteristics indicate potential prognostic and therapeutic implications. Saturation analysis suggests that while additional drivers may be discovered with larger datasets, the current list captures most clinically relevant variants.
This comprehensive analysis of PPD drivers combines genomic evidence with structural insights to advance our understanding of polymerase proofreading deficiency. Our findings provide a framework for variant classification and suggest that different PPD drivers may operate through distinct mechanisms, potentially influencing clinical outcomes. The established relationships between structural impacts and molecular features offer new perspectives for personalized therapeutic approaches in PPD tumors.