IdentifiHR predicts homologous recombination deficiency in high-grade serous ovarian carcinoma using gene expression
Journal Title
Communications Medicine
Publication Type
Jan 14
Abstract
BACKGROUND: Approximately half of all high-grade serous ovarian carcinomas (HGSCs) have a therapeutically targetable defect in homologous recombination (HR) DNA repair. While there are genomic and transcriptomic methods, developed for other cancers, to identify HR deficient (HRD) samples, there are no gene expression-based tools to predict HR status in HGSC specifically. We have built a HGSC-specific model to predict HR status using gene expression. METHODS: We separated The Cancer Genome Atlas (TCGA) cohort of HGSCs into training (n = 288) and testing (n = 73) sets and labelled each case as HRD or HR proficient (HRP) based on the clinical standard for classification. Using the training set, we performed differential gene expression analysis between HRD and HRP cases. The 2604 significantly differentially expressed genes were used to train a penalised logistic regression model. RESULTS: IdentifiHR uses the expression of 209 genes to predict HR status in HGSC. These genes preserve the genomic damage signal, capturing known regions of HR-specific copy number alteration which impact gene expression. IdentifiHR is 85% accurate in the TCGA test set and 86% accurate in an independent cohort of 99 samples, taken from primary tumours, ascites and normal fallopian tubes. Further, IdentifiHR is 84% accurate in pseudobulked single-cell HGSC sequencing from 37 patients and outperforms existing expression-based methods to predict HR status, being BRCAness, MutliscaleHRD and expHRD. CONCLUSIONS: IdentifiHR is an accurate model to predict HR status in HGSC. It is available as an open source R package, empowering researchers to robustly classify HR status when only transcriptomic sequencing data is available.; High-grade serous ovarian cancer (HGSC) is a type of ovarian cancer with very poor outcomes. However, half of HGSCs have faulty DNA repair that can be targeted for treatment if it is identified. Existing methods look at changes in DNA that arise when repair is faulty, but do not consider which genes are actively being used, or are “expressed”, by the cancer. We developed IdentifiHR, a machine learning method to predict DNA repair status using the expression of 209 genes. We tested IdentifiHR on 209 patient samples and found it correctly predicts repair status in about 85–86% of cases, performing better than existing tools on the same patient data. IdentifiHR is released as a software package for public use.; eng
Publisher
Springer Nature
Research Division(s)
Blood Cells and Blood Cancer; Bioinformatics and Computational Biology
PubMed ID
41535393
Open Access at Publisher's Site
https://doi.org/10.1038/s43856-026-01387-y
Terms of Use/Rights Notice
Refer to copyright notice on published article.


Creation Date: 2026-01-29 02:00:47
Last Modified: 2026-01-29 02:01:03
An error has occurred. This application may no longer respond until reloaded. Reload 🗙