AI Tool ‘Sees’ Cancer Gene Signatures in Biopsy Images
Posted on 18 Nov 2024
To assess the type and severity of cancer, pathologists typically examine thin slices of a tumor biopsy under a microscope. However, to understand the genomic alterations driving the tumor's growth, scientists must perform genetic sequencing on RNA extracted from the tumor. Increasingly, clinicians are using not only the tumor’s location to guide treatment decisions but also the specific genes that fuel its progression. The activation or deactivation of certain genes can make a tumor more aggressive, more likely to spread, or more or less responsive to various treatments, such as chemotherapy, immunotherapy, and hormone therapies. However, accessing this critical genetic information often requires expensive and time-consuming sequencing. Now, researchers have developed an artificial intelligence (AI)-powered computational tool that can predict the activity of thousands of genes within tumor cells, using only standard microscopy images from the biopsy. This tool, named SEQUOIA (Slide-based Expression Quantification Using Linearized Attention), was created with data from over 7,000 diverse tumor samples. It has shown the ability to predict genetic variations in breast cancers and patient outcomes, all based on routine biopsy images.
The research team at Stanford Medicine (Stanford, CA, USA) was aware that gene activity within individual cells can change their appearance in ways that are often invisible to the naked eye. To uncover these patterns, they turned to AI. Their study used 7,584 cancer biopsies from 16 different cancer types. Each biopsy was sliced into thin sections and stained using hematoxylin and eosin, a standard method for visualizing cancer cell morphology. Data on the transcriptomes of these cancers—showing which genes were being actively expressed—was also available. By integrating these biopsies with other datasets, including images from thousands of healthy cells and transcriptomic data, the AI program, as described in Nature Communications, was able to predict the expression patterns of more than 15,000 genes from the stained biopsy images.
In certain cancer types, the AI’s predictions of gene activity had more than an 80% correlation with the actual gene activity data. The accuracy of the model generally improved when more samples from a specific cancer type were included in the training data. According to the researchers, clinicians rarely focus on individual genes when making decisions but instead consider gene signatures composed of hundreds of genes. For instance, many cancer cells activate extensive groups of genes related to inflammation or cell growth. SEQUOIA was even more accurate at predicting whether such large genomic programs were activated than it was at predicting individual gene expression. To make the results more accessible, the researchers programmed SEQUOIA to display genetic findings as a visual map of the tumor biopsy, allowing clinicians and researchers to see how genetic variations vary across different areas of the tumor.
To test the clinical utility of SEQUOIA, the team focused on breast cancer genes that are already used in commercial tests. For example, the FDA-approved MammaPrint test evaluates 70 breast-cancer-related genes to generate a risk score for cancer recurrence. The researchers demonstrated that SEQUOIA could generate the same risk score as MammaPrint using only the stained tumor biopsy images. The results were confirmed in multiple cohorts of breast cancer patients, and in each case, patients classified as high risk by SEQUOIA experienced worse outcomes, including higher recurrence rates and shorter times to recurrence. Although SEQUOIA is not yet ready for clinical use—it still requires validation through clinical trials and FDA approval—the researchers are continuing to refine the algorithm and explore its potential. In the future, SEQUOIA could reduce the need for expensive genetic expression tests.
“This kind of software could be used to quickly identify gene signatures in patients’ tumors, speeding up clinical decision-making and saving the health care system thousands of dollars,” said Olivier Gevaert, PhD, a professor of biomedical data science and the senior author of the paper. “We’ve shown how useful this could be for breast cancer, and we can now use it for all cancers and look at any gene signature that is out there. It’s a whole new source of data that we didn’t have before.”