AI Tool Combines Data from Medical Images with Text to Predict Cancer Prognoses
Posted on 10 Jan 2025
The integration of visual data (such as microscopic and X-ray images, CT and MRI scans) with textual information (like exam notes and communications between doctors of different specialties) is a crucial aspect of cancer care. While artificial intelligence (AI) tools have been increasingly employed in clinical settings, their primary application has been in diagnostics rather than prognosis. AI aids doctors in reviewing images and detecting disease-related anomalies, such as abnormally shaped cells, but developing computerized models that can combine various types of data has been a challenge. One of the difficulties is the need to train these models with large amounts of labeled and paired data, like a microscope slide showing a cancerous tumor alongside the clinical notes of the patient from whom the tumor was obtained. However, curated and annotated datasets are often scarce. Researchers have now developed an AI model capable of integrating both visual and textual data. After training on 50 million medical images of standard pathology slides and more than 1 billion pathology-related texts, the model surpassed traditional methods in its ability to predict the prognoses of thousands of cancer patients, identify individuals with lung or gastroesophageal cancers likely to benefit from immunotherapy and pinpoint melanoma patients most at risk of experiencing a recurrence.
The model, named MUSK (multimodal transformer with unified mask modeling), was developed by researchers at Stanford Medicine (Stanford, CA, USA). MUSK marks a significant departure from the typical use of AI in clinical settings, and the researchers believe it has the potential to transform how AI can guide patient care. In AI terminology, MUSK is considered a foundation model. Foundation models, which are pretrained on large datasets, can be further fine-tuned with additional training to handle specific tasks. Since MUSK was designed to utilize unpaired multimodal data that does not meet the traditional requirements for training AI, it can leverage a much larger pool of data for its initial learning phase. As a result, subsequent training only requires smaller, more specialized datasets. Essentially, MUSK is a ready-to-use tool that doctors can customize to answer specific clinical questions.
To develop MUSK, the researchers gathered microscopic tissue slides, pathology reports, and follow-up data (including patient outcomes) from The Cancer Genome Atlas, a national database, for individuals with 16 major cancer types, such as breast, lung, colorectal, pancreatic, kidney, bladder, and head and neck cancers. This data was used to train MUSK to predict disease-specific survival or the percentage of patients who have not died from a specific disease within a given time frame. According to the study, published in Nature, MUSK accurately predicted disease-specific survival for all cancer types 75% of the time. In comparison, traditional predictions based on a person’s cancer stage and other clinical risk factors were correct 64% of the time. In another example, MUSK was trained to analyze extensive data to predict which patients with lung cancer or cancers of the gastric and esophageal tracts are most likely to benefit from immunotherapy.
For non-small cell lung cancer, MUSK identified patients who responded well to immunotherapy approximately 77% of the time. In contrast, the conventional method of predicting immunotherapy response based on PD-L1 expression was correct only 61% of the time. Similarly, when the researchers trained MUSK to identify melanoma patients at high risk of relapse within five years after initial treatment, the model was accurate about 83% of the time, which is roughly 12% more accurate than other foundation models.
“MUSK can accurately predict the prognoses of people with many different kinds and stages of cancer,” said Ruijiang Li, MD, an associate professor of radiation oncology. “We designed MUSK because, in clinical practice, physicians never rely on just one type of data to make clinical decisions. We wanted to leverage multiple types of data to gain more insight and get more precise predictions about patient outcomes.”
“What’s unique about MUSK is the ability to incorporate unpaired multimodal data into pretraining, which substantially increases the scale of data compared with paired data required by other models,” added Li.