AI Predicts Colorectal Cancer Survival Using Clinical and Molecular Features
Posted on 30 Dec 2025
Colorectal cancer is one of the most common and deadly cancers worldwide, and accurately predicting patient survival remains a major clinical challenge. Traditional prognostic tools often rely on either clinical factors or molecular data alone, limiting their ability to identify high-risk patients and guide treatment decisions. A new machine learning approach now shows that combining clinical and biological features can significantly improve survival prediction accuracy in colorectal cancer patients.
The machine learning-based model, developed by researchers from the University of Brasília (Brasília, Brazil), in collaboration with the University of California San Diego (La Jolla, CA, USA), integrates standard clinical variables with molecular biomarkers to predict patient survival more precisely. Clinical inputs included age, cancer stage, lymph node involvement, chemotherapy status, and other routine factors, while biological features consisted of gene expression and microRNA data. The researchers evaluated multiple machine learning techniques and patient data scenarios to determine which approach delivered the most reliable and consistent predictions.

Data from more than 500 colorectal cancer patients were analyzed, with models trained and tested across three different patient group scenarios. Among the methods evaluated, an adaptive boosting model achieved the highest performance, reaching an accuracy of 89.58%. The findings, published in Oncotarget, showed that combining clinical and biological data consistently outperformed models based on a single data type. Key molecular contributors included the gene E2F8, which was influential across all patient groups, along with WDR77 and hsa-miR-495-3p.
The results suggest that integrated machine learning models could help clinicians better stratify patients by risk, tailor treatment strategies, and improve long-term outcome predictions. The study also highlights the value of ensemble learning methods, which provided stable results across diverse patient groups. Future research will focus on incorporating additional clinical variables, such as lifestyle and environmental factors, and further exploring biomarkers like E2F8 as potential targets for monitoring disease progression or developing targeted therapies.
Related Links:
University of Brasília
UC San Diego







