PARIS – A team of medical researchers and engineers from the Gustave Roussy Institute, in Villejuif, France, and Paris-Sud University recently developed an artificial intelligence system called Resolved2, designed to assess prospective cancer drugs. As Loïc Verlingue, lead cancer specialist on the data science team at the Gustave Roussy Institute, explained to BioWorld MedTech, “this AI is intended to predict efficiently whether a cancer treatment molecule will achieve authorization or not within six years of pharmacological data and phase I clinical trials.”
Only 5% of candidate drugs pass phase I clinical trials
Drug development takes a long time (15 years), is expensive ($700 million to $900 million) and requires the involvement of several thousand patients. According to Pharmaceutical Research and Manufacturers of America, more than 1,000 anti-neoplastic agents were being investigated last year. Cancer treatment had the highest overall attrition rate for U.S. FDA approval from phase I (95%), phase II (92%) and phase III (67%) trials. There has been a shrinking number of truly innovative new medicines approved by the FDA and other major regulatory bodies around the world over the past five years. The PAREXEL Biopharmaceutical R&D Statistical Sourcebook notes that 50% fewer new molecular entities were approved compared with the previous five years.
Given the recent limited success rate developing anti-neoplastic agents in cancer treatment, improving early go/no-go decisions following a phase I clinical trial is a timely challenge. “A tool for predicting FDA approval for new compounds individually on the basis of early clinical data is still lacking,” said Verlingue. Hence his team’s idea of designing a system based on machine learning that could enhance drug development.
How the machine learning model from Gustave Roussy works
This work was conceived at the Gustave Roussy Institute. This French cancer center set up a department for therapeutic innovation and early trials (DITEP) in 2013, which has since become the largest center for early phase I clinical trials in Europe. “150 multidisciplinary professionals work in this center and more than 450 patients take part each year in phase I trials,” said Christophe Massard, medical oncologist and Chair of DITEP. This drug development department includes six oncologists, economists and engineers from the Télécom Paris Grand Ecole and CentraleSupélec working on developing an open source machine learning algorithm model based on pharmacologic characteristics.
This machine learning model, Resolved2, uses the earliest phase I PubMed abstracts, and simple pharmacologic characteristics extracted from the Canadian DrugBank5.0 database. “A multi-variable Cox model with Lasso penalization (Least Absolute Shrinkage and Selection Operator) reduces the number of variables in our automated decision model from 1140 to 20,” said Verlingue. In fact, the French team considered time to FDA approval as a right-censored variable, in order to evaluate the unknown probability of future approval for drugs under follow-up.
Drugs are 16 times more likely to be approved when success is predicted by AI
Using a model for censored data to predict drug approval is new. It accounts for heterogeneity in follow-up, and maximizes the amount of data used, including recent examples. Resolved2 was driven by the pharmacological data on 462 cancer treatment molecules in development undergoing phase I trials. The machine learning model selected 28 of the most relevant variables out of 1,411 parameters relating to cancer treatment molecules authorized or not by the U.S. FDA between 1972 and 2017.
Median follow-up was 134 months. At three- and six-years follow-up, 13% and 20% of drugs respectively were approved. The data science team from the Gustave Roussy Institute observed that, overall, 131 of the 462 drugs achieved FDA approval. “Resolved2 model predictions were closely related to the observed FDA approval-free survival for anti-neoplastic agents included in the previously unseen test data,” said Verlingue. FDA approval-free survival is defined as the time between publication of the first early phase I clinical trial to the date of FDA approval, censored by date of last news.
Where the Resolved2 model predicted that a molecule would not obtain marketing authorization, the assumption was true in 92% of cases. When artificial intelligence predicted that a molecule would obtain authorization, 73% of treatments did within six years and 90% within 10 years. A Resolved2 "predicted approved" drug is 16 times more likely to obtain approval than a “predicted non-approved drug.”
Soon to be in French horizon scanning program
This open-source IA tool is proving useful to manufacturers developing new drugs, of course, but also to the academics – the lead investigator physicians – for deciding which molecules they offer their patients in phase II or phase III trials. “The next step is to adapt our model for Europe because the molecules approved are not strictly identical, and the evaluation criteria can also vary,” said Verlingue. His team is already preparing for a second version of Resolved2 that will integrate natural language procession (NLP) into its model. Verlingue is actively seeking public funding as part of the French horizon scanning program launched by the National Cancer Institute (INCa). Indeed, INCa wishes to identify new or emerging anti-cancer drugs in development, and associated biomarkers, at an early stage. It is a matter of evaluating the therapeutic effect, in terms of the day-to-day care facility and financial projections for the products upstream of their marketing authorization. In addition, two big pharmaceutical companies, who wish to remain anonymous, have initiated partnership talks with the data science team at the Gustave Roussy Institute, to develop specific applications using the Resolved2 tool.