A survey of 149 companies by Tufts Center for the Study of Drug Development found more than two-thirds of clinical trial sponsors were using or piloting at least four different data sources in their clinical trials.
"I think it's the very beginning of a very steep hockey curve," Richard Young, vice president of Veeva Systems Inc.'s Vault EDC, told Bioworld. "I think four is going to become a double-digit number in the next five years or so."
Non-case report forms are the most common data source, with 86.4% of small companies and 87.9% of medium and large companies using or piloting the data source. "But it also is the most frequent cited cause for database lock delays," the Tufts report points out.
More medium and large companies are using direct data capture compared to small companies, 83.8% vs. 66%. The largest companies are also more likely to use electronic health records (EHR) or electronic medical records (EMR), with 51.5% of the larger group using EHR/EMR, compared to just 25.6% of small companies.
More data, more challenges
"There's such a wealth of data available to us today; the challenge is how quickly can we use it," Young said. He said he thinks companies are good at understanding what they're going to do with the data once they have it, but the ingestion of the data is the hard part, especially in "making sure the time from receiving it to the time you can actually use it is as short as possible."
When there are only a few data sources, the task isn't particularly daunting, Young noted, but when you get to the point where there are 15 data sources, "it's really not going to scale."
Getting EHR/EMR data out of systems and into a usable format, for instance, is particularly challenging because there are so many different EHR/EMR systems with different formats.
Tufts found that small companies thought transforming and mapping data was a challenge, with more than twice as many small sponsors, compared to large sponsors, reporting the task was "extremely time consuming."
Cleaning the data is another difficult step that will challenge companies as they increase the number of data sources they use. "I think that's going to become a wholesale change as we start to think about how one data point can interrelate to others in ways that we have not previously considered or had to think about."
Electronic data capture (EDC) was designed to convert doctors' notes to an electronic format to clean the data, but it was designed to be done one page at a time. With thousands or even millions of data points per patient for some data sources, "there's no way that those systems are going to manage that volume of data in the same way, so we need a whole new tool set, for sure," Young said.
As the amount of data increases, companies may need to accept that the cleaning of the data won't be perfect. With a small number of data sources, "perfection was the only acceptable standard when you talked about cleaning data. And now, I'm openly talking about how not all data is born equal and maybe good enough is good enough," Young said.
When data sources create millions of data points, from devices or apps for instance, companies may end up summarizing the data to make the analysis easier. But Young noted that companies need to be choosy with which data are summarized because that may result in missing an important individual event.
Of the data tasks companies have to do, companies told Tufts that initiating relationships with data vendors was the most time consuming. Young said he thinks that companies are putting too much pressure on data providers to get data in submission-ready format. He believes it makes more sense for companies to receive the data in the native format, use it to make clinical decisions and then transform it into a format for regulators.
Getting a strategy
Most of the companies surveyed by Tufts thought they needed a formal data strategy. "The vast majority (87%-93%) of sponsor companies perceive benefits of a formal data strategy, and those which have implemented a data strategy report faster time from last patient, last visit to database lock, compared to those with no formal strategy (36.1 days vs. 41.8 days)," the report notes.
But despite the perceived need, only about one-third of companies have implemented a data strategy; medium and large companies were more likely to have one than smaller companies.
"The companies that have impressed me the most, without naming names, are the ones that have said, 'Our data management process fundamentally hasn't changed in the last 30 years. We need to start with a blank sheet of paper and build up what's really important,'" Yong said.
Of the companies with a formal data governance process, about 40% of medium and large companies use a hybrid approach where the clinical operations and data management teams share the responsibility for data governance. In smaller companies, the governance is more likely to be housed specifically in the data management group, with 45.8% of small companies reporting that their data management groups oversee the governance.
Young said he thinks the best strategy for data governance is for companies to have a new group with members from clinical operations, data management, statistics and information technology working together to figure out what data sources they need and how they'll transform and use the data after they are captured.