Big Data and the Pharmaceutical Industry

Stephen DeAngelis

February 17, 2014

“Big data has grown to prominence in the pharmaceutical market over recent years,” writes Lucy Hill, “turning into a key technology for businesses.” [“Big data becomes more prominent in pharma industry,” KL Discovery, 27 December 2013] Big Data analytics should help the pharmaceutical industry take in account more considerations as they develop drugs. That’s a good thing according to Skip Show, Forrester’s Chief Information Officer. He told Hill that he believes “pharma should move away from its focus on molecules in order to take a holistic view of disease.” He stated:

“Pharma needs to understand prescribing behaviour in the formulary and in the physician’s office better in order to influence it and thus drive sales. As per a senior marketing manager from a meeting recently: ‘In the old world, we just sprayed and prayed,’ meaning that the marketing campaigns aimed at the physician did not discriminate as to who that physician was.”

I don’t entirely agree with Show’s position (i.e., that the pharmaceutical industry should move away from a focus on molecules). After all, the pharmaceutical sector is built on a foundation of research and development even though it is sustained by sales. In both the R&D and sales arenas, Big Data can play a significant role. One company that is capitalizing on pharma’s embrace of Big Data analytics is IMS Health, which “says it pulled in nearly $2 billion in the first nine months of 2013, much of it from sweeping up data from pharmacies and selling it to pharmaceutical and biotech companies.” [“,” by Charles Ornstein, ProPublica, 10 January 2014] Ornstein reports that “IMS and its competitors are known as prescription drug information intermediaries.” They provide the data that allows drug companies to do exactly what Show recommends they do (i.e., understand a doctor’s office better in order to influence sales). Ornstein reports, “Drug company sales representatives, using data these companies supply, can know before entering a doctor’s office if he or she favors their products or those of a competitor.” He adds, “The industry is controversial, with some doctors and patient groups saying it threatens the privacy of private medical information.”

As a purveyor of data, IMS is at the blunt end of the Big Data business. The real magic happens at the pointy end where analytics become more prominent. IMS and its competitors have been collecting massive amounts of prescription data for at least 15 years, but they are having a hard time developing Big Data analytics. They do a lot of traditional market and claims analysis research, but they don’t conduct the kind of analytics that will move drug development forward. The use of Big Data for research purposes is somewhat less controversial than its use in marketing. One reason is that such data can be used to advance personalized medicine. An article in the U.S. News & World Report explains:

“Personalized medicine is a young but rapidly advancing field of healthcare that is informed by each person’s unique clinical, genetic, genomic, and environmental information. Because these factors are different for every person, the nature of diseases — including their onset, their course, and how they might respond to drugs or other interventions — is as individual as the people who have them. Personalized medicine is about making the treatment as individualized as the disease. It involves identifying genetic, genomic, and clinical information that allows accurate predictions to be made about a person’s susceptibility of developing disease, the course of disease, and its response to treatment.” [“Personalized Medicine,” 20 January 2011]

Show told Hill “that genomic-based drugs are driving changes via the amounts and types of data managed by the industry.” Nsikan Akpan believes that Big Data must play a larger role in the pharma sector’s R&D efforts. [“,” Genetic Engineering and Biotechnology News, 1 January 2014] He explains:

“The heroic age of pharmaceutical development may be drawing to a close. Once, clinical boons such as the polio vaccine and angiotensin-converting enzyme inhibitors were the work of identifiable geniuses. But future advances may depend on vast data troves and machine-driven calculations as impersonal as they are tortuous. The last three decades have shown that the paramount medical challenges of our time, such as Alzheimer’s disease or cancer, are rife with intricacies that require comprehensive, multimodal studies. In facing up to these challenges, pharmaceutical development is passing into the big data era, a time in which data collection occurs on a scale beyond human comprehension. Yet it is becoming apparent that merely collecting large volumes of data will not be enough. No big data approach is complete without a way to manage the vast amounts of information regarding products, patients, customers, prescriber behaviors, and clinical trials across hundreds of different types of compounds.”

As noted above, companies are collecting vast amounts of data but struggling to leverage it. Analysts repeatedly note that data without analysis are worthless. That’s why John Reynders, Chief Information Officer at Moderna Therapeutics, told Akpan that “the process of massive data collection — is the easiest part of a bioinformatics approach.” Big Data will only play a significant role in the pharma R&D process, he told Akpan, if a company can assemble “the right team and tools for data mining and machine learning.” I agree wholeheartedly with Reynders. At Enterra Solutions®, we believe that our Cognitive Reasoning Platform™ (CRP), a learning system that marries artificial intelligence and a common sense ontology, could play a significant role in helping to discover new drugs. Context matters; even in the drug discovery process, which is why we believe the AI/ontology approach is a good one. As Reynders told Akpan, “People will buy hardware-accelerated approaches for relationship searching, which is great if you have sequel-based queries, but maybe the problem actually required a more dynamic navigation of very non-obvious relationships in graphs or triple stores. Rather than bring the data to your application, it is better to bring the application to your data.”

Before Big Data makes a dent in the drug discovery arena, Steve Dickman, Chief Executive Officer at CBT Advisors, says that the industry needs a big change of heart. [“Big Data in Drug Discovery and Healthcare: What is the Tipping Point?Boston Biotech Watch, 23 January 2014] He explains:

“What good is big data for drug discovery? Not much, if you ask the pharmaceutical industry. The world’s drugmakers have other challenges right now and, with a few notable exceptions like PatientsLikeMe, neither consumer-driven nor patient-driven ‘big data’ seems to be part of the solution. Even in the apparently more data-driven field of healthcare services, big data keeps bumping up against regulatory and practical barriers.”

Dickman asserts that the required change of heart may be coming. He reports, “A recent panel of experts argued that trends in big data will drive up its relevance and provide a navigable path toward greater utility both in pharma and in healthcare.” Among those trends are new algorithms that could foster better drug discovery. Eliza Grinnell reports, “Computer scientists at the Harvard School of Engineering and Applied Sciences (SEAS) and the Wyss Institute for Biologically Inspired Engineering at Harvard University have joined forces to put powerful probabilistic reasoning algorithms in the hands of bioengineers. In a new paper presented at the Neural Information Processing Systems conference on December 7, Ryan P. Adams and Nils Napp have shown that an important class of artificial intelligence algorithms could be implemented using chemical reactions.” [“Drugs with Artificial Intelligence Possible,” Ideas, Inventions and Innovations, 15 December 2013] Grinnell continues:

“Adams’ and Napp’s work demonstrates that some aspects of artificial intelligence (AI) could be implemented at microscopic scales using molecules. In the long term, the researchers say, such theoretical developments could open the door for ‘smart drugs’ that can automatically detect, diagnose, and treat a variety of diseases using a cocktail of chemicals that can perform AI-type reasoning. … The field of machine learning is revolutionizing many areas of science and engineering. The ability to extract useful insights from vast amounts of weak and incomplete information is not only fueling the current interest in ‘big data,’ but has also enabled rapid progress in more traditional disciplines such as computer vision, estimation, and robotics, where data are available but difficult to interpret. Bioengineers often face similar challenges, as many molecular pathways are still poorly characterized and available data are corrupted by random noise. Using machine learning, these challenges can now be overcome by modeling the dependencies between random variables and using them to extract and accumulate the small amounts of information each random event provides.”

Looking to the future, trends all indicate that Big Data analytics are going to play a much larger role in the pharmaceutical industry than they do today. A McKinsey & Company report recently concluded, “The big data revolution is in its early days and most of the potential for value creation is still unclaimed.” [“Interest in healthcare ‘big data’ grows,” by Andrew Ward, Financial Times, 30 January 2014] Undoubtedly, Big Data will play a role in how drug makers market their wares; but, the most beneficial impact for society will come when Big Data analytics are used to help advance drug discovery.