Big Data and Chemistry

Stephen DeAngelis

July 28, 2014

“Despite the cost and relative scarcity of precious metals — iridium, platinum, rhodium — we rely on them to manufacture products from denim to beer, pharmaceuticals to fuel cells,” reports Hillary Rosner (@hillaryrosner). “For instance, a solution containing platinum is used to make silicone emulsifiers, compounds that in turn feed products like makeup, cookware and glue. The elements are used as catalysts, substances that kick off or enable chemical reactions. … Tiny amounts of the expensive metal are scattered in all these things; your jeans, for instance, contain unrecoverable particles of platinum.” [“A Chemist Comes Very Close to a Midas Touch,” The New York Times, 15 October 2012] The modifier “precious” also indicates “expensive.” Imagine being able to substitute common metals for precious metals in industrial processes and consumer products. That is the goal of Dr. Paul Chirik (@pchirik), a professor of chemistry at Princeton University. Although Dr. Chirik hasn’t managed to turn lead into gold (the historical pursuit of alchemists), the good professor “has learned how to make iron function like platinum, in chemical reactions that are crucial to manufacturing scores of basic materials.” Matthew Hartings (@sciencegeist), a chemist at American University in Washington, DC, told Rosner, “We’re not about to run out of platinum, but (the jean manufacturing) process spends that platinum in a nonsustainable way.” Sustainability is becoming a big issue in almost every economic segment; but, sustainability is especially important in manufacturing.

Discovering ways to make plentiful elements more useful is critical if technological advances are going to continue to be made. Chirik told Rosner, “No chemist would think lithium was in short supply, but what happens if you put a lithium battery in every car? This is why chemistry needs to be ahead of the curve. We need to have adaptable solutions.” Undoubtedly, big data analytics are going to play a role in discovering adaptable solutions. Rosner continues:

“Dr. Chirik’s chemistry essentially wraps an iron molecule in another, organic molecule called a ligand. The ligand alters the number of electrons available to form bonds. It also serves as a scaffold, giving the molecule shape. ‘Geometry is really important in chemistry,’ Dr. Hartings said. Dr. Chirik’s ‘ligands help the iron to be in the right geometry to help these reactions along.’ In addition to iron, Dr. Chirik’s lab also works with cobalt, which sits beside iron on the periodic table. Using cobalt, Dr. Chirik said, the scientists have generated ‘a whole new reaction that no one has ever seen before.’ It produces new types of plastics using very inexpensive starting materials. But the price of cobalt has shot up since the lab first began its research, thanks to the element’s use in the flat batteries that power gadgets like iPads and iPhones. ‘The iPad has completely changed the price of cobalt,’ Dr. Chirik said, ‘so something that once was garbage is now valuable.’ While the rising cost may undermine the economic incentive to use Dr. Chirik’s cobalt-fueled materials, it seems to perfectly underscore his basic point about the need for flexibility.”

In addition to help find more sustainable ways to use abundant elements, big data will undoubtedly help chemical companies discover new compounds that can replace more toxic substances. Because chemicals can be toxic, the public has displayed a healthy skepticism about chemical production over the years. The greatest tragedy involving chemical production occurred in 1984 in Bhopal, India. Nearly 4,000 people died in the aftermath of a toxic gas release and an additional 8,000 people died of gas-related ailments over the following years. Tens of thousands of other people were injured as a result of the gas release. The tragedy cost Union Carbide dearly both in money terms and reputation. Jeff Reinke (@JeffReinkeABM) notes, “Chemical production is often described, understated as it might be, as a ‘harsh’ operating environment.” [“Processing Trends, Chemical Industry Knowledge Lead To Custom Approach,”, 29 January 2014] Don Mahoney (@DonMahoneySAP), global head of SAP’s Chemicals Industry Business Solutions, told Reinke, “The chemical processing industry is kind of like heating and air conditioning – nobody notices until something goes wrong. Almost every product starts with chemicals of some sort, but there’s never any coverage of the amount that’s produced without incidents. When it comes to safety, what companies need to do is what they’re already doing – focus on responsible care, sustainability programs, non-regulatory SIN (substitute it now) lists and programs for subbing in less volatile chemicals, and getting more involved in the communities in which they operate.” Big data can play a role in most of those efforts. It can also help chemical plants run more efficiently. Mahoney told Reinke, “Chemical companies have vast amounts of current and historical data from labs and production centers. More and more we see chemical companies needing a way to bring all that data into one place and then establish standards for reporting and metrics to help increase margins through operational excellence.”

In a letter to the editor of Chemical & Engineering News, Meeuwis van Arkel wrote, “Chemists need information from a multitude of different sources, each with its own origins. But there’s a huge gap between volume and relevance that needs to be bridged.” [“The Deal With Big Data,” 13 January 2014] Van Arkel continued:

“For example, many chemists will use freely available search engines at the beginning of a project to look for connections or inspiration. However, a single search can return a huge amount of information, relatively little of which will be relevant or useful. But this is not always the case, particularly in highly complex areas of science. Big data tools must be able not only to crunch the numbers and information into carefully filtered and analyzed results, they must also present the user with the exact information required. Even in that first handful of pages of results there may be only specific nuggets that are truly valuable. As a result, any tool must be able to contextualize and classify information to ensure that researchers are working with the most relevant results. Big data must be focused on breaking huge blocks of information down to the smallest particles. Only when we can ensure that our tools enable confident decision making at every stage of chemical research will we realize big data’s value rather than feel as if we are drowning in the chaos of too much.”

I agree with van Arkel and I believe the only way to achieve what he suggests is to use a cognitive computing system like Enterra Solution’s Cognitive Reasoning Platform™ (CRP), that uses both reasoning and computation to achieve insights. The Enterra Solutions® approach uses the world’s largest common sense ontology to provide the context that van Arkel sees missing in most current approaches. Another company, Nutonian, which advertises itself as the Robotic Data Scientist™, “providing answers to questions you never knew to ask, automatically,” agrees that artificial intelligence needs to play a significant role in the chemical industry’s future. It states, “For chemistry researchers already struggling with hundreds of millions of known molecules and sequences, new machine learning techniques can remove data processing limitations and increase the rate of discovery.”

From discovery to safety, I believe that big data analytics has a role to play in the chemical industry. As we state on our website, “The chemical industry operates in one of the most complex and high risk supply chains in the world. It is characterized by volatile industrial markets, inherently hazardous and closely regulated products, feed stocks that often originate in unstable conflict regions and inter-dependent logistics networks of rail, road, ocean, and pipelines that must safely transport vast quantities of raw materials and finished product to producers and markets. These factors all converge to increase risk and add to the unpredictability of chemical production, transport and marketing. Big Data Analytics provide an obvious path to streamline and safeguard chemical supply chain management. The Enterra Cognitive Reasoning Platform allows chemical manufacturers to capture, curate and analyze vast amounts of data and information generated at every step along the chemical supply chain.”