In this blog

Using high-performance computing (HPC), machine learning (ML), deep learning and AI, Argonne's researchers, in a broader collaboration as part of the National Virtual Biotechnology Laboratory, have created an ecosystem of open-source AI/ML tools and conventional physics-based simulations that can accelerate timely response for treating such pandemics.

The outputs from its physics-based models are used iteratively to improve the prediction capabilities of Argonne's AI/ML approaches, thus successively improving the overall yield of drug candidates that can be refined further using biochemical and biological assays. That is why it is important for ANL researchers to harness the power of an AI system built for the end-to-end machine learning workflow — from data analytics to training to inference – and which provides a giant performance leap for machine learning engineers to stay ahead of the exponentially growing size of AI models and data.

"When the COVID-19 pandemic started, one of the concerns was how can we find drugs that can really attack the virus, and how can we stop it in its tracks," said Arvind Ramanathan, a computational biologist in the Data Science and Learning Division at Argonne National Laboratory and a senior scientist at the University of Chicago Consortium for Advanced Science and Engineering (CASE).

Argonne researchers were already working on AI and ML models for cancer drug research projects sponsored by the National Cancer Institute. They were able to pivot and build out that platform for COVID drug discovery quickly. Also, researchers tapped into the resources of the Advanced Photon Source (APS), Department of Energy's (DOE) Office of Science User Facility at Argonne, one of the world's most productive X-ray light source facilities. APS is like a minute microscope that visualizes proteins from the virus, according to Ramanathan. The ultra-bright X-rays it generates have helped scientists determine more than 160 structures of the proteins that make up SARS-CoV-2. Many of the crystal structures for the virus were solved at Argonne.

AI, machine learning in action

"We were right at the source of data and could get crystal structures and use the same protocols we had built to look at small molecules coming out the APS," said Ramanathan. It is not likely that you will discover a medicine or drug for the virus every time, he said. ML tools were used to sift through vast chemical spaces to find 1,000 to 2,000 molecules that can target the viral proteins.

"The second piece of the story is to understand how the virus attacks the human body," Ramanathan said. Using the spike-like protein on its surface, the COVID-19 virus binds to the angiotensin-converting enzyme 2 (ACE2) "receptor." ACE2 receptor is a protein on the surface of human cells that is critical to regulating processes such as blood pressure, wound healing and inflammation.  

Researchers wanted a better understanding of how the virus is binding to the ACE2 receptor. They had a targeted way to look at molecular dynamic simulation, but it is hard to get insights into the data because these simulations generate massive amounts of data — over 100 terabytes of data in some cases. Here again, ML helped analyze the data to determine what is important to guide the search for chemicals that can target the virus proteins.

The third piece of the puzzle also involves large-scale molecular simulation. In developing simulations, it is hard to figure out what landscapes are important, Ramanathan noted. "Not all encounters between the virus and the ACE2 receptor will lead to a binding. But if we want to understand the basic mechanisms, we have to make that happen," he said.

Consequently, the researchers employed AI/ML techniques to actualize how events might occur inside a cell. "We could kind of study them in real-time to see what is happening. So that is the big piece of the puzzle that all came together. Machine learning played a role in every one of these steps," Ramanathan said.

Argonne researchers discovered nearly 60 molecules that are effective in working against the COVID-19 virus. Many questions remain and the research has been passed to researchers at university hospitals working in collaboration with Argonne, such as the University of Chicago and University of Tennessee.

Industry partners are crucial to success

Argonne researchers have been fortunate to have an AI/ML platform that lets them exploit information and conduct exciting research to help fight the COVID-19 virus, Ramanathan added, citing the AI computing leadership of NVIDIA

"We worked closely with the team at Argonne to get them to the point where they could run models at scale on their NVIDIA DGX SuperPOD™ system — and to build out a wide tool set," said Tony Paikeday, Senior Director of AI Systems at NVIDIA. 

WWT is an award-winning, authorized NVIDIA Partner Network Elite level partner collaborating with NVIDIA on many AI/ML and virtualization solutions for organizations around the world, including this research at Argonne. WWT ensured that the SuperPOD system was ready to support Argonne's mission by performing a series of validation tests during installation. 

"Argonne's COVID research is a perfect showcase of the value that NVIDIA's technology, together with WWT's Advanced Technology Center, can provide to customers," said Bryan Thomas, senior vice president, public sector sales, World Wide Technology. "It really showcases what WWT is capable of and the comprehensive level of integration we provide our customers."  

Going forward: Impeccable Pipeline

Ramanathan's team in collaboration with researchers at Rutgers University/ Brookhaven National Laboratory (BNL) led by Shantenu Jha, Chair of BNL's Computation and Data-Driven Discovery, Computational Science Initiative, have come up with a strategy to build an AI/ML platform to identify leads on new discoveries, an "Impeccable Pipeline" which President Biden called for in his 2022 budget request. It would be "something that allows us to build artificial intelligence and machine learning workflows throughout the ecosystem of things we need to do," he said. That would include experiments, simulations and combining large data sets together to reveal novel insights into both the fundamental mechanism of how viruses work as well as apply the translational side where researchers can discover new molecules.

"That is the place where we are pushing the envelope of why we are doing this research," Ramanathan said.

Technologies