Skip to Content
Anastasia-Christianson
Intelligent industry

Discussion with Janssen, Pharmaceutical companies of Johnson and Johnson

The Capgemini Research Institute spoke to Dr. Anastasia Christianson, Vice President, R&D Business Technology, Janssen, Pharmaceutical companies of Johnson and Johnson, about the data-driven transformation of the pharma R&D, and the value of data-sharing ecosystems.

Pharmaceutical R&D is powered by data

Can you tell us how you use data in the drug discovery and development process?

There are a number of ways in which we use data in each step of the drug discovery and development process. The process can take up to ten years to yield a drug to patients so anything we can do along the way to save time is beneficial for getting the therapy to patients faster. Making best use of the data we generate along the way can definitely accelerate the process. In a typical drug discovery experiment, cells representing a specific disease (such as lung cancer) are exposed to a variety of compounds, and data are collected to evaluate how cells respond to each compound. For example, a microscopy snapshot might be taken of each reaction that follows the initial exposure of cells to a compound. One such experiment might generate half a million snapshots. Artificial Intelligence (AI) technologies help us to sort the data from such experiments with the goal of finding a compound that could create the desired reaction for the disease we are studying. Further, machine learning algorithms help us to predict how other types of cells are likely to react to the same compound(s), giving us a leg up when starting a new study. This AI-based method can be 250 times more efficient than the traditional method of drug discovery.

In clinical trials, data and AI are being used for designing better trials. For instance, researchers today have access to hundreds of millions of real-world data points from a large number of diverse sources that provide insights on how or when people use treatments. These data points can, in some cases, be as useful to researchers as trial data, and could be used to select patients who are most likely to respond to a particular therapy for example, thus accelerating trials. Digital health technologies and digital biomarkers are also accelerating clinical trials by enabling faster collection of data during a trial for patient selection, monitoring response and outcome.

While there are huge benefits from making better use of our data across the entire R&D process, right now, my team and I are focusing on accelerating drug discovery and drug development programs. 

Strengthening the data-sharing ecosystems

Where do you see the benefit of data-sharing ecosystems?

We partner with a number of groups such as Innovative Medicines Initiative (IMI) in Europe to ensure that we’re sharing data in both directions. The major driver behind these data-sharing practices is to make informed, data-driven decisions. The more data we have, the better questions we can ask and thus more accurate responses. While we have a lot of experts, we can’t be the experts on everything. When it comes to specific diseases, for example, others may have more data or additional knowledge that complements our knowledge. We partner with experts as needed to ensure that we benefit from their experience and data.

Conversely, we also aim to share our data that might be beneficial elsewhere. For example, a couple of years ago, we signed a first-of-its-kind agreement with Yale School of Medicine’s Open Data Access Project (YODA) to facilitate the sharing of clinical trial data – aiming to enhance public health and advance science and medicine. The project has been hailed as representing a new standard for responsible, independent clinical data sharing.  

Approach to data-driven R&D

How can organizations advance towards data-driven R&D?

Being a data-driven organization means you have to be a learning organization and ensure that your employees are data-focused, utilizing advanced data analytics capabilities like machine learning and AI, and employing decision frameworks and decision memory capabilities. Your data need to be Findable, Accessible, Interoperable, and Reusable (following FAIR principles) and your scientists and leaders must be well versed in the tools and capabilities that allow them to make best use of data. This includes all the steps from data generation to decision-making.

Your data provide important information, knowledge, and insights; the next step is to use those insights with all the information and knowledge around them, to make the best informed decision, tracking decisions, and learning from them. Making decisions on the right target, drug candidate, disease indication, or biomarker(s) to choose/measure requires a certain level of risk-taking that is dependent on available information. You might have the option to measure one, two or three biomarkers in a given trial to monitor trial efficacy and/or outcome. Choosing how many and which biomarkers to use requires balancing scientific and medical knowledge along with trial feasibility, cost, and speed. You make the best decision based on the data available and you want to track the decision and outcomes for the next time you are faced with a similar scenario, so that you can apply the learning for the next decision. Right now, this process of documenting learnings from each trial or project is very manual and is not always easy to track, but we are looking at ways to digitally capture decisions and associated information for learning and reuse. 

Future of data-driven R&D

What changes do you see in data-driven R&D in the future?

There will be more use of technology in R&D and healthcare, such as 5G connectivity, to help underserved areas. Real-time monitoring through sensing technologies, digital biomarkers, and edge computing, as well as the use of AI at scale will lead to more and earlier predictions in discovery and in the clinic. Quantum computing is another emerging area where qubits can be useful in genomic analysis, protein structure prediction, or accelerated diagnosis, to name a few. For example, current computing capabilities may require several weeks to simulate the formation of protein complexes, protein-to-protein, or protein-to-ligand interactions. Quantum computing is expected to significantly shorten this time.

If we look at discovery specifically, by understanding disease at a molecular, phenotypic, and patient-level, and by feeding these data into our “discovery engine”, we can enable the best-informed drug design decisions. At the same time, through robotics, intelligent automation, and sensors for monitoring, we can run more experiments and collect more data in parallel in the labs.