Data Science & Artificial Intelligence:

Unlocking new science insights

At AstraZeneca we harness data and technology to maximise time for the discovery and delivery of potential new medicines. Right now, we are embedding data science and Artificial Intelligence (AI) across our R&D to enable our scientists to push the boundaries of science to deliver life-changing medicines.

Data science and AI have the potential to transform the way we discover and develop new medicines – turning yesterday’s science fiction into today’s reality with the aim of enabling the translation of innovative science into life-changing medicines

Jim Weatherall Vice President, Data Science & AI, R&D

Today we are generating and have access to more data than ever before. In fact, more data has been created in the past two years than in the entire previous history of the human race. But the value of this data can only be realised if we are able to analyse, interpret and apply it. Right across our R&D, we are using AI to help us decipher this wealth of information with the aim of:

•  Gaining a better understanding of the diseases we want to treat

•  Identifying new targets for novel medicines

•  Recruiting for and designing better clinical trials

•  Driving personalised medicine strategies

•  Speeding up the way we design, develop and make new drugs

Our scientists are using AI to help redefine medical science in the quest for new and better ways to discover, test and accelerate the potential medicines of tomorrow. The following sections tell just some of the stories behind how data science and AI are starting to make a difference to our R&D efforts.


Turning data in to knowledge

We are determined to advance our fundamental understanding of diseases such as cancer, respiratory disease and heart, kidney and metabolic diseases. Because by learning what causes or drives disease, we hope to find new ways to treat, prevent or even cure them.

Through data science and AI, we are uncovering new biological insights with the aim of increasing our R&D productivity. For example, we are using knowledge graphs – networks of contextualised scientific data facts such as genes, proteins, diseases and compounds, and the relationship between them – to give scientists new insights and help overcome cognitive bias. In 2021 we selected the first two AI-generated drug targets into our portfolio, from our collaboration with BenevolentAI in Chronic Kidney Disease and Idiopathic Pulmonary Fibrosis.

Data science and AI can also help us reveal the secrets of disease in our genes. Our Centre for Genomics Research is working to analyse up to two million genomes by 2026. Alongside the gene-editing power of CRISPR to delete every gene in the genome to ask what role those genes play in biology, AstraZeneca scientists are peering inside our genetic make-up to help us better understand disease.

But the huge scale of the genome means these experiments produce a colossal amount of data. Data science and AI are at work helping us analyse and interpret the data more quickly and accurately.

Predicting what molecules to make next and how to make them

Through AI, we have the potential to transform medicinal chemistry, augmenting traditional drug design with sophisticated computational methods to predict what molecules to make next and how to make them

Werngard Czechtizky Head of Medicinal Chemistry, Research and Early Development, Respiratory & Immunology, BioPharmaceuticals R&D

We are exploring the use of AI to help us discover new medicines. We believe it has great potential to increase the quality and reduce the time it takes to discover a potential drug candidate.

This currently takes several years of detailed scientific research; synthesising and testing thousands of molecules in order to achieve the right drug properties.

AI is transforming this lengthy process – enabling us to rapidly generate novel ideas for molecules and to make and rank these ideas using predictions based on large data sets now available to us.

Having identified promising molecules, the next step is to synthesise the molecules in the lab. AI is starting to help here too – the science of synthesis prediction is rapidly evolving and we will soon be able to use AI to help us deduce the best way to make a molecule in the shortest time.

We see AI as a key component in the chemistry lab of tomorrow – not only for discovering and making new drugs but for controlling automation to speed up the repeated cycles of generating, analysing and testing high-quality compounds.

Using AI for fast, accurate image analysis

Every week, our pathologists analyse hundreds of tissue samples from our research studies. They check them for disease and for biomarkers that may indicate patients most likely to respond to our medicines. It is very time consuming which is why we are training AI systems to assist pathologists in analysing samples accurately and more effortlessly. This has the potential to cut analysis time by over 30%.

For one of our AI systems, we implemented an approach inspired by how some self-driving cars understand their environment. We trained the AI system to score tumour cells and immune cells for a biomarker, called PD-L1, which has potential to help inform immunotherapy-based treatment decisions for bladder cancer.

Our AI system looks at thousands of images from tissue samples, methodically checking each one for PD-L1. It saves our pathologists time and is especially useful in difficult cases.

Accelerating clinical trials through data science and AI

Randomised Clinical Trials (RCTs) are currently the method of choice for pharma when it comes to assessing potential new medicines. However, published data shows they have become more expensive and complex over time.

Advances in data science can help us re-think clinical trials, enhancing current practice and finding new ways to discover and develop potential new medicines.

For example, the rapid adoption of high-quality Electronic Health Records (EHRs) represents a vast, rich, and highly relevant data source that has a huge potential to improve clinical trial implementation.

Federated EHR technology is unlocking new opportunities to enhance clinical research and transform the way we do clinical trials. The technology has the potential to refine or replace many clinical trial processes including patient identification, selection, trial conduct, and capture of data.

We are also employing AI and machine learning tools to glean more value from clinical trial data. Historically, we have been proficient in using data from trials to analyse, interpret and report on the safety and efficacy of the trial drug. But we want to maximise the value of the data we have already collected.

Machine learning and AI are also being applied for event adjudication in clinical trials to enable us to optimise the process at different stages with the intent of reducing the time overall. 

Data re-use can help us better design our drug development strategies and programmes. This can help us design smarter trials, strengthen our scientific discoveries, and ultimately, in the future, has the potential to help our patients receive the best treatments.

Building the right data backbone

Today we are generating and have access to more data than ever before. Data and analytics have the potential to transform our business, but the true value of scientific data can only be realised if it is “FAIR” - Findable, Accessible, Interoperable and Reusable.

AstraZeneca’s R&D and IT groups are working closely together to create an industry-leading enterprise data and AI architecture. This will help us answer key business questions and enhance our ability to harness new tools and technologies, such as AI and machine learning, both now and in the future.

We are also mobilising a team of data scientists, bioinformaticians, data engineers and machine learning experts from across the company to ensure we are collecting, organising and using the right data in the best way.

AstraZeneca’s principles for ethical data and AI


Rapid developments in AI technology have brought us in to uncharted territory, and companies and regulators must work together to meet the new challenges posed. Our principles will empower us and our partners to navigate this new environment safely and effectively. By encouraging innovation and evolution while maintaining our values, they provide a long-term ethical foundation to uphold our AI governance.

During 2020, we engaged a diverse range of experts both inside and outside AstraZeneca to develop principles for ethical data and AI, aligned with our Code of ethics and values. These values work for patients and employees and enable AstraZeneca to make a positive contribution to society.

Pushing the boundaries of science through AI expertise

Our leading scientists are using AI to help redefine medical science in the quest for new and better ways to discover, test and accelerate the potential medicines of tomorrow.

Collaborating to help answer big questions in AI

We know the best science doesn’t happen in isolation which is why we work collaboratively and open doors to fuel scientific discovery.

The Cambridge Centre for AI in Medicine, a collaboration between AstraZeneca, the University of Cambridge and GSK, combines world-class academia with real-world industrial challenges and aims to develop cutting-edge AI to transform the way we discover and develop medicines. 

We are collaborating with Mila, the Quebec Artificial Intelligence Institute, to drive forward AI innovation in healthcare. The collaboration brings together leading experts in AI to scale up ideas and push the boundaries of traditional drug discovery and development using advanced computational methods.

Our collaboration with Schrödinger uses their advanced computing platform with the aim of accelerating drug discovery. By combining physics-based modelling and machine learning, we will be able to predict the affinity of large libraries of potential drug molecules to identify the highest affinity candidates for synthesis and biological testing.

We have joined the Machine Learning in Pharmaceutical Discovery and Synthesis (MLPDS) Consortium, an academic/industry consortium with MIT and a number of other pharmaceutical companies. The goal of the consortium is to leverage the respective expertise of its members to design and deliver software tools that predict molecular properties and synthetic routes to increase the speed and efficiency of drug discovery.

We are part of a new consortium of pharmaceutical, technology and academic partners called “MELLODDY” (Machine Learning Ledger Orchestration for Drug Discovery). The project aims to leverage the world’s largest collection of small molecules with known biochemical or cellular activity to enable more accurate predictive models and increase efficiencies in drug discovery.

We are collaborating with BenevolentAI to use machine learning and AI to discover potential new drugs for chronic kidney disease (CKD) and idiopathic pulmonary fibrosis. By combining our disease area expertise and large, diverse datasets with BenevolentAI’s leading AI and machine learning capabilities, we are transforming drug discovery to bring vital new treatments to patients. Two novel AI-generated therapeutic targets, for CKD and IPF, have entered into our drug development portfolio as a result of this collaboration.

AI Sweden brings together industry, academia and the public sector in a unique partnership to accelerate applied AI research and innovation through collaboration and cross-industry sharing.

Join us

If you believe in the power of what science can do, join us in our endeavour to push the boundaries of science to deliver life-changing medicines.

Collaborate with us

We know that however innovative our science, however effective our medicines and delivery, to achieve all we want to achieve, we cannot do it alone.

Veeva ID: Z4-40720
Date of preparation: December 2021