In the media

The fight against coronavirus will test the potential of data-driven research at a new level


28 April 2020

Read the article in Danish

Micro-organisms today represent one of the greatest threats to the world's population and are on the same level as climate disasters and nuclear war. A crucial difference from those threats is that the entire globe can come together to take intensive action against a common enemy when we are hit by viruses, as has now happened with the coronavirus.

It provides a global and very concrete application for all the advances we have seen in artificial intelligence over the last 10 years.

The aim has changed from running a scientific race between humans and algorithms performing the same tasks, to one of accelerating solutions to a global challenge. We are at war with a virus that looks like a mixture of flu and pneumonia - and can kill hundreds of thousands of people. It spreads widely through contact and fluid drops from sneezes and coughs, people are infectious without symptoms and no medical treatment is available yet.

We can limit the spread of infection through our behaviour and by acting together - but we can only effectively stop the spread when we find a medical treatment.

A race against the virus

Common precautionary principles in the development of medical treatment prescribe a course that runs from laboratory tests to animal testing, from a small test group of humans to a larger one - and finally to thousands of people, before a medical treatment can be approved for distribution country by country. Cultivation of samples, tests, observations, documentation, logistics and approvals usually takes 10+ years to pass these three phases before a new medical treatment can be widely offered.

From the time the corona epidemic was found in China until the community shut down in Denmark, it took 3 months! Now the Danish business world is counting the days to the monthly changes and the new costs - and for the government to release labour and trade again. We need to be able to identify and approve a faster paced medical treatment compared to that which is produced by a traditional approach.

Therefore, we have started a race against the coronavirus to decode and suppress it. One of our major challenges is that microbiological analyses in living organisms have an overwhelming number of cause and effect combinations. It is not humanly possible to create an overview of the volume of ongoing ideas, experiments, theories and results being studied around the globe.

Therefore, we need support to provide simulations and an overview that allows us to follow and develop each other's ideas in such a large and complex research area and community.

The largest technology providers, academic institutions, and government agencies are now gathering to provide that support and make scientific progress through advanced data analysis that can complement and / or accelerate laboratory testing and clinical trials. There is an unprecedented scale of global collaboration to meet that goal - and that is far greater than any of the global pharmaceutical companies could drive from investments in the field individually.


AI2 (Paul Allen Institute), Microsoft, Zuckerberg Chan and others have responded to an inquiry from the White House and collected publications from around the world in one format - CORD-19. The dataset can be processed by artificial intelligence such as, for example, Natural Language Processing. This is constantly monitoring new and existing publications for related theories, results, experiments, etc.

Google is showcasing the CORD-19 dataset in one of their Kaggle competitions. Kaggle is an open online platform, known worldwide among data scientists who hold competitions in data science on selected datasets. CORD-19 has grown to 45,000 articles and these have been processed by AI2 so that they can be accessed in a readable format. The competition seeks answers to 10 pre-defined tasks with sub-questions, so there is very good potential for synergies across analytics. They cover incubation time, vaccine options, medical treatment and more. "We're providing this dataset to 4.3 million data scientists in the hope that the world's AI community can help find answers around COVID-19," Anthony Goldbloom, Kaggles co-founder and CEO, said in a recent press release.

In one month - from March 15 to April 20 - the dataset has been viewed 1.32 million times, downloaded 54,200 times, and analysis results from 1143 different data scientists have been produced: mainly from the United States and India so far.

Many initiatives

The first deadline for the global online data science competition to find answers to some of the most pressing questions about coronavirus has already revealed useful insights. These include insights into the reproductive rate of viruses, incubation time from infection to symptoms by age group and gender, and risk factors. There are many more initiatives in the world around data science in the fight against coronavirus, and they include access to the most powerful computers in the world.

The EU has offered assistance from the three supercomputing centres in Italy, Spain and Germany. In the United States, they have established "The COVID-19 High-Performance Computing Consortium" with Amazon, Microsoft, Google, IBM, NASA, the White House and others. Resources in the consortium include access to the world's largest super computer - Oak Ridge National's 200 petaflop supercomputer. At home, Novo Nordisk has just included the project "Applied Artificial Intelligence for real-time risk assessment of patients with COVID-19, Rigshospitalet" in the acute coronavirus pool funds from the Novo Nordisk Foundation. All the online communication reflects a desire to share and help each other towards a rapid development of vaccines and medical treatment, meaning the potential of accelerating drug development through artificial intelligence and advanced data analysis will be tested at a new level. That will be used against coronavirus and show what is humanly and physically possible today.

Although medical scientists would not normally have access to this level of resources, there is a lot of potential learning about future opportunities in data driven medical development to combat all the viruses we can foresee emerging in the future.

Christian Rehfeld is an AI expert from PA Consulting

Explore more

Contact the team

We look forward to hearing from you.

Get actionable insight straight to your inbox via our monthly newsletter.