Why is our Health Data so coveted?

Medical data has become “black gold” for both researchers and cybercriminals. Journalist Coralie Lemke believes its use can lead to medical advances, provided it is handled properly.

Massive Theft

The Assistance publique-Hôpitaux de Paris (AP-HP) said Wednesday, Sept. 15, that the personal data of some 1.4 million patients was stolen in a computer attack over the summer.

Back in February, 500,000 medical records were hacked. More and more healthcare institutions are being targeted by cybercriminals. Why is our health data so coveted and who is interested in it? Should we worry about it becoming more accessible to both researchers and hackers?


People put these questions to Coralie Lemke, health journalist at Sciences et Avenir and author of Ma Santé, Mes données (Premier Parallèle).

Q: When we talk about “health data,” what exactly do we mean?

Coralie Lemke: In France, there is a very precise definition of health data formulated by the National Commission for Information and Liberties (CNIL). It is all information collected in the context of a treatment, a test or an examination, as well as all information about a person’s physiological and biomedical condition.

“In plain language, it is information about a person’s past, present or future health status.”

Information collected by networked objects (pedometers, networked watches and scales, sleep monitoring apps, etc.) is considered health data only when crossed with other medical information. So if I know from an app that I sleep three hours a night, that doesn’t say much about my health. However, if it is also known that I have a prescription for antidepressants, it can be inferred that I suffer from a mental illness. This is the case if the CNIL considers that this information is health data in the strict sense.

Our health monitoring is increasingly carried out by computers.

How has digitization affected Health Data?

It has made patient care and monitoring much easier. Today, everything in the hospital and at the doctor’s office is stored on a computer. Our X-rays and MRIs are digitized, and every time you scan your health card, you generate health data.

Digitization has also greatly advanced research by allowing the analysis of “health data stacks” [records of several hundred or thousand patients]. It was difficult to access this information when it was on paper.

The downside is that this data is more vulnerable. It has become more accessible to healthcare providers, but also to a range of stakeholders who are interested in it.

Why is this Health Data so sought after today?

First, it is important to remember that a single piece of information is of little interest to many people: knowing a person’s blood type is of little use. On the other hand, aggregate health data on several thousand or millions of individuals is considered real “black gold” because its study enables advances in research.

This information is of interest to three types of players. The first is pharmaceutical companies, which must go through numerous phases and clinical trials to develop therapies. This process is very time-consuming and costly, but goes much faster when you start analyzing batches of data. To obtain these data, laboratories turn to “data brokers” who specialize in data research. These brokers are tasked with contacting and partnering with healthcare facilities to obtain anonymized data.

The second type of players are the Gafams (Google, Apple, Facebook, Amazon, and Microsoft), which are interested in this field for commercial reasons. They offer their technological expertise to universities or research centers looking for algorithms to process this data. A study has shown that the artificial intelligence developed by Google is more accurate than radiologists for detecting breast cancer.

The last type of actors are, of course, cyber criminals. Their goal is to hack into healthcare facilities to retrieve health data and then sell it on the dark web or extort ransom. In October 2020, at least 2,000 Finnish patients received an email threatening to publish details of their psychological treatment on the Internet unless they paid several hundred euros, after data from a network of psychotherapy centers was hacked.

So our healthcare data has become a prime target for cybercriminals?

Yes, and this phenomenon has been exacerbated by the Covid 19 pandemic: between February and March 2020, there was a 475% increase in attacks on hospitals in France, according to cybersecurity company Bitdefender. Some cybercriminals had promised a truce at the start of the healthcare crisis, but it didn’t last long: they soon realized that healthcare facilities were even more vulnerable during this time.

“In total, there were 192 cyberattacks on hospitals in France in 2020, up from 54 the year before.”

Healthcare facilities are particularly targeted by cybercriminals because they are computer sensitive. The equipment is often outdated and computer protection is not up to date. As a result, they are easy targets, and the consequences can be catastrophic. In 2017, the WannaCry ransomware [a malicious virus that blocks access to files in exchange for a ransom] crippled the UK’s National Health Service (NHS). As a result, millions of medical appointments and surgeries had to be canceled, which meant the loss of life for some patients.

Health data represents a huge financial windfall for these cybercriminals. EY estimates that the 55 million medical records of British citizens are worth £9.6 billion, or more than €11 billion. The value of a single dossier can rise to €5,600 if it includes the sequencing of that person’s DNA. (See Attachment at the End)

Why is Genetic Data particularly sought after?

Not all health data is equally valuable: genetics is the holy grail. Our DNA is the key to our identity and contains crucial information about our appearance, our predisposition to certain diseases, etc. That’s why this data is so valuable.

The companies that offer genetic saliva tests to the public to find out more about their ancestry understand this. Most people don’t read the fine print that says this data can be resold. In 2018, the 23andme Group signed a $300 million contract with the GSK lab for 5 million anonymized genetic profiles. The goal of this partnership is to work on developing treatments for Parkinson’s disease, but this raises questions of security and privacy.

How are Health Data protected in France?

They are subject to the General Data Protection Regulation (GDPR), which has governed the handling of personal data in France and Europe since 2018. The explicit consent of the data subject is required for the collection and processing of health data. The RGPD also prohibits the transfer of data outside the European Union. These are protections that do not exist in other countries, such as the United States, and prevent Google, for example, from collecting data about our medical appointments in our emails and then reselling it to third parties.

Is it possible to strengthen these protective measures?

As individuals, there’s not much we can do. You can try not to put too much personal information online, but that’s just a drop in the ocean of data. In today’s world, it’s especially complicated. For example, an estimated two-thirds of French people have an account with Doctolib, which is logical because it’s a handy tool for making medical appointments. As long as we are not reimbursed, we are required to fill out our “carte vitale” (and thus provide data about our health) each time we receive treatment.

“Therefore, to protect our health data, we need a system of comprehensive and robust laws governing the collection and use of that data, like the RGPD.”

But these laws must be enforced. Complaints about the RGPD are all handled by the Irish CNIL, which regulates Gafam at the European level. However, the body receives so many complaints that 99.93% of them are not dealt with. This is extremely discouraging. Here we can still improve the protection of health data.

Another example of the vigilance we need to show on these issues is the Health Data Hub. At the end of 2019, the French government decided to create a huge library of health data. The idea is to bring together all the data that already exists – hospital data, health insurance data – on a single platform to allow research teams to access it and find new therapeutic pathways or new treatments.

When it came to finding an approved host for health data that met certain technological and security requirements for managing this database, which is one of the largest in the world, Microsoft was chosen. The problem is that it is a company subject to U.S. law. In particular, there is a law in the United States, the Cloud Act, which allows the transfer of data from foreign subsidiaries of a corporation in the context of legal proceedings. In short, Microsoft can theoretically recover French citizens’ health data and transfer it across the Atlantic, which absolutely violates the RGPD. France is in the process of retrofitting the project, and it is likely that it will be awarded to another player by the end of 2022.

So should we be worried that our Health Data is increasingly being exploited?

Health data is quite paradoxical: it is very intimate and belongs in the private sphere, but when brought together it can serve the common good. Medicine is already being revolutionized by the use of health data. The U.S. Food and Drug Administration (FDA) has approved the use of artificial intelligence to diagnose diabetic retinopathy, a leading cause of blindness in adults. All it takes is a photo to detect it, which has been made possible by analyzing health data.

Studying this data can also help us better understand why certain cancers respond to certain treatments and others don’t, advance research into neurodegenerative diseases that are still poorly understood, such as Alzheimer’s, or even find treatments for rare diseases that affect only a few people in each country. Without digitizing and studying this medical information, it would be impossible to gather information on a few thousand patients scattered around the world. So the use of health data can be quite useful, provided it is appropriately regulated by law.


Realising the value of Health Care Data