Cancer was the second leading cause of death, after heart disease, in the United States in 2020 1, and although its mortality is decreasing over the course of the decades (from 196.5 to 144.1 deaths per 100,000 population from 2001 to 2020 2), there is still plenty of margin of improvement.

For example, lung cancer alone is responsible for two million deaths worldwide in 2019. About 75% of those who have it, die within five years of diagnosis. But the situation improves drastically if the disease is found early: if tumours are still small, the mortality rate decreases as low as 33%. Prevention and screening are then fundamental tools to fight this disease, and AI may help us scale and improve the efficiency of the latter.

In the article that follows, we are going to deepen how lung cancer screening can be improved through AI and discovering what role Data Lake plays to make it happen.


AI and medicine

An AI is something able to simulate the human intelligence through machines; it is an advanced mathematical algorithm that learns from data. Once represented as evil robots in popular movies, AI now surrounds many aspects of our life, including healthcare. Many AI algorithms are in fact already being used to support medical professionals in clinical settings and in ongoing research, and the results have been very promising so far.

Currently, the most common role for AI in medical settings is imaging analysis. AI tools are being used to analyse CT scans, X-rays, MRIs and other images to find things – like tumours – that radiologists might miss. For example, in an experiment ran on mammographies to spot breast cancer at earlier stages, AI technologies have registered an absolute reduction of 5.7% and 1.2% (USA and UK) in false positives and 9.4% and 2.7% in false negatives when compared to the performances of human experts 3.

Recent AI systems are based on deep learning, a subset of machine learning that allows algorithms to extract higher level features from data. With deep learning, rather than looking for tumour features defined in advance by a programmer, the AIs figure out for themselves what a tumour is from real-world examples.

On this matter, Mozziyar Etemadi – a biomedical engineer at Northwestern University’s Feinberg School of Medicine in Chicago, Illinois – has partnered with Google, Northwestern University and other institutions to work on developing an AI capable of analysing CT to identifying lung cancer.

Etemadi and his team trained the AI using a database of more than 40,000 CT scans. During this training period, the scientists told the computer which early-stage scans turned out to contain cancerous spots and which did not. Over time, the computer learnt which image properties separated malignant spots from benign ones, and it became better and better at flagging early signs of cancer.

After years of research, a result that can save human lives bt the scale of the millions was achieved: the computer found tumours in scans more accurately than expert radiologists.


The importance of screening

Studies at the University of California, Los Angeles, and elsewhere show that regular screening of at-risk populations can detect many cases of lung cancer much earlier, reducing mortality by 20–30%. The US Preventive Services Task Force, a volunteer group that makes recommendations for clinical preventative services, now even recommends annual CT screening in groups at high risk of lung cancer, such as smokers.

However, the number of radiologists who can work on these scans is not enough to keep up. The limits of human vision also make it likely for radiologists to overlook tiny malignant lesions. Up to 35% of lung nodules are missed at the initial screening 4. Using AI systems is then the only feasible way to reduce the burden of radiologists and cover the increase in demand for scans, and also to detect lung spots invisible to the naked eye.


Why Data Lake?

Because of deep-learning, the more scans an AI analyses, the more reliably it can spot tumours. Some of the deep-learning systems even give clinicians an estimate of how confident they are in their judgement, which can further inform clinical decision-making.

Threfore, the efficiency of an AI directly depends on the amount of data it has been fed. This, in a world in which the access to medical data is too often made complicated or even impossible by unresponsive institutions, represents the main obstacle to have AIs in our society that are capable of preventing cancer – as well as countless many other diseases – and save millions of lives every year.

Data Lake is creating a global medical data donation system on the Polygon blockchain. With Data Lake, patients only need to give the consent on the use of their medical data in order for us to be able to extract and anonymze them. Then, once we have demand for data with certain parameters, we ask permission to the trust entity to approve it and execute the payment, which also gets registered in the blockchain to ensure transparency. Last but not least, the patient as well as the other stakeholders benefit from the transaction by receiving either $LAKE tokens or fiat currency. For more detailed information, check our whitepaper at!



With AI taking care of the heavier burden, lung cancer screening programs can be optimized to prevent many deaths at the lowest cost thanks to increased automation.

As deep-learning systems can run through different kinds of data sets, such as CT scans, genetic sequences and treatment histories, AI can even allow radiologists to combine screening results with genetic data to create even more customized treatment plans.

However, for this (and more!) to happen and for the healthcare to make the next step, embrace new technologies and reach an unforeseeable degree of efficiency, medical data have to be unlocked, and Data Lake is the main candidate to solve this hamper.

Data Lake is close to launch its product, join us on our social media to learn more, click on!









One Comment

  1. Project Update #6 — August 2022 | Data Lake

    […] Finally, we also posted an article on the potential of AI in medicine and just how crucial the right data is to these incredible upcoming developments in medical science: […]

Comments are closed.