Data Science being used extensively to stem COVID-19
The N-Corona pandemic has shaken and is, without doubt, in the process of transforming much of the world as we know it
May 20, 2020
With more than 3.5 million people infected and more than 250 000 deaths worldwide in 4 months, there most likely isn’t a soul on earth who hasn’t heard of the SARS COV-2 virus, probably not by that name but simply as corona virus or a similar euphemism, and affected by it in some way, either directly or indirectly through the lockdowns and severe restrictions on daily life.
Authorities the world over have come to rely on DATA SCIENCE to find ways to stem the infection and death rate. This is not to say that medical science doesn’t play its part, in fact it is the medics who are leading the fight against the virus, but they are doing so together with data scientists as it is the latter group who are adept at applying various mathematical and computational tools, to ‘play’ the data in order to provide incisive insights on the measures taken or which may be taken by the medics. As the data pool from the entire world increases each day, data scientists are discovering various ways by which the infection rate may be reduced, including ways in which individual behaviour can stem the ‘tsunami’. While face masks cannot stop the spread of the virus, DATA SCIENCE found that where this was prolifically used the infection rate was reduced. DATA SCIENCE found that generally people over 60 were more prone to death from the disease; and also found that fewer women perish from the disease than men; it was DATA SCIENCE that determined the close correlation between co-morbidity (patients with underlying ailments) and Covid-19 deaths. DATA SCIENCE also confirmed the effectiveness (or lack) of certain drugs administered; and the list of contributions by DATA SCIENCE to the fight against the pandemic goes on.
One thing, though, that has become stark as the virulent disease spread throughout the world –
- Is the massive importance of Data Science and Analytics to help stem the tsunami-like flow of the pandemic.
There is a fundamental premise upon which medical science determines the efficacy of medical drugs and treatments – that premise is statistics. It takes several years for new drugs and vaccines to be approved by the responsible medical authority of a country (such as the FDA of the US) Medical drugs go through several series and layers of testing; in the final analysis the proven methodology of control and test groups are given the drug to determine which groups’ condition improves. If a significant number of people in the test group respond positively while a significant number in the control group (they use what is called a placebo with the control group) do not get better coz it’s only a placebo, only then does a new drug make it through to the final admin stages for approval. The word ‘significant’ here is significant – it is used here in statistical terms; in the real final analyses, therefore, it’s statistics that is the last threshold for acceptance. What about the random selection of test participants – this is also based on a mathematical concept, that of randomness – random selection of test participants is carried out to avoid inadvertent bias in the results. And yet another mathematical concept in medical drugs genesis that is employed is that of the level of confidence for acceptance of the test, ie the percentage to which all possible samples are expected to include the true population parameter. The close connection between medical sciences and mathematical sciences has a strong and long history; is it any wonder then that during this unprecedented viral attack that these two sciences together are devising the strategy to arrest the ‘corona tsunami’!
There are several reliable sources for data on Covid 19 that is being added to daily, available for free from sources such as Johns Hopkins University of Baltimore, USA, from the WHO (World Health Organisation, from the Centres for Disease Control (CDC) of many countries such as the ICMR of India, and so forth
As new data is added each day and DATA SCIENCE algorithms are applied to the data, new factors or insights surrounding infection, disease and death emerge, that informs authorities on the measures to be taken to stem the flow. It must be understood that lockdown and other restrictive measures have been imposed by authorities after testing the data. DATA SCIENCE techniques such as predictive and prescriptive modelling, hypotheses testing, correlation coefficient calculations, regression analyses, sample vs population mean, distribution models, etc, are in use to gain greater insights into SARS COV 2 and COVID 19 (the former being the name of this damnable virus and the latter the name of the resultant disease). The use of operations research and quantitative techniques goes back to World War II1, however the advent of DATA SCIENCE (a recent branch of information technologies) has popularised these number crunching techniques outside of text books and universities, DATA SCIENCE has now become the de facto ‘home’ for much of the mathematical and statistical techniques; they are being used to successfully transform the effectiveness and bottom lines of companies, organisations and large events (such as the current virus crisis) bringing about greater effectiveness and efficiency wherever DATA SCIENCE is applied.
Besides the use of algorithms and quantitative techniques, the fight against the c-virus has also seen technologies such as AI and bots being used. It has been widely reported that during the height of the epidemic in China, face recognition software was extensively used to track the frequency of movement of people, drones were used to remotely sanitise places and broadcast warning messages, and various other innovative use of mobile phone technologies were used (still being used in many countries) to stem the flow of the virus. One of the open source technologies in use with big data on the c-virus is NEXSTRAIN, a tool that tracks the movement and mutation tendencies of infectious agents such as viruses and bacteria. This tool was developed some years ago to help Epidemiologists understand the evolution of pathogens in different conditions, countries, environments, climates, etc. In this case we would see that Data Scientists, specialists in this tool, are working with the Epidemiologists. In other industries DATA SCIENCE boffins are usually be adept on general use tools such as RapidMiner or YellowFin BI.
Using transparent and accessible public data the WHO has facilitated the development of Big Data dashboards to track the spread of the virus. This allows users (governments and scientists working on the pandemic) to access real time updates easily; the WHO dashboard is accessible through several platforms. Other similar dashboards are also available such as one that depicts the infection concentration graphically which makes it easy to plan for the much needed and scarce resource distribution and to which areas travel must be seriously minimised. AI, a technology that is closely associated with DATA SCIENCE, is also under wide use in the pandemic.
Technologies such as image recognition using machine learning algorithms are used to diagnose thousands of CT scans related to Covid 19 in a matter of seconds with a high level of accuracy and precision. Also, robots are in use in many hospitals (to minimise contact between medical staff and patients); they are used to deliver food and medicines to patients (AI recognises which patient requires which medicine). It is used in various other hospital tasks such as remote measurement and logging of body temperatures, washing critical equipment, etc. Artificial Intelligence is at the core of robotics these days. Machine Learning, used heavily in DATA SCIENCE, is at the heart of AI, yup we keep coming back to the fact that DATA SCIENCE is a cogent partner in several technological applications in life. Researchers from different parts of the world are collaboratively using AI to create a prediction model for antiviral drugs as shown by a team from South Korea’s Dankook University in South Korea working with researchers from the US academy at Deargen to run a series of tests using commercially available antivirals that may act upon COVID-19 – prediction model is again is a function of DATA SCIENCE. The versatility of DATA SCIENCE, being able to take a deep dive and adapt to highly interdisciplinary use cases, is inexorably making it the biggest thing in the world since the discovery of oil.
Graduates with mathematics as part of their graduate / post-graduate qualification, you may want to consider enhancing your current CV with a course in DATA SCIENCE. There are several institutes offering data Science courses, you may want to check out AptusLearn, though, as they use real data and experienced data scientists to guide one in the use of the special tools and have several contact-based classes as well as online professional courses in DATA SCIENCE, including the Post Graduate Diploma in DATA SCIENCE (IIIT Bhubaneswar), to get you going – get in ahead of the rush!