Every day (and hour) we are served a new report on the current state of the coronavirus (COVID-19). While it is useful to keep track of how things are progressing and whether it is time to finally close ourselves in the bunker (doompreppers certainly want to tell everyone “I told you“) and buy a ton of tuna and bean cans, the interpretation of the data is not always true to the real data. We looked at COVID-19 from an objective standpoint to give you “clean” information.
Just reading that we have 100 cases means to someone that an apocalypse has come, while to someone else it means something less terrible.
When you see the visualization of pure data, it is much easier to be more objective about the whole situation because you are looking at the facts (however flawed they are) as opposed to someone’s interpretations.
My name is Kristijan Saric and I’m a programmer. I’m not an expert on viruses, I’m not an epidemiologist, I don’t consider myself an expert in anything. And again, I’m sure I know a lot more things about IT than some self-proclaimed experts. That’s the IT of today, it’s more important how you present yourself than what you know.
In any case, everything I write and show here is not the opinion of a person skilled in the corona virus, but rather my attempt to explain to myself (and you) what is happening and how fast the virus is spreading.
In this first blog post, the focus will be on data visualization and we will try to look at a couple of graphs that describe the official data you can see here.
This information is from the “Johns Hopkins University Center for Systems Science and Engineering” or the shorter “Johns Hopkins CSSE”. They are updated on a daily basis.
As I wrote, the first post will focus on visualizing the data itself and showing how the virus is spreading around the world, with a focus on Europe, Croatia and neighboring countries.
This is a typical way of working on an ML (Machine Learning) project. We first retrieve (or collect) the data, process it and correct it if necessary, and then try to display the data ourselves to get a clean “picture” of what the data tells us.
Processing and correcting data is necessary because, unlike a machine, we humans can judge whether data that diverges is something real or “noise” in the data (a jump in air temperature from 9°C on Thursday to 100°C on Friday is a data error, not real data).
Then the next phase is a lot easier, too, when we use that same data to try to find a model by which we can predict something we have historical data about (in this case the spread of the virus).
Again, any prediction is pure fiction. As are all other predictions. If someone tells you that he can predict that he sees the future, then he is crazy, stupid or lying (or wants to steal your money). What we can do is assume what the future might look like. This does not mean that we can really see into the future and people should not use someone else’s data to validate their view of the world and try to raise more panic than they are. Even though we all do.
The information we are using is updated on 03/19/2020 at 10:30.
US and the world
Before we focus on Croatia and its neighboring countries, we will look at the US and the rest of the world. The graph below shows all US states that have more than 30 confirmed cases. The X-axis lists the number of confirmed cases (Confirmed) and the Y-axis (Province / State) lists the states that have more than 30 confirmed cases.
If we look at the situation today in the US and around the world, we can see the following. This visualization was obtained through Google Maps.
There is no reason to fear these red dots on the map, they show multiple overlaps, and as the data we have is much more accurate for the US, then more dots overlap. For example, Italy is only one point, although the situation there is much worse than elsewhere – it does not manifest in red.
But this does not tell us much, we try to visualize what has been going on in the world for the last three months. Let’s take three dates:
- 03/18/2020 (latest information at time of writing)
For 10/22/2020 we have the following image:
We see that most confirmed cases linger in China.
For 02/22/2020 we have the following picture:
We see the number of cases expanding and we have cases across Europe, the US and Australia.
For 03/18/2020 we have the following picture:
The virus has spread all over the world. You should not be concerned about the size of the circles as much as the color changes of the circles. So far, we do not see a large number of confirmed cases, although we see that Italy has stood out, as has China.
If we look at the list of countries in the world that have more than 1000 confirmed cases, we get a picture like this:
But let’s not panic right away. Let’s look at a picture of these same countries and how many people have successfully healed and recovered.
Now you can panic! (or maybe not, the data is not so alarming)
I think the situation shows that China has done a good job and that when this virus is taken seriously we can do much, almost eliminating it completely.
Maybe if they would eliminate the bats from the menu, so that would be even better.
Italy is a little late, but the quarantine was only recently announced, so the situation should improve significantly. The same goes for other countries, if taken seriously, a lot of damage can be prevented or eliminated.
Croatia and its surroundings
So far, the data we have shows that there are over 80 confirmed cases. In the media, the number of cases is higher, but this data is still one day late (slower but more precise). What is the situation with people who are cured?
The situation is not so terrible though. Although we see a trend that seems to be going up exponentially, Croatia is still waking up and precautions are being taken to close public places and prevent large numbers of people from gathering.
Let’s look at the situation with the countries with which Croatia borders.
We see that Italy dominates and that the other countries are barely visible. Let’s see what it looks like without Italy.
Slovenia has the most cases, although the whole picture is not so terrible. Apart from Italy, which far exceeds all the surrounding countries, the other countries with which we border are not so affected. The number of cases is not so optimistic, but they do not have a large number of patients.
If we try to make a rough estimate of the trend for Croatia, we can get something like this:
In red, the day the work restriction was declared.
The situation in Italy.
In red, the day the quarantine was declared.
We can also take a closer look at the number of patients in Croatia.
And also in Italy.
Or to look at all the neighboring countries.
And for all of Europe.
The situation is not so terrible for us now, but it is certain that we must be careful so that the situation does not get worse.
In the near future, I will edit this Jupyter Notebook (which was used to visualize the data) and put everything on GitHub for everyone to access. Then it will only be possible to refresh the data (I’ll describe how) and re-generate all these graphs. So it will be possible to look at the situation every day and how it is progressing.
The next post will also include simple machine learning so we can take a closer look at how the virus spreads and how we can (try to) predict it.
Until then, don’t panic, stick to the directions of the county and state headquarters and things will probably be better.
P.S. Read more about us here.
P.P.S. To visualize or analyze your business, sales, or similar information, you can contact us: