Wednesday, April 29, 2020

Covid-19 deaths by county

One aspect of the dilemna about "opening up" the country is that many places don't see any problems except on TV.  The Johns Hopkins database gives me 3238 “counties” in the US. (Some are not real counties but “unassigned”, etc.).

1829 -> 0
 430 -> 1
 191 -> 2
 109 -> 3

56% have zero deaths currently.  At the top end are the usual suspects:

New York City 17515
Wayne          1622
Nassau         1620
Cook           1347
Suffolk        1102
Essex          1028
Westchester     962
Bergen          960
Los Angeles     944

Here is a histogram of the number of counties with the given number of deaths, excluding the bins for values >50 or <3.

The vast majority of counties in the US have experienced have almost no deaths from Covid-19.

Here's another view of the data. Each county's value was put into a list. The list was then sorted, and plotted. I cut off the top end of the distribution to look more closely at the low numbers.

As I said, about 1800 counties have 0 deaths and something like 90% have less than 20.



(A surprising number of counties are simply not reporting. I've assumed those are 0).

This is all part of a project to download and play with the data collated by the Johns Hopkins CSSE folks.  You can find a Github repo with my code here.

My version of the database is constructed from their database files by update.py.  This checks the csv.source directory and if it's not up-to-date, downloads the appropriate data files from their data.

It's all Python2 code, mainly because I can't be bothered to type print(s) instead of print s.

My database looks like this

2020-03-22
2020-04-28

Autauga;Alabama;01001;US
0,0,1,4 ...
0,0,0,0 ...

I also got an urge to make some analyses normalizing to population.  It took some effort to find the data and fix differences with the entries in the Covid-19 database.


The county where I live is #588 with 1.7 deaths per 100,000 population.  You can find the complete list here.

I only just finished re-writing the code (this is the 5th iteration), so there aren't many projects stored in the repo yet (2020-04-28).