疫情数据分析

疫情数据源

网站上都可以看到当时的总数,当天增加数,但增加数有时只是一部分,如果看最近10天20天数据,比较容易掌握趋势和判断严重程度。

中国数据,代码里包含url , 其中latest=1 为最新数据,0则为历史数据

分析代码如下:

import pandas as pd
import requests
import time
url = 'https://lab.isaaclin.cn/nCoV/api/area?latest=0'
r = requests.request('GET', url)
data = r.json()

df = pd.DataFrame.from_records(data['results'])

数据结构如下:

df.info()
<class ‘pandas.core.frame.DataFrame’>
RangeIndex: 253 entries, 0 to 252
Data columns (total 17 columns):
locationId               253 non-null int64
continentName            253 non-null object
continentEnglishName     253 non-null object
countryName              253 non-null object
countryEnglishName       220 non-null object
provinceName             253 non-null object
provinceEnglishName      220 non-null object
provinceShortName        253 non-null object
currentConfirmedCount    253 non-null int64
confirmedCount           253 non-null int64
suspectedCount           253 non-null int64
curedCount               253 non-null int64
deadCount                253 non-null int64
comment                  252 non-null object
cities                   34 non-null object
updateTime               253 non-null int64
dtypes: int64(7), object(10)
memory usage: 33.7+ KB

欧洲CDC 网站下载信息

下载网址:https://www.ecdc.europa.eu/en/publications-data/download-todays-data-geographic-distribution-covid-19-cases-worldwide

下载json 文件,取名为corid2019.json,或者直接在程序中读取,见另篇博文欧洲cdc 疫情数据分析

加拿大数据

https://www.canada.ca/en/public-health/services/diseases/2019-novel-coronavirus-infection.html#a1

import pandas as pd

df = pd.read_csv('covid19.csv')
print(df[['pruid','prname','numconf','numdeaths','date']])

数据结构是:

df.info()
<class ‘pandas.core.frame.DataFrame’>
RangeIndex: 627 entries, 0 to 626
Data columns (total 14 columns):
pruid             627 non-null int64
prname            627 non-null object
prnameFR          627 non-null object
date              627 non-null object
numconf           627 non-null int64
numprob           627 non-null int64
numdeaths         619 non-null float64
numtotal          627 non-null int64
numtested         558 non-null float64
numrecover        30 non-null float64
percentrecover    28 non-null float64
ratetested        0 non-null float64
numtoday          612 non-null float64
percentoday       612 non-null float64
dtypes: float64(7), int64(4), object(3)
memory usage: 68.7+ KB

美国的数据

https://coronavirus.1point3acres.com/ 好像提供数据,但要获得申请,我没得到信息

疫情数据源

网站上都可以看到当时的总数,当天增加数,但增加数有时只是一部分,如果看最近10天20天数据,比较容易掌握趋势和判断严重程度。

中国数据,代码里包含url , 其中latest=1 为最新数据,0则为历史数据

分析代码如下:

import pandas as pd
import requests
import time
url = 'https://lab.isaaclin.cn/nCoV/api/area?latest=0'
r = requests.request('GET', url)
data = r.json()

df = pd.DataFrame.from_records(data['results'])

数据结构如下:

df.info()
<class ‘pandas.core.frame.DataFrame’>
RangeIndex: 253 entries, 0 to 252
Data columns (total 17 columns):
locationId               253 non-null int64
continentName            253 non-null object
continentEnglishName     253 non-null object
countryName              253 non-null object
countryEnglishName       220 non-null object
provinceName             253 non-null object
provinceEnglishName      220 non-null object
provinceShortName        253 non-null object
currentConfirmedCount    253 non-null int64
confirmedCount           253 non-null int64
suspectedCount           253 non-null int64
curedCount               253 non-null int64
deadCount                253 non-null int64
comment                  252 non-null object
cities                   34 non-null object
updateTime               253 non-null int64
dtypes: int64(7), object(10)
memory usage: 33.7+ KB

欧洲CDC 网站下载信息

下载网址:https://www.ecdc.europa.eu/en/publications-data/download-todays-data-geographic-distribution-covid-19-cases-worldwide

下载json 文件,取名为corid2019.json,或者直接在程序中读取,见另篇博文欧洲cdc 疫情数据分析

加拿大数据

https://www.canada.ca/en/public-health/services/diseases/2019-novel-coronavirus-infection.html#a1

import pandas as pd

df = pd.read_csv('covid19.csv')
print(df[['pruid','prname','numconf','numdeaths','date']])

数据结构是:

df.info()
<class ‘pandas.core.frame.DataFrame’>
RangeIndex: 627 entries, 0 to 626
Data columns (total 14 columns):
pruid             627 non-null int64
prname            627 non-null object
prnameFR          627 non-null object
date              627 non-null object
numconf           627 non-null int64
numprob           627 non-null int64
numdeaths         619 non-null float64
numtotal          627 non-null int64
numtested         558 non-null float64
numrecover        30 non-null float64
percentrecover    28 non-null float64
ratetested        0 non-null float64
numtoday          612 non-null float64
percentoday       612 non-null float64
dtypes: float64(7), int64(4), object(3)
memory usage: 68.7+ KB

美国的数据

https://coronavirus.1point3acres.com/ 好像提供数据,但要获得申请,我没得到信息

https://gisanddata.maps.arcgis.com/apps/opsdashboard/index.html#/bda7594740fd40299423467b48e9ecf6 这个实时显示疫情数据

https://gisanddata.maps.arcgis.com/apps/opsdashboard/index.html#/bda7594740fd40299423467b48e9ecf6 这个实时显示疫情数据

发表评论

电子邮件地址不会被公开。 必填项已用*标注