Update the coronavirus Dataset

While the CRAN version of the package is updated once every month or two, the Github (Dev) version is updating on a daily bases. The following options allow you to keep the data updated with the ones available on the Dev version:

The update_dataset function

The update_dataset function enables to keep the installed version updated with the data available on Github. The function compared between the dataset on the installed version and the ones on the Dev version:

library(coronavirus)

update_dataset()

If no new data is available on the Dev version, the function will return the following message:

No updates are available

Once new data is available, the function will prompt the following question that enables the user to select whether to install the updates from the dev version:

Updates are available on the coronavirus Dev version, do you want to update? n/Y

In order to make the new data available, you will have to restart your R session.

Note: As frequent changes may occur on the raw data structure (such as new fields, retroactive updates in the data, etc.), the Dev version dataset may change accordingly.

Reading the data from CSV version

Alternatively, you can read and load the data directly from the package repository, using the csv version:

coronavirus_df <- read.csv("https://raw.githubusercontent.com/RamiKrispin/coronavirus/master/csv/coronavirus.csv",
                     stringsAsFactors = FALSE)

head(coronavirus_df)
##         date province     country      lat     long      type cases
## 1 2020-01-22          Afghanistan 33.93911 67.70995 confirmed     0
## 2 2020-01-23          Afghanistan 33.93911 67.70995 confirmed     0
## 3 2020-01-24          Afghanistan 33.93911 67.70995 confirmed     0
## 4 2020-01-25          Afghanistan 33.93911 67.70995 confirmed     0
## 5 2020-01-26          Afghanistan 33.93911 67.70995 confirmed     0
## 6 2020-01-27          Afghanistan 33.93911 67.70995 confirmed     0

The main difference between the first method (the update_dataset function) and the second method (reading a CSV format of the data) is that the date field on the last method is not formated as a Date object. A quick reformating can fix it:

coronavirus_df$date <- as.Date(coronavirus_df$date)

str(coronavirus_df)
## 'data.frame':    283554 obs. of  7 variables:
##  $ date    : Date, format: "2020-01-22" "2020-01-23" ...
##  $ province: chr  "" "" "" "" ...
##  $ country : chr  "Afghanistan" "Afghanistan" "Afghanistan" "Afghanistan" ...
##  $ lat     : num  33.9 33.9 33.9 33.9 33.9 ...
##  $ long    : num  67.7 67.7 67.7 67.7 67.7 ...
##  $ type    : chr  "confirmed" "confirmed" "confirmed" "confirmed" ...
##  $ cases   : int  0 0 0 0 0 0 0 0 0 0 ...