CFtime is an R package that supports working with CF Metadata Conventions time coordinates, specifically geared to time-referencing data sets of climate projections such as those produced under the World Climate Research Programme and re-analysis data such as ERA5 from the European Centre for Medium-range Weather Forecasts (ECMWF).
The data sets include in their metadata an epoch, or origin, a point
in time from which other points in time are calculated. This epoch takes
the form of days since 1949-12-01
, with each data source
(Coupled Model Intercomparison Project (CMIP) generation, model, etc)
having its own epoch. The data itself has a temporal dimension if a
variable in the netCDF file has an attribute units
with a
string value describing an epoch. The variable, say “time”, has data
values such as 43289, which are offsets from the epoch in units of the
epoch string (“days” in this case). To convert this offset to a date,
using a specific calendar, is what this package does. Given that the
calendars supported by the CF Metadata Conventions are not compatible
with POSIXt
, this conversion is not trivial because the
standard R date-time operations do not give correct results. That it is
important to account for these differences is easily demonstrated:
library(CFtime)
# POSIXt calculations on a standard calendar
as.Date("1949-12-01") + 43289
#> [1] "2068-06-08"
# CFtime calculation on a "360_day" calendar
as_timestamp(CFtime("days since 1949-12-01", "360_day", 43289))
#> [1] "2070-02-30"
That’s a difference of nearly 21 months! (And yes, 30 February is a
valid date on a 360_day
calendar.)
All defined calendars of the CF Metadata Conventions are supported:
standard
or gregorian
: This calendar is
valid for the Common Era only; it starts at 0001-01-01 00:00:00, i.e. 1
January of year 1. Time periods prior to the introduction of the
Gregorian calendar (1582-10-15) use the julian
calendar
that was in common use then. The 10-day gap between the Julian and
Gregorian calendars is observed, so dates in the range 5 to 14 October
1582 are invalid.proleptic_gregorian
: This calendar uses the Gregorian
calendar for periods prior to the introduction of that calendar as well,
and it extends to periods before the Common Era, e.g. year 0 and
negative years.tai
: International Atomic Time, a global standard for
linear time based on multiple atomic clocks: it counts seconds since its
start at 1958-01-01 00:00:00. For presentation it uses the Gregorian
calendar. Timestamps prior to its start are not allowed.utc
: Coordinated Universal Time, the standard for civil
timekeeping all over the world. It is based on International Atomic Time
but it uses occasional leap seconds to remain synchronous with Earth’s
rotation around the Sun; at the end of 2024 it is 37 seconds behind
tai
. It uses the Gregorian calendar with a start at
1972-01-01 00:00:00; earlier timestamps are not allowed. Future
timestamps are also not allowed because the insertion of leap seconds is
unpredictable. Most computer clocks use UTC but calculations of periods
do not consider leap seconds.julian
: The julian
calendar has a leap
year every four years, including centennial years. Otherwise it is the
same as the standard
calendar.365_day
or noleap
: This is a “model time”
calendar in which no leap years occur. Year 0 exists, as well as years
prior to that.366_day
or all_leap
: This is a “model
time” calendar in which all years are leap years. Year 0 exists, as well
as years prior to that.360_day
: This is a “model time” calendar in which every
year has 360 days divided over 12 months of 30 days each. Year 0 exists,
as well as years prior to that.Use of custom calendars is not supported. This package is also not suitable for paleo-calendars.
This package IS NOT intended to support the full date and time functionality of the CF Metadata Conventions. Instead, it facilitates use of a suite of models of climate projections that use different calendars in a consistent manner.
This package is particularly useful for working with climate projection data having a daily or higher resolution, but it will work equally well on data with a lower resolution.
Timestamps are generated using the ISO8601 standard.
Calendar-aware factors can be generated to support processing of data
using tapply()
and similar functions. Merging of multiple
data sets and subsetting facilitate analysis while preserving computer
resources.
Get the latest stable version on CRAN:
install.packages("CFtime")
You can install the development version of CFtime from GitHub with:
# install.packages("devtools")
::install_github("pvanlaake/CFtime") devtools
The package contains a class, CFTime
, to describe the
time coordinate system, including its calendar and origin, and which
holds the time coordinate values that are offset from the origin to
represent instants in time. This class operates on the data in the file
of interest, here a Coordinated Regional Climate Downscaling Experiment
(CORDEX) file of precipitation for the Central America domain:
library(ncdf4)
# Opening a data file that is included with the package.
# Usually you would `list.files()` on a directory of your choice.
<- list.files(path = system.file("extdata", package = "CFtime"), full.names = TRUE)[1]
fn <- nc_open(fn)
nc <- ncatt_get(nc, "")
attrs $title
attrs#> [1] "NOAA GFDL GFDL-ESM4 model output prepared for CMIP6 update of RCP4.5 based on SSP2"
$license
attrs#> [1] "CMIP6 model data produced by NOAA-GFDL is licensed under a Creative Commons Attribution-ShareAlike 4.0 International License (https://creativecommons.org/licenses/). Consult https://pcmdi.llnl.gov/CMIP6/TermsOfUse for terms of use governing CMIP6 output, including citation requirements and proper acknowledgment. Further information about this data, including some limitations, can be found via the further_info_url (recorded as a global attribute in this file). The data producers and data providers make no warranty, either express or implied, including, but not limited to, warranties of merchantability and fitness for a particular purpose. All liabilities arising from the supply of the information (including any liability arising in negligence) are excluded to the fullest extent permitted by law."
# Create the CFTime instance
<- CFtime(nc$dim$time$units,
time $dim$time$calendar,
nc$dim$time$vals)
nc
time#> CF calendar:
#> Origin : 1850-01-01 00:00:00
#> Units : days
#> Type : noleap
#> Time series:
#> Elements: [2015-01-01 12:00:00 .. 2099-12-31 12:00:00] (average of 1.000000 days between 31025 elements)
#> Bounds : not set
# ... work with the data ...
nc_close(nc)
Note that the information of interest
(nc$dim$time$units
, etc) is read out of the file “blindly”,
without checking for available dimensions or attributes. This can be
done because the “time” dimension and its attributes units
and calendar
are required by the CF Metadata Conventions.
Should this fail, then your data set does not have a temporal dimension
or it is not compliant (note that the name “time” could be different, a
temporal dimension is defined by the “units” attribute alone). You could
still use this package if the required information is contained in your
file but using a different dimension name or different attribute
names.
If you are using the RNetCDF
package rather than
ncdf4
, creating a CFTime
instance goes like
this:
<- open.nc(fn)
nc <- CFtime(att.get.nc(nc, "time", "units"),
time att.get.nc(nc, "time", "calendar"),
var.get.nc(nc, "time"))
In a typical process, you would combine multiple data files into a single data set to analyze a feature of interest. To continue the previous example of precipitation in the Central America domain using CORDEX data, you can calculate the precipitation per month for the period 2041 - 2050 as follows:
# NOT RUN
library(CFtime)
library(ncdf4)
library(abind)
# Open the files - one would typically do this in a loop
<- nc_open("~/CC/CORDEX/CAM-22/RCP2.6/pr_CAM-22_MOHC-HadGEM2-ES_rcp26_r1i1p1_GERICS-REMO2015_v1_day_20410101-20451230.nc")
nc2041 <- nc_open("~/CC/CORDEX/CAM-22/RCP2.6/pr_CAM-22_MOHC-HadGEM2-ES_rcp26_r1i1p1_GERICS-REMO2015_v1_day_20460101-20501230.nc")
nc2046
# Create the time object from the first file
# All files have an identical datum as per the CORDEX specifications
<- CFtime(nc2041$dim$time$units, nc2041$dim$time$calendar, nc2041$dim$time$vals)
time
# Add the time values from the remaining files
<- time + as.vector(nc2046$dim$time$vals)
time
# Grab the data from the files and merge the arrays into one, in the same order
# as the time values
<- abind(ncvar_get(nc2041, "pr"), ncvar_get(nc2046, "pr"))
pr nc_close(nc2041)
nc_close(nc2046)
# Optionally - Set the time dimension to the timestamps from the time object
dimnames(pr)[[3]] <- as_timestamp(time)
# Create the month factor from the time object
<- CFfactor(time, "month")
f_month
# The result from applying this factor to a data set that it describes is a new
# data set with a different "time" dimension. The function result stores this
# new time object as an attribute.
<- attr(f_month, "CFtime")
pr_month_time
# Now sum the daily data to monthly data
# Dimensions 1 and 2 are longitude and latitude, the third dimension is time
<- aperm(apply(pr, 1:2, tapply, f_month, sum), c(2, 3, 1))
pr_month dimnames(pr_month)[[3]] <- as_timestamp(pr_month_time)
This package has been tested with the following data sets:
The package also operates on geographical and/or temporal subsets of data sets so long as the subsetted data complies with the CF Metadata Conventions. This includes subsetting in the Climate Data Store. Subsetted data from Climate4Impact is not automatically supported because the dimension names are not compliant with the CF Metadata Conventions, use the corresponding dimension names instead.