An R package for implementing geospatial cluster identification from time series of counts, by location. Locations can be expressed as counties, zip codes, census tracts, or other user-defined geographies. Users provide:
The package provides functions to create these distance objects in either matrix or list format. These can be generated for census tract, zip codes, or counties (fips), or can be constructed for custom locations by providing a dataframe with columns for latitude and longitude (i.e the centroid of each location).
Install the gsClusterDetect package from CRAN as
follows:
install.packages("gsClusterDetect")Install the development version from git as follows:
devtools::install_github("lmullany/gsClusterDetect")location,
date, and count columns.library(gsClusterDetect)
df <- example_count_data
tail(df)
location date count
<char> <IDat> <int>
1: 39171 2025-02-04 1
2: 39171 2025-02-05 0
3: 39173 2025-02-04 6
4: 39173 2025-02-05 7
5: 39175 2025-02-04 2
6: 39175 2025-02-05 0county_distance_matrix() and pass the state
abbreviation:ohio_dm <- county_distance_matrix("OH")
# This is named list of two elements
cat("Class:", class(ohio_dm), "\nNames:", names(ohio_dm))
Class: list
Names: loc_vec distance_matrixdetect_date, and is a parameter that must be passed to the
find_clusters function. Typically, this might be the
current (or last available) date.detect_date <- max(df[, date])find_clusters() function; See
?find_clusters() for full set of options. Note that below,
we pass the minimum required elements: cases,
distance_matrix, detect_date, and set the
distance_limit (the maximum size of the clusters) to 50
(miles).clusters <- find_clusters(
cases = df,
distance_matrix = ohio_dm[["distance_matrix"]],
detect_date = detect_date,
distance_limit = 50
)Copyright 2026 The Johns Hopkins University Applied Physics Laboratory LLC.