Introduction to grangersearch

Overview

The grangersearch package provides tools for performing Granger causality tests on pairs of time series. Granger causality is a statistical concept that tests whether one time series helps predict another.

library(grangersearch)

What is Granger Causality?

A variable X is said to Granger-cause Y if past values of X contain information that helps predict Y, above and beyond the information contained in past values of Y alone. This is not true causality in the philosophical sense, but rather predictive causality based on temporal precedence.

The test works by fitting Vector Autoregressive (VAR) models and comparing restricted vs unrestricted models using F-tests.

Basic Usage

Vector Input

The simplest way to use the package is with two numeric vectors:

# Generate example time series
set.seed(123)
n <- 100

# X is a random walk
x <- cumsum(rnorm(n))

# Y depends on lagged X (so X should Granger-cause Y)
y <- c(0, x[1:(n-1)]) + rnorm(n, sd = 0.5)

# Perform the test
result <- granger_causality_test(x = x, y = y)
print(result)
#> 
#> Granger Causality Test
#> ======================
#> 
#> Observations: 100, Lag order: 1, Significance level: 0.050
#> 
#> x -> y: x Granger-causes y (p = 0.0000)
#> y -> x: y does not Granger-cause x (p = 0.9694)

Detailed Summary

Use summary() for a more detailed output:

summary(result)
#> 
#> Granger Causality Test - Detailed Summary
#> ==========================================
#> 
#> Call:
#> granger_causality_test(x = x, y = y)
#> 
#> Variables: x, y
#> Number of observations: 100
#> VAR lag order: 1
#> Test type: F-test
#> Significance level (alpha): 0.050
#> 
#> Results:
#> --------
#> 
#> Test: x Granger-causes y
#>   Test statistic: 476.1367
#>   P-value: 0.000000
#>   Conclusion: REJECT null (x causes y)
#> 
#> Test: y Granger-causes x
#>   Test statistic: 0.0015
#>   P-value: 0.969439
#>   Conclusion: FAIL TO REJECT null (y does not cause x)
#> 
#> Interpretation:
#> ---------------
#> Unidirectional causality: x Granger-causes y (but not vice versa).

Tidyverse Integration

The package supports tidyverse-style syntax, making it easy to use with data frames and pipes.

Using with Data Frames

library(tibble)

# Create a tibble with time series
df <- tibble(
  price = cumsum(rnorm(100)),
  volume = c(0, cumsum(rnorm(99)))
)

# Use column names directly
result <- granger_causality_test(df, price, volume)
print(result)
#> 
#> Granger Causality Test
#> ======================
#> 
#> Observations: 100, Lag order: 1, Significance level: 0.050
#> 
#> price -> volume: price does not Granger-cause volume (p = 0.0768)
#> volume -> price: volume does not Granger-cause price (p = 0.6559)

Using Pipes

# With base R pipe
df |>
  granger_causality_test(price, volume)
#> 
#> Granger Causality Test
#> ======================
#> 
#> Observations: 100, Lag order: 1, Significance level: 0.050
#> 
#> price -> volume: price does not Granger-cause volume (p = 0.0768)
#> volume -> price: volume does not Granger-cause price (p = 0.6559)

Tidy Output

For programmatic access to results, use tidy():

result <- granger_causality_test(x = x, y = y)
tidy(result)
#> # A tibble: 2 × 6
#>   direction cause effect statistic p.value significant
#>   <chr>     <chr> <chr>      <dbl>   <dbl> <lgl>      
#> 1 x -> y    x     y      476.        0     TRUE       
#> 2 y -> x    y     x        0.00147   0.969 FALSE

Use glance() for model-level summary:

glance(result)
#> # A tibble: 1 × 7
#>    nobs   lag alpha test  bidirectional x_causes_y[,1] y_causes_x[,1]
#>   <int> <int> <dbl> <chr> <lgl>         <lgl>          <lgl>         
#> 1   100     1  0.05 F     FALSE         TRUE           FALSE

Adjusting Parameters

Lag Order

The lag parameter controls the number of lagged values used in the VAR model:

# Using lag = 2
result_lag2 <- granger_causality_test(x = x, y = y, lag = 2)
print(result_lag2)
#> 
#> Granger Causality Test
#> ======================
#> 
#> Observations: 100, Lag order: 2, Significance level: 0.050
#> 
#> x -> y: x Granger-causes y (p = 0.0000)
#> y -> x: y does not Granger-cause x (p = 0.4979)

Significance Level

Adjust the significance level with the alpha parameter:

# More conservative test with alpha = 0.01
result_strict <- granger_causality_test(x = x, y = y, alpha = 0.01)
print(result_strict)
#> 
#> Granger Causality Test
#> ======================
#> 
#> Observations: 100, Lag order: 1, Significance level: 0.010
#> 
#> x -> y: x Granger-causes y (p = 0.0000)
#> y -> x: y does not Granger-cause x (p = 0.9694)

Interpreting Results

The function tests causality in both directions:

X → Y: Does X help predict Y?
Y → X: Does Y help predict X?

Possible outcomes:

Unidirectional causality: Only one direction is significant
Bidirectional causality: Both directions are significant
No causality: Neither direction is significant

Accessing Individual Results

result <- granger_causality_test(x = x, y = y)

# Logical indicators
result$x_causes_y
#>      [,1]
#> [1,] TRUE
result$y_causes_x
#>       [,1]
#> [1,] FALSE

# P-values
result$p_value_xy
#> [1] 0
result$p_value_yx
#> [1] 0.9694391

# Test statistics
result$test_statistic_xy
#> [1] 476.1367

Example: Financial Data

A common application is testing whether one financial variable predicts another:

set.seed(42)
n <- 250  # About one year of trading days

# Simulate stock returns
stock_returns <- rnorm(n, mean = 0.0005, sd = 0.02)

# Trading volume often leads price movements
# Volume is partially predictive of next-day returns
volume <- abs(rnorm(n, mean = 1000, sd = 200))
volume_effect <- c(0, 0.001 * scale(volume[1:(n-1)]))
price_with_volume <- stock_returns + volume_effect

df <- tibble(
  returns = price_with_volume,
  volume = volume
)

# Test if volume Granger-causes returns
result <- df |> granger_causality_test(volume, returns)
print(result)
#> 
#> Granger Causality Test
#> ======================
#> 
#> Observations: 250, Lag order: 1, Significance level: 0.050
#> 
#> volume -> returns: volume does not Granger-cause returns (p = 0.6774)
#> returns -> volume: returns does not Granger-cause volume (p = 0.5871)

Important Notes

Stationarity: Granger causality tests assume stationary time series. Consider differencing non-stationary data.
Lag selection: The choice of lag order matters. Too few lags may miss dynamics; too many reduce power.
Sample size: More observations give more reliable results. The minimum is 2 * lag + 2.
Not true causality: Granger causality indicates predictive relationships, not true causal mechanisms.