xiacf: Nonlinear Dependence and Lead-Lag Analysis via Chatterjee’s Xi

The xiacf package provides a robust framework for detecting complex non-linear and functional dependence in time series data. Traditional linear metrics, such as the standard Autocorrelation Function (ACF) and Cross-Correlation Function (CCF), often fail to detect symmetrical or purely non-linear relationships.

This package overcomes these limitations by utilizing Chatterjee’s Rank Correlation (\(\xi\)), offering both univariate (\(\xi\)-ACF) and multivariate (\(\xi\)-CCF) analysis tools. It features rigorous statistical hypothesis testing powered by advanced surrogate data generation algorithms (IAAFT and MIAAFT) and dynamic Family-Wise Error Rate (FWER) control, all implemented in high-performance C++ using RcppArmadillo.

Citation

If you use xiacf in your research, please cite our latest working paper detailing the methodology:

Watanabe, Y. (2026). Differential diagnosis of nonlinearity: Integrating the BDS omnibus test with chatterjee’s xi for local structural identification. In Social Science Research Network. https://doi.org/10.2139/ssrn.6829431

Key Features

Installation

You can install the stable version of xiacf from CRAN with:

install.packages("xiacf")

You can install the development version from GitHub with:

# install.packages("remotes")
remotes::install_github("yetanothersu/xiacf")

Basic Usage

1. Univariate Non-linear ACF (\(\xi\)-ACF)

Detecting strong non-linear auto-dependence that standard linear ACF fails to capture.

### 1. Univariate Non-linear ACF (xi-ACF)

library(xiacf)
library(ggplot2)

set.seed(42)
n <- 300

# Generate a series with V-shaped auto-dependence (mean zero)
# Standard Pearson ACF will miss this, but Xi-ACF will detect it.
A <- numeric(n)
A[1] <- rnorm(1)
for (t in 2:n) {
  # Subtracting 0.8 keeps the series centered around 0
  A[t] <- abs(A[t - 1]) - 0.8 + rnorm(1, sd = 0.2)
}

res_acf <- xi_acf(A, max_lag = 5, n_surr = 249)
print(res_acf)
#> 
#> === Univariate Xi-Autocorrelation Function ===
#> Time series length: 300
#> Max Lag: 5
#> Surrogates (IAAFT): 249
#> Significance Level: 0.05 (FWER controlled)
#> ==============================================
#> Significant Lags:
#>  Lag        Xi Global_Threshold Xi_Excess
#>    1 0.6301542        0.5302148 0.0999394
autoplot(res_acf)

2. Bivariate Non-linear CCF (\(\xi\)-CCF)

Discovering hidden causal pathways across different time series.

### 2. Bivariate Non-linear CCF (xi-CCF)

# X is pure noise (mean 0, symmetric)
X <- rnorm(n)
Y <- numeric(n)

# Y is a purely quadratic function of X at lag 2
for (t in 3:n) {
  Y[t] <- X[t - 2]^2 + rnorm(1, sd = 0.2)
}

# Center Y so it contains both negative and positive values.
# Without this, Y is always positive, making abs(Y) purely linear later!
Y <- as.numeric(scale(Y))

res_ccf <- xi_ccf(X, Y, max_lag = 5, n_surr = 249, direction = "both")
print(res_ccf)
#> 
#> === Bivariate Xi-Cross-Correlation (CCF) ===
#> Variables: X, Y
#> Time series length: 300
#> Max Lag: 5
#> Direction: both
#> Surrogates (MIAAFT): 249
#> Significance Level: 0.05 (FWER controlled)
#> ============================================
#> Top 5 Strongest Causal Pathways:
#>  Lead_Var Lag_Var Lag        Xi      CCF Global_Threshold Xi_Excess
#>         X       Y   2 0.7584119 0.123921        0.4662698 0.2921421
autoplot(res_ccf)

3. Multivariate Network Matrix

Analyze an entire system of variables at once.

### 3. Multivariate Network Matrix & Pathway Extraction

Z <- numeric(n)

# Z depends on the absolute value of Y at lag 1
# Because Y is centered, this creates a true V-shaped non-linear relationship
for (t in 2:n) {
  Z[t] <- abs(Y[t - 1]) + rnorm(1, sd = 0.2)
}
Z <- as.numeric(scale(Z))

df_system <- data.frame(X = X, Y = Y, Z = Z)

# Compute the multivariate Xi-correlogram matrix
res_matrix <- xi_matrix(df_system, max_lag = 3, n_surr = 499)

autoplot(res_matrix)


# Extract it for a detailed bivariate analysis with exact FWER re-evaluation!
ext_ccf <- extract_xi_ccf(res_matrix, var_x = "X", var_y = "Z", direction = "x_leads")
autoplot(ext_ccf)

4. Rolling Analysis for Dynamic Relationships

Extract time-varying non-linear dependencies. The output is a Tidy data frame, perfectly structured for custom EDA and visualization.

### 4. Rolling Analysis for Dynamic Relationships

library(future)
plan(multisession) # or multicore on Linux/macOS

rolling_res <- run_rolling_xi_ccf(
  x = X,
  y = Y,
  window_size = 50,
  step_size = 10,
  max_lag = 3,
  n_surr = 199
)

head(rolling_res)
#>   Window_ID Lead_Var Lag_Var Lag        Xi Global_Threshold Xi_Excess
#> 1         1        x       y   0 0.2955102        0.4966531 0.0000000
#> 2         1        x       y   1 0.2882653        0.4966531 0.0000000
#> 3         1        x       y   2 0.6923759        0.4966531 0.1957228
#> 4         1        x       y   3 0.2738205        0.4966531 0.0000000
#> 5         1        y       x   0 0.2669388        0.4966531 0.0000000
#> 6         1        y       x   1 0.3205782        0.4966531 0.0000000

References

The theoretical foundation and surrogate data methodologies implemented in this package are based on the following works:

License

This project is licensed under the MIT License - see the LICENSE file for details.