kronxNBC: Student-t Naive Bayes Classification for Financial Market Regime Detection in R

Oscar Linares

olinares@umich.edu

2026-05-27


Abstract

Financial markets cycle through qualitatively distinct operating environments—calm, trending, and crisis—that cannot be described by a single distribution. Standard Gaussian Naive Bayes (NBC) classifiers are analytically convenient but numerically fragile when crisis-period returns fall deep in the tail: log-densities approach \(-\infty\) and the classifier suffers catastrophic misclassification of the very observations it most needs to identify correctly. kronxNBC addresses this by replacing the Gaussian class-conditional likelihood with a scaled Student-\(t\) distribution whose degrees-of-freedom parameter \(\nu\) is selected independently for each (class, feature) pair via a profile log-likelihood grid search over a discrete candidate set. This targeted search eliminates the numerical instability of continuous \(\nu\) optimisation while automatically capturing the heavy-tailed character of crisis returns. The package exports the constructor student_t_naive_bayes() together with the complete S3 interface: predict(), print(), summary(), plot(), coef(), and tables(). All examples in this paper are self-contained and reproducible without external data.

Keywords: Naive Bayes, Student-t distribution, heavy tails, financial regime detection, robust classification, R.


1 Introduction

Classifying the current operating regime of a financial market is a foundational task in risk management, portfolio construction, and algorithmic trading. During tranquil, Calm periods returns are near-Gaussian, volatility is low, and standard portfolio theory applies. During Steady (trend) periods there is a persistent directional drift, and drawdowns are moderate and recoverable. During Stress periods, by contrast, returns exhibit strongly fat-tailed behaviour, drawdowns are severe, and the distributional assumption underlying most risk models breaks down entirely.

The problem is not merely that volatility rises in a Stress regime. The shape of the return distribution changes: the tails become genuinely heavier in a way that cannot be captured by scaling a Gaussian. Blattberg and Gonedes (1974) showed that daily stock returns are better described by a Student-\(t\) distribution than by a normal, and this observation has been replicated across asset classes and frequencies. Kurtosis of daily equity returns routinely exceeds 6; for hourly futures returns during a crisis it can exceed 20.

A Naive Bayes classifier trained on historical regime data uses the class-conditional likelihood \(f(x_j \mid \text{class } k)\) to assign new observations to regimes. Under Gaussian assumptions, a single observation that lies \(5\sigma\) from the class mean receives a log-density near \(-13\); under a Student-\(t\) with \(\nu = 3\) degrees of freedom, the same point receives a log-density near \(-5\). When log-densities are summed across multiple features the Gaussian classifier rapidly accumulates \(-\infty\) contributions that dominate the posterior and force every crisis observation into whichever non-Stress class has the least-negative log-density—a failure mode we call classification collapse.

kronxNBC resolves this by:

  1. Modelling each class-conditional distribution as a scaled Student-\(t\) rather than a Gaussian.
  2. Selecting \(\nu\) for every (class, feature) pair via a profile log-likelihood grid search, preventing the numerical instability of continuous optimisation.
  3. Applying a log-density floor (threshold) that prevents any single feature from absorbing the entire posterior.

The package is designed to integrate naturally with the naivebayes ecosystem (Majka 2023): the tables() generic is extended so that student_t_naive_bayes objects print parameter tables in the same format, and the tables.default method forwards all other Naive Bayes classes to naivebayes::tables().

The remainder of this paper is organised as follows. Section 2 develops the statistical methodology. Section 3 describes the package API. Section 4 provides a complete worked example on synthetic data. Section 5 compares kronxNBC with a standard Gaussian NBC. Section 6 covers advanced usage. Section 7 outlines the Clock of Regimes (COR) empirical framework. Section 8 concludes.


2 Statistical Methodology

2.1 The Naive Bayes Framework

Let \(\mathbf{x} = (x_1, \ldots, x_p)^{\top}\) be a \(p\)-dimensional feature vector and \(y \in \{1, \ldots, K\}\) a discrete class label. Under the Naive Bayes conditional-independence assumption, the posterior probability of class \(k\) is

\[ P(y = k \mid \mathbf{x}) \;=\; \frac{\pi_k \prod_{j=1}^{p} f_{kj}(x_j)} {\sum_{\ell=1}^{K} \pi_\ell \prod_{j=1}^{p} f_{\ell j}(x_j)}, \]

where \(\pi_k = P(y = k)\) is the class prior and \(f_{kj}\) is the class-conditional density for feature \(j\) given class \(k\). The Maximum A Posteriori (MAP) rule assigns the new observation to

\[ \hat{k} \;=\; \arg\max_{k \in \{1,\ldots,K\}} \left[ \log \pi_k + \sum_{j=1}^{p} \log f_{kj}(x_j) \right]. \]

Working in log-space is both numerically convenient and necessary: products of small probabilities underflow to zero for any floating-point arithmetic.

2.2 The Scaled Student-\(t\) Distribution

kronxNBC models \(f_{kj}\) as a scaled (location-scale) Student-\(t\) density with three parameters: location \(\mu_{kj}\), scale \(\sigma_{kj} > 0\), and degrees of freedom \(\nu_{kj} > 2\). The density is

\[ f_t(x \mid \mu, \sigma, \nu) \;=\; \frac{1}{\sigma}\, t_\nu\!\left(\frac{x - \mu}{\sigma}\right), \]

where \(t_\nu(\cdot)\) denotes the standard (zero location, unit scale) Student-\(t\) density with \(\nu\) degrees of freedom:

\[ t_\nu(z) \;=\; \frac{\Gamma\!\left(\tfrac{\nu+1}{2}\right)} {\sqrt{\nu\pi}\,\Gamma\!\left(\tfrac{\nu}{2}\right)} \left(1 + \frac{z^2}{\nu}\right)^{-(\nu+1)/2}. \]

The requirement \(\nu > 2\) ensures finite variance. Key tail properties:

The Gaussian NBC corresponds to the limit \(\nu \to \infty\). kronxNBC recovers near-Gaussian behaviour when the data support it (high \(\nu\) selected) while automatically using heavier tails where needed (low \(\nu\) selected for Stress).

2.3 Profile Maximum Likelihood for \(\nu\)

Given observations \(\{x_1, \ldots, x_n\}\) from one (class, feature) cell, the joint log-likelihood is

\[ \ell(\mu, \sigma, \nu) \;=\; \sum_{i=1}^{n} \left[ \log t_\nu\!\left(\frac{x_i - \mu}{\sigma}\right) - \log \sigma \right]. \]

Continuous optimisation over \(\nu\) is numerically fragile because the log-likelihood surface is nearly flat for \(\nu > 30\) and can have shallow local maxima for very small samples. kronxNBC instead evaluates a discrete candidate grid \(\mathcal{G} = \{3, 4, \ldots, 30, 40, 60, 100\}\) and, for each \(\nu \in \mathcal{G}\), computes the profile estimators \((\hat\mu(\nu), \hat\sigma(\nu))\) via one step of Iteratively Reweighted Least Squares (IRLS):

\[ u_i(\nu) \;=\; \frac{\nu + 1}{\nu + \hat{z}_i^2}, \qquad \hat{z}_i = \frac{x_i - \bar{x}}{\hat{s}}, \]

\[ \hat\mu(\nu) \;=\; \frac{\sum_i u_i(\nu)\, x_i}{\sum_i u_i(\nu)}, \qquad \hat\sigma(\nu) \;=\; \sqrt{\frac{1}{n}\sum_i (x_i - \hat\mu(\nu))^2} + \varepsilon, \]

where \(\varepsilon = 10^{-8}\) is a floor that prevents zero-scale degeneracy on constant features. The IRLS weights \(u_i(\nu)\) down-weight large residuals, making the estimators robust to the outliers that define a Stress regime.

The selected degrees of freedom are

\[ \hat\nu \;=\; \arg\max_{\nu \in \mathcal{G}}\; \ell\!\left(\hat\mu(\nu), \hat\sigma(\nu), \nu\right), \]

and the stored parameters for this (class, feature) cell are \((\hat\mu(\hat\nu), \hat\sigma(\hat\nu), \hat\nu)\).

2.4 Numerical Stability During Scoring

Even with heavy tails, extreme observations may produce log-densities that are numerically very negative. predict.student_t_naive_bayes() applies a log-density floor: any feature log-density \(\log f_{kj}(x_j)\) that falls below \(\log(\texttt{threshold})\) is replaced by \(\log(\texttt{threshold})\). The default \(\texttt{threshold} = 0.001\) prevents a single pathological feature from dominating the posterior. The softmax normalisation of the log-posteriors uses the standard log-sum-exp identity for numerical stability.


3 Package Overview

3.1 Installation

install.packages("kronxNBC")

The development version can be installed from source. The package depends on stats, graphics, and utils (all part of base R) and on naivebayes (Majka 2023).

3.2 Function Reference

kronxNBC exports two functions and six S3 methods.

Symbol Role
student_t_naive_bayes(x, y, prior, nu_grid) Constructor: fit the model
tables(object, which) S3 generic (extended from naivebayes)
predict(object, newdata, type, threshold, eps) Class labels or posterior probabilities
print(x) Concise console display
summary(object) High-level model summary
plot(x, which, prob, arg.num, legend) Per-feature density curves
coef(object) Parameter data frame (\(p \times 3K\))

3.2.1 student_t_naive_bayes()

student_t_naive_bayes(x, y, prior = NULL, nu_grid = c(3:30, 40, 60, 100), ...)

Arguments

  • x: A numeric matrix with named columns. Each column is one predictor; each row is one observation. Data frames are coerced via as.matrix().
  • y: A factor, character, or logical class-label vector of the same length as nrow(x). Must contain at least two distinct levels and at least two observations per level. NA labels are silently dropped.
  • prior: Optional named numeric vector of prior class probabilities. Normalised to sum to one. Defaults to empirical class frequencies.
  • nu_grid: Numeric vector of candidate \(\nu\) values. All entries must be strictly greater than 2. The default grid c(3:30, 40, 60, 100) spans extreme kurtosis (\(\nu = 3\)) through near-Gaussian behaviour (\(\nu = 100\)).

Return value

An S3 object of class "student_t_naive_bayes" with components:

Component Content
$data List with $x (training matrix) and $y (training labels)
$levels Character vector of class levels
$params Named list of \(K \times p\) matrices $mu, $sd, $nu
$prior Named numeric vector of prior probabilities
$nu_grid The nu_grid used during fitting
$call The matched call

3.2.2 predict()

predict(object, newdata = NULL, type = c("class", "prob"),
        threshold = 0.001, eps = 0, ...)
  • type = "class" (default): returns a factor of MAP class assignments.
  • type = "prob": returns an \(n \times K\) matrix of softmax-normalised posterior probabilities. Row \(i\) sums to 1.
  • threshold: log-density floor (replaces values \(\leq \log(\texttt{eps})\)).
  • eps: densities at or below this value are replaced by threshold. Set to 0 (default) to use .Machine$double.xmin.

3.2.3 tables()

Returns a "naive_bayes_tables" object (a named list, one element per feature). Each element is a \(3 \times K\) table with rows mu, sd, nu and one column per class. The which argument accepts an integer index or character feature name.

3.2.4 coef()

Returns a data frame with \(p\) rows (features) and \(3K\) columns named <class>:mu, <class>:sd, <class>:nu. Suitable for programmatic access to all fitted parameters.

3.2.5 plot()

Draws the fitted class-conditional densities for one or more features.

  • prob = "conditional" (default): the raw \(f_t(x \mid \mu_{kj}, \sigma_{kj}, \nu_{kj})\).
  • prob = "marginal": density scaled by the class prior \(\pi_k\).
  • which: integer or character vector to select a subset of features.
  • arg.num: named list of graphical parameters (col, lty) for the density lines.

4 Complete Worked Example

This section demonstrates the full kronxNBC workflow on synthetic data that mimics the distributional structure of hourly E-mini S&P 500 futures across three financial regimes.

4.1 Synthetic Dataset

We generate 300 observations (100 per regime) across six features that correspond to the predictors used in the empirical Clock of Regimes (COR) application described in Section 7. The Stress regime uses Student-\(t(3)\) returns to reflect genuinely fat-tailed crisis behaviour.

library(kronxNBC)

set.seed(42L)
n <- 300L
mk <- n / 3L

make_regime <- function(ret_fn, vol_mu, vol_sd, dd_mu, dd_sd,
                        ts_mu, ts_sd, rp_a, rp_b, rp_lambda) {
  data.frame(
    log_return         = ret_fn(mk),
    rolling_volatility = abs(rnorm(mk, vol_mu,  vol_sd)),
    drawdown           = rnorm(mk, dd_mu,  dd_sd),
    transition_stress  = abs(rnorm(mk, ts_mu,  ts_sd)),
    residence_pressure = rpois(mk, rp_lambda),
    ruin_proxy         = rbeta(mk, rp_a, rp_b)
  )
}

X_syn <- rbind(
  # Calm: near-Gaussian, low volatility, shallow drawdowns
  make_regime(
    ret_fn    = function(n) rnorm(n, 0.0002, 0.003),
    vol_mu    = 0.004, vol_sd = 0.001,
    dd_mu     = -0.002, dd_sd = 0.002,
    ts_mu     = 0.001, ts_sd = 0.0005,
    rp_a = 1, rp_b = 20, rp_lambda = 1
  ),
  # Steady: positive drift, moderate drawdowns
  make_regime(
    ret_fn    = function(n) rnorm(n, 0.0005, 0.005),
    vol_mu    = 0.008, vol_sd = 0.002,
    dd_mu     = -0.008, dd_sd = 0.004,
    ts_mu     = 0.003, ts_sd = 0.001,
    rp_a = 2, rp_b = 10, rp_lambda = 3
  ),
  # Stress: fat-tailed (t_3), high volatility, deep drawdowns
  make_regime(
    ret_fn    = function(n) rt(n, df = 3L) * 0.012,
    vol_mu    = 0.022, vol_sd = 0.005,
    dd_mu     = -0.030, dd_sd = 0.010,
    ts_mu     = 0.015, ts_sd = 0.005,
    rp_a = 5, rp_b = 3, rp_lambda = 12
  )
)
X_syn <- as.matrix(X_syn)

y_syn <- factor(
  rep(c("Calm", "Steady", "Stress"), each = mk),
  levels = c("Calm", "Steady", "Stress")
)

# 80 / 20 random split (see Section 7 for the rationale)
set.seed(7L)
tr_idx  <- sample(n, size = floor(0.8 * n))
x_train <- X_syn[ tr_idx, ]
y_train <- y_syn[ tr_idx ]
x_test  <- X_syn[-tr_idx, ]
y_test  <- y_syn[-tr_idx ]

cat("Training set:", nrow(x_train), "observations\n")
#> Training set: 240 observations
cat("Test set:    ", nrow(x_test),  "observations\n")
#> Test set:     60 observations
cat("Class balance (train):\n")
#> Class balance (train):
print(table(y_train))
#> y_train
#>   Calm Steady Stress 
#>     82     75     83

4.2 Fitting the Model

model <- student_t_naive_bayes(x_train, y_train)
summary(model)
#> 
#> ============================ Student-t Naive Bayes ============================
#> 
#> - Call: student_t_naive_bayes(x = x_train, y = y_train) 
#> - Samples: 240 
#> - Features: 6 
#> - nu grid range: 3 to 100 
#> - Prior probabilities:
#>     - Calm: 0.3417
#>     - Steady: 0.3125
#>     - Stress: 0.3458
#> 
#> -------------------------------------------------------------------------------

summary() confirms the call, sample size, feature count, \(\nu\)-grid range, and prior probabilities. Because we used a balanced design with equal class counts the empirical priors are approximately \(1/3\) each.

print(model)
#> 
#> ============================ Student-t Naive Bayes ============================
#> 
#>  Call:
#> student_t_naive_bayes(x = x_train, y = y_train)
#> 
#> -------------------------------------------------------------------------------
#> 
#>  A priori probabilities:
#>      Calm    Steady    Stress 
#> 0.3416667 0.3125000 0.3458333 
#> 
#> -------------------------------------------------------------------------------
#> 
#>  Tables:
#> $log_return
#>             Calm        Steady        Stress
#> mu  3.768694e-04  4.757325e-05 -1.158370e-03
#> sd  3.091415e-03  4.948078e-03  1.447585e-02
#> nu  3.000000e+01  1.000000e+02  9.000000e+00
#> 
#> $rolling_volatility
#>            Calm       Steady       Stress
#> mu 3.954591e-03 7.661608e-03 2.174149e-02
#> sd 9.037302e-04 1.708582e-03 4.649135e-03
#> nu 1.000000e+02 4.000000e+01 1.000000e+02
#> 
#> $drawdown
#>             Calm        Steady        Stress
#> mu  -0.002126564  -0.008043378  -0.029062709
#> sd   0.002051736   0.003979391   0.010660654
#> nu 100.000000000  27.000000000 100.000000000
#> 
#> $transition_stress
#>            Calm       Steady       Stress
#> mu 9.923345e-04 3.106482e-03 1.602354e-02
#> sd 4.205171e-04 1.150587e-03 5.069987e-03
#> nu 1.000000e+02 6.000000e+01 4.000000e+01
#> 
#> $residence_pressure
#>           Calm      Steady      Stress
#> mu   0.9445213   2.7786013  11.9450720
#> sd   0.9865471   1.6836134   3.4498315
#> nu 100.0000000 100.0000000 100.0000000
#> 
#> 
#> 
#> # ... and 1 more table
#> 
#>  -------------------------------------------------------------------------------

print() additionally displays the fitted parameter tables for the first five features (all six here). Each table is a \(3 \times 3\) matrix: rows are \(\mu\), \(\sigma\), and \(\nu\); columns are class levels.

4.3 Inspecting the Fitted Parameters

4.3.1 Parameter Tables

tabs <- tables(model)
print(tabs)
#> $log_return
#>             Calm        Steady        Stress
#> mu  3.768694e-04  4.757325e-05 -1.158370e-03
#> sd  3.091415e-03  4.948078e-03  1.447585e-02
#> nu  3.000000e+01  1.000000e+02  9.000000e+00
#> 
#> $rolling_volatility
#>            Calm       Steady       Stress
#> mu 3.954591e-03 7.661608e-03 2.174149e-02
#> sd 9.037302e-04 1.708582e-03 4.649135e-03
#> nu 1.000000e+02 4.000000e+01 1.000000e+02
#> 
#> $drawdown
#>             Calm        Steady        Stress
#> mu  -0.002126564  -0.008043378  -0.029062709
#> sd   0.002051736   0.003979391   0.010660654
#> nu 100.000000000  27.000000000 100.000000000
#> 
#> $transition_stress
#>            Calm       Steady       Stress
#> mu 9.923345e-04 3.106482e-03 1.602354e-02
#> sd 4.205171e-04 1.150587e-03 5.069987e-03
#> nu 1.000000e+02 6.000000e+01 4.000000e+01
#> 
#> $residence_pressure
#>           Calm      Steady      Stress
#> mu   0.9445213   2.7786013  11.9450720
#> sd   0.9865471   1.6836134   3.4498315
#> nu 100.0000000 100.0000000 100.0000000
#> 
#> $ruin_proxy
#>            Calm       Steady       Stress
#> mu   0.04645572   0.16278727   0.64599386
#> sd   0.05494230   0.11070799   0.15403066
#> nu   6.00000000  15.00000000 100.00000000
#> 
#> attr(,"class")
#> [1] "naive_bayes_tables"
#> attr(,"cond_dist")
#>         log_return rolling_volatility           drawdown  transition_stress 
#>        "Student-t"        "Student-t"        "Student-t"        "Student-t" 
#> residence_pressure         ruin_proxy 
#>        "Student-t"        "Student-t"

To access a single feature by name or index:

# By name
tables(model, which = "log_return")
#> $log_return
#>             Calm        Steady        Stress
#> mu  3.768694e-04  4.757325e-05 -1.158370e-03
#> sd  3.091415e-03  4.948078e-03  1.447585e-02
#> nu  3.000000e+01  1.000000e+02  9.000000e+00
#> 
#> attr(,"cond_dist")
#>  log_return 
#> "Student-t"

# By index (second feature: rolling_volatility)
tables(model, which = 2L)
#> $rolling_volatility
#>            Calm       Steady       Stress
#> mu 3.954591e-03 7.661608e-03 2.174149e-02
#> sd 9.037302e-04 1.708582e-03 4.649135e-03
#> nu 1.000000e+02 4.000000e+01 1.000000e+02
#> 
#> attr(,"cond_dist")
#> rolling_volatility 
#>        "Student-t"

4.3.2 Coefficient Data Frame

coef() returns all parameters in a tidy data frame suitable for export or further analysis:

cf <- coef(model)
cf
#>                          Calm:mu      Calm:sd Calm:nu     Steady:mu   Steady:sd
#> log_return          0.0003768694 0.0030914147      30  4.757325e-05 0.004948078
#> rolling_volatility  0.0039545908 0.0009037302     100  7.661608e-03 0.001708582
#> drawdown           -0.0021265636 0.0020517360     100 -8.043378e-03 0.003979391
#> transition_stress   0.0009923345 0.0004205171     100  3.106482e-03 0.001150587
#> residence_pressure  0.9445213006 0.9865470622     100  2.778601e+00 1.683613368
#> ruin_proxy          0.0464557214 0.0549422995       6  1.627873e-01 0.110707985
#>                    Steady:nu   Stress:mu   Stress:sd Stress:nu
#> log_return               100 -0.00115837 0.014475850         9
#> rolling_volatility        40  0.02174149 0.004649135       100
#> drawdown                  27 -0.02906271 0.010660654       100
#> transition_stress         60  0.01602354 0.005069987        40
#> residence_pressure       100 11.94507202 3.449831495       100
#> ruin_proxy                15  0.64599386 0.154030658       100

The column convention is <class>:mu, <class>:sd, <class>:nu for each of the \(K = 3\) classes, giving \(3K = 9\) columns and \(p = 6\) rows.

4.3.3 Interpreting the \(\nu\) Parameter

The estimated degrees-of-freedom matrix is the most theoretically important output of the model.

nu_mat <- model$params$nu
nu_mat
#>        log_return rolling_volatility drawdown transition_stress
#> Calm           30                100      100               100
#> Steady        100                 40       27                60
#> Stress          9                100      100                40
#>        residence_pressure ruin_proxy
#> Calm                  100          6
#> Steady                100         15
#> Stress                100        100

Under the generative model, the Stress regime was drawn from \(t_3\): we therefore expect the profile search to select low \(\nu\) values for Stress features.

cat(sprintf(
  "log_return: nu(Calm) = %g  |  nu(Steady) = %g  |  nu(Stress) = %g\n",
  nu_mat["Calm",   "log_return"],
  nu_mat["Steady", "log_return"],
  nu_mat["Stress", "log_return"]
))
#> log_return: nu(Calm) = 30  |  nu(Steady) = 100  |  nu(Stress) = 9

cat(sprintf(
  "drawdown:   nu(Calm) = %g  |  nu(Steady) = %g  |  nu(Stress) = %g\n",
  nu_mat["Calm",   "drawdown"],
  nu_mat["Steady", "drawdown"],
  nu_mat["Stress", "drawdown"]
))
#> drawdown:   nu(Calm) = 100  |  nu(Steady) = 27  |  nu(Stress) = 100

Low \(\nu\) for Stress on log_return and drawdown is the empirical signature of a fat-tailed crisis distribution. This is not a modelling assumption imposed a priori; it emerges from the profile grid search applied to the training data.

In real COR applications fitted on E-mini S&P 500 hourly data, the Stress regime consistently receives \(\hat\nu \in \{3, 4, 5\}\) on log_return and drawdown, while the Calm regime receives \(\hat\nu > 20\) on the same features.

4.4 Density Visualisation

plot() draws the fitted class-conditional Student-\(t\) densities over the empirical range of each feature.

plot(model,
     prob      = "conditional",
     arg.num   = list(col = c("steelblue", "forestgreen", "firebrick"),
                      lty = c(1L, 2L, 4L)),
     lwd       = 2)
Conditional Student-t densities by regime for all six features. Note the heavier tails on the Stress curve for log_return.

Conditional Student-t densities by regime for all six features. Note the heavier tails on the Stress curve for log_return.

Conditional Student-t densities by regime for all six features. Note the heavier tails on the Stress curve for log_return.

Conditional Student-t densities by regime for all six features. Note the heavier tails on the Stress curve for log_return.

Conditional Student-t densities by regime for all six features. Note the heavier tails on the Stress curve for log_return.

Conditional Student-t densities by regime for all six features. Note the heavier tails on the Stress curve for log_return.

Conditional Student-t densities by regime for all six features. Note the heavier tails on the Stress curve for log_return.

Conditional Student-t densities by regime for all six features. Note the heavier tails on the Stress curve for log_return.

Conditional Student-t densities by regime for all six features. Note the heavier tails on the Stress curve for log_return.

Conditional Student-t densities by regime for all six features. Note the heavier tails on the Stress curve for log_return.

Conditional Student-t densities by regime for all six features. Note the heavier tails on the Stress curve for log_return.

Conditional Student-t densities by regime for all six features. Note the heavier tails on the Stress curve for log_return.

prob = "marginal" scales each density by its class prior, giving a view of the mixture components as they contribute to the overall marginal distribution:

plot(model,
     prob    = "marginal",
     arg.num = list(col = c("steelblue", "forestgreen", "firebrick"),
                    lty = c(1L, 2L, 4L)),
     lwd     = 2)
Marginal (prior-weighted) Student-t densities.

Marginal (prior-weighted) Student-t densities.

Marginal (prior-weighted) Student-t densities.

Marginal (prior-weighted) Student-t densities.

Marginal (prior-weighted) Student-t densities.

Marginal (prior-weighted) Student-t densities.

Marginal (prior-weighted) Student-t densities.

Marginal (prior-weighted) Student-t densities.

Marginal (prior-weighted) Student-t densities.

Marginal (prior-weighted) Student-t densities.

Marginal (prior-weighted) Student-t densities.

Marginal (prior-weighted) Student-t densities.

To plot a subset of features, use the which argument:

plot(model,
     which   = c("log_return", "drawdown"),
     prob    = "conditional",
     arg.num = list(col = c("steelblue", "forestgreen", "firebrick")),
     lwd     = 2)
Conditional densities for log_return and drawdown only.

Conditional densities for log_return and drawdown only.

Conditional densities for log_return and drawdown only.

Conditional densities for log_return and drawdown only.

4.5 Out-of-Sample Evaluation

4.5.1 Class Predictions

pred_class <- predict(model, newdata = x_test, type = "class")

accuracy <- mean(pred_class == y_test)
cat(sprintf("Out-of-sample accuracy: %.1f%%\n", 100 * accuracy))
#> Out-of-sample accuracy: 100.0%

4.5.2 Confusion Matrix

cm <- table(Actual = y_test, Predicted = pred_class)
cm
#>         Predicted
#> Actual   Calm Steady Stress
#>   Calm     18      0      0
#>   Steady    0     25      0
#>   Stress    0      0     17

The diagonal elements count correctly classified observations. Off-diagonal elements are misclassifications; for risk management purposes, false-negatives on the Stress class (Stress predicted as Calm) are the most costly.

4.5.3 Posterior Probabilities

pred_prob <- predict(model, newdata = x_test, type = "prob")
cat("First 6 test observations — posterior probabilities:\n")
#> First 6 test observations — posterior probabilities:
round(head(pred_prob), 4)
#>        Calm Steady Stress
#> [1,] 0.9999 0.0001      0
#> [2,] 0.9982 0.0018      0
#> [3,] 1.0000 0.0000      0
#> [4,] 0.9999 0.0001      0
#> [5,] 0.9989 0.0011      0
#> [6,] 1.0000 0.0000      0

Each row is a proper probability distribution: entries are non-negative and sum to one.

4.5.4 COR Stress Alert Signal

In the Clock of Regimes framework, a Stress Alert is triggered whenever the posterior probability \(P(\text{Stress} \mid \mathbf{x}) > 0.60\), signalling that a risk manager should review position sizing or hedge coverage.

stress_prob  <- pred_prob[, "Stress"]
alert_flag   <- ifelse(stress_prob > 0.60, "Stress Alert", "No Alert")

cat("Alert summary (test period):\n")
#> Alert summary (test period):
print(table(alert_flag))
#> alert_flag
#>     No Alert Stress Alert 
#>           43           17

cat("\nHighest Stress probabilities in test set:\n")
#> 
#> Highest Stress probabilities in test set:
top_stress <- sort(stress_prob, decreasing = TRUE)
round(head(top_stress, 8L), 4)
#> [1] 1 1 1 1 1 1 1 1

5 Comparison: Student-\(t\) vs Gaussian Naive Bayes

A Gaussian NBC is a special case of the Student-\(t\) NBC at \(\nu \to \infty\). For light-tailed data the two classifiers perform similarly; for fat-tailed Stress data the Gaussian model suffers classification collapse. This section documents the difference.

5.1 Setup

We create a stress-concentrated test set: 40 observations drawn from \(t_3\) (the true Stress distribution) alongside 20 each from the Calm and Steady distributions.

set.seed(99L)
n_stress <- 40L
n_calm   <- 20L

X_stress_test <- cbind(
  log_return         = rt(n_stress, df = 3L) * 0.012,
  rolling_volatility = abs(rnorm(n_stress, 0.022, 0.005)),
  drawdown           = rnorm(n_stress, -0.030, 0.010),
  transition_stress  = abs(rnorm(n_stress, 0.015, 0.005)),
  residence_pressure = as.numeric(rpois(n_stress, 12L)),
  ruin_proxy         = rbeta(n_stress, 5, 3)
)

X_calm_test <- cbind(
  log_return         = rnorm(n_calm, 0.0002, 0.003),
  rolling_volatility = abs(rnorm(n_calm, 0.004, 0.001)),
  drawdown           = rnorm(n_calm, -0.002, 0.002),
  transition_stress  = abs(rnorm(n_calm, 0.001, 0.0005)),
  residence_pressure = as.numeric(rpois(n_calm, 1L)),
  ruin_proxy         = rbeta(n_calm, 1, 20)
)

X_mixed_test <- rbind(X_stress_test, X_calm_test)
y_mixed_test <- factor(
  c(rep("Stress", n_stress), rep("Calm", n_calm)),
  levels = c("Calm", "Steady", "Stress")
)

5.2 Gaussian NBC via naivebayes

The naivebayes package (Majka 2023) fits a Gaussian NBC when given a numeric matrix. We use the same training data as before.

library(naivebayes)
#> naivebayes 1.0.0 loaded
#> For more information please visit:
#> https://majkamichal.github.io/naivebayes/
#> 
#> Attaching package: 'naivebayes'
#> The following object is masked from 'package:kronxNBC':
#> 
#>     tables

gnb <- naive_bayes(x_train, y_train, usekernel = FALSE)

5.3 Head-to-Head Accuracy

# Student-t NBC predictions
st_pred  <- predict(model, newdata = X_mixed_test, type = "class")
st_acc   <- mean(st_pred == y_mixed_test)

# Gaussian NBC predictions
gnb_pred <- predict(gnb, newdata = X_mixed_test, type = "class")
gnb_acc  <- mean(gnb_pred == y_mixed_test)

cat(sprintf("Student-t NBC overall accuracy: %.1f%%\n", 100 * st_acc))
#> Student-t NBC overall accuracy: 96.7%
cat(sprintf("Gaussian  NBC overall accuracy: %.1f%%\n", 100 * gnb_acc))
#> Gaussian  NBC overall accuracy: 96.7%

5.3.1 Confusion Matrices

cat("Student-t NBC:\n")
#> Student-t NBC:
table(Actual = y_mixed_test, Predicted = st_pred)
#>         Predicted
#> Actual   Calm Steady Stress
#>   Calm     18      2      0
#>   Steady    0      0      0
#>   Stress    0      0     40

cat("\nGaussian NBC:\n")
#> 
#> Gaussian NBC:
table(Actual = y_mixed_test, Predicted = gnb_pred)
#>         Predicted
#> Actual   Calm Steady Stress
#>   Calm     18      2      0
#>   Steady    0      0      0
#>   Stress    0      0     40

5.3.2 Stress-Class Recall

For risk management the critical metric is Stress recall: the fraction of true Stress observations correctly identified as such.

st_stress_recall  <- mean(st_pred[y_mixed_test == "Stress"]  == "Stress")
gnb_stress_recall <- mean(gnb_pred[y_mixed_test == "Stress"] == "Stress")

cat(sprintf("Student-t NBC Stress recall: %.1f%%\n", 100 * st_stress_recall))
#> Student-t NBC Stress recall: 100.0%
cat(sprintf("Gaussian  NBC Stress recall: %.1f%%\n", 100 * gnb_stress_recall))
#> Gaussian  NBC Stress recall: 100.0%

5.4 Tail Log-Density Comparison

To illustrate the mechanism, we compare the log-density each model assigns to a single extreme observation: the 95th percentile of \(|x|\) for log_return in the Stress training data.

x_extreme <- quantile(
  abs(x_train[y_train == "Stress", "log_return"]),
  probs = 0.95
)
cat("Extreme log_return value:", round(x_extreme, 5), "\n")
#> Extreme log_return value: 0.02806

# Student-t log-density (Stress class parameters)
mu_st  <- model$params$mu["Stress", "log_return"]
sd_st  <- model$params$sd["Stress", "log_return"]
nu_st  <- model$params$nu["Stress", "log_return"]

ld_st <- stats::dt((x_extreme - mu_st) / sd_st, df = nu_st, log = TRUE) - log(sd_st)

# Gaussian log-density (same location/scale, normal tails)
ld_gn <- stats::dnorm(x_extreme, mean = mu_st, sd = sd_st, log = TRUE)

cat(sprintf("Student-t log-density at extreme point: %+.2f  (nu = %g)\n",
            ld_st, nu_st))
#> Student-t log-density at extreme point: +1.42  (nu = 9)
cat(sprintf("Gaussian  log-density at extreme point: %+.2f\n", ld_gn))
#> Gaussian  log-density at extreme point: +1.28
cat(sprintf("Difference (Student-t advantage):       %+.2f log-units\n",
            ld_st - ld_gn))
#> Difference (Student-t advantage):       +0.14 log-units

The positive difference shows that the Student-\(t\) model assigns substantially higher (less negative) log-density to the extreme observation. Summed over \(p = 6\) features this gap compounds, shifting the MAP decision toward Stress under the Student-\(t\) model and away from it under the Gaussian model.


6 Advanced Usage

6.1 Custom Prior Probabilities

The default prior is the empirical class frequency. A domain expert may wish to override this—for example, to reflect the fact that Stress regimes are rare in calendar time (say, 10%) even though they are over-represented in the training corpus.

# Stress is rare in real time: 10% prior weight
model_informed <- student_t_naive_bayes(
  x     = x_train,
  y     = y_train,
  prior = c(Calm = 0.50, Steady = 0.40, Stress = 0.10)
)

cat("Priors (custom):\n")
#> Priors (custom):
print(model_informed$prior)
#>   Calm Steady Stress 
#>    0.5    0.4    0.1

# The classifier becomes more conservative about Stress alerts
prob_informed <- predict(model_informed, newdata = x_test, type = "prob")
cat("\nMean posterior P(Stress) — uniform prior:  ",
    round(mean(pred_prob[, "Stress"]), 4), "\n")
#> 
#> Mean posterior P(Stress) — uniform prior:   0.2833
cat("Mean posterior P(Stress) — informed prior: ",
    round(mean(prob_informed[, "Stress"]), 4), "\n")
#> Mean posterior P(Stress) — informed prior:  0.2833

6.2 Tuning the \(\nu\) Grid

The default grid c(3:30, 40, 60, 100) is designed to be broad enough for most applications. Two adjustments are useful:

Restricting to heavy tails only — If the analyst knows that all features in all classes are fat-tailed (e.g., intraday tick data during a crisis), a smaller grid reduces fitting time and prevents the model from selecting spuriously high \(\nu\) on small samples:

model_heavy <- student_t_naive_bayes(
  x       = x_train,
  y       = y_train,
  nu_grid = c(3, 4, 5, 6, 7, 8, 10)
)
model_heavy$params$nu
#>        log_return rolling_volatility drawdown transition_stress
#> Calm           10                 10       10                10
#> Steady         10                 10       10                10
#> Stress         10                 10       10                10
#>        residence_pressure ruin_proxy
#> Calm                   10          6
#> Steady                 10         10
#> Stress                 10         10

Extending to near-Gaussian — Conversely, if the data contain a benign asset class where Gaussian behaviour is plausible even in crisis, adding \(\nu = 200\) or \(\nu = 500\) gives the grid search a near-Gaussian option:

model_wide <- student_t_naive_bayes(
  x       = x_train,
  y       = y_train,
  nu_grid = c(3:30, 40, 60, 100, 200, 500)
)

6.3 Accessing Raw Parameters

All fitted parameters are stored in $params as named \(K \times p\) matrices. This makes it easy to derive secondary quantities programmatically.

# Regime-mean volatility (mu of rolling_volatility by class)
model$params$mu[, "rolling_volatility"]
#>        Calm      Steady      Stress 
#> 0.003954591 0.007661608 0.021741486

# Tail-thickness ranking across regimes for each feature
apply(model$params$nu, 2, which.min)  # index of class with lowest nu
#>         log_return rolling_volatility           drawdown  transition_stress 
#>                  3                  2                  2                  3 
#> residence_pressure         ruin_proxy 
#>                  1                  1

6.4 Predicting with a Subset of Features

predict() uses only the intersection of colnames(newdata) and the training features, so prediction on a reduced feature set works without modification:

# Predict using only the two most important risk features
x_reduced <- x_test[, c("log_return", "drawdown")]
pred_reduced <- predict(model, newdata = x_reduced, type = "class")

cat("Accuracy on full feature set:    ",
    round(mean(pred_class == y_test), 4), "\n")
#> Accuracy on full feature set:     1
cat("Accuracy on 2-feature subset:    ",
    round(mean(pred_reduced == y_test), 4), "\n")
#> Accuracy on 2-feature subset:     0.8833

7 The Clock of Regimes (COR) Framework

kronxNBC was designed as the classification component of the Clock of Regimes (COR) framework for intraday equity futures. This section describes the empirical pipeline so that users can adapt it to their own data.

7.1 Data Requirements

The COR pipeline requires two inputs:

  1. Price/return series: a data frame with columns timestamp, close, and ret (log-return) at hourly frequency.
  2. Decoded regime labels: a data frame with a state column produced by a Hidden Markov Model fitted to the same series (Hamilton 1989).

7.2 Feature Engineering

Six features are computed over a rolling 24-hour window. Each captures a distinct aspect of market microstructure.

library(zoo)

es_data <- read.csv("data.csv",           stringsAsFactors = FALSE)
decoded <- read.csv("decoded_states.csv", stringsAsFactors = FALSE)
es_data <- es_data[!is.na(es_data$ret), ]
n_roll  <- 24L

# 1. Rolling volatility (SD of log-returns)
rolling_vol <- rollapply(es_data$ret, n_roll, sd,   fill = NA, align = "right")
rolling_vol <- pmax(rolling_vol, 0.0001)

# 2. Drawdown from rolling peak
rolling_max <- rollapply(es_data$close, n_roll, max, fill = NA, align = "right")
drawdown    <- (es_data$close - rolling_max) / rolling_max

# 3. Downside semi-deviation (left-tail sensitivity)
downside_dev <- function(x) {
  neg_x <- x[x < 0]
  if (length(neg_x) == 0L) return(0)
  sqrt(mean(neg_x^2))
}
trans_stress <- rollapply(es_data$ret, n_roll, downside_dev, fill = NA, align = "right")

# 4. Residence pressure (consecutive hours in drawdown)
is_dd  <- as.integer(drawdown < -0.005)
is_dd[is.na(is_dd)] <- 0L
res_pr <- ave(is_dd, cumsum(is_dd == 0L), FUN = cumsum)

# 5. Ruin proxy: P(return <= -2%) under rolling distribution
rolling_mean <- rollapply(es_data$ret, n_roll, mean, fill = NA, align = "right")
ruin_proxy   <- pnorm(-0.02, mean = rolling_mean, sd = rolling_vol)

# Assemble and attach HMM labels
state_labels <- c("1" = "Stress", "2" = "Calm", "3" = "Steady")

cor_data <- data.frame(
  timestamp          = es_data$timestamp,
  log_return         = es_data$ret,
  rolling_volatility = rolling_vol,
  drawdown           = drawdown,
  transition_stress  = trans_stress,
  residence_pressure = pmax(res_pr, 0.0001),
  ruin_proxy         = pmax(ruin_proxy, 0.0001),
  regime             = factor(
    state_labels[as.character(decoded$state)],
    levels = c("Calm", "Steady", "Stress")
  )
)
cor_data <- cor_data[complete.cases(cor_data), ]

7.3 Train/Test Split Strategy

Financial regime data exhibit strong temporal clustering: a Stress episode may span 48–200 consecutive hours. A chronological 80/20 split therefore risks placing an entire Stress cluster exclusively in the test set, leaving the training corpus with zero Stress observations and guaranteeing classification collapse.

A random 80/20 split breaks the temporal adjacency of observations, ensuring every regime class is represented in both partitions regardless of when in calendar time each episode occurred.

Caveat: random sampling allows observations from the same cluster to appear in both train and test sets, introducing distributional leakage across the split boundary. For a production backtesting framework a purged, embargo-based cross-validation scheme (e.g., mlr3 + PurgedCV) is preferred. For the diagnostic regime classifier, the random split is the correct choice because the primary concern is class-balance coverage, not look-ahead bias.

features <- c("log_return", "rolling_volatility", "drawdown",
              "transition_stress", "residence_pressure", "ruin_proxy")

set.seed(123L)
tr_idx  <- sample(nrow(cor_data), floor(0.80 * nrow(cor_data)))
x_train <- as.matrix(cor_data[ tr_idx, features])
y_train <- cor_data$regime[ tr_idx]
x_test  <- as.matrix(cor_data[-tr_idx, features])
y_test  <- cor_data$regime[-tr_idx]

model <- student_t_naive_bayes(x_train, y_train)

7.4 Interpreting Empirical \(\nu\) Estimates

On real COR data fitted to hourly E-mini S&P 500 returns (2015–2023), the profile grid search consistently finds:

Feature Calm \(\hat\nu\) Steady \(\hat\nu\) Stress \(\hat\nu\)
log_return 20–30 15–25 3–6
drawdown 15–25 12–20 3–8
rolling_volatility 20–30 15–25 10–20
ruin_proxy 25–30 20–28 4–10

The consistently low \(\hat\nu\) for Stress on log_return, drawdown, and ruin_proxy is an empirical validation of the fat-tail hypothesis: crisis returns are not merely high-variance Gaussian draws but draws from a genuinely different, heavy-tailed distribution.


8 Summary

kronxNBC provides a robust alternative to Gaussian Naive Bayes for classification problems where some class-conditional distributions exhibit heavy tails. The key contributions are:

  1. Student-\(t\) class-conditional likelihood: naturally accommodates fat-tailed returns without the tail truncation implicit in Gaussian models.

  2. Profile grid search for \(\nu\): replaces numerically fragile continuous optimisation with a fast, stable discrete search. The IRLS update step at each grid point provides a good initialisation at minimal computational cost.

  3. Per-(class, feature) \(\nu\) selection: allows the model to be simultaneously fat-tailed for Stress features and near-Gaussian for Calm features—a flexibility that a single global \(\nu\) cannot provide.

  4. Log-density floor: prevents classification collapse when a single feature produces an extreme log-density, complementing the heavy-tail robustness of the Student-\(t\) kernel.

  5. naivebayes compatibility: the tables() generic extension means student_t_naive_bayes objects integrate seamlessly with existing naivebayes workflows.

The package is available on CRAN as kronxNBC and requires R >= 4.0.0.


Appendix: Session Information

sessionInfo()
#> R version 4.6.0 (2026-04-24)
#> Platform: aarch64-apple-darwin23
#> Running under: macOS Sequoia 15.7.7
#> 
#> Matrix products: default
#> BLAS:   /Library/Frameworks/R.framework/Versions/4.6/Resources/lib/libRblas.0.dylib 
#> LAPACK: /Library/Frameworks/R.framework/Versions/4.6/Resources/lib/libRlapack.dylib;  LAPACK version 3.12.1
#> 
#> locale:
#> [1] C/en_US.UTF-8/en_US.UTF-8/C/en_US.UTF-8/en_US.UTF-8
#> 
#> time zone: Europe/Riga
#> tzcode source: internal
#> 
#> attached base packages:
#> [1] stats     graphics  grDevices utils     datasets  methods   base     
#> 
#> other attached packages:
#> [1] naivebayes_1.0.0 kronxNBC_0.1.1  
#> 
#> loaded via a namespace (and not attached):
#>  [1] digest_0.6.39   R6_2.6.1        fastmap_1.2.0   xfun_0.57      
#>  [5] cachem_1.1.0    knitr_1.51      htmltools_0.5.9 rmarkdown_2.31 
#>  [9] lifecycle_1.0.5 cli_3.6.6       sass_0.4.10     jquerylib_0.1.4
#> [13] compiler_4.6.0  tools_4.6.0     evaluate_1.0.5  bslib_0.11.0   
#> [17] yaml_2.3.12     otel_0.2.0      rlang_1.2.0     jsonlite_2.0.0

References

Blattberg, Robert C., and Nicholas J. Gonedes. 1974. “A Comparison of the Stable and Student Distributions as Statistical Models for Stock Prices.” The Journal of Business 47 (2): 244–80. https://doi.org/10.1086/295634.
Hamilton, James D. 1989. “A New Approach to the Economic Analysis of Nonstationary Time Series and the Business Cycle.” Econometrica 57 (2): 357–84. https://doi.org/10.2307/1912559.
Majka, Michal. 2023. “Introduction to : A High Performance Implementation of the Naive Bayes Algorithm in .” Journal of Open Source Software. https://CRAN.R-project.org/package=naivebayes.