Comparison with the GAS Package

Vladimír Holý

2024-01-30

Introduction

We compare the gasmodel package with the GAS package developed by Ardia et al. (2019). The GAS package provides functionality for both univariate and multivariate GAS models. The current version, 0.3.4, supports 16 distributions. However, the model specification in the GAS package is somewhat limited, only allowing for basic dynamics without the inclusion of exogenous variables. Additionally, this package lacks distributions for certain more specialized data types, such as circular, compositional, and ranking data. The package thus supports only a limited selection of GAS models found in the literature.

The Bookshop Orders Case Study

First, let us attempt to replicate the results from the Bookshop Orders case study in the case_durations vignette based on Tomanová and Holý (2021) using the GAS package. This case study focuses on modeling the durations between orders from a Czech antiquarian bookshop. Specifically, We analyze durations adjusted for the diurnal pattern.

library("dplyr")
library("gasmodel")
library("GAS")

data("bookshop_sales")

y <- bookshop_orders$duration_adj[-1]

The case study employs the general gamma distribution and its specific instances to model durations. The GAS package offers the exponential and gamma distributions but does not support the Weibull and generalized gamma distributions. The exponential distribution is parametrized in terms of the rate parameter with the logistic link function in the GAS package. The gasmodel package allows for both the scale and rate parametrizations as well as the identical and logarithmic link functions. When the logarithmic link function is used, however, the only difference between the scale and rate parametrizations is in the sign of the intercept We can therefore compare the exponential model using the scale parametrization estimated by the gasmodel package with the exponential model using the rate parametrization estimated by the GAS package.

est_exp <- gas(y = y, distr = "exp")
est_exp
#> GAS Model: Exponential Distribution / Scale Parametrization / Unit Scaling 
#> 
#> Coefficients: 
#>                      Estimate  Std. Error   Z-Test  Pr(>|Z|)    
#> log(scale)_omega  -0.00089754  0.00117598  -0.7632    0.4453    
#> log(scale)_alpha1  0.04992815  0.00657547   7.5931 3.123e-14 ***
#> log(scale)_phi1    0.96278385  0.00918996 104.7647 < 2.2e-16 ***
#> ---
#> Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
#> 
#> Log-Likelihood: -5571.078, AIC: 11148.16, BIC: 11168.11

spec_exp <- UniGASSpec(Dist = "exp", GASPar = list(location = TRUE))
fit_exp <- UniGASFit(spec_exp, data = y)
fit_exp
#> 
#> ------------------------------------------
#> -          Univariate GAS Fit            -
#> ------------------------------------------
#> 
#> Model Specification: 
#> T =  5718
#> Conditional distribution:  exp
#> Score scaling type:  Identity
#> Time varying parameters:  location
#> ------------------------------------------
#> Estimates:
#>            Estimate  Std. Error     t value     Pr(>|t|)
#> kappa1 0.0008988645 0.001176163   0.7642349 2.223636e-01
#> a1     0.0499298880 0.006583222   7.5844146 1.665335e-14
#> b1     0.9627795337 0.009205272 104.5900093 0.000000e+00
#> 
#> ------------------------------------------
#> Unconditional Parameters:
#> location 
#> 1.024444 
#> 
#> ------------------------------------------
#> Information Criteria:
#>       AIC       BIC        np       llk 
#> 11148.156 11168.110     3.000 -5571.078 
#> 
#> ------------------------------------------
#> Convergence: 0
#> ------------------------------------------
#> 
#> Elapsed time: 0.01 mins

The results are nearly identical, within a reasonable level of precision. Other than the inverted sign of the intercept, the only difference lies in the reported p-values: the GAS package employs one-tailed hypotheses, whereas the gasmodel package uses two-tailed hypotheses.

Next, we estimate the model with the gamma distribution and the rate parametrization.

est_gamma <- gas(y = y, distr = "gamma")
est_gamma
#> GAS Model: Gamma Distribution / Scale Parametrization / Unit Scaling 
#> 
#> Coefficients: 
#>                    Estimate Std. Error   Z-Test  Pr(>|Z|)    
#> log(scale)_omega  0.0010440  0.0013489   0.7740    0.4389    
#> log(scale)_alpha1 0.0526020  0.0071647   7.3418 2.107e-13 ***
#> log(scale)_phi1   0.9627838  0.0094368 102.0247 < 2.2e-16 ***
#> shape             0.9491683  0.0155575  61.0102 < 2.2e-16 ***
#> ---
#> Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
#> 
#> Log-Likelihood: -5565.939, AIC: 11139.88, BIC: 11166.48

spec_gamma <- UniGASSpec(Dist = "gamma", GASPar = list(scale = TRUE, shape = FALSE))
fit_gamma <- UniGASFit(spec_gamma, data = y)
fit_gamma
#> 
#> ------------------------------------------
#> -          Univariate GAS Fit            -
#> ------------------------------------------
#> 
#> Model Specification: 
#> T =  5718
#> Conditional distribution:  gamma
#> Score scaling type:  Identity
#> Time varying parameters:  scale
#> ------------------------------------------
#> Estimates:
#>           Estimate  Std. Error   t value     Pr(>|t|)
#> kappa1 -0.01013261 0.004387366 -2.309496 1.045803e-02
#> kappa2 -0.05727855 0.021342698 -2.683754 3.640035e-03
#> a1      0.05273107 0.008356407  6.310256 1.392869e-10
#> b1      0.84825707 0.045825664 18.510524 0.000000e+00
#> 
#> ------------------------------------------
#> Unconditional Parameters:
#>     scale     shape 
#> 0.9354058 0.9443310 
#> 
#> ------------------------------------------
#> Information Criteria:
#>       AIC       BIC        np       llk 
#> 11257.770 11284.375     4.000 -5624.885 
#> 
#> ------------------------------------------
#> Convergence: 0
#> ------------------------------------------
#> 
#> Elapsed time: 0.03 mins

The result of the GAS package significantly contrasts with the outcome of the gasmodel package. The default optimizer within the GAS package identifies a suboptimal solution, yielding a significantly lower log-likelihood compared to the exponential model. Note that the gamma distribution is a generalization of the exponential distribution and should therefore result in the same or better fit.

The default optimizer within the gasmodel package finds a considerably superior solution, likely the optimal one, albeit demanding more computational resources. In both packages, it is possible to alter the optimizers. However, in the GAS package, the optimizer’s parameters cannot be directly provided through the UniGASFit() function. Instead, a complete replacement of the optimizer is necessary, rendering it a more intricate process to manage.

Let us compare the computational speed of both packages. The gasmodel package is entirely written in R, whereas the GAS package also utilizes C++ to expedite certain functions. As the gasmodel package incorporates some more exotic distributions, it relies on various statistical packages not implemented in C++. By default, the GAS package employs the BFGS algorithm for optimization, while the gasmodel package uses the Nelder–Mead algorithm with a much lower termination tolerance for the variables. In the table below, we compare the running speed of the GAS package with its default optimizer, the gasmodel package with the default GAS optimizer, and the gasmodel package with its default optimizer. We can see that when the optimizer is the same, the gasmodel package is approximately two times slower than the GAS package. The default optimizer of the gasmodel package is even slower; however, it is more reliable, as discussed above. The computations were performed on a 5.2 GHz processor.

Running times of estimation in seconds.
Model Package Optimizer Running Time
Exponential GAS BFGS 0.83151
Exponential gasmodel BFGS 1.70054
Exponential gasmodel Nelder-Mead 3.86932
Gamma GAS BFGS 1.95717
Gamma gasmodel BFGS 3.41864
Gamma gasmodel Nelder-Mead 11.69872

After performing parameter estimation for the Weibull and generalized gamma distributions, the case study proceeds by introducing a trend into the model. Regrettably, the GAS package lacks the capacity for accommodating exogenous variables, thus preventing this extension. This shortcoming stands as a substantial limitation that considerably restricts the package’s potential applications. The case study also utilizes the bootstrapping function provided by the gasmodel package. Such a feature is absent in the GAS package. This primarily affects convenience, as bootstrapping can still be executed using custom code from the user along with specialized packages. The functionality for simulation of GAS processes is very similar in both packages.

The Ice Hockey Rankings Case Study

The Ice Hockey Rankings case study in the case_rankings vignette based on Holý and Zouhar (2022) is not replicable at all using the GAS package due to its lack of support for the Plackett–Luce distribution or any distribution based on rankings. Furthermore, the GAS package does not facilitate the imposition of constraints on coefficients, which is useful, for instance, in creating random walk models or multivariate models with a panel structure. In the same case study, the process of forecasting and deriving confidence bands on time-varying parameters is illustrated. Similar functionality is also offered by the GAS package.

Comparison of Supported Distributions and Functionalities

The following tables compare the supported distributions and the available functionalities in both packages. In general, the gasmodel package offers much broader range of GAS models, encompassing various probability distributions and model specifications. The gasmodel package (version 0.6.0) supports 35 distributions, whereas the GAS package (version 0.3.4) includes only 16 distributions. The GAS package features asymmetric and skewed versions of the normal and Student’s t distributions, which are currently absent in the gasmodel package. Conversely, the gasmodel package incorporates 23 distributions tailored for count, duration, categorical, circular, compositional, and ranking data, which are not present in the GAS package. While the GAS package caters primarily to standard GAS models without the ability to handle missing values, the gasmodel package offers enhanced flexibility, allowing for various model structures, incorporation of exogenous variables, and the handling of missing values in time series. Apart from differences in probability distributions and model specification, both packages provide analogous functionalities for inference, forecasting, and simulation. The GAS package also computes the probability integral transform and offers certain capabilities for backtesting one-step ahead density and Value-at-Risk. However, these functionalities are limited to continuous distributions, which constitute only a subset of GAS models. Furthermore, such functionalities can be derived from the output generated by the gasmodel package. For these reasons, we have decided not to implement them in gasmodel.

List of the supported distributions.
Distribution gasmodel GAS
Asymmetric Laplace
Asymmetric Student’s t with One Tail Decay
Asymmetric Student’s t with Two Tail Decay
Bernoulli
Beta
Birnbaum–Saunders
Burr
Categorical
Dirichlet
Double Poisson
Exponential
Exponential-Logarithmic
Fisk
Gamma
Generalized Gamma
Geometric
Kumaraswamy
Laplace
Logistic
Log-Normal
Logit-Normal
Lomax
Multivariate Normal
Multivariate Student’s t
Negative Binomial
Normal
Plackett–Luce
Poisson
Rayleigh
Skellam
Skewed Normal
Skewed Student’s t
Student’s t
vonMises
Weibull
Zero-Inflated Geometric
Zero-Inflated Negative Binomial
Zero-Inflated Poisson
Zero-Inflated Skellam
List of the available functionality.
Functionality gasmodel GAS
Various parametrizations and link functions
Exogenous variables
Higher score and autoregressive orders
Custom initial values of time-varying parameters
Fixed and bounded values of coefficients
Missing values
Custom optimization function
Hessian-based inference
Probability integral transform
Confidence bands
Forecasting
Backtesting and rolling re-estimation
Basic simulation
Bootstrapping
Easy visualization

References

Ardia, D., Boudt, K, and Catania, L. (2019). Generalized Autoregressive Score Models in R: The GAS Package. Journal of Statistical Software, 88(6), 1–28. doi: 10.18637/jss.v088.i06.

Holý, V. and Zouhar, J. (2022). Modelling Time-Varying Rankings with Autoregressive and Score-Driven Dynamics. Journal of the Royal Statistical Society: Series C (Applied Statistics), 71(5), 1427–1450. doi: 10.1111/rssc.12584.

Tomanová, P. and Holý, V. (2021). Clustering of Arrivals in Queueing Systems: Autoregressive Conditional Duration Approach. Central European Journal of Operations Research, 29(3), 859–874. doi: 10.1007/s10100-021-00744-7.