We compare the gasmodel package with the GAS package developed by Ardia et al. (2019). The GAS package provides functionality for both univariate and multivariate GAS models. The current version, 0.3.4, supports 16 distributions. However, the model specification in the GAS package is somewhat limited, only allowing for basic dynamics without the inclusion of exogenous variables. Additionally, this package lacks distributions for certain more specialized data types, such as circular, compositional, and ranking data. The package thus supports only a limited selection of GAS models found in the literature.
First, let us attempt to replicate the results from the Bookshop
Orders case study in the case_durations
vignette based on
Tomanová and Holý (2021) using the GAS package. This
case study focuses on modeling the durations between orders from a Czech
antiquarian bookshop. Specifically, We analyze durations adjusted for
the diurnal pattern.
library("dplyr")
library("gasmodel")
library("GAS")
data("bookshop_sales")
y <- bookshop_orders$duration_adj[-1]
The case study employs the general gamma distribution and its specific instances to model durations. The GAS package offers the exponential and gamma distributions but does not support the Weibull and generalized gamma distributions. The exponential distribution is parametrized in terms of the rate parameter with the logistic link function in the GAS package. The gasmodel package allows for both the scale and rate parametrizations as well as the identical and logarithmic link functions. When the logarithmic link function is used, however, the only difference between the scale and rate parametrizations is in the sign of the intercept We can therefore compare the exponential model using the scale parametrization estimated by the gasmodel package with the exponential model using the rate parametrization estimated by the GAS package.
est_exp <- gas(y = y, distr = "exp")
est_exp
#> GAS Model: Exponential Distribution / Scale Parametrization / Unit Scaling
#>
#> Coefficients:
#> Estimate Std. Error Z-Test Pr(>|Z|)
#> log(scale)_omega -0.00089754 0.00117598 -0.7632 0.4453
#> log(scale)_alpha1 0.04992815 0.00657547 7.5931 3.123e-14 ***
#> log(scale)_phi1 0.96278385 0.00918996 104.7647 < 2.2e-16 ***
#> ---
#> Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
#>
#> Log-Likelihood: -5571.078, AIC: 11148.16, BIC: 11168.11
spec_exp <- UniGASSpec(Dist = "exp", GASPar = list(location = TRUE))
fit_exp <- UniGASFit(spec_exp, data = y)
fit_exp
#>
#> ------------------------------------------
#> - Univariate GAS Fit -
#> ------------------------------------------
#>
#> Model Specification:
#> T = 5718
#> Conditional distribution: exp
#> Score scaling type: Identity
#> Time varying parameters: location
#> ------------------------------------------
#> Estimates:
#> Estimate Std. Error t value Pr(>|t|)
#> kappa1 0.0008988645 0.001176163 0.7642349 2.223636e-01
#> a1 0.0499298880 0.006583222 7.5844146 1.665335e-14
#> b1 0.9627795337 0.009205272 104.5900093 0.000000e+00
#>
#> ------------------------------------------
#> Unconditional Parameters:
#> location
#> 1.024444
#>
#> ------------------------------------------
#> Information Criteria:
#> AIC BIC np llk
#> 11148.156 11168.110 3.000 -5571.078
#>
#> ------------------------------------------
#> Convergence: 0
#> ------------------------------------------
#>
#> Elapsed time: 0.01 mins
The results are nearly identical, within a reasonable level of precision. Other than the inverted sign of the intercept, the only difference lies in the reported p-values: the GAS package employs one-tailed hypotheses, whereas the gasmodel package uses two-tailed hypotheses.
Next, we estimate the model with the gamma distribution and the rate parametrization.
est_gamma <- gas(y = y, distr = "gamma")
est_gamma
#> GAS Model: Gamma Distribution / Scale Parametrization / Unit Scaling
#>
#> Coefficients:
#> Estimate Std. Error Z-Test Pr(>|Z|)
#> log(scale)_omega 0.0010440 0.0013489 0.7740 0.4389
#> log(scale)_alpha1 0.0526020 0.0071647 7.3418 2.107e-13 ***
#> log(scale)_phi1 0.9627838 0.0094368 102.0247 < 2.2e-16 ***
#> shape 0.9491683 0.0155575 61.0102 < 2.2e-16 ***
#> ---
#> Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
#>
#> Log-Likelihood: -5565.939, AIC: 11139.88, BIC: 11166.48
spec_gamma <- UniGASSpec(Dist = "gamma", GASPar = list(scale = TRUE, shape = FALSE))
fit_gamma <- UniGASFit(spec_gamma, data = y)
fit_gamma
#>
#> ------------------------------------------
#> - Univariate GAS Fit -
#> ------------------------------------------
#>
#> Model Specification:
#> T = 5718
#> Conditional distribution: gamma
#> Score scaling type: Identity
#> Time varying parameters: scale
#> ------------------------------------------
#> Estimates:
#> Estimate Std. Error t value Pr(>|t|)
#> kappa1 -0.01013261 0.004387366 -2.309496 1.045803e-02
#> kappa2 -0.05727855 0.021342698 -2.683754 3.640035e-03
#> a1 0.05273107 0.008356407 6.310256 1.392869e-10
#> b1 0.84825707 0.045825664 18.510524 0.000000e+00
#>
#> ------------------------------------------
#> Unconditional Parameters:
#> scale shape
#> 0.9354058 0.9443310
#>
#> ------------------------------------------
#> Information Criteria:
#> AIC BIC np llk
#> 11257.770 11284.375 4.000 -5624.885
#>
#> ------------------------------------------
#> Convergence: 0
#> ------------------------------------------
#>
#> Elapsed time: 0.03 mins
The result of the GAS package significantly contrasts with the outcome of the gasmodel package. The default optimizer within the GAS package identifies a suboptimal solution, yielding a significantly lower log-likelihood compared to the exponential model. Note that the gamma distribution is a generalization of the exponential distribution and should therefore result in the same or better fit.
The default optimizer within the gasmodel package
finds a considerably superior solution, likely the optimal one, albeit
demanding more computational resources. In both packages, it is possible
to alter the optimizers. However, in the GAS package,
the optimizer’s parameters cannot be directly provided through the
UniGASFit()
function. Instead, a complete replacement of
the optimizer is necessary, rendering it a more intricate process to
manage.
Let us compare the computational speed of both packages. The gasmodel package is entirely written in R, whereas the GAS package also utilizes C++ to expedite certain functions. As the gasmodel package incorporates some more exotic distributions, it relies on various statistical packages not implemented in C++. By default, the GAS package employs the BFGS algorithm for optimization, while the gasmodel package uses the Nelder–Mead algorithm with a much lower termination tolerance for the variables. In the table below, we compare the running speed of the GAS package with its default optimizer, the gasmodel package with the default GAS optimizer, and the gasmodel package with its default optimizer. We can see that when the optimizer is the same, the gasmodel package is approximately two times slower than the GAS package. The default optimizer of the gasmodel package is even slower; however, it is more reliable, as discussed above. The computations were performed on a 5.2 GHz processor.
Model | Package | Optimizer | Running Time |
---|---|---|---|
Exponential | GAS | BFGS | 0.83151 |
Exponential | gasmodel | BFGS | 1.70054 |
Exponential | gasmodel | Nelder-Mead | 3.86932 |
Gamma | GAS | BFGS | 1.95717 |
Gamma | gasmodel | BFGS | 3.41864 |
Gamma | gasmodel | Nelder-Mead | 11.69872 |
After performing parameter estimation for the Weibull and generalized gamma distributions, the case study proceeds by introducing a trend into the model. Regrettably, the GAS package lacks the capacity for accommodating exogenous variables, thus preventing this extension. This shortcoming stands as a substantial limitation that considerably restricts the package’s potential applications. The case study also utilizes the bootstrapping function provided by the gasmodel package. Such a feature is absent in the GAS package. This primarily affects convenience, as bootstrapping can still be executed using custom code from the user along with specialized packages. The functionality for simulation of GAS processes is very similar in both packages.
The Ice Hockey Rankings case study in the case_rankings
vignette based on Holý and Zouhar (2022) is not replicable at all using
the GAS package due to its lack of support for the
Plackett–Luce distribution or any distribution based on rankings.
Furthermore, the GAS package does not facilitate the
imposition of constraints on coefficients, which is useful, for
instance, in creating random walk models or multivariate models with a
panel structure. In the same case study, the process of forecasting and
deriving confidence bands on time-varying parameters is illustrated.
Similar functionality is also offered by the GAS
package.
The following tables compare the supported distributions and the available functionalities in both packages. In general, the gasmodel package offers much broader range of GAS models, encompassing various probability distributions and model specifications. The gasmodel package (version 0.6.0) supports 35 distributions, whereas the GAS package (version 0.3.4) includes only 16 distributions. The GAS package features asymmetric and skewed versions of the normal and Student’s t distributions, which are currently absent in the gasmodel package. Conversely, the gasmodel package incorporates 23 distributions tailored for count, duration, categorical, circular, compositional, and ranking data, which are not present in the GAS package. While the GAS package caters primarily to standard GAS models without the ability to handle missing values, the gasmodel package offers enhanced flexibility, allowing for various model structures, incorporation of exogenous variables, and the handling of missing values in time series. Apart from differences in probability distributions and model specification, both packages provide analogous functionalities for inference, forecasting, and simulation. The GAS package also computes the probability integral transform and offers certain capabilities for backtesting one-step ahead density and Value-at-Risk. However, these functionalities are limited to continuous distributions, which constitute only a subset of GAS models. Furthermore, such functionalities can be derived from the output generated by the gasmodel package. For these reasons, we have decided not to implement them in gasmodel.
Distribution | gasmodel | GAS |
---|---|---|
Asymmetric Laplace | ✓ | ✓ |
Asymmetric Student’s t with One Tail Decay | ✕ | ✓ |
Asymmetric Student’s t with Two Tail Decay | ✕ | ✓ |
Bernoulli | ✓ | ✓ |
Beta | ✓ | ✓ |
Birnbaum–Saunders | ✓ | ✕ |
Burr | ✓ | ✕ |
Categorical | ✓ | ✕ |
Dirichlet | ✓ | ✕ |
Double Poisson | ✓ | ✕ |
Exponential | ✓ | ✓ |
Exponential-Logarithmic | ✓ | ✕ |
Fisk | ✓ | ✕ |
Gamma | ✓ | ✓ |
Generalized Gamma | ✓ | ✕ |
Geometric | ✓ | ✕ |
Kumaraswamy | ✓ | ✕ |
Laplace | ✓ | ✕ |
Logistic | ✓ | ✕ |
Log-Normal | ✓ | ✕ |
Logit-Normal | ✓ | ✕ |
Lomax | ✓ | ✕ |
Multivariate Normal | ✓ | ✓ |
Multivariate Student’s t | ✓ | ✓ |
Negative Binomial | ✓ | ✓ |
Normal | ✓ | ✓ |
Plackett–Luce | ✓ | ✕ |
Poisson | ✓ | ✓ |
Rayleigh | ✓ | ✕ |
Skellam | ✓ | ✓ |
Skewed Normal | ✕ | ✓ |
Skewed Student’s t | ✕ | ✓ |
Student’s t | ✓ | ✓ |
vonMises | ✓ | ✕ |
Weibull | ✓ | ✕ |
Zero-Inflated Geometric | ✓ | ✕ |
Zero-Inflated Negative Binomial | ✓ | ✕ |
Zero-Inflated Poisson | ✓ | ✕ |
Zero-Inflated Skellam | ✓ | ✕ |
Functionality | gasmodel | GAS |
---|---|---|
Various parametrizations and link functions | ✓ | ✕ |
Exogenous variables | ✓ | ✕ |
Higher score and autoregressive orders | ✓ | ✕ |
Custom initial values of time-varying parameters | ✓ | ✕ |
Fixed and bounded values of coefficients | ✓ | ✕ |
Missing values | ✓ | ✕ |
Custom optimization function | ✓ | ✓ |
Hessian-based inference | ✓ | ✓ |
Probability integral transform | ✕ | ✓ |
Confidence bands | ✓ | ✓ |
Forecasting | ✓ | ✓ |
Backtesting and rolling re-estimation | ✕ | ✓ |
Basic simulation | ✓ | ✓ |
Bootstrapping | ✓ | ✕ |
Easy visualization | ✓ | ✓ |
Ardia, D., Boudt, K, and Catania, L. (2019). Generalized Autoregressive Score Models in R: The GAS Package. Journal of Statistical Software, 88(6), 1–28. doi: 10.18637/jss.v088.i06.
Holý, V. and Zouhar, J. (2022). Modelling Time-Varying Rankings with Autoregressive and Score-Driven Dynamics. Journal of the Royal Statistical Society: Series C (Applied Statistics), 71(5), 1427–1450. doi: 10.1111/rssc.12584.
Tomanová, P. and Holý, V. (2021). Clustering of Arrivals in Queueing Systems: Autoregressive Conditional Duration Approach. Central European Journal of Operations Research, 29(3), 859–874. doi: 10.1007/s10100-021-00744-7.