A discharge rating curve is a model that describes the relationship between water elevation and discharge in a river. The rating curve is estimated from paired observations of water elevation and discharge and it is used to predict discharge for a given water elevation. This is the main practical usage of rating curves, as water elevation is substantially easier to directly observe than discharge. Four different discharge rating curve models are implemented in this R package using a Bayesian hierarchical model:

`plm0()`

- Power-law model with a constant error variance
(hence the 0). This is a Bayesian hierarchical implementation of the
most commonly used discharge rating curve model in hydrological
practice.

`plm()`

- Power-law model with error variance that varies
with water elevation.

`gplm0()`

- Generalized power-law model with a constant
error variance (hence the 0). The generalized power law is introduced in
Hrafnkelsson et al. (2022).

`gplm()`

- Generalized power-law model with error variance
that varies with water elevation. The generalized power law is
introduced in Hrafnkelsson et al. (2022).

For further details about the different models, see Hrafnkelsson et
al. (2022). The models differ in their complexity, `gplm` being
the most flexible and complex model. We will focus on the use of
`gplm` throughout this introduction vignette and explore the
different ways to fit the `gplm` and visualize its output.
However, the API of the functions for the other three models are
completely identical so this vignette also helps users to run those
models.

```
> library(bdrc)
> set.seed(1) #set seed for reproducibility
```

We will use a dataset from a stream gauging station in Sweden, called Krokfors, that comes with the package:

```
> data(krokfors)
> krokfors
#> W Q
#> 1 9.478000 10.8211700
#> 2 8.698000 1.5010000
#> 3 9.009000 3.3190000
#> 4 8.097000 0.1595700
#> 5 9.104000 4.5462500
#> 6 8.133774 0.2121178
#> 7 8.569583 1.1580000
#> 8 9.139151 4.8110000
#> 9 9.464250 10.9960000
#> 10 8.009214 0.0984130
#> 11 8.961843 2.7847910
#> 12 8.316000 0.6631890
#> 13 8.828716 1.8911800
#> 14 9.897000 20.2600000
#> 15 7.896000 0.0190000
#> 16 9.534000 12.1000000
#> 17 9.114000 4.3560000
#> 18 8.389000 0.6200000
#> 19 8.999000 2.6800000
#> 20 9.099000 3.7310000
#> 21 8.502000 0.8930000
#> 22 8.873000 1.9000000
#> 23 8.240000 0.3200000
#> 24 9.219000 5.9000000
#> 25 9.271000 6.9000000
#> 26 8.370000 0.4420000
#> 27 9.431000 9.0000000
```

It is very simple to fit a discharge rating curve with the
*bdrc* package. All you need are two mandatory input arguments,
`formula` and `data`. The `formula` is of the form
`y`~`x`, where `y` is the discharge in square
meters per second (m\(^3/\)s), and
`x` is the water elevation in meters (m). It is very important
that the data is in the correct units! The `data` argument must
be a data.frame including `x` and `y` as column names. In
our case, the Krokfors data has the discharge and water elevation
measurements stored in columns named `Q` and `W`,
respectively. We are ready to fit a discharge rating curve using the
`gplm` function:

`> gplm.fit <- gplm(Q~W,data=krokfors,parallel=TRUE,num_cores=2) # parallel=TRUE by default and by default, the number of cores is detected on the machine`

The `gplm` function returns an object of class “gplm” which we
can summarize and visualize using familiar functions such as

```
> summary(gplm.fit)
#>
#> Formula:
#> Q ~ W
#> Latent parameters:
#> lower-2.5% median-50% upper-97.5%
#> a 1.50 1.94 2.23
#> b 1.83 1.84 1.84
#>
#> Hyperparameters:
#> lower-2.5% median-50% upper-97.5%
#> c 7.71211 7.8102 7.856
#> sigma_beta 0.42368 0.7066 1.288
#> phi_beta 0.57452 1.1910 2.874
#> sigma_eta 0.00329 0.0912 0.534
#> eta_1 -4.93445 -4.2566 -3.557
#> eta_2 -5.94891 -4.0659 -2.280
#> eta_3 -6.87756 -4.2049 -1.611
#> eta_4 -7.71587 -4.4949 -1.223
#> eta_5 -8.24321 -4.6293 -0.680
#> eta_6 -8.78471 -4.6121 -0.250
#>
#> WAIC: 1.112881
```

and

`> plot(gplm.fit)`

In the next section, we will dive deeper into visualizing the “gplm” object.

The *bdrc* package provides several tools to visualize the
results from model objects which can give insight into the physical
properties of the river at hand. For instance, the hyperparameter \(c\) corresponds to the water elevation of
zero discharge. To visualize the posterior of \(c\), we can write

`> plot(gplm.fit,type='histogram',param='c')`

Technically, instead of inferring \(c\) directly, \(h_{min}-c\) is inferred, where \(h_{min}\) is the lowest water elevation value in the data. Since the parameter \(h_{min}-c\) is strictly positive, a transformation \(\zeta=log(h_{min}-c)\) is used for the Bayesian inference so that it has support on the real line. To plot the transformed posterior we write

`> plot(gplm.fit,type='histogram',param='c',transformed=TRUE)`

The `param` argument can also be a vector containing multiple
parameter names. For example, to visualize the posterior distributions
of the parameters \(a\) and \(c\), we can write

`> plot(gplm.fit,type='histogram',param=c('a','c'))`

There is a shorthand to visualize the hyperparameters all at once

`> plot(gplm.fit,type='histogram',param='hyperparameters')`

Similarly, writing ‘latent_parameters’ plots the latent parameters in one plot. To plot the hyperparameters transformed on the same scale as in the Bayesian inference, we write

`> plot(gplm.fit,type='histogram',param='hyperparameters',transformed=TRUE)`

Finally, we can visualize the components of the model that are
allowed (depending on the model) to vary with water elevation, that is,
the power-law exponent, \(f(h)\), and
the standard deviation of the error terms at the response level, \(\sigma_{\varepsilon}(h)\). Both
`gplm0` and `gplm` generalize the power-law exponent by
modeling it as a sum of a constant term, \(b\), and Gaussian process, \(\beta(h)\), namely \(f(h)=b+\beta(h)\), where \(\beta(h)\) is assumed to be twice
differentiable with mean zero. On the other hand, `plm` and
`plm0` both model the power-law exponent as a constant by setting
\(\beta(h)=0\), which gives \(f(h)=b\). We can plot the inferred
power-law exponent with

`> plot(gplm.fit,type='f')`

Both `plm` and `gplm` model the standard deviation of
the error terms at the response level, \(\sigma_{\varepsilon}(h)\), as a function of
water elevation, using B-splines basis functions, while `plm0`
and `gplm0` model the standard deviation as a constant. We can
plot the inferred standard deviation by writing

`> plot(gplm.fit,type='sigma_eps')`

To get a visual summary of your model, the ‘panel’ option in the plot type is useful:

`> plot(gplm.fit,type='panel',transformed=TRUE)`

The package has several functions for convergence diagnostics of a bdrc model, most notably the residual plot, trace plots, autocorrelation plots, and Gelman-Rubin diagnostic plots. The log-residuals can be plotted with

`> plot(gplm.fit,type='residuals')`

The log-residuals are calculated by subtracting the posterior estimate (median) of the log-discharge, \(log(\hat{Q})\), from the observed log-discharge, \(log(Q)\). Additionally, the plot includes the 95% predictive intervals of log(Q) (- -) and 95% credible intervals for the expected value of log(Q) (—), the latter reflecting the rating curve uncertainty.

`> plot(gplm.fit,type='trace',param='c',transformed=TRUE)`

To plot a trace plot for all the transformed hyperparameters, we write

`> plot(gplm.fit,type='trace',param='hyperparameters',transformed=TRUE)`