The (European) Pareto distribution is probably the most popular
distribution for modeling large losses in reinsurance pricing. There are
good reasons for this popularity, which are discussed in detail in
Fackler (2013). We recommend Philbrick (1985) and Schmutz
*et.al.* (1998) for an impression of how the (European) Pareto
distribution is applied in practice.

In cases where the Pareto distribution is not flexible enough, pricing actuaries sometimes use piecewise Pareto distributions. For instance, a Pareto alpha of 1.5 is used to model claim sizes between USD 1M and USD 5M and an alpha of 2.5 is used above USD 5M. A particularly useful and non-trivial application of the piecewise Pareto distribution is that it can be used to match a tower of expected layer losses with a layer independent collective loss model. Details are described in Riegel (2018), who also provides a matching algorithm that works for an arbitrary number of reinsurance layers.

The package provides a tool kit for the Pareto, the piecewise Pareto and the generalized Pareto distribution, which is useful for pricing of reinsurance treaties. In particular, the package provides the matching algorithm for layer losses.

**Definition:** Let \(t>0\) and \(\alpha>0\). The *Pareto
distribution* \(\text{Pareto}(t,\alpha)\) is defined by the
distribution function \[
F_{t,\alpha}(x):=\begin{cases}
0 & \text{ for $x\le t$} \\
\displaystyle 1-\left(\frac{t}{x}\right)^{\alpha} & \text{ for
$x>t$.}
\end{cases}
\] This version of the Pareto distribution is also known as
*Pareto type I*, *European Pareto* or *single-parameter
Pareto*.

The functions `pPareto`

and `dPareto`

provide
the distribution function and the density function of the Pareto
distribution:

```
library(Pareto)
<- c(1:10) * 1000
x pPareto(x, 1000, 2)
```

```
## [1] 0.0000000 0.7500000 0.8888889 0.9375000 0.9600000 0.9722222 0.9795918
## [8] 0.9843750 0.9876543 0.9900000
```

`plot(pPareto(1:5000, 1000, 2), xlab = "x", ylab = "CDF(x)")`

`dPareto(x, 1000, 2)`

```
## [1] 2.000000e-03 2.500000e-04 7.407407e-05 3.125000e-05 1.600000e-05
## [6] 9.259259e-06 5.830904e-06 3.906250e-06 2.743484e-06 2.000000e-06
```

`plot(dPareto(1:5000, 1000, 2), xlab = "x", ylab = "PDF(x)")`

The package also provides the quantile function:

`qPareto(0:10 / 10, 1000, 2)`

```
## [1] 1000.000 1054.093 1118.034 1195.229 1290.994 1414.214 1581.139 1825.742
## [9] 2236.068 3162.278 Inf
```

`rPareto(20, 1000, 2)`

```
## [1] 1896.289 1180.420 2943.743 1323.985 1055.317 2929.332 1021.523
## [8] 3735.630 17514.558 2367.470 8347.912 1084.410 1675.636 1002.364
## [15] 1254.839 1096.003 2317.094 3702.355 1181.540 1066.404
```

Let \(X\sim \text{Pareto}(t,\alpha)\) and \(a, c\ge 0\). Then \[ E(\min[c,\max(X-a,0)]) = \int_a^{c+a}(1-F_{t,\alpha}(x))\, dx =: I_{t,\alpha}^{\text{$c$ xs $a$}} \] is the layer mean of \(c\) xs \(a\), i.e. the expected loss to the layer given a single loss \(X\).

*Example:* \(t=500\), \(\alpha = 2\), Layer 4000 xs 1000

`Pareto_Layer_Mean(4000, 1000, 2, t = 500)`

`## [1] 200`

Let \(X\sim
\text{Pareto}(t,\alpha)\) and \(a, c\ge
0\). Then the variance of the layer loss \(\min[c,\max(X-a,0)]\) can be calculated
with the function `Pareto_Layer_Var`

.

*Example:* \(t=500\), \(\alpha = 2\), Layer 4000 xs 1000

`Pareto_Layer_Var(4000, 1000, 2, t = 500)`

`## [1] 364719`

**Lemma:**

- Let \(X \sim \text{Pareto}(t,\alpha )\). Then \(cX \sim \text{Pareto}(ct,\alpha )\) for all \(c>0\).
- Let \(X \sim \text{Pareto}(t_{1} ,\alpha )\). For \(t_2 > t_1\) we then have \(X|(X>t_2 ) \sim \text{Pareto}(t_2 ,\alpha )\)

**Consequences:**

- The
*Pareto alpha*is invariant wrt scaling (which implies that \(\alpha\) does not depend on currencies and inflation) - For Pareto distributed data the Pareto alpha does not depend on the reporting threshold
- For layers and thresholds above \(t\) the ratio between expected layer losses and/or excess frequencies depends only on \(\alpha\) (and not on \(t\))

Consider two layers \(c_i\) xs \(a_i\) and a \(\text{Pareto}(t,\alpha)\) distributed severity with sufficiently small \(t\). What is the expected loss of \(c_2\) xs \(a_2\) given the expected loss of \(c_1\) xs \(a_1\)?

*Example:* Assume \(\alpha =
2\) and the expected loss of 4000 xs 1000 is 500. Calculate the
expected loss of the layer 5000 xs 5000.

`Pareto_Extrapolation(4000, 1000, 5000, 5000, 2) * 500`

`## [1] 62.5`

`Pareto_Extrapolation(4000, 1000, 5000, 5000, 2, ExpLoss_1 = 500)`

`## [1] 62.5`

Given the expected losses of two layers, there is typically a unique Pareto alpha \(\alpha\) which is consistent with the ratio of the expected layer losses.

*Example:* Expected loss of 4000 xs 1000 is 500. Expected loss
of 5000 xs 5000 is 62.5. Alpha between the two layers:

`Pareto_Find_Alpha_btw_Layers(4000, 1000, 500, 5000, 5000, 62.5)`

`## [1] 2`

Check: see previous example

Given the expected excess frequency at a threshold and the expected loss of a layer, then there is typically a unique Pareto alpha \(\alpha\) which is consistent with this data.

*Example:* Expected frequency in excess of 500 is 2.5.
Expected loss of 4000 xs 1000 is 500. Alpha between the frequency and
the layer:

`Pareto_Find_Alpha_btw_FQ_Layer(500, 2.5, 4000, 1000, 500)`

`## [1] 2`

Check:

`Pareto_Layer_Mean(4000, 1000, 2, t = 500) * 2.5`

`## [1] 500`

Given the expected losses of two layers, we can use these techniques to obtain a Poisson-Pareto model which matches the expected loss of both layers.

*Example:* Expected loss of 30 xs 10 is 26.66 (Burning Cost).
Expected loss of 60 xs 40 is 15.95 (Exposure model).

`Pareto_Find_Alpha_btw_Layers(30, 10, 26.66, 60, 40, 15.95)`

`## [1] 1.086263`

Frequency @ 10:

`26.66 / Pareto_Layer_Mean(30, 10, 1.086263)`

`## [1] 2.040392`

A collective model \(\sum_{n=1}^NX_n\) with \(X_N\sim \text{Pareto}(10, 1.09)\) and \(N\sim \text{Poisson}(2.04)\) matches both expected layer losses.

Given the frequency \(f_1\) in excess of \(t_1\) the frequency \(f_2\) in excess of \(t_2\) can directly be calculated as follows: \[ f_2 = f_1 \cdot \left(\frac{t_1}{t_2}\right)^\alpha \] Vice versa, we can calculate the Pareto alpha, if the two excess frequencies \(f_1\) and \(f_2\) are given: \[ \alpha = \frac{\log(f_2/f_1)}{\log(t_1/t_2)}. \]

*Example:*

Expected frequency excess 1000 is 2. What is the expected frequency excess 4000 if we have a Pareto alpha of 2.5?

```
<- 1000
t_1 <- 2
f_1 <- 4000
t_2 <- f_1 * (t_1 / t_2)^2.5) (f_2
```

`## [1] 0.0625`

Vice versa:

`Pareto_Find_Alpha_btw_FQs(t_1, f_1, t_2, f_2)`

`## [1] 2.5`

For \(i=1,\dots,n\) let \(X_i\sim \text{Pareto}(t,\alpha)\) be Pareto
distributed observations. Then we have the ML estimator \[
\hat{\alpha}^{ML}=\frac{n}{\sum_{i=1}^n\log(X_i/t)}.
\] *Example:*

Pareto distributed losses with a reporting threshold of \(t=1000\) and \(\alpha = 2\):

```
<- rPareto(1000, t = 1000, alpha = 2)
losses Pareto_ML_Estimator_Alpha(losses, t = 1000)
```

`## [1] 2.02407`

In reinsurance, sometimes large loss data from different sources are used for severity fits. Then the losses are typically only available in excess of certain reporting thresholds which may vary by data source. Assume that two data sources each contain 5000 losses in excess of 1000, which are Pareto distributed with an alpha of 2 but from data source 2 we only know the losses exceeding a reporting threshold of 3000. If we apply the standard ML estimator with a threshold of 1000, then we obtain an alpha which is too low, since we ignore that the loss data is not complete in excess of 1000:

```
<- rPareto(5000, t = 1000, alpha = 2)
losses_1 <- rPareto(5000, t = 1000, alpha = 2)
losses_2 <- losses_2 > 3000
reported <- losses_2[reported]
losses_2 <- c(losses_1, losses_2)
losses Pareto_ML_Estimator_Alpha(losses, t = 1000)
```

`## [1] 1.592139`

In the function `Pareto_ML_Estimator_Alpha`

the user can
define reporting threshold for each loss in order to handle this
situation:

```
<- rep(1000, length(losses_1))
reporting_thresholds_1 <- rep(3000, length(losses_2))
reporting_thresholds_2 <- c(reporting_thresholds_1, reporting_thresholds_2)
reporting_thresholds Pareto_ML_Estimator_Alpha(losses, t = 1000, reporting_thresholds = reporting_thresholds)
```

`## [1] 1.950612`

Now, assume that the underlying policies have limits of 5000 or 10000 and that a loss is censored if it exceeds the respective limit. If the underlying losses are Pareto distributed before they are censored then ML estimation leads to a too large value for alpha:

```
<- sample(c(5000, 10000), length(losses), replace = T)
limits <- losses > limits
censored <- limits[censored]
losses[censored] <- losses > reporting_thresholds
reported <- losses[reported]
losses <- reporting_thresholds[reported]
reporting_thresholds Pareto_ML_Estimator_Alpha(losses, t = 1000, reporting_thresholds = reporting_thresholds)
```

`## [1] 2.058482`

In order to deal with this situation the function allows to specify for each loss if it is censored or not:

```
Pareto_ML_Estimator_Alpha(losses, t = 1000, reporting_thresholds = reporting_thresholds,
is.censored = censored)
```

`## [1] 1.953108`

Let \(X\sim
\text{Pareto}(t,\alpha)\) and \(T>t\). Then \(X|(X<T)\) has a *truncated Pareto
distribution*. The Pareto functions mentioned above are also
available for the truncated Pareto distribution.

**Definition:** Let \(\mathbf{t}:=(t_1,\dots,t_n)\) be a vector
of thresholds with \(0<t_1<\dots<t_n<t_{n+1}:=+\infty\)
and let \(\boldsymbol\alpha:=(\alpha_1,\dots,\alpha_n)\)
be a vector of Pareto alphas with \(\alpha_i\ge 0\) and \(\alpha_n>0\). The *piecewise
Pareto* distribution} \(\text{PPareto}(\mathbf{t},\boldsymbol\alpha)\)
is defined by the distribution function \[
F_{\mathbf{t},\boldsymbol\alpha}(x):=\begin{cases}
0 & \text{ for $x<t_1$} \\
\displaystyle
1-\left(\frac{t_{k}}{x}\right)^{\alpha_k}\prod_{i=1}^{k-1}\left(\frac{t_i}{t_{i+1}}\right)^{\alpha_i}
& \text{ for $x\in [t_k,t_{k+1}).$}
\end{cases}
\]

The family of piecewise Pareto distributions is very flexible:

**Proposition:** The set of Piecewise Pareto
distributions is dense in the space of all positive-valued distributions
(with respect to the Lévy metric).

This means that we can approximate any positive valued distribution as good as we want with piecewise Pareto. A very good approximation typically comes at the cost of many Pareto pieces. Piecewise Pareto is often a good alternative to a discrete distribution, since it is much better to handle!

The Pareto package also provides functions for the piecewise Pareto distribution. For instance:

```
<- c(1:10) * 1000
x <- c(1000, 2000, 3000, 4000)
t <- c(2, 1, 3, 20)
alpha pPiecewisePareto(x, t, alpha)
```

```
## [1] 0.0000000 0.7500000 0.8333333 0.9296875 0.9991894 0.9999789 0.9999990
## [8] 0.9999999 1.0000000 1.0000000
```

`plot(pPiecewisePareto(1:5000, t, alpha), xlab = "x", ylab = "CDF(x)")`

`dPiecewisePareto(x, t, alpha)`

```
## [1] 2.000000e-03 1.250000e-04 1.666667e-04 3.515625e-04 3.242592e-06
## [6] 7.048328e-08 2.768239e-09 1.676381e-10 1.413089e-11 1.546188e-12
```

`plot(dPiecewisePareto(1:5000, t, alpha), xlab = "x", ylab = "PDF(x)")`

`rPiecewisePareto(20, t, alpha)`

```
## [1] 1015.502 1499.866 4055.737 1847.381 2350.878 1944.604 1343.026 1320.471
## [9] 1289.198 1193.922 1376.018 1776.002 1541.265 1352.772 1107.721 1123.625
## [17] 1285.675 1937.020 3225.122 2316.607
```

`PiecewisePareto_Layer_Mean(4000, 1000, t, alpha)`

`## [1] 826.6969`

`PiecewisePareto_Layer_Var(4000, 1000, t, alpha)`

`## [1] 922221.2`

Let \(\mathbf{t}:=(t_1,\dots,t_n)\) be a vector of thresholds and let \(\boldsymbol\alpha:=(\alpha_1,\dots,\alpha_n)\) be a vector of Pareto alphas. For \(i=1,\dots,n\) let \(X_i\sim \text{PPareto}(\mathbf{t},\boldsymbol\alpha)\). If the vector \(\mathbf{t}\) is known, then the parameter vector \(\boldsymbol\alpha\) can be estimated with maximum likelihood.

*Example:*

Piecewise Pareto distributed losses with \(\mathbf{t}:=(1000,\,2000,\, 3000)\) and \(\boldsymbol\alpha:=(1,\, 2,\, 3)\):

```
<- rPiecewisePareto(10000, t = c(1000, 2000, 3000), alpha = c(1, 2, 3))
losses PiecewisePareto_ML_Estimator_Alpha(losses, c(1000, 2000, 3000))
```

`## [1] 1.007083 2.032210 3.066960`

Reporting thresholds and censoring of losses can be taken into
account as described for the function
`Pareto_ML_Estimator_Alpha`

.

```
<- rPiecewisePareto(5000, t = c(1000, 2000, 3000), alpha = c(1, 2, 3))
losses_1 <- rPiecewisePareto(5000, t = c(1000, 2000, 3000), alpha = c(1, 2, 3))
losses_2 <- losses_2 > 3000
reported <- losses_2[reported]
losses_2 <- c(losses_1, losses_2)
losses PiecewisePareto_ML_Estimator_Alpha(losses, c(1000, 2000, 3000))
```

`## [1] 0.8029695 1.2305628 3.1015749`

```
<- rep(1000, length(losses_1))
reporting_thresholds_1 <- rep(3000, length(losses_2))
reporting_thresholds_2 <- c(reporting_thresholds_1, reporting_thresholds_2)
reporting_thresholds PiecewisePareto_ML_Estimator_Alpha(losses, c(1000, 2000, 3000),
reporting_thresholds = reporting_thresholds)
```

`## [1] 1.049237 2.041128 3.101575`

```
<- sample(c(2500, 5000, 10000), length(losses), replace = T)
limits <- losses > limits
censored <- limits[censored]
losses[censored] <- losses > reporting_thresholds
reported <- losses[reported]
losses <- reporting_thresholds[reported]
reporting_thresholds <- censored[reported]
censored PiecewisePareto_ML_Estimator_Alpha(losses, c(1000, 2000, 3000),
reporting_thresholds = reporting_thresholds)
```

`## [1] 1.049237 2.972463 3.529461`

```
PiecewisePareto_ML_Estimator_Alpha(losses, c(1000, 2000, 3000),
reporting_thresholds = reporting_thresholds,
is.censored = censored)
```

`## [1] 1.049237 2.065581 3.151667`

The package also provides truncated versions of the piecewise Pareto distribution. There are two options available:

`truncation_type = 'lp'`

: Below the largest threshold \(t_n\), the distribution function equals the distribution of the piecewise Pareto distribution without truncation. The last Pareto piece, however, is truncated at`truncation`

`truncation_type = 'wd'`

: The whole piecewise Pareto distribution is truncated at `truncation’

The Pareto distribution can be used to build a collective model which matches the expected loss of two layers. We can use piecewise Pareto if we want to match the expected loss of more than two layers.

Consider a sequence of attachment points \(0 < a_1 <\dots < a_n<a_{n+1}:=+\infty\). Let \(c_i:=a_{i+1}-a_i\) and let \(e_i\) be the expected loss of the layer \(c_i\) xs \(a_i\). Moreover, let \(f_1\) be the expected frequency in excess of \(a_1\).

The following matching algorithm uses one Pareto piece per layer and is straight forward:

- Calculate the Pareto alpha \(\alpha_1\) between the excess frequency \(f_1\) and the layer \(c_1\) xs \(a_1\)
- Calculate the frequency \(f_2\) in excess of \(a_2\): \(f_2:=(a_1/a_2)^{\alpha_1}\cdot f_1\)
- Calculate the Pareto alpha \(\alpha_2\) between the excess frequency \(f_2\) and the layer \(c_2\) xs \(a_2\)
- Calculate the frequency \(f_3\) in excess of \(a_3\): \(f_3:=(a_2/a_3)^{\alpha_2}\cdot f_3\)
- \(\dots\)
- Use a collective model \(\sum_{n=1}^NX_n\) with \(E(N)=f_1\) and \(X_n\sim\text{PPareto}(\mathbf{t},\boldsymbol\alpha)\).

This approach always works for three layers, but it often does not work if we have three or more layers. For instance, Riegel (2018) shows that it does not work for the following example:

\(i\) | Cover \(c_i\) | Att. Pt. \(a_i\) | Exp. Loss \(e_i\) | Rate on Line \(e_i/c_i\) |
---|---|---|---|---|

1 | 500 | 1000 | 100 | 0.20 |

2 | 500 | 1500 | 90 | 0.18 |

3 | 500 | 2000 | 50 | 0.10 |

4 | 500 | 2500 | 40 | 0.08 |

The Pareto package provides a more complex matching approach that uses two Pareto pieces per layer. Riegel (2018) shows that this approach works for an arbitrary number of layers with consistent expected losses.

*Example:*

```
<- c(1000, 1500, 2000, 2500, 3000)
attachment_points <- c(100, 90, 50, 40, 100)
exp_losses <- PiecewisePareto_Match_Layer_Losses(attachment_points, exp_losses)
fit fit
```

```
##
## Panjer & Piecewise Pareto model
##
## Collective model with a Poisson distribution for the claim count and a Piecewise Pareto distributed severity.
##
## Poisson Distribution:
## Expected Frequency: 0.2136971
##
## Piecewise Pareto Distribution:
## Thresholds: 1000 1500 1932.059 2000 2147.531 2500 2847.756 3000
## Alphas: 0.3091209 0.1753613 9.685189 3.538534 0.817398 0.7663698 5.086828 2.845488
## The distribution is not truncated.
##
## Status: 0
## Comments: OK
```

The function `PiecewisePareto_Match_Layer_Losses`

returns
a `PPP_Model`

object (PPP stands for Panjer & Piecewise
Pareto) which contains the information required to specify a collective
model with a Panjer distributed claim count and a piecewise Pareto
distributed severity. The results can be checked using the attributes
`FQ`

, `t`

and `alpha`

of the
object:

```
c(PiecewisePareto_Layer_Mean(500, 1000, fit$t, fit$alpha) * fit$FQ,
PiecewisePareto_Layer_Mean(500, 1500, fit$t, fit$alpha) * fit$FQ,
PiecewisePareto_Layer_Mean(500, 2000, fit$t, fit$alpha) * fit$FQ,
PiecewisePareto_Layer_Mean(500, 2500, fit$t, fit$alpha) * fit$FQ,
PiecewisePareto_Layer_Mean(Inf, 3000, fit$t, fit$alpha) * fit$FQ)
```

`## [1] 100 90 50 40 100`

There are, however, functions which can directly use PPP_Models:

```
<- c(diff(attachment_points), Inf)
covers Layer_Mean(fit, covers, attachment_points)
```

`## [1] 100 90 50 40 100`

The function `PiecewisePareto_Match_Layer_Losses`

can be
used to match the expected losses of a complete tower of layers. If we
want to match the expected losses of some reference layers which do not
form a complete tower then it is more convenient to use the function
`Fit_References`

. Also excess frequencies can be provided as
reference information. The function can be seen as a user interface for
`PiecewisePareto_Match_Layer_Losses`

:

```
<- c(1000, 1000, 1000)
covers <- c(1000, 2000, 5000)
att_points <- c(100, 50, 10)
exp_losses <- c(4000, 10000)
thresholds <- c(0.04, 0.005)
fqs <- Fit_References(covers, att_points, exp_losses, thresholds, fqs)
fit Layer_Mean(fit, covers, att_points)
```

`## [1] 100 50 10`

`Excess_Frequency(fit, thresholds) `

`## [1] 0.040 0.005`

If the package `lpSolve`

is installed then the funcion
`Fit_References`

can handle ovelapping layers.

The function `Fit_PML_Curve`

can be used fit a
`PPP_Model`

that reproduces and interpolates the information
provided in the PML curve. A PML curve is a table containing return
periods and the corresponding loss amounts:

\(i\) | Return Period \(r_i\) | Amount \(x_i\) |
---|---|---|

1 | 1 | 1000 |

2 | 5 | 4000 |

3 | 10 | 7000 |

4 | 20 | 10000 |

5 | 50 | 13000 |

6 | 100 | 14000 |

The information contained in such a PML curve can be used to create a
`PPP_Model`

that has the expected excess frequency \(1/r_i\) at \(x_i\).

*Example:*

```
<- c(1, 5, 10, 20, 50, 100)
return_periods <- c(1000, 4000, 7000, 10000, 13000, 14000)
amounts <- Fit_PML_Curve(return_periods, amounts)
fit 1 / Excess_Frequency(fit, amounts)
```

`## [1] 1 5 10 20 50 100`

A `PPP_Model`

object contains the information required to
specify a collective model with a Panjer distributed claim count and a
piecewise Pareto distributed severity.

**Claim count distribution:** The Panjer class contains
the binomial distribution, the Poisson distribution and the negative
binomial distribution. The distribution of the claim count \(N\) is specified by the expected frequency
\(E(N)\) (attribute `FQ`

of
the object) and the dispersion \(D(N):=Var(N)/E(N)\) (attribute
`dispersion`

of the object). We have the following cases:

`dispersion < 1`

: binomial distribution`dispersion = 1`

: Poisson distribution`dispersion > 1`

: negative binomial distribution.

**Severity distribution:** The piecewise Pareto
distribution is specified by the vectors `t`

,
`alpha`

, `truncation`

and
`truncation_type`

.

The function `PiecewisePareto_Match_Layer_Losses`

returns
`PPP_Model`

object. Such an object can also be directly
created using the constructor function:

```
<- PPP_Model(FQ = 2, t = c(1000, 2000), alpha = c(1, 2),
PPPM truncation = 10000, truncation_type = "wd", dispersion = 1.5)
PPPM
```

```
##
## Panjer & Piecewise Pareto model
##
## Collective model with a Negative Binomial distribution for the claim count and a Piecewise Pareto distributed severity.
##
## Negative Binomial Distribution:
## Expected Frequency: 2
## Dispersion: 1.5 (i.e. contagion = 0.25)
##
## Piecewise Pareto Distribution:
## Thresholds: 1000 2000
## Alphas: 1 2
## Truncation: 10000
## Truncation Type: 'wd'
##
## Status: 0
## Comments: OK
```

A `PPP_Model`

can directly be used to calculate the
expected loss, the standard deviation or the variance of a reinsurance
layer: function:

```
<- PPP_Model(FQ = 2, t = c(1000, 2000), alpha = c(1, 2),
PPPM truncation = 10000, truncation_type = "wd", dispersion = 1.5)
Layer_Mean(PPPM, 4000, 1000)
```

`## [1] 2475.811`

`Layer_Sd(PPPM, 4000, 1000)`

`## [1] 2676.332`

`Layer_Var(PPPM, 4000, 1000)`

`## [1] 7162754`

A `PPP_Model`

can directly be used to calculate the
expected frequency in excess of a threshold:

```
<- PPP_Model(FQ = 2, t = c(1000, 2000), alpha = c(1, 2),
PPPM truncation = 10000, truncation_type = "wd", dispersion = 1.5)
<- c(0, 1000, 2000, 5000, 10000, Inf)
thresholds Excess_Frequency(PPPM, thresholds)
```

`## [1] 2.0000000 2.0000000 0.9795918 0.1224490 0.0000000 0.0000000`

A `PPP_Model`

can directly be used to simulate losses with
the corresponding collective model:

```
<- PPP_Model(FQ = 2, t = c(1000, 2000), alpha = c(1, 2),
PPPM truncation = 10000, truncation_type = "wd", dispersion = 1.5)
Simulate_Losses(PPPM, 10)
```

```
## [,1] [,2] [,3] [,4] [,5] [,6]
## [1,] 1049.576 2277.154 2190.743 2082.267 2027.094 6256.14
## [2,] 1144.362 1012.695 NaN NaN NaN NaN
## [3,] 1029.560 5909.302 1485.112 NaN NaN NaN
## [4,] 3914.593 3806.222 NaN NaN NaN NaN
## [5,] 2438.013 1022.212 NaN NaN NaN NaN
## [6,] NaN NaN NaN NaN NaN NaN
## [7,] NaN NaN NaN NaN NaN NaN
## [8,] 9952.716 3202.051 NaN NaN NaN NaN
## [9,] 2098.255 NaN NaN NaN NaN NaN
## [10,] 1536.806 2204.064 3212.567 NaN NaN NaN
```

The function `Simulate_Losses`

returns a matrix where each
row contains the losses from one simulation.

Note that for a given expected frequency `FQ`

not every
dispersion `dispersion < 1`

is possible for the binomial
distribution. In this case a binomial distribution with the smallest
dispersion larger than or equal to `dispersion`

is used for
the simulation.

**Definition:** Let \(t>0\) and \(\alpha_\text{ini},
\alpha_\text{tail}>0\). The *generalized Pareto
distribution* \(\text{GenPareto}(t,\alpha_\text{ini},
\alpha_\text{tail})\) is defined by the distribution function
\[
F_{t,\alpha_\text{ini}, \alpha_\text{tail}}(x):=\begin{cases}
0 & \text{ for $x\le t$} \\
\displaystyle 1-\left(1+\frac{\alpha_\text{ini}}{\alpha_\text{tail}}
\left(\frac{x}{t}-1\right)\right)^{-\alpha_\text{tail}} & \text{ for
$x>t$.}
\end{cases}
\] We do not the standard parameterization from extreme value
theory but the parameterization from Riegel (2008) which is useful in a
reinsurance context.

The functions `pGenPareto`

and `dGenPareto`

provide the distribution function and the density function of the Pareto
distribution:

```
<- c(1:10) * 1000
x pGenPareto(x, t = 1000, alpha_ini = 1, alpha_tail = 2)
```

```
## [1] 0.0000000 0.5555556 0.7500000 0.8400000 0.8888889 0.9183673 0.9375000
## [8] 0.9506173 0.9600000 0.9669421
```

`plot(pGenPareto(1:5000, 1000, 1, 2), xlab = "x", ylab = "CDF(x)")`

`dGenPareto(x, t = 1000, alpha_ini = 1, alpha_tail = 2)`

```
## [1] 1.000000e-03 2.962963e-04 1.250000e-04 6.400000e-05 3.703704e-05
## [6] 2.332362e-05 1.562500e-05 1.097394e-05 8.000000e-06 6.010518e-06
```

`plot(dGenPareto(1:5000, 1000, 1, 2), xlab = "x", ylab = "PDF(x)")`

The package also provides the quantile function:

`qGenPareto(0:10 / 10, 1000, 1, 2)`

```
## [1] 1000.000 1108.185 1236.068 1390.457 1581.989 1828.427 2162.278 2651.484
## [9] 3472.136 5324.555 Inf
```

`rGenPareto(20, 1000, 1, 2)`

```
## [1] 3147.274 1734.730 2123.732 1595.111 1197.549 1532.917 8691.040 6366.773
## [9] 1131.391 1651.110 3682.903 1401.417 2538.556 6660.384 1643.598 1193.262
## [17] 1390.637 1073.550 1500.037 3276.521
```

`GenPareto_Layer_Mean(4000, 1000, t = 500, alpha_ini = 1, alpha_tail = 2)`

`## [1] 484.8485`

`GenPareto_Layer_Var(4000, 1000, t = 500, alpha_ini = 1, alpha_tail = 2)`

`## [1] 908942.5`

Let \(t>0\) and \(\alpha_\text{ini}, \alpha_\text{tail}>0\) and let \(X_i\sim \text{GenPareto}(t,\alpha_\text{ini}, \alpha_\text{tail})\). For known \(t\) the parameters \(\alpha_\text{ini}, \alpha_\text{tail}\) can be estimated with maximum likelihood.

*Example:*

Generalized Pareto distributed losses with \(t:=1000\) and \(\alpha_\text{ini}=1\), \(\alpha_\text{tail}=2\):

```
<- rGenPareto(10000, t = 1000, alpha_ini = 1, alpha_tail = 2)
losses GenPareto_ML_Estimator_Alpha(losses, 1000)
```

`## [1] 0.9767616 2.0160404`

Reporting thresholds and censoring of losses can be taken into
account as described for the function
`Pareto_ML_Estimator_Alpha`

.

```
<- rGenPareto(5000, t = 1000, alpha_ini = 1, alpha_tail = 2)
losses_1 <- rGenPareto(5000, t = 1000, alpha_ini = 1, alpha_tail = 2)
losses_2 <- losses_2 > 3000
reported <- losses_2[reported]
losses_2 <- c(losses_1, losses_2)
losses GenPareto_ML_Estimator_Alpha(losses, 1000)
```

`## [1] 0.6665051 2.0286163`

```
<- rep(1000, length(losses_1))
reporting_thresholds_1 <- rep(3000, length(losses_2))
reporting_thresholds_2 <- c(reporting_thresholds_1, reporting_thresholds_2)
reporting_thresholds GenPareto_ML_Estimator_Alpha(losses, 1000,
reporting_thresholds = reporting_thresholds)
```

`## [1] 0.9997103 1.9035688`

```
<- sample(c(2500, 5000, 10000), length(losses), replace = T)
limits <- losses > limits
censored <- limits[censored]
losses[censored] <- losses > reporting_thresholds
reported <- losses[reported]
losses <- reporting_thresholds[reported]
reporting_thresholds <- censored[reported]
censored GenPareto_ML_Estimator_Alpha(losses, 1000,
reporting_thresholds = reporting_thresholds)
```

`## [1] 0.8992253 6.0849359`

```
GenPareto_ML_Estimator_Alpha(losses, 1000,
reporting_thresholds = reporting_thresholds,
is.censored = censored)
```

`## [1] 0.9823981 1.9917070`

Let \(X\sim \text{GenPareto}(t,
\alpha_\text{ini}, \alpha_\text{tail})\) and \(T>t\). Then \(X|(X<T)\) has a *truncated
generalized Pareto distribution*. The Pareto functions mentioned
above are also available for the truncated generalized Pareto
distribution.

A `PGP_Model`

object contains the information required to
specify a collective model with a Panjer distributed claim count and a
generalized Pareto distributed severity.

**Claim count distribution:** Like in a
`PPP_Model`

the claim count distribution from the Panjer
class is specified by the expected frequency \(E(N)\) (attribute `FQ`

of the
object) and the dispersion \(D(N):=Var(N)/E(N)\) (attribute
`dispersion`

of the object).

**Severity distribution:** The generalized Pareto
distribution is specified by the parameters `t`

,
`alpha_ini`

, `alpha_tail`

and
`truncation`

.

A `PPP_Model`

object can be created using the constructor
function:

```
<- PGP_Model(FQ = 2, t = 1000, alpha_ini = 1, alpha_tail = 2,
PGPM truncation = 10000, dispersion = 1.5)
PGPM
```

```
##
## Panjer & Generalized Pareto model
##
## Collective model with a Negative Binomial distribution for the claim count and a generalized Pareto distributed severity.
##
## Negative Binomial Distribution:
## Expected Frequency: 2
## Dispersion: 1.5 (i.e. contagion = 0.25)
## Generalized Pareto Distribution:
## Threshold: 1000
## alpha_ini: 1
## alpha_tail: 2
## Truncation: 10000
##
## Status: 0
## Comments: OK
```

For PGP_Models the same methods are available as for PPP_Models:

```
<- PGP_Model(FQ = 2, t = 1000, alpha_ini = 1, alpha_tail = 2,
PGPM truncation = 10000, dispersion = 1.5)
Layer_Mean(PGPM, 4000, 1000)
```

`## [1] 2484.33`

`Layer_Sd(PGPM, 4000, 1000)`

`## [1] 2756.15`

`Layer_Var(PGPM, 4000, 1000)`

`## [1] 7596365`

```
<- c(0, 1000, 2000, 5000, 10000, Inf)
thresholds Excess_Frequency(PGPM, thresholds)
```

`## [1] 2.0000000 2.0000000 0.8509022 0.1614435 0.0000000 0.0000000`

`Simulate_Losses(PGPM, 10)`

```
## [,1] [,2] [,3]
## [1,] 1851.611 NaN NaN
## [2,] NaN NaN NaN
## [3,] 2617.298 NaN NaN
## [4,] 1113.183 1079.235 3369.614
## [5,] 1378.958 1741.678 1333.982
## [6,] NaN NaN NaN
## [7,] 2213.376 1957.310 NaN
## [8,] 1912.848 1715.038 NaN
## [9,] 1618.580 NaN NaN
## [10,] 1266.242 2168.332 4114.312
```

Fackler, M. (2013) Reinventing Pareto: Fits for both small and large losses. ASTIN Colloquium Den Haag

Johnson, N.L., and Kotz, S. (1970) Continuous Univariate Distributions-I. Houghton Mifflin Co

Philbrick, S.W. (1985) A Practical Guide to the Single Parameter Pareto Distribution. PCAS LXXII: 44–84

Riegel, U. (2008) Generalizations of common ILF models. Bl"{a}tter der DGVFM 29: 45–71

Riegel, U. (2018) Matching tower information with piecewise Pareto. European Actuarial Journal 8(2): 437–460

Schmutz, M., and Doerr, R.R. (1998) Das Pareto-Modell in der Sach-Rueckversicherung. Formeln und Anwendungen. Swiss Re Publications, Zuerich