This vignette will cover pure IBNR functions of the NetSimR package. It will overview the theory behind these functions, propose possible uses and illustrate examples. The reporting delay distributions supported are LogNormal and Gamma. An application of the pure IBNR functions is suggested in the article “Escaping the triangle” accessible here.

Reporting delay is the time difference between when the accident occured and when the insurer was notified. Reporting delay is a positive and positively skewed variable. It can me modelled with a LogNormal or Gamma regression. As suggested by the article cited in the introduction, estimating reserves can be more accurate if the reserves are split to pure IBNR and IBNER. Also note tha reporting delay is a right truncated variable. The closer the exposure period to the valuation date, the more the truncation. As such reporting delay would be better modelled by a right truncated regression.

Pure IBNR refers to claims not reported yet to the insurer, while IBNER refers to the over/under estimation in the outstanding part of reported claims. Together the two form the insurer’s reserves for claims occured. Unearned premium will also be calculated by the functions provided, and this relates to future claims from business already written. If we split the period to days, one can think of the reporting delay cumulative distribution function as the proportion of the day earned net of pure IBNR, based on the maximum reporting delay (Maximum reporting delay would be the difference between the reference day and the valuation day). As integration is equivalent to summing the function values, when the partition size is one, one can derive that integrating from the maximum to the minimum reporting delay calculates the pure IBNR earned exposure for the period. This can be proportioned with the full duration and the earned duration, to provide different metrics.

In this section we will suggest possible uses of the specified functions.

The function will input the inception, expiry, valuation dates and estimate the proportions unearned and allocated to pure IBNR. The pure IBNR proportion multiplied by the frequency-severity from exposure rating, can provide an accurate estimate of the reserve. Alternatively mutliplying this with the premium and a loss ratio proxy will provide an acceptable approximation of the reserve. The proportion of the unearned exposure multiplied by the premium will provide the unearned premium reserve. One can use the exposure net of unearned period and pure IBNR as the offset (exposure) for modelling frequency in exposure or experience rating. This will be the modellers choice rather than removing the last periods in situations where data is less. The pure IBNR and unearned exposures can be used for simulating stochastic claim reserves. Following the above, the analyst can incorporate the unearned and pure IBNR exposure models to dashboards, hence automate estimating the reserves. For all the above this method is able to drill down individual segments and be insensitive to portfolio mix changes, when the reporting delay is modelled with generalised linear models. When this method is combined with a truncated regression, the method can adjust for reporting delay development for the latest periods. The method can also reveal patterns otherwise invisible to the analyst. Reinsurance structures are easier to be implemneted on top of this method compared to triangles.

In the below example we will following these steps:

- Construct a policy dataset with policy inception/expiry dates
- Generate claims
- Generate accident and reporing dates, through a reporting delay disribution
- Derive reported claims up to the valuation date
- Model reporting delay
- Use the function of the NetSimR package to estimate PureIBNR duration
- Model claim frequency with adjusted claim duration and compare modelling without adjusting
- Predict claim frequency for pure IBNR and unearned duration and compare with our data
- Generate a chain ladder estimate for pure IBNR claims count and compare with the data and the GLM approach
- Simulate the pure IBNR reserve distribution

```
library("NetSimR")
#set the seed
set.seed(0)
#Set the reporting delay distribution parameters
<- 3
mu <- 1
sigma
#Generate reporting delays and preview the distribution
<-rlnorm(1000,mu,sigma)
xsummary(x)
```

```
## Min. 1st Qu. Median Mean 3rd Qu. Max.
## 0.7895 9.8902 18.9372 32.7484 39.9504 526.5859
```

`hist(x, breaks = 100)`

```
#Generate dates data
<-100
PoliciesPerDay<-365*2
PeriodLength<-as.Date("2011/1/1")
InceptionDateStart<-seq(InceptionDateStart, by = "day", length.out = PeriodLength)
InceptionDate<-rep(InceptionDate,PoliciesPerDay)
InceptionDate#assume all policies are yearly
<-365
DayPolicyDuration<-InceptionDate+DayPolicyDuration
ExpiryDate<-max(InceptionDate)
ValuationDate<-data.frame(InceptionDate,ExpiryDate)
Datasummary(Data)
```

```
## InceptionDate ExpiryDate
## Min. :2011-01-01 Min. :2012-01-01
## 1st Qu.:2011-07-02 1st Qu.:2012-07-01
## Median :2011-12-31 Median :2012-12-30
## Mean :2011-12-31 Mean :2012-12-30
## 3rd Qu.:2012-07-01 3rd Qu.:2013-07-01
## Max. :2012-12-30 Max. :2013-12-30
```

` ValuationDate`

`## [1] "2012-12-30"`

```
#Generate Claims Counts Data and reporting delays
#Assume only 1 claim per policy possible with probability of claiming 10%
<-0.1
ClaimFreq$ClaimCount<-ifelse(runif(nrow(Data))<ClaimFreq,1,0)
Data#generate accident date assuming uniformly happening throughout the year
$AccidentDate<-Data$InceptionDate+runif(nrow(Data))*(Data$ExpiryDate-Data$InceptionDate)
Data#generate reporting delays
$ReportingDelay<-round(Data$ClaimCount*rlnorm(nrow(Data),mu,sigma),1)
Data#generate reporting days
$NotificationDate<-Data$AccidentDate+Data$ReportingDelay
Data#remove claims reported beyond valuation day
$ClaimReportedBeforeValuation<-ifelse(Data$NotificationDate>ValuationDate,0,Data$ClaimCount)
Data#Maximum reporting delays
$MaxReportingDelay<-as.numeric(ValuationDate-Data$AccidentDate)+1
Data
#Earned Duration
$EarnedDuration<-ifelse(Data$ExpiryDate>ValuationDate,as.numeric(ValuationDate-Data$InceptionDate),as.numeric(Data$ExpiryDate-Data$InceptionDate))/365
Data
#Filter claims only data
<-Data[Data$ClaimReportedBeforeValuation==1,]
ModelData
#model reporting delays with a linear model
<-lm(log(ModelData$ReportingDelay)~1)
lm#summary(lm)
cbind(coefficients(lm),summary(lm)$sigma)
```

```
## [,1] [,2]
## (Intercept) 2.916398 0.958557
```

```
#model reporting delays with a truncated model (splitting to monthly periods would be more accurate in terms of maximum reporting delay)
library("crch")
<-crch(log(ModelData$ReportingDelay)~1, truncated = TRUE, right = log(ModelData$MaxReportingDelay), dist = "gaussian", link.scale = "identity")
rTLm#summary(rTLm)
coefficients(rTLm)
```

```
## (Intercept) (scale)_(Intercept)
## 2.9752603 0.9919743
```

```
#Generate and predict pure IBNR and UPR Claims
$PureIBNRClaims<-ifelse(Data$AccidentDate>ValuationDate,0,Data$ClaimCount-Data$ClaimReportedBeforeValuation)
Data$UPRClaims<-Data$ClaimCount-Data$ClaimReportedBeforeValuation-Data$PureIBNR
Data<-PureIBNRLNorm(Data$InceptionDate,Data$ExpiryDate,ValuationDate,coefficients(rTLm)[1],coefficients(rTLm)[2])
PureIBNRPrediction$PredictedPureIBNRYears<-PureIBNRPrediction$PureIBNRDuration/DayPolicyDuration
Data$PredictedUPRYears<-PureIBNRPrediction$UnearnedDuration/DayPolicyDuration
Data$NetEarnedDuration<-1-Data$PredictedUPRYears-Data$PredictedPureIBNRYears+0.00001
Data
#Predict Claim Frequency with reduced exposure to adjust for pure IBNR
<-glm(Data$ClaimReportedBeforeValuation~1, offset = log(Data$NetEarnedDuration), family = "poisson")
AdjFreqGLM#summary(AdjFreqGLM)
<-exp(AdjFreqGLM$coefficients)
PredictedFrequency
#Compare as if not adjusting for pure IBNR
<-glm(Data$ClaimReportedBeforeValuation~1, offset = log(Data$EarnedDuration+0.00001), family = "poisson")
UnAdjFreqGLM#summary(UnAdjFreqGLM)
cbind(AdjustedModel=round(exp(AdjFreqGLM$coefficients),3),UnAdjustedModel=round(exp(UnAdjFreqGLM$coefficients),3),Actual=ClaimFreq)
```

```
## AdjustedModel UnAdjustedModel Actual
## (Intercept) 0.1 0.094 0.1
```

```
#Compare UPR Claim predictions
cbind(Actual=sum(Data$UPRClaims),Predicted=round(sum(Data$PredictedUPRYears)*PredictedFrequency,0))
```

```
## Actual Predicted
## (Intercept) 1888 1822
```

```
#Compare Pure IBNR Claim predictions
cbind(Actual=sum(Data$PureIBNRClaims),Predicted=round(sum(Data$PredictedPureIBNRYears)*PredictedFrequency,0), Theoretical=327)
```

```
## Actual Predicted Theoretical
## (Intercept) 344 318 327
```

```
#Theoretical comes from re-running with the theoretical parameter values
#Compare Pure IBNR predictions with Chain Ladder
<-sum(Data[Data$AccidentDate<as.Date("2012/1/1") & Data$NotificationDate<as.Date("2012/1/1"), ]$ClaimReportedBeforeValuation)
ClaimsAY1RY1
<-sum(Data[Data$AccidentDate<as.Date("2012/1/1") & Data$NotificationDate<as.Date("2013/1/1"), ]$ClaimReportedBeforeValuation)
ClaimsAY1RY2
<-sum(Data$ClaimReportedBeforeValuation)-ClaimsAY1RY1-ClaimsAY1RY2
ClaimsAY2RY1
<-(ClaimsAY1RY2/ClaimsAY1RY1-1)*ClaimsAY2RY1
PredictedPureIBNRClaimsChainLadder
cbind(Actual=sum(Data$PureIBNRClaims),PredictedGLM=round(sum(Data$PredictedPureIBNRYears)*PredictedFrequency,0),PredictedCL=round(PredictedPureIBNRClaimsChainLadder,0))
```

```
## Actual PredictedGLM PredictedCL
## (Intercept) 344 318 321
```

```
#Note: Chain ladder can predict better if periods brake down to months or quarters
#Sliced LogNormal-Pareto claim Severity assumption
=5.6
mu=1.65
sigma=10000
s=1.5
alpha
#Simulate the reserve distribution
=10000
numberOfSimulations<- numeric(length = numberOfSimulations)
SimulatedYears <- sum(Data$PredictedPureIBNRYears)*PredictedFrequency
ExpectedFrequency
for (i in 1 :numberOfSimulations){
=round(sum(qSlicedLNormPareto(runif(rpois(1,ExpectedFrequency)),mu,sigma,s,alpha)),0)
SimulatedYears[i]
}
#Visualising pure IBNR reserve distribution
head(SimulatedYears)
```

`## [1] 389186 507867 1628801 236865 471060 437146`

`hist(SimulatedYears, breaks = 1000, xlim = c(0,1.5e6))`

`summary(SimulatedYears)`

```
## Min. 1st Qu. Median Mean 3rd Qu. Max.
## 157777 293934 335702 373093 394701 25837662
```

`#Note: the strength of the method illustrated is that it can have GLM factors for reporting delay, frequency and severity, and run the simulations per risk`