Parameter Estimation of the Geometric Magnitude Distribution

2024-01-15

Introduction

The geometric distribution of meteor magnitudes is a frequently used statistical model to describe the real magnitude distribution of a meteor shower. The observable magnitude distribution of meteors is then \[ {\displaystyle P[M = m] \sim f(m) \, \mathrm r^{-m}} \,\mathrm{,} \] where m >= -0.5 is the difference between the limiting magnitude and the meteor magnitude. f(m) is the perception probability function.

The estimation of the population index r, briefly called the r-value, is a common task in the evaluation of meteor magnitudes. Here we demonstrate two methods for unbiased estimation of this parameter.

First, we obtain some magnitude observations from the example data set, which also includes the limiting magnitude.

observations <- with(PER_2015_magn$observations, {
    idx <- !is.na(lim.magn) & sl.start > 135.81 & sl.end < 135.87
    data.frame(
        magn.id = magn.id[idx],
        lim.magn = lim.magn[idx]
    )
})
head(observations, 5) # Example values
magn.id lim.magn
225413 5.30
225432 5.95
225438 6.01
225449 6.48
225496 5.50

Next, the observed meteor magnitudes are matched with the corresponding observations. This is necessary as we need the limiting magnitudes of the observations to determine the r-value.

Using

magnitudes <- merge(
  observations,
  as.data.frame(PER_2015_magn$magnitudes),
  by = 'magn.id'
)
magnitudes$magn <- as.integer(as.vector(magnitudes$magn))
head(magnitudes[magnitudes$Freq>0,], 5) # Example values

we obtain a data frame with the absolute observed frequencies Freq for each observation of a magnitude class:

magn.id lim.magn magn Freq
9 225413 5.30 4 1.0
11 225413 5.30 1 2.0
14 225413 5.30 3 3.0
15 225432 5.95 4 2.0
17 225432 5.95 3 1.5

This data frame contains a total of 97 meteors. This is a sufficiently large number to estimate the r-value.

Maximum Likelihood Method I

The maximum likelihood method can be used to estimate the r-value in an unbiased manner. For this, the function dvmgeom() is needed, which returns the probability density of the observable meteor magnitudes when the r-value and the limiting magnitudes are known.

The following algorithm estimates the r-value by maximizing the likelihood with the optim() function. The function ll returns the negative log-likelihood, as optim() identifies a minimum. The expression subset(magnitudes, (magnitudes$lim.magn - magnitudes$magn) > -0.5 ensures that meteors fainter than the limiting magnitude are not used if they exist.

# maximum likelihood estimation (MLE) of r
result <- with(subset(magnitudes, (magnitudes$lim.magn - magnitudes$magn) > -0.5), {
    # log likelihood function
    ll <- function(r) -sum(Freq * dvmgeom(magn, lim.magn, r, log=TRUE))
    r.start <- 2.0 # starting value
    r.lower <- 1.2 # lowest expected value
    r.upper <- 4.0 # highest expected value
    # find minimum
    optim(r.start, ll, method='Brent', lower=r.lower, upper=r.upper, hessian=TRUE)
})

This gives the expected value and the variance of the r-value:

r.mean <- result$par # mean of r
print(r.mean)
#> [1] 2.34661
r.var <- 1/result$hessian[1][1] # variance of r
print(r.var)
#> [1] 0.0171495

Maximum Likelihood Method II

With the maximum likelihood method, it can be demonstrated that the mean difference between meteor magnitudes and the limiting magnitude is an unbiased estimator for the r-value. This mean is straightforward to calculate:

m.mean <- with(magnitudes, sum((lim.magn - magn) * Freq)/sum(Freq))
print(m.mean)
#> [1] 3.353402

Similarly, its variance is:

m.var <- with(magnitudes, {
    n <- sum(Freq)
    sum((lim.magn - magn - m.mean)^2 * Freq)/((n-1) * n)
})
print(m.var)
#> [1] 0.02248767

We can easily determine the mean for an r-value using the Laplace transform of the perception probabilities by setting s=log(r). However, since we aim to inversely determine the r-value from the mean value, we first generate the necessary values and then employ the splinefun() function for interpolation:

r.mean.fun <- with(new.env(), {
    r <- seq(1.3, 3.5, 0.1)
    s <- log(r)
    m.mean <- -vmperception.l(s, deriv.degree = 1L)/vmperception.l(s)
    splinefun(m.mean, r)
})

This approach yields the r-value as follows:

r.mean <- r.mean.fun(m.mean)
print(r.mean)
#> [1] 2.349886

Assuming that the mean is normally distributed and that the variance of magnitudes m.var is small, we can obtain the variance of the r-value:

r.var <- r.mean.fun(m.mean, deriv = 1L)^2 * m.var
print(r.var)
#> [1] 0.01182143

The method described herein for estimating the r-value offers an advantage over the previous method. It is not only more straightforward to execute but also less computationally demanding.

Residual Analysis

So far, we have operated under the assumption that the real distribution of meteor magnitudes is exponential and that the perception probabilities are accurate. We now use the Chi-Square goodness-of-fit test to check whether the observed frequencies match the expected frequencies. Then, using the estimated r-value, we retrieve the relative frequencies p for each observation and add them to the data frame magnitudes:

magnitudes$p <- with(magnitudes, dvmgeom(m = magn, lm = lim.magn, r.mean))

We must also consider the probabilities for the magnitude class with the brightest meteors.

magn.min <- min(magnitudes$magn)

The smallest magnitude class magn.min is -6. In calculating the probabilities, we assume that the magnitude class -6 contains meteors that are either brighter or equally bright as -6 and thus use the function pvmgeom() to determine their probability.

idx <- magnitudes$magn == magn.min
magnitudes$p[idx] <- with(
    magnitudes[idx,],
    pvmgeom(m = magn + 1L, lm = lim.magn, r.mean, lower.tail = TRUE)
)

This ensures that the probability of observing a meteor of any given magnitude is 100%. This is known as the normalization condition. Accordingly, the Chi-Square goodness-of-fit test will fail if this condition is not met.

We now create the contingency table magnitutes.observed for the observed meteor magnitudes and its margin table.

magnitutes.observed <- xtabs(Freq ~ magn.id + magn, data = magnitudes)
magnitutes.observed.mt <- margin.table(magnitutes.observed, margin = 2) 
print(magnitutes.observed.mt)
#> magn
#>   -6   -5   -4   -3   -2   -1    0    1    2    3    4    5    6    7 
#>  0.0  0.0  0.0  0.0  3.0  4.0  7.0 10.0 23.0 26.5 20.0  3.0  0.5  0.0

Next, we check which magnitude classes need to be aggregated so that each contains at least 10 meteors, allowing us to perform a Chi-Square goodness-of-fit test.

The last output shows that meteors of magnitude class 0 or brighter must be combined into a magnitude class 0-. Meteors with a brightness less than 4 are grouped here in the magnitude class 4+, and a new contingency table magnitudes.observed is created:

magnitudes$magn[magnitudes$magn <= 0] <- '0-'
magnitudes$magn[magnitudes$magn >= 4] <- '4+'
magnitutes.observed <- xtabs(Freq ~ magn.id + magn, data = magnitudes)
print(margin.table(magnitutes.observed, margin = 2))
#> magn
#>   0-    1    2    3   4+ 
#> 14.0 10.0 23.0 26.5 23.5

We now need the corresponding expected relative frequencies

magnitutes.expected <- xtabs(p ~ magn.id + magn, data = magnitudes)
magnitutes.expected <- magnitutes.expected/nrow(magnitutes.expected)
print(sum(magnitudes$Freq) * margin.table(magnitutes.expected, margin = 2))
#> magn
#>       0-        1        2        3       4+ 
#> 13.03208 13.09614 19.07151 21.27210 30.52817

and then carry out the Chi-Square goodness-of-fit test:

chisq.test.result <- chisq.test(
    x = margin.table(magnitutes.observed, margin = 2),
    p = margin.table(magnitutes.expected, margin = 2)
)

As a result, we obtain the p-value:

print(chisq.test.result$p.value)
#> [1] 0.3406622

If we set the level of significance at 5 percent, then it is clear that the p-value with 0.3406622 is greater than 0.05. Thus, under the assumption that the magnitude distribution follows an geometric meteor magnitude distribution and assuming that the perception probabilities are correct (i.e., error-free or precisely known), the assumptions cannot be rejected. However, the converse is not true; the assumptions may not necessarily be correct. The total count of meteors here is too small for such a conclusion.

To verify the p-value, we also graphically represent the Pearson residuals:

chisq.test.residuals <- with(new.env(), {
    chisq.test.residuals <- residuals(chisq.test.result)
    v <- as.vector(chisq.test.residuals)
    names(v) <- rownames(chisq.test.residuals)
    v
})

plot(
    chisq.test.residuals,
    main="Residuals of the chi-square goodness-of-fit test",
    xlab="m",
    ylab="Residuals",
    ylim=c(-3, 3),
    xaxt = "n"
)
abline(h=0.0, lwd=2)
axis(1, at = seq_along(chisq.test.residuals), labels = names(chisq.test.residuals))