---
title: "Inside kofn: Building on the rlang MLE Stack"
output: rmarkdown::html_vignette
vignette: >
  %\VignetteIndexEntry{Inside kofn: Building on the rlang MLE Stack}
  %\VignetteEngine{knitr::rmarkdown}
  %\VignetteEncoding{UTF-8}
---

```{r setup, include = FALSE}
knitr::opts_chunk$set(
  collapse = TRUE,
  comment = "#>",
  fig.width = 7,
  fig.height = 4.5
)
```

## The design in one paragraph

`kofn` is a small package.  Most of what a user sees when working with
it, the inference methods, the optimizer composition, the ability to
combine independent fits, the bridge to distribution algebra, is not
defined inside `kofn`.  It comes from four sibling packages that sit
underneath `kofn` in a deliberately layered stack.  This vignette is a
guided tour of that stack using a `kofn` model as the running example.
The point is to show what `kofn` had to specialize (the
domain-specific content) and what it gets for free by conforming to
ecosystem-wide generics.

The layering is a design choice, not an accident.  Building
reliability packages on a uniform substrate means the next package up
the chain, `dagsys` for general coherent systems, or a future
`maskedparallel`, can reuse the same inference machinery without
rebuilding it.  Specialization only enters when a new concept genuinely
cannot be expressed in the existing vocabulary.

## The stack

```
  kofn                 (this package: k-out-of-m censoring MLE)
    |
  likelihood.model     (likelihood calculus as a set of generics)
  compositional.mle    (MLE solvers as first-class composable objects)
    |
  algebraic.mle        (MLE objects with asymptotic inference)
    |
  algebraic.dist       (probability distribution algebra)
```

Each layer crystallizes a single concept:

| Layer | Concept |
|---|---|
| `algebraic.dist`    | A probability distribution is a first-class value.  `normal(0, 1) + normal(0, 1)` is a `normal(0, 2)`.  Distribution arithmetic simplifies when it can and falls back to Monte Carlo when it cannot. |
| `algebraic.mle`     | An MLE is not a number.  It is an object with an asymptotic covariance.  `confint`, `vcov`, `AIC`, `bias`, `mse`, `pred`, and `combine` are defined on it. |
| `likelihood.model`  | A likelihood model is specified by generics (`loglik`, `score`, `hess_loglik`, `fim`, `rdata`, `assumptions`).  A package that implements these generics gets Fisherian inference (support functions, relative likelihoods, likelihood intervals) for free. |
| `compositional.mle` | A solver is a function.  Functions compose.  `lbfgsb() %>>% nelder_mead()` is a sequential chain; `lbfgsb() %|% nelder_mead()` is a parallel race.  `with_restarts(lbfgsb(), n = 5)` is a wrapper. |

The `compositional.mle` package has its own vignette, `mle-ecosystem`,
that walks through the layers from the other direction.  Read it if
you want the bottom-up view.  This vignette is top-down: start with a
`kofn` model and peel back the layers.

## A running example

We will use a 3-component series system.  The system fails when the
first component fails (series, $k = 1$); component lifetimes are
i.i.d. exponential with unknown rates.  Series is the cleanest case
for demonstrating the ecosystem generics; richer configurations (and
the identifiability issues that come with them) are the subject of
other vignettes.

```{r example-setup}
library(kofn)
library(flexhaz)
set.seed(42)

model <- kofn(k = 1, m = 3, component = dfr_exponential())
gen   <- rdata(model)
df    <- gen(theta = c(0.3, 0.8, 1.5), n = 100)
head(df, 3)
```

The object `model` is our entry point into the ecosystem.  It is an S3
object inheriting from both `kofn` and `likelihood_model`.

```{r model-class}
class(model)
```

That second class, `likelihood_model`, is the contract: `kofn` has
promised to implement the likelihood calculus as generics.

## Layer 1: the likelihood calculus (`likelihood.model`)

`likelihood.model` exports four small generics: `loglik`, `score`,
`hess_loglik`, and `fim`.  Each one is a *closure factory*: you pass a
model and get back a function that takes data.

```{r likelihood-generics}
ll <- loglik(model)                  # ll: function(df, par) -> numeric
sc <- score(model)                   # sc: function(df, par) -> gradient
H  <- hess_loglik(model)             # H:  function(df, par) -> Hessian

par0 <- c(0.3, 0.8, 1.5)
ll(df, par0)
sc(df, par0)
```

The package implements `loglik.exp_kofn` and `score.exp_kofn` as S3
methods.  The generic dispatch is what makes it possible for any other
package, one that only knows about `likelihood_model` objects, to
consume a `kofn` model without knowing anything about k-out-of-m
systems.  This is the payoff of conforming to a published interface.

`likelihood.model` also provides `rdata` (a closure factory for
simulating data from the model) and `assumptions` (a description of
what the model assumes).  `kofn` implements both.

```{r rdata-assumptions}
assumptions(model)[1:3]              # first 3 assumptions
```

## Layer 2: the optimizer (`compositional.mle`)

`fit(model)` returns a closure that optimizes the likelihood.
Internally it delegates to `kofn::solve_mle`, which wraps
`compositional.mle`'s L-BFGS-B primary solver with a log-scale
Nelder-Mead fallback.  That detail is intentionally opaque to the
caller.

```{r fit-default}
fit_fn <- fit(model)
res <- fit_fn(df, n_starts = 1L)
round(coef(res), 3)
```

What if you want a different optimization strategy?  You do not have
to write a new solver.  You compose existing ones:

```{r compose-solver}
library(compositional.mle)

# Build the problem yourself to use compositional.mle directly
prob <- mle_problem(
  loglike    = function(par) ll(df, par),
  constraint = mle_constraint(support = function(par) all(par > 0))
)

# Sequential: grid search as a warm start, then L-BFGS-B for polish
strategy <- grid_search(lower = rep(0.05, 3),
                         upper = rep(2.0, 3), n = 3) %>>%
            lbfgsb(lower = rep(1e-10, 3))
res_chain <- strategy(prob, theta0 = rep(0.5, 3))
round(coef(res_chain), 3)
```

The `%>>%` operator is **sequential composition**: run the first
solver, pipe its estimate into the second.  There is also `%|%`
(**parallel race**), which runs two solvers in parallel and keeps
whichever gets a higher log-likelihood:

```{r parallel-race}
race <- lbfgsb(lower = rep(1e-10, 3)) %|% nelder_mead()
res_race <- race(prob, theta0 = rep(0.5, 3))
round(coef(res_race), 3)
```

Neither composition operator is defined in `kofn`.  They apply to any
MLE problem specified through `mle_problem`.  `kofn` simply built its
likelihood to be compatible.

## Layer 3: the MLE object (`algebraic.mle`)

Fit results carry inference operations.  All of the following come
from `algebraic.mle` (or methods that `algebraic.mle` registers on its
result classes), not from `kofn`:

```{r mle-inference}
library(algebraic.mle)

round(coef(res), 3)                  # point estimates
round(se(res), 3)                    # standard errors
round(confint(res), 3)[1:3, ]        # 95% CIs, first three params
logLik(res)                          # log-likelihood
AIC(res); BIC(res)                   # information criteria
nobs(res); nparams(res)              # bookkeeping
```

Beyond the classics, `algebraic.mle` supplies inference-theoretic
machinery:

```{r mle-advanced}
round(bias(res), 4)                  # O(1/n) bias (zero for MLE at truth)
round(observed_fim(res), 1)[1:3, 1:3]  # observed Fisher information
```

The payoff for bundling these under one class becomes clear when we
compose estimates.  Suppose we run the same experiment twice
(independent replicates) and want to pool them:

```{r combine-estimates}
df2  <- gen(theta = c(0.3, 0.8, 1.5), n = 100)
res1 <- fit_fn(df,  n_starts = 1L)
res2 <- fit_fn(df2, n_starts = 1L)

pooled <- combine(res1, res2)        # inverse-variance weighted
round(coef(pooled), 3)
round(se(pooled), 3)
nobs(pooled)                         # 200 = 100 + 100
```

`combine` is not a `kofn` function.  It is defined in `algebraic.mle`
for any MLE object whose asymptotic covariance is known.  `kofn` gets
it because it uses the ecosystem's currency.

## Layer 4: the distribution bridge (`algebraic.dist`)

MLE results are asymptotically normal.  `algebraic.mle::as_dist`
converts any MLE into the corresponding multivariate normal
distribution, at which point the full `algebraic.dist` algebra
applies:

```{r as-dist}
d <- as_dist(res)
class(d)
```

This matters when you want to reason about the sampling distribution
of derived quantities.  For a redundant system, the mean time to the
first failure is a nonlinear function of the rates; its sampling
distribution can be obtained through the delta method, but
`algebraic.mle::rmap` does the bookkeeping automatically:

```{r rmap-delta}
# g(lambda) = 1/sum(lambda): mean time to first failure (series system)
g <- function(lam) 1 / sum(lam)
res_mttff <- rmap(res, g)
round(coef(res_mttff), 3)            # point estimate
round(se(res_mttff), 4)              # delta-method SE
```

The delta method is general.  Nothing above is specific to `kofn`.

## What had to be specialized

Everything in the previous four sections is generic ecosystem code.
So what did `kofn` actually have to write?  Three kinds of thing:

**1. The likelihood itself**, which encodes k-out-of-m censoring:

```r
# R/exp_kofn.R (simplified)
loglik.exp_kofn <- function(model) {
  function(df, par) {
    # Inclusion-exclusion expansion for k = m (parallel).
    # Critical-state enumeration for general k.
    # Interval-censored contributions for Scheme 1 rows.
    ...
  }
}
```

This is where the domain work lives: the IE expansion for exponential
parallel, the critical-state enumeration for general k, the truncated
Weibull moments for the EM E-step, and the interval-censoring
contributions for Scheme 1.  These could not be inherited because
they encode the physics of k-out-of-m censoring.

**2. Data-generating processes**.  `rdata.exp_kofn` and
`rdata.wei_kofn` implement the forward simulator (generate component
lifetimes, compute system failure time, record censoring pattern).
Trivial in principle, but the package has to know the model.

**3. Custom observation schemes**.  `loglik_scheme1` (periodic
inspection) and `loglik_masked` (partial autopsy) are two extra
observation mechanisms that require their own likelihood construction.
Each of them, once written, plugs back into the same generics above.

Everything else, the solver, the inference, the composition, the
distributional bridge, came from the ecosystem.

## Extending the stack

The same interface is what makes further specialization cheap.  A
future `dagsys` package, for arbitrary coherent systems described as
DAGs of components, would do exactly what `kofn` does:

1. Define a new S3 class (e.g., `dag_system`) inheriting from
   `likelihood_model`.
2. Implement `loglik.dag_system`, `score.dag_system`, and
   `rdata.dag_system`.
3. Arrange for `fit.dag_system` to call `solve_mle` (or a custom
   `compositional.mle` strategy).

Step 3 alone brings in `coef`, `vcov`, `confint`, `summary`,
`logLik`, `AIC`, `BIC`, `bias`, `mse`, `combine`, `rmap`, `as_dist`,
and all of distribution algebra.  The specialization is only the
domain-specific likelihood, which is where the real work is anyway.

This is the bet the rlang stack is making: keep the generic layer
small and stable, and the specialization layer earns the right to
exist by introducing a genuinely new concept.  `kofn` earned it by
introducing k-out-of-m censoring.  `maskedcauses` earned it by
introducing series-system masked cause of failure and publishing the
`series_md` protocol (an S3 class specialising `likelihood_model`)
that `maskedhaz` then implements against a different hazard class.
The shared infrastructure is what lets both the protocol package and
its alternative implementations stay small.

## Further reading

- `vignette("mle-ecosystem", package = "compositional.mle")` walks
  the same stack bottom-up with simpler examples.
- `vignette("getting-started", package = "likelihood.model")` goes
  deeper on the likelihood calculus generics.
- `vignette("algebraic-mle-integration", package = "likelihood.model")`
  covers the `fisher_mle` class and how it inherits from `mle`.
- `vignette("strategy-design", package = "compositional.mle")` details
  the composition operators and why SICP-style closure matters for
  solvers.