Principal Curves of Oriented Points

Rpcop computes principal curves of oriented points (PCOP) following Delicado (2001) and Delicado and Huerta (2003). The package wraps the original C++ implementation with an R interface and returns both the original PCOP table and a princurve-style projected curve.

The main function is pcop(). It accepts a finite numeric matrix or numeric data frame, where rows are observations and columns are coordinates.

library(Rpcop)

set.seed(1)
n <- 120
t <- runif(n, -1, 1)
x <- cbind(t, t^2 + rnorm(n, sd = 0.08))

fit <- pcop(x, Ch = 1.5, Cd = 0.3, plot.true = FALSE)
names(fit)
## [1] "pcop.f1"     "pcop.f2"     "parameters"  "input_names" "call"
head(fit$pcop.f1)
##        param      dens      span    orth.var       pop1       pop2   pr.dir1
## 1 -1.0690409 0.2503839 0.2083333 0.002938929 -0.7546700 0.57376523 0.6164722
## 2 -0.9314294 0.3057348 0.2166667 0.003106390 -0.6708639 0.46461640 0.6543643
## 3 -0.7938234 0.3452349 0.3000000 0.002758622 -0.5811377 0.36028703 0.6906801
## 4 -0.6566439 0.4168378 0.3750000 0.003084806 -0.4832350 0.26419702 0.7582364
## 5 -0.5189354 0.5058162 0.4333333 0.003958056 -0.3754251 0.17851951 0.7893142
## 6 -0.3837072 0.5699055 0.4583333 0.005957643 -0.2655122 0.09974237 0.8939312
##      pr.dir2
## 1 -0.7873767
## 2 -0.7561794
## 3 -0.7231604
## 4 -0.6519797
## 5 -0.6139895
## 6 -0.4482043
summary(fit)
## Principal curve of oriented points summary
## Dimension: 2 
## Curve points: 18 
## Projected points: 120 
## Ch: 1.5 Cd: 0.3

pcop.f1 stores the PCOP output in the original tabular format. pcop.f2 stores the projected curve in the format returned by princurve, including the curve coordinates in fit$pcop.f2$s.

str(fit$pcop.f2, max.level = 1)
## List of 5
##  $ s       : num [1:120, 1:2] -0.55 -0.245 0.141 0.819 -0.557 ...
##   ..- attr(*, "dimnames")=List of 2
##  $ ord     : int [1:120] 116 10 27 47 92 118 55 69 56 38 ...
##  $ lambda  : num [1:120] 0.594 0.983 1.38 2.298 0.584 ...
##  $ dist_ind: num [1:120] 1.33e-02 1.01e-03 2.12e-04 9.83e-06 3.18e-03 ...
##  $ dist    : num 0.592
##  - attr(*, "class")= chr "principal_curve"

The Ch argument controls the smoothing bandwidth relative to the normal reference rule. The Cd argument controls the approximate spacing between consecutive principal oriented points relative to that bandwidth. The accepted ranges are documented in ?pcop.

PCOP is distance-based, so coordinate scaling can affect the fitted curve. The wrapper first runs the native backend on the submitted scale. If that backend fails, it retries on internally standardized coordinates and maps the returned curve coordinates back to the submitted scale. The submitted data are not modified.

The wrapper rejects missing, infinite, and non-numeric inputs before calling the native backend. Plotting is available for two-dimensional inputs:

pcop(x, plot.true = TRUE, lwd = 2, col = 2)

## Principal curve of oriented points
## Dimension: 2 
## Curve points: 18 
## Ch: 1.5 Cd: 0.3