R already has two OO systems built-in (S3 and S4) and many additional OO systems are available in CRAN packages. Why did we decide more work was needed? This vignette will discuss some of the motivations behind S7, focussing on the aspects of S3 and S4 that have been found to be particularly challenging in practice.
S3 is very informal, meaning that there’s no formal definition of
a class. This makes it impossible to know exactly which properties an
object should or could possess, or even what its parent class should be.
S7 resolves this problem with a formal definition encoded in a class
object produced by new_class()
. This includes support for
validation (and avoiding validation where needed) as inspired by
S4.
When a new user encounter an S3 generic, they are often confused because the implementation of the function appears to be missing. S7 has a thoughtfully designed print method that makes it clear what methods are available and how to find their source code.
Properties of an S3 class are usually stored in attributes, but,
by default, attr()
does partial matching, which can lead to
bugs that are hard to diagnose. Additionally, attr()
returns NULL
if an attribute doesn’t exist, so misspelling
an attribute can lead to subtle bugs. @
fixes both of these
problems.
S3 method dispatch is complicated for compatibility with S. This
complexity affects relatively little code, but when you attempt to dive
into the details it makes UseMethod()
hard to understand.
As much as possible, S7 avoids any “funny” business with environments or
promises, so that there is no distinction between argument values and
local values.
S3 is primarily designed for single dispatch and double dispatch is only provided for a handful of base generics. It’s not possible to reuse the implementation for user generics. S7 provides a standard way of doing multiple dispatch (including double dispatch) that can be used for any generic.
NextMethod()
is unpredictable since you can’t tell
exactly which method will be called by only reading the code; you
instead need to know both the complete class hierarchy and what other
methods are currently registered (and loading a package might change
those methods). S7 takes a difference approach with
super()
, requiring explicit specification of the superclass
to be used.
Conversion between S3 classes is only implemented via loose
convention: if you implement a class foo
, then you should
also provide generic as.foo()
to convert other objects to
that type. S7 avoids this problem by providing the double-dispatch
convert()
generic so that you only need to provide the
appropriate methods.
Multiple inheritance seemed like a powerful idea at the time, but in practice it appears to generate more problems than it solves. S7 does not support multiple inheritance.
S4’s method dispatch uses a principled but complex distance metric to pick the best method in the presence of ambiguity. Time has shown that this approach is hard for people to understand and makes it hard to predict what will happen when new methods are registered. S7 implements a much simpler, greedy, approach that trades some additional work on behalf of the class author for a system that is simpler and easier to understand.
S4 is a clean break from S3. This made it possible to make radical changes but it made it harder to switch from S3 to S4, leading to a general lack of adoption in the R community. S7 is designed to be drop-in compatible with S3, making it possible to convert existing packages to use S7 instead of S3 with only an hour or two of work.
@
or @<-
.
Secondly, users know about @
and use it to access object
internals even though they’re not supposed to. S7 avoids these problems
by accepting the fact that R is a data language, and that there’s no way
to stop users from pulling the data they need out of an object. To make
it possible to change the internal implementation details of an object
while preserving existing @
usage, S7 provides dynamic
properties.