loggit2
is an easy-to-use, yet powerful, ndjson
logger. It is
very fast, has zero external dependencies, and can be as straightforward
or as integral as you want to make it.
R has a selection of built-in functions for handling different
exceptions, or special cases where diagnostic messages are
provided, and/or function execution is halted because of an error.
However, R itself provides nothing to record this diagnostic post-hoc;
useRs are left with what is printed to the console as their only means
of analyzing the what-went-wrong of their code. There are some slightly
hacky ways of capturing this console output, such as
sink
ing to a text file, repetitively cat
ing
identical exception messages that are passed to existing handler calls,
etc. But there are two main issues with these approaches:
The console output is not at all easy to parse, so that a user can quickly identify the causes of failure without manually scanning through it
Even if the user tries to structure a text file output, they would likely have to ensure consistency in that output across all their work, and there is still the issue of parsing that text file into a familiar, usable format
loggit2
addresses these issues by writing logs as
newline-delimited JSON
(ndjson
). This format exhibits very fast disk write speeds,
while still being machine-parsable, human-readable, and ideal for log
stream collection systems.
loggit2
To write a log entry using loggit2
via its exception
handlers, you just load loggit2
, set its log file location,
and use the same handlers you always do:
library(loggit2)
set_logfile("/path/to/my/log/directory/loggit.log") # loggit2 enforces no specific file extension
message("This is a message")
#> {"timestamp": "2024-05-02T19:02:53+0200", "log_lvl": "INFO", "log_msg": "This is a message__LF__"}
#> This is a message
warning("This is a warning")
#> {"timestamp": "2024-05-02T19:02:53+0200", "log_lvl": "WARN", "log_msg": "This is a warning"}
#> Warning: This is a warning
stop("This is a critical error")
#> {"timestamp": "2024-05-02T19:02:53+0200", "log_lvl": "ERROR", "log_msg": "This is a critical error"}
#> Error in eval(expr, envir, enclos): This is a critical error
You can see that the handlers will pring both the
loggit()
-generated log entry, as well as their base default
output. To only have the JSON print, wrap the call in the appropriate
suppressor (i.e. suppressMessages()
or
suppressWarnings()
). To only have the base text printed,
pass echo = FALSE
to the handler.
And… that’s it! You’ve introduced human-readable, machine-parsable logging into your workflow!
However, surely you want more control over your logs.
Behind the scenes, loggit2
’s core function, called
loggit()
, is executed right before the base handlers with
some sane defaults. However, the loggit()
function is also
exported for use by the developer:
loggit("INFO", "This is also a message")
#> {"timestamp": "2024-05-02T19:02:53+0200", "log_lvl": "INFO", "log_msg": "This is also a message"}
loggit("WARN", "This is also a warning")
#> {"timestamp": "2024-05-02T19:02:53+0200", "log_lvl": "WARN", "log_msg": "This is also a warning"}
loggit("ERROR", "This is an error, but it won't stop")
#> {"timestamp": "2024-05-02T19:02:53+0200", "log_lvl": "ERROR", "log_msg": "This is an error__COMMA__ but it won't stop"}
“But why wouldn’t I just use the handlers instead?”
Because loggit()
exposes much greater flexibility to the
user, by way of custom fields.
loggit(
"INFO",
"This is a message",
but_maybe = "you want more fields?",
sure = "why not?",
like = 2
)
#> {"timestamp": "2024-05-02T19:02:53+0200", "log_lvl": "INFO", "log_msg": "This is a message", "but_maybe": "you want more fields?", "sure": "why not?", "like": "2"}
Since JSON is considered semi-structured data (sometimes called “schema-on-read”), you can log any custom fields you like, as inconsistently as you like. It all just ends up as text in a file, with no column structure to worry about.
So, loggit2
’s log format is a special type of JSON. JSON
objects are like list
s – and so are
data.frames
. To allow for the most flexibility, the
read_logs()
function is available to you, which reads in
the currently-set log file as a data frame:
read_logs()
#> timestamp log_lvl log_msg but_maybe sure like
#> 1 2024-05-02T19:02:53+0200 INFO This is a message\n <NA> <NA> <NA>
#> 2 2024-05-02T19:02:53+0200 WARN This is a warning <NA> <NA> <NA>
#> 3 2024-05-02T19:02:53+0200 ERROR This is a critical error <NA> <NA> <NA>
#> 4 2024-05-02T19:02:53+0200 INFO This is also a message <NA> <NA> <NA>
#> 5 2024-05-02T19:02:53+0200 WARN This is also a warning <NA> <NA> <NA>
#> 6 2024-05-02T19:02:53+0200 ERROR This is an error, but it won't stop <NA> <NA> <NA>
#> 7 2024-05-02T19:02:53+0200 INFO This is a message you want more fields? why not? 2
Notice that read_logs()
handles any columnar
inconsistencies as mentioned above. If read_logs()
finds a
field that other entries don’t have, it maps it to an empty string for
that log entry. This was chosen over NA
s to allow for
consistency on re-write. You can, however, just replace all the empty
strings with NA
after read, if you want to.
You can also pass a file path to read_logs()
, and read
that loggit2
log file instead.
The other helpful utilities are as follows:
"%Y-%m-%dT%H:%M:%S%z"
, but you may set it
yourself using set_timestamp_format()
. Note that this
format is ultimately passed to format.Date()
, so the
supplied format needs to be valid.set_logfile(logfile)
. Similarly, you can retrieve the
location of the current log file using get_logfile()
.loggit2
will default to writing to an R temporary
directory. As per CRAN policies, a package cannot
write to a user’s “home filespace” without approval.
Therefore, you need to set the log file before any logs are written to
disk, using set_logfile(logfile)
(I recommend in your
working directory, and naming it “loggit.log”). If you are using loggit2
in your own package, you can wrap this in a call to
.onLoad()
, so that logging is set on package load. If not,
then make the set call as soon as possible (e.g. at the top of your
script(s), right after your calls to library()
); otherwise,
no logs will be written to persistent storage!