knitr icon indicating copy to clipboard operation
knitr copied to clipboard

A logging solution for debugging `knitr` and dependent packages

Open leogama opened this issue 3 years ago • 10 comments

Hi, @yihui. I've been working on solutions for Python in knitr with reticulate. I'm also exploring and extending (for me) features of rmarkdown and bookdown, including some templating with jinjar. It's been all fun and games, but the interdependence and complexity of these separate packages is starting to slow down my adventures...

Beyond that, I hit a nasty file location bug recently and I'm unable to use some standard R debugging tools like browser() or recover() to inspect the environment at the failure point because it happens many levels deep (an R function called from a Python hook, called by a knitr block, possibly in a child process). I thought of using some "cat/print" calls, but there's another problem as stdout and stderr are both used in my setup for one thing or another and I'm never sure if the message could be swallowed somehow by knitr itself...


I'd like you to consider adding some logging facility to knitr. It may be either a standalone package like logging or a streamlined version of it included in the source. I'm thinking of something that could:

  • log to a file, configurable somehow
  • include and format context information like function scope, working dir, etc.
  • be called from dependent packages like rmarkdown and reticulate

For example, knitr could expose a logger object, like implemented in Python's and R's logging, that would be used from other interacting packages by calling knitr:::logger$debug("sample message"). The code base(s) could then have some basic messages like "entering R/Python block", "running after hooks", "reading results from cache", etc. predefined for localizing the stage of code execution, and when debugging something we would be able to add a couple of temporary extra logging messages without needing to care about the whole logging setup.

I'm motivated to write a draft PR, but I wanna know your opinion first.


By filing an issue to this repo, I promise that

  • [x] I have fully read the issue guide at https://yihui.org/issue/.
  • [x] I have provided the necessary information about my issue.
    • If I'm asking a question, I have already asked it on Stack Overflow or RStudio Community, waited for at least 24 hours, and included a link to my question there.
    • If I'm filing a bug report, I have included a minimal, self-contained, and reproducible example, and have also included xfun::session_info('knitr'). I have upgraded all my packages to their latest versions (e.g., R, RStudio, and R packages), and also tried the development version: remotes::install_github('yihui/knitr').
    • If I have posted the same issue elsewhere, I have also mentioned it in this issue.
  • [x] I have learned the Github Markdown syntax, and formatted my issue correctly.

I understand that my issue may be closed if I don't fulfill my promises.

leogama avatar Dec 14 '22 14:12 leogama

I don't have much experience here, but if the (initial) PR is not complicated, I can definitely have a look. Thank you!

yihui avatar Dec 14 '22 15:12 yihui

I don't have much experience here, but if the (initial) PR is not complicated, I can definitely have a look. Thank you!

Great! It's not meant to be complicated, just need to get a couple of details right. Performance impacts should be negligible. And I don't think this feature should be exposed as public API, it's solely to ease debugging and development.

Are you fine with a new dependency on the package logging (CRAN)? It has no sub-dependencies and implements the same API as Python's logging module, which is nice.

leogama avatar Dec 14 '22 15:12 leogama

Adding soft dependencies (i.e., Suggests) is almost always okay to me.

Keeping this functionality internal is also okay. Thanks!

yihui avatar Dec 14 '22 16:12 yihui

I like the idea of debugging options. I think it should be really lightweight, for internal debugging mainly. Otherwise it could become quite verbose.

What motivates the logging package ?

I know about more recent option in logger 📦 which I like the flexibility https://daroczig.github.io/logger/

Or a very lightweight option useful that makes add it as suggest package quite easy https://cran.rstudio.com/web/packages/debugme/index.html Licenced MIT

Anyway, just adding some other choices I am more used to. There are several logging packages - not sure what is the best for us.

cderv avatar Dec 15 '22 15:12 cderv

Thanks for the suggestions, @cderv. I had seen logger when looking for options, but didn't know about debugme —it's really interesting!

I like the idea of debugging options. I think it should be really lightweight, for internal debugging mainly. Otherwise it could become quite verbose.

Yes, I'm proposing this only for debugging and development of knitr and packages that extend it, not for users.

What motivates the logging package ?

I've suggested logging because it's stable, implements the nice Python logging API and, more important, it has no dependencies. But there are alternatives with only a few dependencies.

I know about more recent option in logger package which I like the flexibility https://daroczig.github.io/logger/

Oh! I just noticed that logger's dependencies are in "Suggests", so this is fine too.

Or a very lightweight option useful that makes add it as suggest package quite easy https://cran.rstudio.com/web/packages/debugme/index.html Licenced MIT

This logging approach is very attractive, ~however it doesn't seem to be~ and it is able to log to a file. As knitr can run code many levels deep and generally sinks stderror output, this is an important requisite.


Now, I'm not sure if @yihui is willing to accept a lightweight package as hard dependency. If not, than we need to find a package with a logging API that can be set up, exposed and called from other packages (e.g. rmarkdown) conditionally. But if such a solution exists and is practical to use, we won't even need to argue about dependencies...

debugme is incredibly easy to set up and enable/disable with multiple packages. ~As we only need a single output stream to a single file, I'm thinking of writing a small extension to be able to redirect debugme's output and take advantage of it's cool features.~

If you have any other thoughts, please share! But let me brew these ideas for a few days.

leogama avatar Dec 15 '22 16:12 leogama

You can write to file with debugme. Everything is controled by env var externally by the user. See redirecting output section in the help page https://rdrr.io/cran/debugme/man/debugme.html

cderv avatar Dec 15 '22 16:12 cderv

I will try adding this snippet to .onLoad():

if (requireNamespace('debugme', quietly = TRUE))
    debugme::debugme()

leogama avatar Dec 15 '22 17:12 leogama

I'd prefer an explicit opt-in like if (getOption('knitr.enable.debugme', FALSE)) debugme::debugme(), otherwise debugme will always be loaded if installed when loading knitr.

yihui avatar Dec 15 '22 19:12 yihui

Well, the DEBUGME environment variable must be set for debugme to do anything. Shouldn't we check this instead of creating a redundant setting?

Example:

if (Sys.getenv('DEBUGME') != '' && requireNamespace('debugme', quietly = TRUE))
    debugme::debugme()

leogama avatar Dec 16 '22 02:12 leogama

Sure. That's definitely more appropriate!

yihui avatar Dec 16 '22 14:12 yihui