languageserver Provide more detailed outline of R6 classes

I might not know how to make it happen in VS Code or if it belongs to https://github.com/REditorSupport/vscode-r-lsp , but I have noticed that the outline of R6 classes could be improved. Currently, the outline shows only the names of R6 classes. It would be amazing if it could show declarations of members inside R6 classes - immensely helpful for big classes.

Here is my VSCode settings.json

{
    "files.associations": {
        "*.rmd": "markdown"
    },
    "workbench.editorAssociations": {
        "*.ipynb": "jupyter.notebook.ipynb"
    },
    "r.bracketedPaste": true,
    "r.rpath.linux": "/usr/local/bin/R",
    "r.rterm.linux": "/usr/local/bin/radian",
    "r.sessionWatcher": true,
    "r.workspaceViewer.removeHiddenItems": true,
    "editor.cursorSmoothCaretAnimation": true,
    "editor.formatOnType": true,
    "editor.inlineSuggest.enabled": true,
    "files.insertFinalNewline": true,
    "files.trimFinalNewlines": true,
    "files.trimTrailingWhitespace": true,
}

The outline:

Environment: languageserver version 0.3.10 R LSP Client 0.1.16

Jul 15 '21 12:07 kpagacz

It is perfectly possible to improve the parsing so that we also capture the internal structure of the R6 class definition by going through the syntax tree of R6Class calls.

Jul 16 '21 02:07 renkun-ken

The tricky part of this is handling the source refs of non-top level expressions in parse_expr. In parse_expr, we use R's language object representation of syntax tree so that it is easier to walk through the R expressions programmatically. The drawback is that the parse() only provides the srcref of top-level expressions. We don't know the accurate locations of the child expressions in a top-level expression unless we use getParseData() or XML syntax tree at token level which is, however, not handy to walk through as R expressions.

If we work out a way to associate the token-level information from getParseGet() with each element in the parse expressions, then it would be much easier to generate an R6 class definition from R6Class calls with correct symbol location. Only in this way, the document outline of the R6 class could properly appear as a hierarchy of public and private members.

Jul 19 '21 06:07 renkun-ken

Hi @renkun-ken I'm have explored the way to get R6 objects using XML syntax and I would like to continue and fix languaugeserver also if possible. I was browsing source code but forgive me, I don't know where to start.

file <- tempfile()
cat(
  file = file,
  'someR6class <- R6::R6Class(
    classname = "some",
    parent = parent_class,
    public = list(
      public_field = character(0),
      initialize = function(arg, arg2 = NULL) {
        inside_method <- function() {
          NULL
        }
        NULL
      },
      public_method = function(...) {
        NULL
      }
    ),
    list(
      private_field1 = NA,
      private_field2 = NULL,
      private_method = function() {
        NULL
      }
    )
  )'
)

parsed <- xmlparsedata::xml_parse_data(parse(file))
xml <- xml2::read_xml(parsed)

xml |>
  xml2::xml_find_all(
    paste0(
      "//SYMBOL_FUNCTION_CALL[text()='R6Class']/../..", # R6 call
      "/expr/expr/SYMBOL_FUNCTION_CALL[text()='list']/../..", # list call (within R6 call)
      "/expr/FUNCTION/../preceding-sibling::SYMBOL_SUB[1]" # method name (within list call)
    )
  ) |>
  xml2::xml_text()


# [1] "initialize"     "public_method"          "private_method"

Could you give me more hints where and how files are parsed in languageserver?

Jan 26 '22 19:01 gogonzo

I have just started looking into this as well :).

The parsing happens in languageserver:::parse_expr (https://github.com/REditorSupport/languageserver/blob/master/R/document.R#L317) and languageserver:::parse_document (https://github.com/REditorSupport/languageserver/blob/master/R/document.R#L400)

I think it should be possible to match the xml_parse_data and parse() versions by source locations?

To get a location of a source-ref, one can do this:

base_parse_data <- parse(file, keep.source = T)
source_ref <- attr(base_parse_data, "srcref")
languageserver:::expr_range(source_ref[[1]])

edit: naming

Jan 26 '22 19:01 ikruv

Correction: languageserver::expr_range gives location in 0-based UTF-16 coding units. To match with xml_parse_data, we need something like this, I think:

get_srcref_range <- function(srcref) {
  list(
    line1 = srcref[1],
    col1 = srcref[5],
    line2 = srcref[3],
    col2 = srcref[6]
  )
}
# so, with above:
get_srcref_range(attr(base_parse_data, "srcref")[[1]])

Jan 26 '22 20:01 ikruv

languageserver languageserver copied to clipboard

Provide more detailed outline of R6 classes

languageserver
languageserver copied to clipboard