vim-scala icon indicating copy to clipboard operation
vim-scala copied to clipboard

Fix case class/object ctags to work with Universal Ctags

Open andrewrembrandt opened this issue 4 years ago • 6 comments

Hi Derek, Many thanks for a great plugin.

I've discovered that Universal Ctags (Exuberant Ctags is not in common use it seems these day) is not compatible with spaces in kind names (i.e. it's stricter). Using kinddef with spaces is the way to get around this.

Let me know if you'd like any other changes, thanks, Andrew

andrewrembrandt avatar Mar 02 '20 21:03 andrewrembrandt

Thank you @andrewrembrandt, sorry for the long silence here.

I use our integrated ctags + Tagbar definitions, I still have a ctags binary on my machines that is Exuberant Ctags.

I'm not opposed to the change in principle, Exuberant Ctags seems dead in terms of activity so the world is probably moving on as you say, but I'm not happy for the upgrade experience… the --kinddef options aren't supported by e-ctags it seems, so for myself I'd get errors once I update vim-scala with this patch. I suspect I wouldn't be alone, from a quick look there are still working packages called ctags on current OS distros which install e-ctags (macOS Homebrew and Ubuntu/Debian to name a few).

It'd be unfortunately annoying to maintain two versions of Scala ctag defs in this repo that are almost the same, but that's the best suggestion I have so far, at least they've hardly changed in years. We could maybe have a new setting to go along with g:scala_use_builtin_tagbar_defs that indicates "I'm a u-ctags user, configure the scala.uctags file!"

Alternatively we could say you're on your own to set g:tagbar_type_scala.deffile, pointing to your own conf. But if many ctags users are in the know and moving to u-ctags, that'd be a disservice.

I'm open to second file plus a setting variable, or if anyone has another idea.

ches avatar Aug 11 '21 19:08 ches

A kind comes from three items: one-letter flag, long-name, description. --regex-<LANG> option of e-ctags allows users to specify a kind without the description.

--regex-scala=/^[ \t]*((abstract|final|sealed|implicit|lazy)[ \t]*)*(private[^ ]*|protected)?[ \t]*((abstract|final|sealed|implicit|lazy)[ \t ]*)*case class[ \t ]+([a-zA-Z0-9_]+)/\6/C,case classes/

In this example, C is the one-letter flag, and case classes is the long-name. The description is omitted. When the description is omitted, the long-name is also used for the description.

If you read the definition of the tags file format, a whitespace in a long-name is not allowed. I'm talking about the file format, not about the implementation of ctags.

Though the format doesn't allow it, e-ctags accepts a long-name including a whitespace. u-ctags doesn't accept it.

To share your ctags between e-ctags and u-ctags, what you have to do are:

  • don't use any whitespace in a long-name for u-ctags, and
  • add a description to a kind definition explicitly.

You don't have to use --kinddef-<LANG> option for the purpose.

The change you may need is:

diff --git a/ctags/scala.ctags b/ctags/scala.ctags
index b7d3125..c51ad07 100644
--- a/ctags/scala.ctags
+++ b/ctags/scala.ctags
@@ -3,8 +3,8 @@
 
 --regex-scala=/^[ \t]*((abstract|final|sealed|implicit|lazy)[ \t]*)*(private[^ ]*|protected)?[ \t]*class[ \t]+([a-zA-Z0-9_]+)/\4/c,classes/
 --regex-scala=/^[ \t]*((abstract|final|sealed|implicit|lazy)[ \t]*)*(private[^ ]*|protected)?[ \t]*object[ \t]+([a-zA-Z0-9_]+)/\4/o,objects/
---regex-scala=/^[ \t]*((abstract|final|sealed|implicit|lazy)[ \t]*)*(private[^ ]*|protected)?[ \t]*((abstract|final|sealed|implicit|lazy)[ \t ]*)*case class[ \t ]+([a-zA-Z0-9_]+)/\6/C,case classes/
---regex-scala=/^[ \t]*((abstract|final|sealed|implicit|lazy)[ \t]*)*(private[^ ]*|protected)?[ \t]*case object[ \t]+([a-zA-Z0-9_]+)/\4/O,case objects/
+--regex-scala=/^[ \t]*((abstract|final|sealed|implicit|lazy)[ \t]*)*(private[^ ]*|protected)?[ \t]*((abstract|final|sealed|implicit|lazy)[ \t ]*)*case class[ \t ]+([a-zA-Z0-9_]+)/\6/C,caseClass,case classes/
+--regex-scala=/^[ \t]*((abstract|final|sealed|implicit|lazy)[ \t]*)*(private[^ ]*|protected)?[ \t]*case object[ \t]+([a-zA-Z0-9_]+)/\4/O,caseObject,case objects/
 --regex-scala=/^[ \t]*((abstract|final|sealed|implicit|lazy)[ \t]*)*(private[^ ]*|protected)?[ \t]*trait[ \t]+([a-zA-Z0-9_]+)/\4/t,traits/
 --regex-scala=/^[ \t]*type[ \t]+([a-zA-Z0-9_]+)/\1/T,types/
 --regex-scala=/^[ \t]*((abstract|final|sealed|implicit|lazy|override|private[^ ]*(\[[a-z]*\])*|protected)[ \t]*)*def[ \t]+([a-zA-Z0-9_]+)/\4/m,methods/

With the modified .ctags:

$ u-ctags --options=vim-scala/ctags/scala.ctags --list-kinds-full=scala
#LETTER NAME       ENABLED REFONLY NROLES MASTER DESCRIPTION
C       caseClass  yes     no      0      NONE   case classes
O       caseObject yes     no      0      NONE   case objects
T       types      yes     no      0      NONE   types
V       values     yes     no      0      NONE   values
c       classes    yes     no      0      NONE   classes
m       methods    yes     no      0      NONE   methods
o       objects    yes     no      0      NONE   objects
p       packages   yes     no      0      NONE   packages
t       traits     yes     no      0      NONE   traits
v       variables  yes     no      0      NONE   variables
$ e-ctags --options=vim-scala/ctags/scala.ctags --list-kinds=scala     
c  classes 
o  objects 
C  case classes 
O  case objects 
t  traits 
T  types 
m  methods 
V  values 
v  variables 
p  packages 

masatake avatar Aug 12 '21 04:08 masatake

Are you satisfying with the line-oriented regex patterns for extracting definitions from scala source files? I don't think so. It is not enough for such a complicated language.

u-ctags has many offers to parser developers:

  • byte-oriented statefull regex engine
    • https://goral.net.pl/post/ctags-for-notes/
    • https://stackoverflow.com/questions/10741664/ctags-regex-for-multiple-declarations-in-one-line
    • https://docs.ctags.io/en/latest/optlib.html#advanced-pattern-matching-with-multiple-regex-table
  • .ctags to C translator (optlib2c) https://docs.ctags.io/en/latest/optlib.html#translating-an-option-file-into-c-source-code-optlib2c
  • optscript, generall purpose programming language like postscript https://ctags.io/2021/01/05/optscript/

masatake avatar Aug 12 '21 04:08 masatake

Thank you for the suggestion @masatake, we can try making our current long names compatible (whitespace-free) and using description for the "pretty" names—the change you suggest. I'm certainly happy if we have one file that's compatible with e-ctags and u-ctags.

Are you satisfying with the line-oriented regex patterns for extracting definitions from scala source files? I don't think so. It is not enough for such a complicated language.

This is a bit of a different discussion, but personally these days when I want more I turn to a language server with semantic capability. In vim-scala the ctags definition we have is only directly used in support of a sidebar "outline" plugin that parses the single file/buffer you're working on on-the-fly. For light editing tasks, this is useful and sufficient for me.

For complete indexing of large projects e.g. to facilitate tags-based code navigation in Vim, I'd agree e-ctags certainly has limitations and people may benefit from u-ctags. This just isn't in scope of anything vim-scala is trying to provide currently. Vim users can explore their options for ctags implementations and integrating them with their Vim workflow, it's not Scala-specific aside from the parsing rules.

ches avatar Aug 12 '21 05:08 ches

@andrewrembrandt if you're still on the line, would you like to try the change @masatake suggested without --kinddef? We'll also need to update the Tagbar mapping for the changed names.

ches avatar Aug 12 '21 05:08 ches

Thanks for the feedback @ches & @masatake (universal-ctags maintainer no less!) Will make some time to update and test this.

(I do wonder if there's interest amongst users to have a benefit or two which @masatake mentioned - e.g. multiple variables on the same line - which I guess would mean the first suggestion of two versions with a config option. But yes, that's best for a separate PR. Sadly, I'm not using Scala these days, so can't gauge how useful I'd find this personally).

andrewrembrandt avatar Aug 17 '21 12:08 andrewrembrandt