ctags icon indicating copy to clipboard operation
ctags copied to clipboard

Ruby Parser Improvements

Open jfelchner opened this issue 10 years ago • 32 comments

Here are some improvements that, based on the way Ruby files are parsed, could be added to the parser:

Camel-Cased Constants

class Foo
  THIS_IS_MY_CONSTANT = 3
end

Should create entries for variable:

  • THIS_IS_MY_CONSTANT

attr_* methods

class Foo
  attr_accessor :bar
  attr_reader :baz
  attr_writer :qux
end

Should create entries for methods:

  • bar
  • bar=
  • baz
  • qux=

Alias Method

class Foo
  def bar
  end

  alias_method :baz, :bar
end

Should create entries for methods:

  • baz

Basic Rails Parsing

I don't think this is something to go crazy on, but there are some basic Rails-y things that people would expect.

class Foo < ActiveRecord::Base
  has_many :bars
  has_one :baz
  belongs_to :qux

  validates :buz
end

Should create entries for properties/attributes:

  • bars
  • bars=
  • baz
  • baz=
  • qux
  • qux=
  • buz
  • buz=

Hope this helps.

jfelchner avatar Jun 30 '15 23:06 jfelchner

Thank you for input.

Can I ask you to convert this issue to a test case? I want to have these you reported as test cases that are marked 'known bugs(.b)'. So people can work for fixting them.

(Universal ctags project considers the test cases is the first class output of the project.)

See docs/units.rst. Especailly about *.b maker. Units/variables-prototypes.cpp.b/. You may have to ideal output (expected.tags) by hand.

masatake avatar Jul 01 '15 04:07 masatake

@masatake I have no clue about C but I'll do my best. I may have to just copy/paste some stuff and then have you all put the finishing touches on it. :smile:

jfelchner avatar Jul 01 '15 04:07 jfelchner

For creating a test case, you don't need know C. In other hand I don't know Ruby so I need your help.

masatake avatar Jul 01 '15 05:07 masatake

ripper-tags is "a fast, accurate ctags generator for ruby source code using Ripper".

Perhaps it would make sense to use ripper-tags as an xcmd, rather than adding these features to the Ruby parser written in C?

nelstrom avatar Jul 21 '15 09:07 nelstrom

The test suite for ripper-tags suggests that it already supports most of the features @jfelchner mentioned:

nelstrom avatar Jul 21 '15 09:07 nelstrom

I would like to continue to maitain built-in ruby parser. However, I will provide the way to override the built-in parser by xcmd based ruby parser, here ripper-tags. I'm afraid that there are gaps about kind letters between two parsers.

Anyway, a PR for adding test cases that cannot handle the built-in parser is really well come. I treat the test cases are the first class result of this project. Test cases may drive the development of parsers.

masatake avatar Jul 21 '15 16:07 masatake

@nelstrom I tried ripper to limited success. It missed all kinds of things that universal-ctags properly caught. I compared the diff of the tags file output when generating it on Rails. I think universal-ctags is really close to getting all the info it can from Ruby files and I like not having to install another dependency.

jfelchner avatar Jul 21 '15 18:07 jfelchner

@nelstrom I think that ripper has a ton of promise though. Because it can actually parse the Ruby file for intent rather than syntax. I think it could be pretty awesome. So I think that universal-ctags should probably allow it to be used to parse Ruby files.

jfelchner avatar Jul 21 '15 18:07 jfelchner

@masatake I haven't forgotten about getting these test cases to you, I'm just super swamped right now. Probably next weekend.

jfelchner avatar Jul 21 '15 18:07 jfelchner

universal-ctags doesn't seem handle these cases that ripper-tags does:

  • alias
  • alias_method
  • recognizing that methods defined within class << self are class methods, not instance methods
  • methods defined within class_eval or module_eval properly scoped to the corresponding module
  • constant definitions
  • classes defined with Class.new
  • classes defined with Struct.new
  • modules defined with Module.new
  • methods defined with define_method
  • methods generated from attr_accessor, attr_reader, attr_writer

And those are just the cases that I could spot within 3 minutes of casual testing. So you see, I'm not really convinced that ripper-tags "missed all kinds of things that universal-ctags properly caught". If you find some method or class definition not being properly recognized with ripper-tags, please file an issue there. Thanks!

I'm :+1: for improving Ruby tags within universal-ctags project, but due to technical limitations it will hardly be able to support all the metaprogramming constructs that Ripper can recognize because it's a true Ruby parser.

mislav avatar Jul 21 '15 19:07 mislav

@mislav and that's why I said:

I think that ripper has a ton of promise though. Because it can actually parse the Ruby file for intent rather than syntax.

:smile:

But it did miss things. A few things from a language point of view, but more often the tags file that it generated didn't have nearly all of the fields and options that a standard ctags generated file does.

I'm not trying to knock your gem @mislav. I'm a huge fan of what ripper is trying to achieve. I personally think that writing a tags parser using something that can parse the language itself rather than just the text is the best approach. But we all know that open source is a hard and sometimes thankless job. Projects come and go. I'm sure there've been projects that you wish you had time for, but just don't. There are only so many hours in the day.

So all I'm saying is that if universal-ctags can get 90% of the way there, so that someone can just pick it up and go (without needing to know that ripper is a thing), then switch to ripper when it's appropriate, then that both helps the new developer who's just coming to a tagging system for the first time, and also future-proofs universal-ctags from having to depend on a project which may or may not be there in the future.

So again, I think that the idea behind ripper is awesome. I just think that there also needs to be an integrated solution inside universal-ctags.

jfelchner avatar Jul 21 '15 20:07 jfelchner

I'm not trying to knock your gem @mislav.

I understand. And it's not my library; I just worked on it to bring those improvements.

But in a thread that's about comparing different ctags implementations of a language, I think it's more constructive to point out specifically which construct does an implementation fail to handle instead of saying "it missed all kinds of things". At the very least, I'd appreciate bug reports.

mislav avatar Jul 21 '15 22:07 mislav

It would be nice if you guys could provide the example Ruby code that is not being handled correctly as well as the expected tags list. Ideally this would come as a pull request adding new test units, but if that is not possible as long as the examples are still provided one of us could create the test units and try to improve the existing parser.

vhda avatar Jul 21 '15 23:07 vhda

@mislav cool. :+1: :heart: I definitely would have added issues, but the project hasn't had any activity for over a year and a half and has issues and PRs that are older than that, so my impression was that it had been abandoned and which is also why I didn't go into much detail on what it didn't do correctly. I'm glad that it's still being maintained. Like I said, I think a parser is by far the better solution. :smile:

@vhda it's on my list! :)

jfelchner avatar Jul 21 '15 23:07 jfelchner

I've just filed the following Ruby parser bugs: #452 #453 #454 #455

They're unrelated to the enhancements proposed above, but they were the result of generating ctags on Rails source code and comparing it to ripper-tags output. @jfelchner: ripper-tags handles Rails source code pretty well, but I found what you meant re: dropped tag definitions. It currently ignores method definitions inside other methods:

def foo
  def obj.bar() end # ripper-tags ignores this
end

as well as within DSL-like blocks

included do
  def foo() end # ripper-tags ignores this
end

as well as when defining methods for anonymous classes/modules:

Class.new do
  def foo() end # ripper-tags ignores this
end

I'll take those issues with the ripper-tags project and won't bother this thread about that anymore; just sending a heads-up to anyone who wants to try both parsers.

I'm :+1: on the enhancements proposed originally by the OP here, starting with recognizing constant definitions, but I think solving the current Ruby parser bugs is bigger priority.

mislav avatar Jul 22 '15 23:07 mislav

Thanks for your help @mislav. We'll look into those issues as soon as possible.

vhda avatar Jul 23 '15 05:07 vhda

[yamato@x201]~/var/ctags-github% ripper-tags a.rb --list-kinds=Ruby
c  classes
f  methods
m  modules
F  singleton methods
C  constants
a  aliases
[yamato@x201]~/var/ctags-github% ./ctags --list-kinds=Ruby        
c  classes
f  methods
m  modules
F  singleton methods
d  describes
C  contexts

contexts and constants are conflicted. Obviously constants are more important. Should we have to distinguish describe and context? I think we should not. Ruby is so popular that we would like to continue maintain build-in crated parser. However, an xcmd based alternative parser may be useful. For preparing it ctags has to translate F in ripper-tag output to !.

A question is a $GLOBAL variable.

[yamato@x201]~/var/ctags-github% cat a.rb
$global = "A"
CONSTANT = "B"
describe = "X"
context = "Y"
[yamato@x201]~/var/ctags-github% ripper-tags -f - ./a.rb
!_TAG_FILE_FORMAT   2   /extended format; --format=1 will not append ;" to lines/
!_TAG_FILE_SORTED   1   /0=unsorted, 1=sorted, 2=foldcase/
CONSTANT    ./a.rb  /^CONSTANT = "B"$/;"    C   class:
[yamato@x201]~/var/ctags-github% ./ctags -f - ./a.rb
= "X"   ./a.rb  /^describe = "X"$/;"    d
= "Y"   ./a.rb  /^context = "Y"$/;" C   describe:= "X"

I wonder why people don't want to tag global variables.

A Suggestion with a test case is welcome.

masatake avatar Jul 31 '15 03:07 masatake

contexts and constants are conflicted. Obviously constants are more important.

Yes, I suggested you drop the (broken) support for contexts and describes here https://github.com/universal-ctags/ctags/issues/453#issuecomment-124327142

However, an xcmd based alternative parser may be useful. For preparing it ctags has to translate F in ripper-tag output to !.

What did you mean exactly by this last sentence? Isn't ripper-tags already xcmd compatible?

mislav avatar Jul 31 '15 04:07 mislav

I wonder why people don't want to tag global variables.

We don't use them that much in Ruby. Also, unlike a constant, a global variable can be reassigned easily. Generating ctags for global variables would entail generating a tag for each of its assignments. It could be a lot of noise and not very useful. I don't think we need to track global variables.

mislav avatar Jul 31 '15 04:07 mislav

@mislav, thank you for explaining about global variables in Ruby. I understand the situation.

What did you mean exactly by this last sentence? Isn't ripper-tags already xcmd compatible?

I'm thiking what is needed to use ripper-tags via xcmd. Letter F is reserved for a input file itself in ctags:

[yamato@x201]~/var/ctags-github% ./ctags --extra=+f main/main.h 
./ctags --extra=+f main/main.h 
[yamato@x201]~/var/ctags-github% grep ' F$' tags 
main.h  main/main.h 1;" F

This conflicts "F singleton methods" of ripper-tags. ruby parser also uses 'F' for the same kind.
However ctags main translates F to !.

% ./ctags --extra=+f a.rb
% cat tags
...
a.rb    a.rb    1;" !

Ctags can translate it because Ruby parser is built-in. We have to extend the translation code(flavour) can be used in xcmd.

masatake avatar Jul 31 '15 05:07 masatake

So should we then switch to using ! for singleton methods? ! is not a very intuitive character to be marking this. What is the text editor support for these letters that indicate tag "kind"?

mislav avatar Jul 31 '15 06:07 mislav

So should we then switch to using ! for singleton methods?

I'm sorry I am confused. The a tag for input file is injected by ctags itself even if xcmd is used as a parser. So no translation is needed.

@mislav, do you want to try ripper-tags+xcmd parser?

masatake avatar Jul 31 '15 06:07 masatake

I made a new release of ripper-tags which I hoped was xcmd-compatible but the integration doesn't seem to work perfectly.

$ ctags --version
Universal Ctags Development, Copyright (C) 1996-2009 Darren Hiebert
  Compiled: Jul 21 2015, 12:41:18
  Addresses: <[email protected]>, https://github.com/universal-ctags/ctags
  Optional compiled features: +wildcards, +regex, +debug, +option-directory, +coproc

$ ctags --xcmd-ruby=ripper-tags -R -f -

The output of the above command are Ruby tags generated as universal-ctags would do normally. However, I expected to see in the output some tags that only ripper-tags would have recognized, such as constant definitions. Indeed, when I run ripper-tags -R -f - manually, I see those definitions.

So, it doesn't seem that ctags is using ripper-tags for Ruby language here. Any pointers how to debug?

$ ripper-tags --list-kind=ruby
c  classes
f  methods
m  modules
F  singleton methods
C  constants
a  aliases

mislav avatar Jul 31 '15 08:07 mislav

Here's what I got with --verbose:

Reading initial options from command line
  Option: --xcmd-ruby=ripper-tags
loading path kinds of Ruby from [ripper-tags --list-kinds=Ruby]
        status: 32512
xcmd: the ripper-tags backend is not available

mislav avatar Jul 31 '15 08:07 mislav

Oh, I'm surprised at you are a developer of ripper-tags. O.K. I prepare a prototype. Please, wait for awhile.

masatake avatar Jul 31 '15 08:07 masatake

https://github.com/masatake/ctags/tree/ripper-tags-xcmd

Could you try this branch?

%  ./ctags --libexec-dir=./libexec --data-dir=./data --options=ripper-tags --list-kinds=ripper-ruby
c  classes 
f  methods 
m  modules 
F  singleton methods 
C  constants 
a  aliases 
% cat a.rb
cat a.rb
$global = "A"
CONSTANT = "B"
describe = "X"
context = "Y"
%  ./ctags --libexec-dir=./libexec --data-dir=./data --options=ripper-tags  -o - ./a.rb
CONSTANT    ./a.rb  /^CONSTANT = "B"$/;"    C   class:
%  ./ctags -o - ./a.rb
= "X"   ./a.rb  /^describe = "X"$/;"    d
= "Y"   ./a.rb  /^context = "Y"$/;" C   describe:= "X"

masatake avatar Jul 31 '15 08:07 masatake

Could you try this branch?

Your branch works for me. Any idea why it didn't work with --xcmd-ruby=ripper-tags?

mislav avatar Jul 31 '15 17:07 mislav

Shouldn't it be possible to use --xcmd-ruby=ripper-tags without defining a driver either in ctags core or in ~/.ctags.d/drivers/ripper-tags?

mislav avatar Jul 31 '15 17:07 mislav

Shouldn't it be possible to use --xcmd-ruby=ripper-tags without defining a driver either in ctags core or in ~/.ctags.d/drivers/ripper-tags?

As you wrote the interface between ctags and xcmd backend may be a bit redundant. I made so because I was not sure how the interface between two should be. I'm still not sure. I memorize what you wrote here as important suggestion. Thank you. As far as I can remember you are the first person who tries xcmd feature voluntarily.

It is nice that ctags utilizes ripper-tags instead of built-in ruby parser when ripper-tags is available where ctags runs.

--langdef=ripper_ruby
--altname-rupper_ruby=ruby
`ripper_ruby+ripper_ruby-ruby

(I just remember '-' cannot be used in lang name.) --altname-<LANG>= is obviously useful. --languages=? is too add-hock.

When --fields=+l is given, ctags adds language field. e.g.

timeStamp       main/main.c     /^#undef timeStamp$/;"  d       language:C      file:

Currently % ./ctags --fields=+l --libexec-dir=./libexec --data-dir=./data --options=ripper-tags -o - ./a.rb may generates:

CONSTANT    ./a.rb  /^CONSTANT = "B"$/;"    C   class:  language:ripper_ruby

Users may want

CONSTANT    ./a.rb  /^CONSTANT = "B"$/;"    C   class:  language:ruby

--altname-rupper_ruby=ruby is for the purpose.

masatake avatar Aug 01 '15 03:08 masatake

@mislav, Can I ask you to add following pseudo tags to ripper-tags?

  • !_TAG_PROGRAM_AUTHOR
  • !_TAG_PROGRAM_NAME
  • !_TAG_PROGRAM_URL
  • !_TAG_PROGRAM_VERSION

They are not must but ctags utilizes these tags returned from the xcmd backend.

masatake avatar Aug 02 '15 17:08 masatake