yard icon indicating copy to clipboard operation
yard copied to clipboard

@api-filtered class is incorrectly hidden if containing module not marked with same @api

Open smackesey opened this issue 3 years ago • 5 comments

When I mark a class MyModule::MyClass with @api public, then run yard doc --api public:

  • the class MyClass incorrectly does not show up in the "Class List"
  • the methods of MyClass correctly show up in the "Method List"
  • if I also mark MyModule with @api public, then it works.

Steps to reproduce

  • $ mkdir -p yardtest/lib
  • $ cd yardtest
  • put this is in lib/test.rb:
module MyModule
  # @api public
  class MyClass
    def my_method; end
  end
end
  • $ yard doc --api public # output shows 1 class, 1 method
  • $ open doc/index.html
  • click on "Classes" in upper-left; notice that MyClass is not listed
  • edit lib/test.rb to add # @api public as the first line, over module MyModule
  • $ yard doc --api public
  • reload doc/index.html, this time the output is correct MyClass is in the class list

Expected Output

Classes marked with @api public should show in the class list regardless of whether the containing module is marked as such.

Environment details:

  • OS: macOS 11.3
  • Ruby version (ruby -v): ruby 2.7.2p137 (2020-10-01 revision 5445e04352) [x86_64-darwin20]
  • YARD version (yard -v): yard 0.9.26

smackesey avatar May 02 '21 01:05 smackesey

Classes marked with @api public should show in the class list regardless of whether the containing module is marked as such.

Unfortunately that's not the expectation YARD provides, though I understand the confusion. There is a bug here, but I'm not sure fixing it will necessarily do what you want. Specifically, the bug is YARD is not warning you about this behavior and creating an inconsistent state. If YARD was going to "fix" this, it would probably do so by hiding MyClass and warning you that your public API is invalid. Again, I'm sure that's not what you want.

The issue is that by applying @api public only to the child class, you've filtered out the parent when using --api public (which is an alias for --query @api.text == 'public'). Namespaces that have been filtered out of generation will never show up in generated output, and that includes anything they contain. What you've created is a bit of a contradictory scenario where MyMixin, a private API (and by extension a private module with private children), should be hidden, but with an explicitly public child class. This contradiction effectively creates an undefined behavior.

At the end of the day, you might want to ask whether MyModule is actually private, or whether you just mis-documented it as such? Remember: if you omit public in your example, you're effectively calling it non-public. Presumably if MyClass is to be found, its parent namespace must also be publicly available in your API, no?

As a suggestion, it might be easier to accomplish the same result by doing this in the opposite direction: rather than assuming everything unmarked is private and only those with @api public are visible, consider marking only your private APIs as such and assuming that anything unmarked is public as expected. You would do this with --hide-api private and @api private tags. In practice this should require much less annotation, since typically private namespace surface-area is much smaller. If that's not true for your codebase, it might be hinting at a worthwhile refactor to collect these private namespaces into a single root namespace (MyMixinImpl, maybe?).

Alternatively, a little added tip here: @api is a "transitive" tag, in other words, the tag applies to all descendent namespaces. That is why when you apply @api public to MyModule it "works as expected"-- because it's also marking MyClass as @api public. You can see this for yourself by running the following in the directory with your above example repro code:

$ ruby -ryard -e 'YARD::CLI::Yardoc.run("-q"); p YARD::Registry.at("MyModule::MyClass").docstring.to_raw'
"@api public"

Note, that this means everything inside MyModule now has @api public, so that may not be exactly what you want and why I'm only suggesting this as an alternative.

lsegal avatar May 02 '21 02:05 lsegal

Thanks @lsegal for the clear and detailed response.

The reasoning that the container must be public for the child to be public does make sense. I would have inferred that but for the fact that all of the methods of MyClass (which by your explanation should be hidden) actually show up in the method list even when MyClass itself is hidden (the methods disappear as expected when you pass --non-transitive-tag api). This is a bug, no?

Now if I understand you, the intended rule of YARD queries is: "a construct (module/class/method) is only matched by a query if all of its ancestors in the container hierarchy also match the query". Call this the "ancestor-match rule". IMO this rule is counterintuitive, as code objects that live deeper in the container hierarchy are often classified in a way that doesn't make sense for their containers. For instance, this implies that to query for @api private methods within a gem, the gem root module must also be marked @api private, which makes no sense.

That said, I haven't thought enough about the issue to propose an intelligent alternative that also accounts for the logic of namespace hiding you mention.


As for your suggestions regarding the "direction" of API-marking (default public or default private):

The example I posted is a simplified version of my actual use-case, in which the container module is actually the root module of my gem. In my gem (and I find in most of the stuff I code-- I try to keep my public API as small as possible) the majority of methods are private. So I prefer treating everything as private by default, and explicitly marking constructs as public.

But according to the ancestor-match rule, that's impossible with transitive api tags, because it's impossible for a query to return any results if it doesn't match the root module. So It is impossible to query any api that is not the default api (since the default is set by the transitive root module tag).

IMO this makes the --api option useless under the default settings (API transitivity enabled). Any use case of the api tag can be more efficiently satisfied with the --hide-api tag. Ultimately though, I was able to achieve what I want by turning off API transitivity and explicitly marking all containers public.

I do find the current state of affairs pretty counterintuitive though. Maybe some docs would help? I'd actually be happy to translate the knowledge I've gained debugging this issue into a short guide for inclusion in the YARD docs showing how to generate docs under various API scenarios, but I'm not sure of where the guide should live in the docs.

smackesey avatar May 02 '21 18:05 smackesey

The reasoning that the container must be public for the child to be public does make sense. I would have inferred that but for the fact that all of the methods of MyClass (which by your explanation should be hidden) actually show up in the method list even when MyClass itself is hidden (the methods disappear as expected when you pass --non-transitive-tag api). This is a bug, no?

I'm not sure I'm following here. What's the repro case for this? If you apply @api public to MyClass in a transitive fashion (default behavior), my_method will be marked as public via transitive property of the tag and should be visible. If you use --non-transitive-tag api, the tag will not be inherited by the methods and they will be filtered out and be hidden. Both of these scenarios work as described with YARD from my tests.

But according to the ancestor-match rule, that's impossible with transitive api tags, ... IMO this makes the --api option useless under the default settings (API transitivity enabled).

Sorry, one of the things I did not make very clear at all with my last paragraph was actually a pretty important feature of transitivity, so let me correct my last message:

Transitive tags are applied transitively to child objects unless there is an overriding tag in the child. In other words, transitive tags are not applied to children that already have a tag of that type defined on them. This is a very important distinction and why you can in fact do something as simple as:

# @api public
module MyModule
  class MyClass
    def my_method; end

    # @api private
    def my_private_api_method; end
  end

  # @api private
  class MyPrivateClass; end
end

And yard --api public will do the "intuitive" thing:

image

One might assume that you would end up with both @api public and @api private on the child, but this is explicitly not the case for transitive tag logic. This might be the missing piece for you?

I do find the current state of affairs pretty counterintuitive though. Maybe some docs would help?

FWIW this is documented in the tag documentation, though even I completely missed explaining this detail above, so one could argue it needs better explanation than what is below:

image

I'd actually be happy to translate the knowledge I've gained debugging this issue into a short guide for inclusion in the YARD docs showing how to generate docs under various API scenarios, but I'm not sure of where the guide should live in the docs.

If you have recommendations / suggestions, PRs are always welcome to improve docs. The Getting Started guide is long overdue for an improvement, and perhaps that is what you're describing, but you may also consider that some of this belongs in separate documentation, i.e. littered through quick start, advanced topics, tag doc improvements, etc. I would love to see any contributions there!

lsegal avatar May 02 '21 23:05 lsegal

Also just adding a bit extra to one part of your initial response which I think is relevant:

Call this the "ancestor-match rule". IMO this rule is counterintuitive, as code objects that live deeper in the container hierarchy are often classified in a way that doesn't make sense for their containers.

This is a fair callout, though I wouldn't consider it counter-intuitive. It kind of depends on what your expectation is. If the expectation is to generate with --api public and then again with --api private with the above unchanged code (unmarked elements), yes it can be confusing. Again I think the reason it's counter-intuitive has to do with the conflicting semantics of how to display generated output and why this is in fact a fairly hard thing to fix (as you noted), mostly because "filtering" objects and "displaying" objects are pretty separable concerns.

For example, a displayed object might require the context of objects that were otherwise hidden by a filter. A common scenario here is a subclass that inherits a class that was filtered out by --query. YARD shows the superclass in documentation, but if that class is hidden, how should that work? If methods are made available to a subclass but those methods exist on the hidden class, how do we document them? Furthermore, this may be a problem in an API reference template (default) but may not at all be a problem when generating different style docs, or maybe even docs in a different format (yri style etc). So really the responsibility lies with the documentor to use queries in a way that doesn't slice their data up in too small of a chunk for the tool to use reliably. This creates some conflicting cases, but overall, the default behavior has so far been be fairly intuitive for typical use cases based on the "data" we've collected on the topic-- and by data I mean the very small amount of reports about this tag not working over a very large period of time.

FWIW, while we're on the subject of filtering, it's worth pointing out that --api is just a convenience option for --query. I pointed this out before, but it's worth noting that if you do end up with contradicting cases or complex scenarios, you can always create more complex filters that find the objects you want.

In the case of the original code pasted in this issue, you could create a filter that finds anything with @api public, but also anything that omitted an @api tag altogether. That's not really a best practice, but --query '@api.text == 'public' || !@api' would do this. Incidentally, YARD does have a short-hand for this too: --api public --no-api would work on the unchanged original repro code and be equally compatible if you wanted to go down the --api private route, but to be honest that is mostly just relying on leaving objects undefined, which is not ideal from a documentation POV.

lsegal avatar May 02 '21 23:05 lsegal

I'm not sure I'm following here. What's the repro case for this? If you apply @api public to MyClass in a transitive fashion (default behavior), my_method will be marked as public via transitive property of the tag and should be visible. If you use --non-transitive-tag api, the tag will not be inherited by the methods and they will be filtered out and be hidden. Both of these scenarios work as described with YARD from my tests.

The repro is the original case I posted. Yes, my_method is marked public, but shouldn't it be hidden if the containing class MyClass is? You explained that YARD hides MyClass even though it is specifically marked @api public because the containing module MyModule is not marked @api public:

Namespaces that have been filtered out of generation will never show up in generated output, and that includes anything they contain.

But we have a situation where the methods of MyClass are not being hidden (they are present in the method list) even while MyClass itself is (absent from the class list). So the above statement is false in the current YARD implementation.


Regarding transitive tags, I understand what you're saying here about explicit marking overriding transitivity. My point is that the only API that can ever be matched is the default one. That is what I find counterintuitive. You can mark up constructs with arbitrary values for @api, but the --api option can only ever be used to match an @api value that is present on the root. For instance, in the example you posted --api private will produce no output. This makes the --api option sort of pointless when API is transitive, since it will always be more economical (less markup since no explicit declaration of the @api of the root module) to use --hide-api to achieve an equivalent result. In your example, you would remove @api public from the root and just run yard --hide-api private.


Regarding docs, I will poke around and look for a good place to make my contribution.

smackesey avatar May 02 '21 23:05 smackesey