doc Decide which methods should be documented

Problem

Right now, we do not have a clear policy on what methods should be documented in a Type. This means that it's harder to tell if a method was left out accidentally or intentionally and correspondingly harder to detect methods that should have been documented but slipped through the cracks. See, e.g., #3569, #3563, and #3564 (just from the last ~10 issues).

I don't have strong feelings on what the policy should be. There are reasonable points to be made in favor of different approach, as was clear in #3572. But I do strongly believe that we should have a clear, unambiguous policy. Having a clear rule will have two large advantages:

Contributors can spend less time wondering "does this need to be documented" and more time actually writing docs.
Automated tooling (including the list-missing-methods script I recently revised) can reliably detect methods that need documentation. In my dream world, we will document 100% of the methods we think should be documented, and then can add tests to ensure that we are aware of any new method to document.

Suggestions

Here are a few suggestions for where we could draw the line, from most inclusive to least inclusive:

1. Document all public methods declared in Rakudo's implementation of a Type that are not is implementation-detail

pros:

Docs have a close relationship to code. If fully implemented, users can look at the docs page for a Type and know exactly what methods that Type implements.
Deciding what methods to document is fully unambiguous and require no manual decision-making.

cons:

Adds "clutter" to the type docs, with more methods that aren't particularly useful to many users. This could be mitigated by resolving #392 or by finding a different way to direct attention to the most important methods.
More implementation dependent: it is possible that a non-Rakudo implementation of Raku would use inherited methods where Rakudo uses locally declared methods (or the reverse). The more closely we follow Rakudo's implementation, the larger an issue this is.
Gives us the most methods to document – 2,018 by my count (about half of which are Exception methods), though this is mitigated by how easy/copy paste some of the documentation would be.

2. As above, but exclude well-defined subcategories of methods For example, we could exclude some or all of: ALL-CAPS methods that are automatically called for the user; methods involved in printing a Type or text related to it (gist, raku, message, etc); coercion methods (which all basically are documented as "coerces $type into $other-type").

pros:

Once we have decided on categories to exclude, maintaining the categories requires no ongoing work.
Fewer methods to document/cluttering the docs.

cons:

Requires deciding on categories to exclude.
Doc readers lose the ability to know where methods are implemented by looking at docs.

3. As above, but exclude additional methods on an ad-hoc basis For example, we might decide that the standard is "document a method only if it provides semantically different behavior than would be provided by an inherited method of the same name" (here, I'm thinking of many methods that are declared locally in a Class because the class for performance reasons – because the class knows more about it's data structure, it can produce the same result more efficiently).

Pros:

Results in the leanest docs, with all/nearly all documented methods relevant to users.
Our tooling is already set up to support excluding methods in this way (true for 1 & 2 too, but mentioned here to avoid doubt).

Cons:

Requires significant, ongoing work to maintain the list of intentionally-undocumented types.
Makes getting to "100% documentation" more difficult.
Leads to the problem was-this-left-out-intentionally issue I mentioned above, which could both slow existing contributors and discourage new ones.

4. Entirely decouple what methods we document from Rakudo and rely on Roast instead

Pros:

Implementation independent.

Cons:

Require huge doc rewrite.
How would this even work??
Really just listed here for analytical completeness.

I've listed labeled this issue as a Blocker because I think it significantly impedes our ability to fully document the many missing methods. However, that doesn't mean that I am fully blocked: I intend to keep working to document methods pending reaching consensus here. I plan to proceed under the 1/document everything approach for now – not because I'm convinced it's the right answer but just because it's easier to delete things later than to add them. That means that, the longer this issue stays unresolved, the work potentially wasted work I'll have done – so I appreciate everyone's help in figuring out the best approach here.

Aug 21 '20 14:08 codesections

Thanks a lot for this issue, I really appreciate you taking the time to formulate this. In principle, I'm for option 1. But the main problem is that of priorities. In principle, there are more "meaty" methods and those that are not so much (.Set will obviously return a Set). Also, higher priority will go to those new added methods than to those that have been disregarded (and nobody has cared about them). Also, we always need to take into account Roast; if it's not in roast, it's simply not spec and behavior might change at any time. That does not mean that it shouldn't be documented, but it should be so with all due caveats. Finally, with respect to ALL-CAPS, they might be spec, or they might not. If they have been "roasted", they should probably be documented.

So, long story short, there's no doubt we should document everything that is user-facing, is not an implementation detail, and has been specced. Everything else, if it's user-facing, it should be done... probably, but with a lower priority, and even more so if it's "in the wild" or used in some ecosystem module. Then... the rest.

Aug 21 '20 15:08 JJ

In principle, I'm for option 1. But the main problem is that of priorities.

I agree 100%. That said – at least speaking personally – it can be significantly more efficient to work on a type-by-type basis. That's how I submitted #3572: I wanted to add documentation for SetHash's set and unset methods, but once I had the SetHash.pod6 file, the SetHash.pm6 file, and a REPL all up and and understood the structure of the SetHash.pm6 code, it seemed easier to go ahead and add the other 14 missing SetHash methods while I had the context to do so quickly. I know that some of those methods were "lower priority", but they were the ones I had the context to add at that moment. Within the next few days, I plan to do the same thing with BagHash – add and remove seem like high-priority methods, but I plan to add the low priority methods while I'm there.

Maybe no one else works the way I do – it wouldn't be the first time! But, again just speaking for myself, I anticipate adding some low-priority documentation even before we've finished with the higher priority stuff.

Aug 21 '20 16:08 codesections

+1 for "all public, not impl. detail".

If it's not in roast, I think there are 2 options:

document it and request tests be added to roast (just in a todo list on roast is fine for our purposes, ticket better, PR best)
don't document it and have rakudo mark it as implementation detail (and therefore we don't have to document it)

Doc editors/authors aren't the arbiters of whether or not it should be in roast. But we can definitely help highlight the gaps as we go and make recommendations.

Aug 21 '20 16:08 coke

Also, we always need to take into account Roast; if it's not in roast, it's simply not spec and behavior might change at any time. That does not mean that it shouldn't be documented, but it should be so with all due caveats.

After giving this more thought, I believe that – as valid as the "if it's not in Roast it's not spec" principle is – it doesn't help answer the question of which methods should be documented on which types. Indeed, unless I'm very confused about how Roast works, it intentionally can't answer that question.

For example, Roast tests ensure that implementations allow (via indexing) calls to both SetHash.AT-KEY and SetHash.EXISTS-KEY. What Roast doesn't and (I believe) can't test is whether those calls are carried out by methods declared directly on SetHash or on some Class/Role SetHash inherits from. As it happens, SetHash directly declares AT-KEY but not EXISTS-KEY, but Roast is totally agnostic on that point; and a different implementation could make a different choice there, while still passing Roast.

And the same is true for other methods. So, as much as I respect Roast's role, I don't think it has any opinions about where methods are declared, as opposed to from where they can be called.

Aug 21 '20 16:08 codesections

Well, if it's not in roast, it might or might not be in the Synopsis. If it's there, well, it should go to roast and then an issue should be raised. If it's not, well... Probably a request for clarification is in order.

Aug 21 '20 16:08 JJ

Also, we always need to take into account Roast; if it's not in roast, it's simply not spec and behavior might change at any time. That does not mean that it shouldn't be documented, but it should be so with all due caveats.

After giving this more thought, I believe that – as valid as the "if it's not in Roast it's not spec" principle is – it doesn't help answer the question of which methods should be documented on which types. Indeed, unless I'm very confused about how Roast works, it intentionally can't answer that question.

For example, Roast tests ensure that implementations allow (via indexing) calls to both SetHash.AT-KEY and SetHash.EXISTS-KEY. What Roast doesn't and (I believe) can't test is whether those calls are carried out by methods declared directly on SetHash or on some Class/Role SetHash inherits from. As it happens, SetHash directly declares AT-KEY but not EXISTS-KEY, but Roast is totally agnostic on that point; and a different implementation could make a different choice there, while still passing Roast.

But this is a different problem. It's not about what to document, but about how to document it. In general, we add a reference for classes methods, but not for classes syntax, although we could obviously do that (mainly in an ad-hoc basis, in the intro). So the question is not whether to document AT-KEY or not, but if documenting in as a class method or as a syntactic option. What we will see in roast is the latter, not the former.

And the same is true for other methods. So, as much as I respect Roast's role, I don't think it has any opinions about where methods are declared, as opposed to from where they can be called.

Which is probably what we should be interested in.

Aug 21 '20 17:08 JJ

I'm not sure I'm communicating my point clearly. In my view, the question we're trying to figure out is how to document something (not what to document), so we might be talking past one another a bit. Let me try again with a more concrete example: the kv method.

Roast tests that code can call the kv method on Sets and SetHashes, so we need to document the kv method. Everyone is sure of that.

The question, then, is where/in how many places do we document the kv method. In my understanding, Roast has no bearing on answering this question. Right now, (before my PR) we document kv on Setty and do not document it on Set or SetHash.

Looking to Rakudo, it turns out that Setty does not declare a kv method, but Set and SetHash both do. I'm inclined to say that this means our docs are "wrong" about kv – that, to be more accurate, the method should be removed from Setty and added to Set and SetHash. Do you agree?

If so, then my point is that this determination is independent of Roast – a different implementation could declare kv in Setty and still pass all the tests. If not, then I'm kind of lost in figuring out where different methods should live within our /type/* documentation hierarchy.

Aug 21 '20 17:08 codesections

I'm not sure I'm communicating my point clearly. In my view, the question we're trying to figure out is how to document something (not what to document), so we might be talking past one another a bit. Let me try again with a more concrete example: the kv method.

Well, they are not so different. You might decide to document AT-KEY (what) but then you might decide to document the syntax [] (how) instead of method AT-KEY.

Roast tests that code can call the kv method on Sets and SetHashes, so we need to document the kv method. Everyone is sure of that.

The question, then, is where/in how many places do we document the kv method. In my understanding, Roast has no bearing on answering this question. Right now, (before my PR) we document kv on Setty and do not document it on Set or SetHash.

Well, if the spec says that the behavior is different than what was described in Setty, and it's shown that way in roast, we should. If not, the documentation for Setty.kv is already attached to the bottom of all the classes that mix it in. If the behavior is still the same, why should we document it?

Looking to Rakudo, it turns out that Setty does not declare a kv method, but Set and SetHash both do. I'm inclined to say that this means our docs are "wrong" about kv – that, to be more accurate, the method should be removed from Setty and added to Set and SetHash. Do you agree?

If this is not a trick question, of course I agree.

If so, then my point is that this determination is independent of Roast – a different implementation could declare kv in Setty and still pass all the tests. If not, then I'm kind of lost in figuring out where different methods should live within our /type/* documentation hierarchy.

Exactly where they are declared, and/or where they exhibit a sufficiently different, or outstanding, behavior, specced and tested in roast.

(Also please raise the issue on the wrong .kv in Setty)

Aug 21 '20 17:08 JJ

If this is not a trick question, of course I agree.

Definitely not a trick question. No tricks here!

[We should document methods] exactly where they are declared, and/or where they exhibit a sufficiently different, or outstanding, behavior

I think the and/or here gets at the heart of where I'm confused. This seems like an xor situation to me: If we're going by "where they are declared", then .kv should be documented on Set and SetHash but not Setty. If we're going by "where they exhibit sufficiently different behavior", then it makes sense to keep the kv docs in Setty: the docs say that "Returns a Seq of the set's elements and True values interleaved", which is a perfectly accurate description of the behavior in both Set and SetHash. If we're going by "where they're declared and/or where they exhibit different behavior" then I genuinely don't know where kv should be documented.

(Also please raise the issue on the wrong .kv in Setty)

I'm happy to do so.

Aug 21 '20 17:08 codesections

After an extremely interesting an mind-expanding conversation with @coke in the IRC channel (lines 529-609), I've come to believe that I was fundamentally mistaken above, and no longer believe that any of the first three potential solutions I presented in the OP are correct options.

That's because they were all answers to the wrong question. They were implicitly answering the question "what methods should be documented in the pod6 file for each Type". That's close to right, but not quite.

The right question to ask is "what methods should be visible in the HTML (or other output) page for each Type". That is, we shouldn't care about where a method is documented (which is a parallel concern to where it is implemented in Rakudo). Instead, we should care about where a method is visible to the user – which is a parallel concern to where it is callable, the only thing Roast cares about.

This is a bit abstract and took me a while to wrap my head around, so let me make it concrete. Under the document-all-public-methods approach (1 in the OP), the documentation for Setty's kv method should be deleted and from the Setty.pod6 and inserted into SetHash.pod6. Those methods aren't declared at all in Setty, but they are declared as public methods methods in SetHash. On the other hand, the documentation for Setty's of method is correctly in Setty.pod6 – and moving it to SetHash.pod6 would be wrong – since that method is a public Setty method and is not a SetHash method. The document-all-public-methods rule gives a clear answer; the kv docs need to move and the of docs need to stay.

But that ignores one thing: Roast doesn't care.

Roast doesn't test that SetHash has a kv method or that Setty has an of method. Roast just tests that SetHash can call a kv and an of method. Those methods could be implemented locally, they could come from Setty, or they could come from some other Class or Role that Raku doesn't even include. So long as the tests pass, Roast doesn't care. And neither should our docs.

So, what does that mean in practice? It means that I'd like to propose a new definition for when a type is fully documented:

A type is fully documented only if all methods Roast tests call on that type appear in the generated documentation for that type

That definition is not quite as easy to operationalize/test as the all-public-methods definition. But, complex or not, it has the virtue of being true.

That said, I haven't given up on the goal of having a simple way to determine what should be documented. I stand by what I said in the OP: having a simple, clear list of missing methods would make things a lot easier for contributors. And having a tool that can automatically check when we're missing something would help prevent gaps from growing in our docs. So, how can we turn that true definition into a workable one that can be applied by a tool.

Well, we need to compare two lists: the list of all methods that render in the output documentation for a Type, and the list of all methods Roast calls for a Type.

The first one of these is pretty easy, and will require only relatively minor changes to the list-missing-methods script. (It will mean that we won't be able to have output for just one pod6.file – because the concept of whether one pod6 file is complete documentation for a Type is no longer meaningful). But that's not a large issue.

The second problem – checking which methods Roast calls for each Type – is more challenging, but hardly insurmountable. It's effectively a sub-problem of implementing a tool for measuring code-coverage tool, after all. And, in the meantime, we can fairly easily modify the list-missing-methods program to check our documentation against all methods that a Type can call (that is, the results of Type.^methods(:all), which should be a very good first approximation.

So, this is my plan:

[x] 1. close my HashSet PR and resubmit a more limited version that just adds set and unset.
[ ] 2. Prepare a PR to CONTRIBUTING.md that clarifies the standard for whether a method should be documented.
[ ] 3. Revise list-missing-methods to use the definition and approximation described above, instead of the all-public-methods definition (and spin it off into a separate module, as previously discussed).
[ ] 4. Start using that script to resume adding undocumented methods.
[ ] 5. Work on the ability to generate a list of all methods tested in Roast, as the second part of our long-term solution.

I feel pretty confidant that this is the right plan, but I'm definitely open to contrary viewpoints. After all, I've already learned quite a lot from this conversation, and that was just today. (Thanks!)

Aug 21 '20 22:08 codesections

Was a great convo on IRC, and I think you summed things up very eloquently. Thank you for all the effort you're putting in here.

Aug 22 '20 01:08 coke

[We should document methods] exactly where they are declared, and/or where they exhibit a sufficiently different, or outstanding, behavior

I think the and/or here gets at the heart of where I'm confused. This seems like an xor situation to me: If we're going by "where they are declared", then .kv should be documented on Set and SetHash but not Setty. If we're going by "where they exhibit sufficiently different behavior", then it makes sense to keep the kv docs in Setty: the docs say that "Returns a

OK, but the fact that they should actually be used in that particular setting was implicit. Setty was probably once endowed with a kv, but not any more. The "or" case would be, for instance, a role that implements something like a trivial conversion to .WhateverHash, which is only meaningfully implemented where it's mixed in. In this case, you wouldn't want to have a method description "this does a trivial thing" attached at the bottom of the page of every single class that mixes it in, but only in the class that actually does something meaningful. At the end of the day, this ends up being a judgment call, I know. But again, it needs to be looked at from the point of view of priorities and also information overload on the user's end.

Aug 22 '20 08:08 JJ

So, what does that mean in practice? It means that I'd like to propose a new definition for when a type is fully documented:

A type is fully documented only if all methods Roast tests call on that type appear in the generated documentation for that type

That definition is not quite as easy to operationalize/test as the all-public-methods definition. But, complex or not, it has the virtue of being true.

It's asymptotically true, at least.

The second problem – checking which methods Roast calls for each Type – is more challenging, but hardly insurmountable. It's effectively a sub-problem of implementing a tool for measuring code-coverage tool, after all. And, in the meantime, we can fairly easily modify the list-missing-methods program to check our documentation against all methods that a Type can call (that is, the results of Type.^methods(:all), which should be a very good first approximation.

And it would be great to have that tool...

1. close my `HashSet` PR and resubmit a more limited version that just adds `set` and `unset`.

2. Prepare a PR to CONTRIBUTING.md that clarifies the standard for whether a method should be documented.

Again, please also consider adding some text about documenting the syntax (for instance [] instead of ALL-CAPS methods.

5. Work on the ability to generate a list of all methods tested in Roast, as the second part of our long-term solution.

A test coverage tool would be absolutely great, in general.

I feel pretty confidant that this is the right plan, but I'm definitely open to contrary viewpoints. After all, I've already learned quite a lot from this conversation, and that was just today. (Thanks!)

Thanks to you!

Aug 22 '20 08:08 JJ