intellij-community [java,doc] Allow nested markup in inline return tag

Fixed Youtrack issues

https://youtrack.jetbrains.com/issue/IDEA-300185/Support-code-tags-inside-inline-return
https://youtrack.jetbrains.com/issue/IDEA-329877/Javadoc-does-not-render-correctly-for-inline-return
Maybe more

Before

Previously IntelliJ did not parse nested markup in inline tags, resulting in the following display for a simple method doc comment.

    /**
     * {@return {@code something cody} in here !}
     *
     * @param args my args
     * @throws IOException if things happen
     */
    public static void main(String[] args) throws IOException {}

After

Threats to validity

The javadoc parsing code seems incredibly complicated for what it does, so I might have missed some intricacies. As my changes are quite limited in scope though, that should be okay to review.

Alternative approaches and future work

My first commit only fixes inline return tags. My second commit allows nested content everywhere except literal tags, which prettifies the display somewhat in other cases (e.g. {@link Test {@code foo} bar}). The standard resolves such text and the spec says

Escape sequences cannot be used in inline tags that contain literal text; this includes {@code}, {@literal}, {@snippet}, and user-defined tags.

The last part, user-defined tags, I did not model. They will also have their markup parsed recursively. I do not know how to detect them in a future-proof way in IntelliJ.

Feb 09 '24 23:02 I-Al-Istannen

Can I do anything here to help you review the PR @sirgl?

Mar 19 '24 16:03 I-Al-Istannen

@sirgl What is the status of this PR? Can I do anything to help?

Apr 15 '24 15:04 I-Al-Istannen

Something is not quite right (probably in remapAndAdvance), for instance in the following example there is a DOC_INLINE_TAG_END in excess:

/**
 * {@code opening curly brace { closing curly brace } this is still code }
 */

      PsiDocToken:DOC_COMMENT_START('/**')(0,3)
      PsiDocToken:DOC_COMMENT_LEADING_ASTERISKS('*')(5,6)
      PsiDocToken:DOC_COMMENT_DATA(' This is a class. ')(6,24)
      PsiInlineDocTag:@code(24,95)
        PsiDocToken:DOC_INLINE_TAG_START('{')(24,25)
        PsiDocToken:DOC_TAG_NAME('@code')(25,30)
        PsiDocToken:DOC_COMMENT_DATA(' opening curly brace ')(30,51)
        PsiDocToken:DOC_COMMENT_DATA('{')(51,52)
        PsiDocToken:DOC_COMMENT_DATA(' closing curly brace ')(52,73)
        PsiDocToken:DOC_INLINE_TAG_END('}')(73,74)
        PsiDocToken:DOC_COMMENT_DATA(' this is still code ')(74,94)
        PsiDocToken:DOC_INLINE_TAG_END('}')(94,95)
      PsiDocToken:DOC_COMMENT_END('*/')(97,99)

Before changes the closing brace was remapped to DOC_COMMENT_DATA correctly:

      PsiDocToken:DOC_COMMENT_START('/**')(0,3)
      PsiDocToken:DOC_COMMENT_LEADING_ASTERISKS('*')(5,6)
      PsiDocToken:DOC_COMMENT_DATA(' This is a class. ')(6,24)
      PsiInlineDocTag:@code(24,95)
        PsiDocToken:DOC_INLINE_TAG_START('{')(24,25)
        PsiDocToken:DOC_TAG_NAME('@code')(25,30)
        PsiDocToken:DOC_COMMENT_DATA(' opening curly brace ')(30,51)
        PsiDocToken:DOC_COMMENT_DATA('{')(51,52)
        PsiDocToken:DOC_COMMENT_DATA(' closing curly brace ')(52,73)
        PsiDocToken:DOC_COMMENT_DATA('}')(73,74)
        PsiDocToken:DOC_COMMENT_DATA(' this is still code ')(74,94)
        PsiDocToken:DOC_INLINE_TAG_END('}')(94,95)
      PsiDocToken:DOC_COMMENT_END('*/')(97,99)

Apr 18 '24 09:04 yopox

That does look broken, I will have a look at it :) Thanks!

Apr 22 '24 10:04 I-Al-Istannen

Okay, I looked at it for quite a while now and I don't see an obvious fix for this in the current architecture. The Javadoc parser code is quite convoluted, interleaving multiple consuming while loops with recursion, token remapping and advancing/rolling back the lexer in leaf calls.

The problem I currently have is correctly detecting the end of an enclosed region. If you prevent a starting { from counting as a start tag, you also need to prevent everything until the next balancing closing } brace from becoming a tag marker. I tried to do this in various ways, but I don't have quite enough information to decide this down where the lexer advances.

I tried adding a markupBraceCutoff which is incremented every time a tag that can contain markup is encountered. Then you would have the following situation:

Token           : /**   {   @code   {   }   }
Brace level     : 0     1   1       2   1   0
Cutoff level    : 1     1   1       1   1   1
Brace <= cutoff : -     y   -       n   n   y
Allow           : -     y   -       n   n   y

In IJ however, as the brace levels are updated in the inner leaf calls, you get this:

Token           : /**   {   @code   {   }   }
Brace level     : 0     0   1       1   2   1
Cutoff level    : 1     1   1       1   1   1
Brace <= cutoff : -     y   -       y   n   y
Allow           : -     y   -       y   n   y

This is still workable if you implement asymmetric boundaries for { and }. The opening must be strictly smaller, the ending must be smaller or equal. This works fineish.

But when you have free-standing { and }, IJ manually rolls back the classification of { and does not increase the brace level. This results in a growing imbalance, with my code thinking all the closing braces actually appear on the root level. There, markup is obviously allowed. Consequently, the closing braces are marked as inline tag ends, even if they shouldn't be. But even if incrementing the brace level there, we still aren't on much better footing. Inside of { inline tags can appear and be rendered by the javadoc tool. This would necessitate incrementing the cutoff level — making it impossible to mark the free-standing braces as normal comment data.

I can't help but feel that the current parser implementation is a bit more convoluted and interestingly factored than it has to be. Changing that would require quite some work and relatively large diff though. But I am not sure how I can properly fix nested markup tags without such a refactor.

May 03 '24 17:05 I-Al-Istannen

intellij-community intellij-community copied to clipboard

[java,doc] Allow nested markup in inline return tag

Fixed Youtrack issues

Before

After

Threats to validity

Alternative approaches and future work

intellij-community
intellij-community copied to clipboard