typedoc-plugin-markdown icon indicating copy to clipboard operation
typedoc-plugin-markdown copied to clipboard

Broken tables

Open KillyMXI opened this issue 1 year ago • 11 comments
trafficstars

What package is the bug related to?

typedoc-plugin-markdown

Describe the issue

Related to #615

Issues I notice with table formatting, in version 4.0.2.

Interface members, unexpected line-breaks

export interface Rule {
  // ...

  /**
   * Switch to another lexer function after this match,
   * concatenate it's results and continue from where it stopped.
   */
  push?: Lexer;

  // ...
}
export interface RegexRule extends Rule {
  /**
   * Regular expression to match.
   *
   * - Can't have the global flag.
   *
   * - All regular expressions are used as sticky,
   *   you don't have to specify the sticky flag.
   *
   * - Empty matches are considered as non-matches -
   *   no token will be emitted in that case.
   */
  regex: RegExp;
  /**
   * Replacement string can include patterns,
   * the same as [String.prototype.replace()](https://developer.mozilla.org/en-US/docs/Web/JavaScript/Reference/Global_Objects/String/replace#specifying_a_string_as_a_parameter).
   *
   * This will only affect the text property of an output {@link Token}, not it's offset or length.
   *
   * Note: the regex has to be able to match the matched substring when taken out of context
   * in order for replace to work - boundary/neighborhood conditions may prevent this.
   */
  replace?: string;
}
## Properties

| Property | Type | Description | Inherited from |
| :------ | :------ | :------ | :------ |
| `push?` | [`Lexer`](../type-aliases/Lexer.md) | Switch to another lexer function after this match,
 concatenate it's results and continue from where it stopped. | [`Rule`](Rule.md).`push` |
| `regex` | `RegExp` | <p>Regular expression to match.</p>
<ul><li>Can't have the global flag.</li><li></li><li>All regular expressions are used as sticky,</li></ul>  you don't have to specify the sticky flag.
<ul><li>Empty matches are considered as non-matches -</li></ul>  no token will be emitted in that case. | - |
| `replace?` | `string` | <p>Replacement string can include patterns,</p>
<p> the same as [String.prototype.replace()](https://developer.mozilla.org/en-US/docs/Web/JavaScript/Reference/Global_Objects/String/replace#specifying_a_string_as_a_parameter).</p><p> This will only affect the text property of an output [Token](../type-aliases/Token.md), not it's offset or length.</p><p> Note: the regex has to be able to match the matched substring when taken out of context</p>
<p> in order for replace to work - boundary/neighborhood conditions may prevent this.</p> | - |
... abbreviated ...

Type alias, unexpected line-break

export type Token = {
  // ...

  /**
   * The length of the matched substring.
   *
   * _(Might be different from the text length in case replace value
   * was used in a {@link RegexRule}.)_
   */
  len: number;

  // ...
}
## Type declaration

| Member | Type | Description |
| :------ | :------ | :------ |
| `len` | `number` | <p>The length of the matched substring.</p><p>_(Might be different from the text length in case replace value</p>
<p> was used in a [RegexRule](../interfaces/RegexRule.md).)_</p> |
... abbreviated ...

Function parameters, unexpected line-break

/**
 * Create a lexer function.
 *
 * @param rules - Non-empty array of lexing rules.
 *
 * Rules are processed in provided order, first match is taken.
 *
 * Rules can have the same name - you can have separate rules
 * for keywords and use the same name "keyword" for example.
 *
 * @param options - Lexer options object.
 */
export function createLexer (
  rules: Rules,
  options?: Options
): Lexer;

### Parameters
| Parameter | Type | Description |
| :------ | :------ | :------ |
| `rules` | [`Rules`](../type-aliases/Rules.md) | <p>Non-empty array of lexing rules.</p><p> Rules are processed in provided order, first match is taken.</p><p> Rules can have the same name - you can have separate rules</p>
<p> for keywords and use the same name "keyword" for example.</p> |
| `options`? | [`Options`](../type-aliases/Options.md) | Lexer options object. |

Index, missing description

/**
 * Lexer function.
 *
 * @param str - A string to tokenize.
 * @param offset - Initial offset. Used when composing lexers.
 */
export type Lexer = (
  str: string,
  offset?: number
) => LexerResult;
## Type Aliases

| Type alias | Description |
| :------ | :------ |
| [Lexer](type-aliases/Lexer.md) | - |
... abbreviated ...

(expected "Lexer function." in the description column)

Index, unexpected line-break

/**
 * Lexing rule.
 *
 * Base rule looks for exact match by it's name.
 *
 * If the name and the lookup string have to be different
 * then specify `str` property as defined in {@link StringRule}.
 */
export interface Rule {
  // ...
}
| Interface | Description |
| :------ | :------ |
| [Rule](interfaces/Rule.md) | Lexing rule.
 |
... abbreviated ...

Index, description trimmed at line-break, no full paragraph, no ellipsis

/**
 * String rule - looks for exact string match that
 * can be different from the name of the rule.
 */
export interface StringRule extends Rule {
  // ...
}
| Interface | Description |
| :------ | :------ |
| [StringRule](interfaces/StringRule.md) | String rule - looks for exact string match that
 |
... abbreviated ...

(expected "String rule - looks for exact string match that can be different from the name of the rule." in the description column) (there is an unexpected line-break instead)

TypeDoc configuration

{
  "$schema": "https://typedoc.org/schema.json",
  "disableSources": true,
  "entryFileName": "index.md",
  "entryPoints": [ "src/code.ts" ],
  "githubPages": false,
  "hideBreadcrumbs": true,
  "includeVersion": true,
  "readme": "none",
  "out": "docs",
  "plugin": ["typedoc-plugin-markdown"],
  "outputFileStrategy": "members",
  "useCodeBlocks": true,
  "expandParameters": true,
  "parametersFormat": "table",
  "propertiesFormat": "table",
  "typeDeclarationFormat": "table",
  "indexFormat": "table",
}

Expected behavior

No response

KillyMXI avatar May 15 '24 20:05 KillyMXI

Thank you for spending time to document this. It is extremely useful.

I think parsing of markdown for table cells is probably going to be best handled by a 3rd party. I'll report back with a favoured approach shortly.

tgreyuk avatar May 15 '24 21:05 tgreyuk

Half-assed solution could be to defensively remove line-breaks from any cell contents. But a proper parser would certainly be better than homegrown regexes.

The case of missing description in the index table is most puzzling though - it doesn't seem to fall into the same pattern as others. And I'm not sure what makes that one description special from others.

KillyMXI avatar May 15 '24 22:05 KillyMXI

@KillyMXI I think rather than trying to parse the markdown to html in the table cell it might be a better to provide an option to wrap the markdown in an html table as follows:

Input

export type Token = {
  // ...

  /**
   * The length of the matched substring.
   *
   * _(Might be different from the text length in case replace value
   * was used in a {@link RegexRule}.)_
   */
  len: number;

  // ...
}

format="table"

Will create a markdown table and remove all linebreaks from the comments. This will work for short descriptions or if html is being used for block elements.

## Type declaration

| Member | Type | Description |
| :------ | :------ | :------ |
| `len` | `number` | The length of the matched substring.  _(Might be different from the text length in case replace value was used in a [RegexRule](../interfaces/RegexRule.md).)_ |

format="htmlTable"

Rather than trying to parse the markdown to html we simply wrap the markdown in an html table. This also means that users will be able to render triple line code blocks in the same way as outside of table cells.

## Type declaration

<table><tr><th>Member</th><th>Type</th><th>Description</th></tr><tr><td>

`len`

</td><td>

`number`

</td><td>

The length of the matched substring.

_(Might be different from the text length in case replace value
was used in a [RegexRule](../interfaces/RegexRule.md).)_

</td></tr></table>

I would be interested to know your thoughts?

tgreyuk avatar May 16 '24 21:05 tgreyuk

I love this idea! It will help keeping everything readable as well.

One note though: with format="table", beware of repeating #331 - I see an extra space in the example above.

KillyMXI avatar May 16 '24 21:05 KillyMXI

As discussed, htmlTable key now available to options in [email protected].

tgreyuk avatar Jun 22 '24 11:06 tgreyuk

Thank you. Trying it now.

First thing I tripped on: indexFormat only supports list and table. And table produces broken result.

So I'm limited to either list and zero information or table and cut the documentation from my code to conform the the limitations of table.

What I wish is that it would be possible to have another option for index - get the first paragraph, trim all line breaks in it. So I can have the right balance of the level of details.

KillyMXI avatar Jun 22 '24 14:06 KillyMXI

Where htmlTable is supported - it seems to work wonderfully.

KillyMXI avatar Jun 22 '24 14:06 KillyMXI

First thing I tripped on: indexFormat only supports list and table. And table produces broken result.

Will update so also supported for index.

What I wish is that it would be possible to have another option for index - get the first paragraph, trim all line breaks in it. So I can have the right balance of the level of details.

Happy to look at this. Can you provide a specific example so I don't implement the wrong thing?

tgreyuk avatar Jun 22 '24 15:06 tgreyuk

These two were mentioned in the issue above:

/**
 * Lexing rule.
 *
 * Base rule looks for exact match by it's name.
 *
 * If the name and the lookup string have to be different
 * then specify `str` property as defined in {@link StringRule}.
 */
export interface Rule {
  // ...
}

Desirable index description: Lexing rule.

/**
 * String rule - looks for exact string match that
 * can be different from the name of the rule.
 */
export interface StringRule extends Rule {
  // ...
}

Desirable index description: String rule - looks for exact string match that can be different from the name of the rule.

This is another one:

/**
 * Non-empty array of rules.
 *
 * Rules are processed in provided order, first match is taken.
 *
 * Rules can have the same name. For example, you can have
 * separate rules for various keywords and use the same name "keyword".
 */
export type Rules = [
  //...
];

Desirable index description: Non-empty array of rules.

Previously mentioned issue of missing description is now fixed, that's nice.

KillyMXI avatar Jun 22 '24 15:06 KillyMXI

hmm Is htmlTable for indexFormat included in 4.1.1? If I try to set it - it doesn't fail like with unknown value, but it outputs a list instead.

KillyMXI avatar Jul 02 '24 16:07 KillyMXI

Sorry - the option was exposed but the logic wasn't merged into the release. It will go into 4.1.2.

tgreyuk avatar Jul 02 '24 16:07 tgreyuk

All issues in this ticket should be fixed in [email protected]

tgreyuk avatar Jul 05 '24 13:07 tgreyuk

sigh Something is not right.

What I currently see in index for Rules with "indexFormat": "htmlTable", for example:

<tr>
<td>

[Rules](type-aliases/Rules.md)

</td>
<td>

Non-empty array of rules.  Rules are processed in provided order, first match is taken.  Rules can have the same name. For example, you can have separate rules for various keywords and use the same name "keyword".

</td>
</tr>

Expectation for htmlTable - markdown formatting left unchanged.

Currently, it hurts readability of both raw and processed output with "indexFormat": "htmlTable", and the whole point of introducing HTML tables was to preserve formatting both in raw and processed Markdown.

There might also be some misunderstanding - I was expecting at least a comment after https://github.com/typedoc2md/typedoc-plugin-markdown/issues/618#issuecomment-2184070450 , and it should be attributed to "...Format": "table" or a separate format. (Because such reduced output should be generally usable with Markdown tables without breaking them.)

KillyMXI avatar Jul 05 '24 14:07 KillyMXI

I have set up a demo repo with this issue:

https://github.com/typedoc2md/typedoc-plugin-markdown-scratchpad/tree/main/issues/618

As you can see this output matches your highlighted comments.

Screenshot 2024-07-05 at 15 32 35

Desirable index description: Lexing rule. Desirable index description: String rule - looks for exact string match that can be different from the name of the rule. Desirable index description: Non-empty array of rules.

I have just discovered that your observed behaviour is confined to Windows OS - i'll have to fix it with next release (sorry).

If you think I have misunderstood anything else let me know.

tgreyuk avatar Jul 05 '24 14:07 tgreyuk

I need to clarify once again, that the combination of my desirable behavior and htmlTable in itself is rather pointless and can come in conflict with someone's use cases.

htmlTable is safe for any formatting and thus might be beneficial to keep it unaltered. table is not safe and can be made more safe by cutting content. Cutting down content might also be considered a separate concern, separate option from output format.

Thus, I see the content of the demo repo as a result of miscommunication. I should've made more clear that the desirable index description is not linked to html.

KillyMXI avatar Jul 05 '24 14:07 KillyMXI

Yes i understand. htmlTable should display the complete summary (without the block tags) in the table cell?

tgreyuk avatar Jul 05 '24 15:07 tgreyuk

You have also alluded to providing a separate option of what should be displayed in the table column for indexes which might also be a good idea. (First para or complete summary etc).

tgreyuk avatar Jul 05 '24 15:07 tgreyuk

Expectation for htmlTable: to behave the same everywhere (being a safe way to show all available markdown content):

<tr>
<td>

[Rules](type-aliases/Rules.md)

</td>
<td>

Non-empty array of rules.

Rules are processed in provided order, first match is taken.

Rules can have the same name. For example, you can have
separate rules for various keywords and use the same name "keyword".

</td>
</tr>

Desirable separate option: index table with the description that is cut down to first paragraph. There are multiple ways to achieve it, depending on what implementation is more convenient:

  • separate output format, abbreviatedTable or something like that - expected to be a markdown table, since that's more readable for limited content;
  • or separate flag, abbreviatedIndex or something like that - cuts the description for index regardless of output format - most flexible, in case someone needs a table-breaking syntax in the first paragraph;
  • or maybe some other way. (Actually most flexible option would be independent configuration of output format and abbreviation for each of targets - I might've gone for it if I were implementing it myself, but that's beyond my pragmatic needs on my current projects)

KillyMXI avatar Jul 05 '24 15:07 KillyMXI

thank you. I think for now (and for simplicity):

  • "table" - will display the first paragraph of the summary in a markdown table (with the windows bug fixed).
  • "htmlTable" - will display the full markdown content in an html table.

Any other configurations i'll wait to see if there is any community feedback / requests.

tgreyuk avatar Jul 05 '24 17:07 tgreyuk

[email protected] will now work as above.

See demo repo - https://github.com/typedoc2md/typedoc-plugin-markdown-scratchpad/tree/main/issues/618.

tgreyuk avatar Jul 10 '24 22:07 tgreyuk

Nice.

But now the bug of missing description in the index is back. Now with functions though.

/**
 * Create a lexer function.
 *
 * @param rules - Non-empty array of lexing rules.
 *
 * Rules are processed in provided order, first match is taken.
 *
 * Rules can have the same name - you can have separate rules
 * for keywords and use the same name "keyword" for example.
 *
 * @param options - Lexer options object.
 */
export function createLexer (
  rules: Rules,
  options?: Options
): Lexer;

I got hyphen both with table and htmlTable:


## Functions

| Function | Description |
| ------ | ------ |
| [createLexer](functions/createLexer.md) | - |


## Functions

<table>
<tr>
<th>Function</th>
<th>Description</th>
</tr>
<tr>
<td>

[createLexer](functions/createLexer.md)

</td>
<td>

&hyphen;

</td>
</tr>
</table>

Expected description: Create a lexer function.

KillyMXI avatar Jul 10 '24 23:07 KillyMXI

[email protected] . fingers crossed :)

https://github.com/typedoc2md/typedoc-plugin-markdown-scratchpad/tree/main/issues/618

tgreyuk avatar Jul 11 '24 20:07 tgreyuk

Yay! Thank you!

All seems to work, no more broken tables, and I can have perfect index.

KillyMXI avatar Jul 11 '24 22:07 KillyMXI