git-scm.com icon indicating copy to clipboard operation
git-scm.com copied to clipboard

New manpage format

Open jnavila opened this issue 1 year ago β€’ 24 comments

Changes

This PR changes the way the asciidoc source of manpage is processed, by adding the "synopsis" paragraph style and reworking the backtick format.

Context

The style change has been pushed to master and will be applied to git-clone and git-init in the next version.

jnavila avatar Nov 17 '24 21:11 jnavila

I just triggered a pair of workflow runs to update the manual pages and to update the translated manual pages, fetched the result and rendered it locally. Here are two examples:

language before after
English image image
French image image

Personally, I cannot spot any difference, apart from the version number (because this here PR branch is based on v2.46.2 while the updated manual pages include v2.47.0) and the incorrect =<regexp> on the "before" side of the French version (fixed on the "after" side).

Even looking at the HTML of the synopses (taking the French version, so that there is a known difference), I only see this:

diff --git a/before b/after
index 1a87d1348..6185fb72b 100644
--- a/before
+++ b/after
@@ -1,5 +1,5 @@
 <pre class="content"><em>git config list</em> [&lt;option-de-fichier&gt;] [&lt;option-d-affichage&gt;] [--includes]
-<em>git config get</em> [&lt;option-de-fichier&gt;] [&lt;option-d-affichage&gt;] [--includes] [--all] [--regexp=&lt;regexp&gt;] [--value=&lt;valeur&gt;] [--fixed-value] [--default=&lt;default&gt;] &lt;nom&gt;
+<em>git config get</em> [&lt;option-de-fichier&gt;] [&lt;option-d-affichage&gt;] [--includes] [--all] [--regexp] [--value=&lt;valeur&gt;] [--fixed-value] [--default=&lt;default&gt;] &lt;nom&gt;
 <em>git config set</em> [&lt;option-de-fichier&gt;] [--type=&lt;type&gt;] [--all] [--value=&lt;valeur&gt;] [--fixed-value] &lt;nom&gt; &lt;valeur&gt;
 <em>git config unset</em> [&lt;option-de-fichier&gt;] [--all] [--value=&lt;valeur&gt;] [--fixed-value] &lt;nom&gt; &lt;valeur&gt;
 <em>git config rename-section</em> [&lt;option-de-fichier&gt;] &lt;ancien-name&gt; &lt;nouveau-name&gt;

@jnavila what am I missing?

dscho avatar Nov 18 '24 18:11 dscho

The manpage of git-config has not been converted yet. I pushed a branch "test-refactor" on git-html-l10n, where I hand-edited fr/git-add.txt.

After importing, here is the result:

image

I'm not satisfied with the styles, particularly when dealing with inline formats:

image

you can test by yourself locally, and tell me your judgment.

jnavila avatar Nov 18 '24 21:11 jnavila

@dscho I updated the CSS, so it is ready for review.

jnavila avatar Dec 22 '24 18:12 jnavila

These regexes are basically the same ones that I already pushed to git/git. They are applied during the conversion phase, not in live, and only on the quoted strings of text, after the initial asciidoc parsing has been performed.

Anyway, as they are cryptic, I can try to convert the code to a parser, but I doubt this will be a lot clearer.

jnavila avatar Jan 04 '25 21:01 jnavila

Writing a parser turns out to be more involving than expected, mainly because the new parser must be resilient to formatting mistakes in the translations (particularly in Chinese) and manpages that were not converted to the new format (where the backticks are used for widely varying strings).

jnavila avatar Jan 25 '25 18:01 jnavila

A parser would at least document a lot more cohesively what is being done.

As they are, I know what the regular expressions do, but only after studying them extensively, an effort that I would most likely have to repeat were I in the need to fix any bugs in that code in the future. I am extremely uncomfortable with that, source code should be as obvious as possible, and this is not it. As such, I will not review this any further and not merge it. I won't oppose anybody else merging it, but I want nothing to do with the added code, myself.

dscho avatar Jan 26 '25 12:01 dscho

I've pushed an updated version with a parser. In case the parser fails, we just roll back to the basis code format.

jnavila avatar Jan 27 '25 20:01 jnavila

@dscho Is this version better for you?

jnavila avatar Feb 07 '25 18:02 jnavila

@jnavila I am sorry, I had not planned any involvement in this anymore, due to time constraints.

dscho avatar Feb 07 '25 18:02 dscho

Ah cool, I just realize you're trying to fix https://github.com/git/git-scm.com/issues/1972 here.

To1ne avatar Mar 12 '25 09:03 To1ne

@jnavila I've noticed we have a synopsis processor at https://github.com/git/git/blob/master/Documentation/asciidoctor-extensions.rb.in. Why did you chose to write it yourself? I rather not maintain 2 versions.

I've used that extension locally (without the postprocessor) and this is how the synopsis on git-clone looks:

image

Or on git-diff-index:

image

I feel it's very much a "drop-in" extension we can/should use.

To1ne avatar Mar 12 '25 09:03 To1ne

@jnavila I've noticed we have a synopsis processor at https://github.com/git/git/blob/master/Documentation/asciidoctor-extensions.rb.in. Why did you chose to write it yourself? I rather not maintain 2 versions.

In fact, I wrote both of the extensions.. :laughing: . The rewrite here is a request from @dscho which is very convincing to me. The synopsis is a language in itself, so it is more comprehensive to have a dedicated parser for this language instead of a bunch of repelling regexps.

This rewrite came after I proposed the regexp version for inclusion in git/git . Now, I'm contemplating reverting the git/git version to the parser flavor, and at the same time removing this ugly pre-processor hack by a38edab.

jnavila avatar Mar 12 '25 13:03 jnavila

In fact, I wrote both of the extensions.. πŸ˜† . The rewrite here is a request from @dscho which is very convincing to me. The synopsis is a language in itself, so it is more comprehensive to have a dedicated parser for this language instead of a bunch of repelling regexps.

@jnavila I'm sorry for dropping in completely ignorant!

This rewrite came after I proposed the regexp version for inclusion in git/git . Now, I'm contemplating reverting the git/git version to the parser flavor

I haven't looked at it in detail yet, but at first sight that makes sense.

and at the same time removing this ugly pre-processor hack by a38edab.

Do you mean https://github.com/git/git/commit/a38edab7c88b5503bb2b5f5cbd49f6b97e9a6a4e ? If you mean the hack to fill in @GIT_VERSION@, I agree it would be nice to get rid of it. Although it's probably not directly related to the implementation details of the parser.

That said, can we deduplicate having a synopsis parser here and one in git/git.git ?

To1ne avatar Mar 12 '25 15:03 To1ne

After importing, here is the result: ... I'm not satisfied with the styles

@jnavila Without thinking about how to achieve this, how do you prefer the end result will look like? In a quick mockup I've ended up with the following:

image

Is that something you'd like?

To1ne avatar Mar 13 '25 06:03 To1ne

That would be great πŸ˜ƒ. The parser cannot make a difference between the "git commit" part and the options, so they would have the same style ( bold red… for instance)

jnavila avatar Mar 13 '25 10:03 jnavila

The parser cannot make a difference between the "git commit" part and the options

Couldn't the parser learn that Git commands start with git and then continue with white-space followed by a command-name that matches [a-z][-0-9a-z]* (yes, git p4 contains digits)? That should make it possible to discern between commands and options quite nicely.

dscho avatar Mar 13 '25 12:03 dscho

@jnavila I see there's more work being done in this direction: https://lore.kernel.org/git/[email protected]/. I'm sorry for my lazy question, but what is the status of this PR? What do you/we need to drive it forward?

To1ne avatar May 05 '25 10:05 To1ne

@To1ne In its current state, I think the PR is already able to correctly parse and format all the elements in back-quotes. There was a request to be able to format more specifically the "git <command>" part, but I didn't find time to work on it.

We may merge this PR and open another issue with the requested enhancement.

jnavila avatar May 07 '25 08:05 jnavila

I should rebase and address the comments from @dscho though.

jnavila avatar May 07 '25 08:05 jnavila

In its current state, I think the PR is already able to correctly parse and format all the elements in back-quotes.

For your entertainment deployed to https://to1ne.github.io/git-scm.com/docs/git-clone

To1ne avatar May 07 '25 15:05 To1ne

That said, can we deduplicate having a synopsis parser here and one in git/git.git ?

This can be pushed also to git/git, but only for asciidoctor. For asciidoc, we are stuck with regex.

jnavila avatar May 07 '25 21:05 jnavila

@jnavila @To1ne what's the status on this? Do we need a fix for #2002 before merging this (preemptively addressing the <placeholder> tag)?

dscho avatar May 13 '25 12:05 dscho

The PR was reworked according to your remarks. I won't have much time to work on adding the code to deal with "git foo" formatting. Another span tag is added, so #2002 really needs to be fixed before.

jnavila avatar May 14 '25 09:05 jnavila

Another span tag is added, so #2002 really needs to be fixed before.

@To1ne do you think you'd have time to work on that? I won't have time for at least another week... 😦

dscho avatar May 14 '25 09:05 dscho

@jnavila you probably want to rebase to gh-pages, force-push and then run the update-git-version-and-manual-pages workflow and then the update-translated-manual-pages workflow, both times forcing a full rebuild by checking that checkbox.

dscho avatar Jul 02 '25 18:07 dscho

@jnavila FYI I canceled the gh-pages run (because it would not have picked up the changes from this here PR) and triggered a new, new_manpage_format run.

dscho avatar Jul 04 '25 12:07 dscho

Ouch.

dscho avatar Jul 04 '25 12:07 dscho

Now the run I triggered will be in vain.

dscho avatar Jul 04 '25 12:07 dscho

OK Obviously, I didn't understand what your comment meant. What was I suppose to do?

jnavila avatar Jul 04 '25 12:07 jnavila

@jnavila FYI I canceled the run that would have updated the now-deleted PR branch and triggered a new gh-pages one.

Next time, let's do let the workflow make the updates before the PR is merged. That way, we can verify (and validate) the results before integrating them into the default branch.

dscho avatar Jul 04 '25 12:07 dscho