glTF icon indicating copy to clipboard operation
glTF copied to clipboard

WIP: Convert 2.0 extensions markdown to Asciidoctor and add Makefile

Open oddhack opened this issue 3 years ago • 9 comments

First version for WG review.

This is the initial stage of conversion. It simply converts each extensions/2.0/{Vendor,Khronos}/*/README.md to a corresponding <extension_name>.adoc. As with the glTF 2.0 spec, this is somewhat readable in the Github viewer but missing important features like math rendering and include::s. The intention is that we generate HTML outputs that are published in the registry as with the 2.0 spec.

To generate HTML outputs in extensions/2.0/out:

cd extensions/2.0 ; make

Status notes / items for WG feedback before merging and publishing to the glTF registry:

  • The extension markup has been renamed from README.md -> <extension name>.adoc.
  • This does not yet use wetzel to convert the schemas into asciidoctor that can be inlined, and the links from the HTML files to the schemas aren't working (they would have to be converted into links into the github repository since they'll be linking from the registry when published). This is the big thing to fix before switching over. What I'm having a little trouble with, given my almost nonexistent knowledge of wetze:
    • How to deal with the schema/ directories that have multiple .json schema files - should I run wetzel on all of them at the same time (if that works), or separately? If run on separate files we'll necc. end up with multiple property reference / JSON schema includes, which leads to:
    • Whether the generated wetzel output will naturally drop in as a replacement for the handcoded markup. At present the handcoded markup appears somewhat split across the extension source, while in the glTF 2.0 spec we just have a "Properties" chapter and a "JSON Schema" appendix, each of which fold in the entire content of all the schemas used in the document.
  • I did some minor consistency cleanup - using the same titles for Notes sections in all extensions, that sort of thing. In general this shouldn't be semantically meaningful or objectionable, but two comments:
    • In the process I noticed that some "Implementation Note" sections don't actually say that they're non-normative, and probably should. That isn't something I'd touch without WG signoff but wanted to raise awareness.
    • Some of the links from "References" sections no longer render the explicit URL of the link, just the link text - some already did this even in the Markdown form. To be clear it's the difference between Khronos and https://www.khronos.org/.
  • Source .adoc files for Khronos extensions have been moved to the same CC-BY-4.0 license as the glTF 2.0 specification, while generated HTML outputs to be published by Khronos remain under the restrictive Khronos Spec Copyright. This matches the license approach for the core glTF 2.0 spec. Vendor extensions generally do not have explicit copyright statements (which vendors can fix, but they should remain consistent with CC-BY-4.0 if possible).
  • The latexmath: markup could use more eyes to make sure everything translated over correctly and is consistent with the non-latexmath inline math markup. It was fairly mechanical but errors may crept in.
  • I've added new README.md in the converted extension directories which just contains a link to the glTF Registry on khronos.org and explains the change in markup.

Once I get some feedback on the wetzel plan I'll incorporate that. In order to minimize tracking issues it would be good to merge this first stage fairly quickly afterwards. The next stage of combining the extensions into the 2.0 spec can be discussed with more clarity at that point.

oddhack avatar Jan 04 '22 14:01 oddhack

it makes more sense to have them named e.g. KHR_xmp_json_ld.adoc

Yes and I think we should replace the contents of all README.md files with a short standard boilerplate describing the new layout to avoid 404 on existing direct GitHub links.

How to deal with the schema/ directories that have multiple .json schema files

wetzel should be able to take *.schema.json as input and generate a single reference section composed of all schemas.

Whether the generated wetzel output will naturally drop in as a replacement for the handcoded markup.

Yes, we should reformat the extension specs to be similar to the main spec: with a dedicated autogenerated normative Property Reference section and non-normative JSON-Schema appendix composed of the schema sources.

Some of the links from "References" sections no longer render the explicit URL of the link

That's fine for now. Before publishing to the registry, we should normalize all such references, probably to follow the style of the main spec.

make the source markup CC-BY 4.0

Agreed.

lexaknyazev avatar Jan 16 '22 16:01 lexaknyazev

Jon, thanks for taking this on, this is a great start to a large effort.

it makes more sense to have them named e.g. KHR_xmp_json_ld.adoc

Yes and I think we should replace the contents of all README.md files with a short standard boilerplate describing the new layout to avoid 404 on existing direct GitHub links.

Agreed with both.

  • How to deal with the schema/ directories that have multiple .json schema files

Certainly wetzel could be upgraded to make this process easier. I'm not sure I can promise a timeframe for wetzel upgrades given other priorities at the moment, but it's something I would like to do given time to work on it.

  • Whether the generated wetzel output will naturally drop in as a replacement for the handcoded markup.

It will not. This was by far the most time-consuming part of my contributions to the main glTF ISO spec, more time-consuming than upgrading wetzel itself to generate AsciiDoc-flavored output. But that was a huge, monolithic set of diffs for me to work through, whereas in this effort, each extension can be considered in isolation, which could break the problem into smaller pieces.

As a first step, perhaps we could hook up wetzel to produce its own reference section alongside whatever reference the handwritten part of the extension offers, and then folks from the WG can compare and trim down from there?

emackey avatar Jan 17 '22 16:01 emackey

Renamed extension markup from README.adoc -> extension_name.adoc

oddhack avatar Jan 20 '22 16:01 oddhack

I removed the extension flag, as currently it is used for current and new extensions - but not on the documentation level.

UX3D-nopper avatar Feb 16 '22 12:02 UX3D-nopper

Took me a while to create this filter: https://github.com/KhronosGroup/glTF/labels/extension

UX3D-nopper avatar Feb 16 '22 12:02 UX3D-nopper

I just saw the activity here, and noticed some points that are related to wetzel. I recently wanted to apply wetzel (in a completely different context), and stumbled over similar questions. I don't have an overview in which shape the property references for the extensions currently are, or some details about what they should look like, but a few comments:


This does not yet use wetzel to convert the schemas into asciidoctor that can be inlined, and the links from the HTML files to the schemas aren't working (they would have to be converted into links into the github repository since they'll be linking from the registry when published).

One question that I wondered about (and that may be what you referred to here) :

  • On the GitHub preview, the links from the ADOC files should probably point into the GitHub repository. This makes sense, to allow people to do live-browsing
  • In the registry that is hosted on the Khronos website, these links should probably not point to GitHub. The hosted HTML version should be standalone and frozen (without external references to schema files that may become invalid).

One possible solution for that could be conditionals, roughly like

The schema is at 
ifdef::env-github[]
https://github.example.com/example.schema.json
endif::[]
ifndef::env-github[]
https://khronos.registry.example.com/example.schema.json
endif::[]

This would require a few tweaks and configuration options for wetzel, but might be doable.


How to deal with the schema/ directories that have multiple .json schema files - should I run wetzel on all of them at the same time (if that works), or separately? If run on separate files we'll necc. end up with multiple property reference / JSON schema includes, which leads to: Whether the generated wetzel output will naturally drop in as a replacement for the handcoded markup. At present the handcoded markup appears somewhat split across the extension source, while in the glTF 2.0 spec we just have a "Properties" chapter and a "JSON Schema" appendix, each of which fold in the entire content of all the schemas used in the document.

The question of multiple input files/paths also turned out to be non-trivial. For the future of the extensions properties reference, one could make assumptions (or rather, define the common shape for them):

  • The property reference for each extension should probably be "self-contained", and only contain the part that really is defined by the extension (even though it may refer to other extensions, or the core spec). One could consider linking to the glTF property reference (e.g. when an extension says "This thing is a glTFProperty"), but this may have some caveats.
  • The property reference should probably not be "scattered" across the main extension README. Instead, it should always be a dedicated PropertiesReference.adoc document, or at least one contiguous section at the end of the main README.

There is a refactored state of wetzel at https://github.com/CesiumGS/wetzel/tree/generate-3dtiles that offers a few options that I found missing in the current main state. Some information about how to use this state is at https://github.com/CesiumGS/3d-tiles/blob/draft-1.1/specification/BUILDING.md#generating-the-properties-reference , but ... this is mainly an example for now...

javagl avatar Aug 02 '22 16:08 javagl

Thanks for the feedback!

The github preview is not going to be fully functional - in particular, github's asciidoctor renderer does not support 'include::' directives. For the links, it's probably easiest to set an asciidoctor attribute with an appropriate URL prefix globally and refer to that within links, than to conditionalize each one. Since include:: isn't supported, there is no reasonable way to imbed wetzel output in the github preview.

I started trying to sort out the wetzel situation in January but wasn't getting enough feedback. I have a bit more time to put into this now and have just brought the PR branch up to date with updates and new extensions in main, so hoping to progress this and get it merged soon. I see there's quite a bit of extension PR activity that will also need to be converted once we have an acceptable solution; fortunately the markup conversions only take a few minutes in most cases.

oddhack avatar Aug 03 '22 01:08 oddhack

Again, I'll need to get an overview of the current state of the extension specs, but a few more thoughts:

For the links, it's probably easiest to set an asciidoctor attribute with an appropriate URL prefix globally and refer to that within links,

I tried that in the other repo (3D Tiles), but it always glitched out in one way or another. What I found to be distressingly non-working was cross-linking between ADOC files that are in the same directory. The workaround that is described in the build instructions right now looks like an 👀 odd hack 👀 , and I'm curious if there is a better solution...

Regarding the include: IIRC, most of the specifications for extensions are relatively small and do not have a deeply nested structure. So the lack of include might not be sooo critical (at least, I think that there are hardly any cases where one "top-level" file is supposed to just be a list of include statemens for chapter[1...n].adoc).

The common case where an include might be useful would be the include for the PropertiesReference.adoc and JsonSchemaReference.adoc that are supposed to be generated by wetzel, and that could be inlined into the main README. But leaving them as standalone ADOC files should be OK for the GitHub preview. In the final registry HTML, they can easily be inlined.

I see there's quite a bit of extension PR activity that will also need to be converted once we have an acceptable solution; fortunately the markup conversions only take a few minutes in most cases.

For the first pass of the 3D Tiles conversion, I used kramdoc (as described in an earlier version of the build instructions - I assume that you're also doing the bulk of the conversion with this or a similar tool...? (Some manual tweaks may be necessary, but it's at least not a tedious, repetitive process...)

javagl avatar Aug 03 '22 13:08 javagl

I tried that in the other repo (3D Tiles), but it always glitched out in one way or another. What I found to be distressingly non-working was cross-linking between ADOC files that are in the same directory. The workaround that is described in the build instructions right now looks like an eyes odd hack eyes , and I'm curious if there is a better solution...

Not sure about better, but easiest is to use the same directory structure for the extension markup and the rendered HTML in the registry. I threw all the generated HTML from asciidoctor into a single output directory thus far in this MR.

The common case where an include might be useful would be the include for the PropertiesReference.adoc and JsonSchemaReference.adoc that are supposed to be generated by wetzel, and that could be inlined into the main README. But leaving them as standalone ADOC files should be OK for the GitHub preview. In the final registry HTML, they can easily be inlined.

Sounds good. I'm not sure how much utility the github preview has if the WG commits to publishing generated HTML docs. I don't have an investment in the answer but it simplifies matters if the generated HTML is the sole source of truth. I mean, sure, lpeople can attempt to view the ADOC markup of the Vulkan spec in github, but we don't expect anyone to make sense out of that. It's possible for these extensions because they're so short and simple in structure, so they fall within the scope of what the github renderer can do (which isn't much).

For the first pass of the 3D Tiles conversion, I used kramdoc (as described in an earlier version of the build instructions - I assume that you're also doing the bulk of the conversion with this or a similar tool...? (Some manual tweaks may be necessary, but it's at least not a tedious, repetitive process...)

A few editor macros for links + minor cleanup. But if kramdoc works then great - I don't know how good a job it does, and my prior experience with pandoc was that it introduced a large amount of noise and excess formatting markup that annoyed me more than whatever time savings it may have produced.

My thought is that after this PR is accepted, there should be a process to ensure MD -> ADOC conversion when an extension PR is accepted, and a template to drive new extension PRs to ADOC from the beginning.

oddhack avatar Aug 03 '22 15:08 oddhack