lemminx icon indicating copy to clipboard operation
lemminx copied to clipboard

Support for Relax NG/Schematron

Open adunning opened this issue 5 years ago • 16 comments

It would be absolutely brilliant if this project could reimplement the functionality of the Atom package at https://github.com/aerhard/linter-autocomplete-jing to provide validation and autocomplete for XML using Relax NG and Schematron. (This package uses Jing, which the Oxygen XML editor also uses for validation.) These schema languages are widely used by the Text Encoding Initiative and other complex encoding applications.

adunning avatar Nov 22 '18 14:11 adunning

Can probably be implemented as an extension. However, the team is stretched pretty thin at the moment so it'd be faster if someone provided a PR for that.

@adunning could you add a set of sample xml + schemas so we get an idea of how it works?

fbricon avatar Nov 22 '18 15:11 fbricon

@adunning if you take us a sample with xml + schematron and a Java code which validate this xml file with Jing, it will help us a lot to add Schematron as an extension.

angelozerr avatar Nov 22 '18 15:11 angelozerr

Thank you! It's usually a matter of supporting <?xml-model> to associate a file with a schema. For example, a basic DocBook 5.1 file:

<?xml version="1.0" encoding="UTF-8"?>
<?xml-model href="http://docbook.org/xml/5.1/rng/docbook.rng" schematypens="http://relaxng.org/ns/structure/1.0"?>
<?xml-model href="http://docbook.org/xml/5.1/sch/docbook.sch" schematypens="http://purl.oclc.org/dsdl/schematron"?>
<article xmlns="http://docbook.org/ns/docbook"
    xmlns:xlink="http://www.w3.org/1999/xlink" version="5.1">
    <info>
        <title>Test file</title>
    </info>
    <sect1>
        <title>Test</title>
        <para>Sample text</para>
    </sect1>
</article>

In this case the Relax NG and Schematron schemata are in two separate files, but Schematron rules can also be embedded within an RNG schema, such as the example at https://github.com/DigitalLatin/caesar-balex/blob/master/balex-auto.xml (I can come up with many others).

The Atom package that validates such files mostly runs off a Java implementation at https://github.com/aerhard/xml-tools. It also provides a good set of test files at https://github.com/aerhard/linter-autocomplete-jing/tree/master/spec/validation/xml.

There is also a newer Java library at https://github.com/phax/ph-schematron/ that might be useful for Schematron support.

adunning avatar Nov 22 '18 16:11 adunning

Thanks @adunning ! Is there any chance that you could contribute to this schematron support? I could help you if you wish to initialize the schematron extension.

angelozerr avatar Nov 22 '18 16:11 angelozerr

Sorry for the delayed response! That's an extremely kind offer; looking at the code, I don't think that I have sufficient skills with Java to write the entire thing, but I am certainly happy to test and help in any other way I can.

adunning avatar Nov 30 '18 20:11 adunning

After looking briefly into this, there are really two features here:

  • Allowing <?xml-model … ?> to specify schemas (of any type, not just RelaxNG) by adding a new CMPrologContentModelProvider to PrologPlugin.
  • Modifying XMLValidator to use SchemaFactory (example here) to allow additional schema language support to be discovered dynamically. It also allows binding multiple schema for a single document if we refactor a bit.

@angelozerr does that sound right to you? I'm not sure how the latter could be introduced through a plugin instead, because XMLValidator always runs on documents.

zwaldowski avatar May 20 '19 20:05 zwaldowski

Never played with shemafactory. We should try it but m'y fear is that we will not have the same information than xerces report handler like location and arguments string array which is used in some case to retrieve element name. As we have a lot of tests i suggest you to try and it. Please vive me feedback i will try it when i will have time

Le lun. 20 mai 2019 22:10, Zachary Waldowski [email protected] a écrit :

After looking briefly into this, there are really two features here:

@angelozerr https://github.com/angelozerr does that sound right to you? I'm not sure how the latter could be introduced through a plugin instead, because XMLValidator always runs on documents.

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/angelozerr/lsp4xml/issues/237?email_source=notifications&email_token=AAOXXM6KAFMO2V4CKGV26PTPWMAUFA5CNFSM4GF55RBKYY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGODVZ552Q#issuecomment-494132970, or mute the thread https://github.com/notifications/unsubscribe-auth/AAOXXMZ7UG57P3MNKC6LFGTPWMAUFANCNFSM4GF55RBA .

angelozerr avatar May 20 '19 23:05 angelozerr

This feature would be great. I would even think about spending some money for it, if it could help.

pgundlach avatar Sep 10 '19 06:09 pgundlach

Yes, it would be great, as it would allow a seamless use of TEI in VSCode with the RedHat XML extension.

astronaute-nope avatar Sep 19 '19 20:09 astronaute-nope

Allowing to specify schemas (of any type, not just RelaxNG) by adding a new CMPrologContentModelProvider to PrologPlugin.

It's now supported for DTD and XSD.

angelozerr avatar Jun 25 '20 19:06 angelozerr

@angelozerr am I right to assume that nobody is working on this at the moment? If so, I'd volunteer to give it a shot.
I would open separate issues for all the steps necessary. (According to the list suggested by @angelozerr in redhat-developer/vscode-xml#236 )

I did look into the Jing package a bit, and I'm confident I can get RelaxNG validation to work (rather easily, in fact). However, I would be greatful for some help/suggestion, how best to integrate it into Lemminx.

BalduinLandolt avatar Jul 26 '20 15:07 BalduinLandolt

@angelozerr am I right to assume that nobody is working on this at the moment?

I would like to study this issue when I will find time but if you can do that it would be very nice! Please do that and give us feedback.

The first issue to check is to verify if Jing and their dependencies can be used with EPL license. I think we must for Jing we will have to do open some CQ to use it but if it's not compatible with EPL we cannot use Jing.

An important thing, is the size of LemMinx uber jar if we include Jing and her dependencies.

I did look into the Jing package a bit, and I'm confident I can get RelaxNG validation to work (rather easily, in fact).

Great!

However, I would be greatful for some help/suggestion, how best to integrate it into Lemminx.

I think you should start managing validation with RelaxNG. The idea is to create a classe XMLModelRelaxNGValidator and call it in https://github.com/eclipse/lemminx/blob/b1407bb52ce3ff6cd350a98b9c5aa8ea3392df0e/org.eclipse.lemminx/src/main/java/org/eclipse/lemminx/extensions/xerces/xmlmodel/XMLModelHandler.java#L123

I don't know if it will be possible with Jing. To study... Good luck!

angelozerr avatar Jul 26 '20 22:07 angelozerr

I would like to study this issue when I will find time but if you can do that it would be very nice! Please do that and give us feedback.

Ok, I'll keep you updated. :)

The first issue to check is to verify if Jing and their dependencies can be used with EPL license. I think we must for Jing we will have to do open some CQ to use it but if it's not compatible with EPL we cannot use Jing.

Honestly, if I could leave the license question to somebody else, I would feel more at ease... That would be embarassing to mess up...

An important thing, is the size of LemMinx uber jar if we include Jing and her dependencies.

We'll have to see what difference it makes.

I think you should start managing validation with RelaxNG. The idea is to create a classe XMLModelRelaxNGValidator and call it in

https://github.com/eclipse/lemminx/blob/b1407bb52ce3ff6cd350a98b9c5aa8ea3392df0e/org.eclipse.lemminx/src/main/java/org/eclipse/lemminx/extensions/xerces/xmlmodel/XMLModelHandler.java#L123

Allright!

BalduinLandolt avatar Jul 26 '20 22:07 BalduinLandolt

The latest snapshot of lemminx supports validating files using RelaxNG schemas.

If you would like to try it out in VS Code, you can try the prerelease version of vscode-xml, which should release at around 3 AM EST tomorrow.

If you have any feedback on it, feel free to open an issue.

datho7561 avatar Oct 13 '22 16:10 datho7561

Does this also work for the compact format?

denismaier avatar Oct 13 '22 18:10 denismaier

It should work with the compact format.

datho7561 avatar Oct 13 '22 19:10 datho7561

This is absolutely fabulous; thank you! This is working perfectly with RelaxNG. I assume it does not yet include Schematron.

adunning avatar Oct 19 '22 09:10 adunning

This is absolutely fabulous; thank you! This is working perfectly with RelaxNG.

Great thank you for your feedback. I think RelaxNG support should be improved again like validate RNC, RNG grammar, completion inside RNC, find definition in RNG, etc but validation, completion and hover in XML based on relaxNG should work.

Don't hesitate to create any issues to improve the support.

I assume it does not yet include Schematron.

Indeed we don't support Schematron. Please create an new issue for that, but I don't know when we will have time to do that.

This issue is fixed with https://github.com/eclipse/lemminx/pull/841

angelozerr avatar Oct 19 '22 16:10 angelozerr

Thank you again – this will be transformative for users of the Text Encoding Initiative (TEI), which relies heavily on Relax NG and Schematron.

adunning avatar Oct 19 '22 16:10 adunning