lemminx icon indicating copy to clipboard operation
lemminx copied to clipboard

Validation for RelaxNG schema

Open BalduinLandolt opened this issue 5 years ago • 10 comments

First step of #237, see also redhat-developer/vscode-xml#236.

To Do:

  • [ ] Check license compatibility.
    For jing license, see https://github.com/relaxng/jing-trang/blob/master/copying.txt.

  • [ ] Ensure the Lemminx uber jar doesn't get out of hand.

  • [ ] use jing
    (Note to self: github, maven, website )

  • [ ] start managing validation with RelaxNG. The idea is to create a classe XMLModelRelaxNGValidator and call it in
    https://github.com/eclipse/lemminx/blob/b1407bb52ce3ff6cd350a98b9c5aa8ea3392df0e/org.eclipse.lemminx/src/main/java/org/eclipse/lemminx/extensions/xerces/xmlmodel/XMLModelHandler.java#L123

  • [ ] write unit tests

  • [ ] test manually

BalduinLandolt avatar Jul 26 '20 22:07 BalduinLandolt

Having an XMLModelRNGValidator that is called in extensions/xerces/xmlmodel/XMLModelHandler.java proofs to be difficult: I can't really get jing to work in the context of the xerces framework. And calling it from XMLModelHandler requires it to extend XMLSchemaValidator which is very much xerces.

Instead, I'd go for a RelaxNG Extension like there is for XSD (i.e. lemminx/extensions/rng/), and add classes like RNGPlugin, RNGDiagnosticsParticipant and RNGValidator (analog to what there is for XSD).
If I understand the existing code correctly, it should then work like this: XMLLanguageService alls for diagnostics in XMLDiagnostics, which calls for diagnostics in all DiagnosticsParticipants it has. If I register RNGDiagnosticsParticipant there, then that should work.
In the RNGValidator, I could use jing for the actual validation.

@angelozerr does that seem reasonable to you?

BalduinLandolt avatar Jul 27 '20 12:07 BalduinLandolt

Well... never mind! I just realized that the XSDValidator is for validating xsd files, not for validating against(!) xsd files.

I'll keep digging :)

BalduinLandolt avatar Jul 27 '20 13:07 BalduinLandolt

@angelozerr I'm getting somewhere with this! :)

I just drafted a pull request with what I have so far.
It's currently very much a hack and I know that lots of things need improovement.

It does not yet work when packaged and used in vs code, but from the test cases you can see that it can produce proper diagnostics.

Firstly, I'd be greatful if you could give me general feedback, if the general direction I'm taking seems reasonable to you.

Furthermore, I have some more concrete questions:

  • when I know the href pointing to the schema from xml-model, what is the best way of making this schema available for validation? I'm assuming there is already a system that can handle local files, remote files, relative paths, etc.?
  • I don't get the catalog.xml... how do I use that? and should I use it at all?
  • the jing dependency... in which of the two pom.xml do I add that?
  • I added one of the TEI .rng schemas to the resources for test purposes. Is that a porblem license-wise?

BalduinLandolt avatar Jul 28 '20 16:07 BalduinLandolt

also, the validation creates errors and warnings, but honestly, I can't think of a way/case to get a warning. Any ideas?

BalduinLandolt avatar Jul 28 '20 16:07 BalduinLandolt

Firstly, I'd be greatful if you could give me general feedback, if the general direction I'm taking seems reasonable to you.

I will try to do my best to find time and play with your PR. But your implementation means that you will parse twice the XML document (one by xerces, and one by jing) . That's why I wonder if we could create XMLModelRelaxNGValidator which extends XMLModelValidator which validate the XML content when XML is parsed by xerces. But I'm not sure it's possible.

For your other comments, please let me time to study more jing.

angelozerr avatar Jul 29 '20 08:07 angelozerr

I will try to do my best to find time and play with your PR.

For your other comments, please let me time to study more jing.

Sure thing, no hurry!

But your implementation means that you will parse twice the XML document (one by xerces, and one by jing) . That's why I wonder if we could create XMLModelRelaxNGValidator which extends XMLModelValidator which validate the XML content when XML is parsed by xerces. But I'm not sure it's possible.

I absolutely see your point. I just could not figure out how to do that... But I can try and see what I can do.
If you have a look at "JARV" here, maybe this could help too.

In any case, I'll see what I can do and will keep you updated. And if you find time to look into it, let me know if you have any suggestions.

BalduinLandolt avatar Jul 29 '20 09:07 BalduinLandolt

If you have a look at "JARV" here, maybe this could help too.

Today Xerces is on the top for validation because we customize it to have advanced support like error range instead of offset range. If I understand correctly JARV provides a Validation API, so it means that Xerces will not on the top of validation.

In otherwords I would like to provides a Xerces XNI component implementd with Jing. Perhaps https://github.com/georgebina/dita-ng/blob/master/src/org/ditang/relaxng/defaults/RelaxNGDefaultsComponent.java is a good start?

To study...

In any case, I'll see what I can do and will keep you updated. And if you find time to look into it, let me know if you have any suggestions.

I will try as soon as I will find time.

angelozerr avatar Jul 29 '20 14:07 angelozerr

you mean something like this? :) https://people.apache.org/~andyc/neko/doc/index.html

I just came across this page, which lead me to NekoXNI. I haven't tried it out yet, but if I understand you correctly, that might be something, right?

BalduinLandolt avatar Jul 29 '20 15:07 BalduinLandolt

Well, the download link is dead, and I can't seem to find the files anywhere online... So NekoXNI is clearly not an option. (Or maybe you have better luck?)

If we were to write a Jing-XNI-component ourself... roughly where would that come into play? In the LSPXMLParserConfiguration, so it gets passed to the LSPSAXParser before parsing?

BalduinLandolt avatar Jul 30 '20 10:07 BalduinLandolt

@BalduinLandolt please see my draft PR https://github.com/eclipse/lemminx/pull/841

angelozerr avatar Aug 10 '20 14:08 angelozerr