kafka-connect-transform-xml
kafka-connect-transform-xml copied to clipboard
ERROR: White spaces are required between publicId and systemId
Version 0.1.0.18
Installed using:
confluent-hub install --no-prompt jcustenborder/kafka-connect-transform-xml:0.1.0.18
Config:
curl -i -X PUT -H "Content-Type:application/json" http://localhost:8083/connectors/source-file-01/config \
-d '{
"connector.class": "org.apache.kafka.connect.file.FileStreamSourceConnector",
"tasks.max": "1",
"file": "/tmp.xml",
"topic": "xmltest",
"key.converter": "org.apache.kafka.connect.storage.StringConverter",
"value.converter": "org.apache.kafka.connect.storage.StringConverter",
"transforms": "xml",
"transforms.xml.type": "com.github.jcustenborder.kafka.connect.transform.xml.FromXml$Value",
"transforms.xml.schema.path": "http://datex2.eu/schema/1_0/1_0/DATEXIISchema_1_0_1_0.xsd"
}'
Transform failed with error org.xml.sax.SAXParseException; systemId: http://datex2.eu/schema/1_0/1_0/DATEXIISchema_1_0_1_0.xsd; lineNumber: 1; columnNumber: 50; White spaces are required between publicId and systemId.
[2020-09-08 14:08:41,647] INFO [source-file-01|task-0] FromXmlConfig values:
package = com.github.jcustenborder.kafka.connect.transform.xml.model
schema.path = [http://datex2.eu/schema/1_0/1_0/DATEXIISchema_1_0_1_0.xsd]
xjc.options.automatic.name.conflict.resolution.enabled = false
xjc.options.strict.check.enabled = true
xjc.options.verbose.enabled = false
(com.github.jcustenborder.kafka.connect.transform.xml.FromXmlConfig:347)
[2020-09-08 14:08:41,699] INFO [source-file-01|task-0] compileContext() - Generating source for http://datex2.eu/schema/1_0/1_0/DATEXIISchema_1_0_1_0.xsd (com.github.jcustenborder.kafka.connect.transform.xml.XSDCompiler:99)
[2020-09-08 14:08:42,278] ERROR [source-file-01|task-0] fatalError (com.github.jcustenborder.kafka.connect.transform.xml.XSDCompiler:36)
org.xml.sax.SAXParseException; systemId: http://datex2.eu/schema/1_0/1_0/DATEXIISchema_1_0_1_0.xsd; lineNumber: 1; columnNumber: 50; White spaces are required between publicId and systemId.
at com.sun.org.apache.xerces.internal.util.ErrorHandlerWrapper.createSAXParseException(ErrorHandlerWrapper.java:203)
at com.sun.org.apache.xerces.internal.util.ErrorHandlerWrapper.fatalError(ErrorHandlerWrapper.java:177)
at com.sun.org.apache.xerces.internal.impl.XMLErrorReporter.reportError(XMLErrorReporter.java:400)
at com.sun.org.apache.xerces.internal.impl.XMLErrorReporter.reportError(XMLErrorReporter.java:327)
at com.sun.org.apache.xerces.internal.impl.XMLScanner.reportFatalError(XMLScanner.java:1472)
at com.sun.org.apache.xerces.internal.impl.XMLScanner.scanExternalID(XMLScanner.java:1072)
at com.sun.org.apache.xerces.internal.impl.XMLDocumentScannerImpl.scanDoctypeDecl(XMLDocumentScannerImpl.java:642)
at com.sun.org.apache.xerces.internal.impl.XMLDocumentScannerImpl$PrologDriver.next(XMLDocumentScannerImpl.java:924)
at com.sun.org.apache.xerces.internal.impl.XMLDocumentScannerImpl.next(XMLDocumentScannerImpl.java:602)
at com.sun.org.apache.xerces.internal.impl.XMLNSDocumentScannerImpl.next(XMLNSDocumentScannerImpl.java:112)
at com.sun.org.apache.xerces.internal.impl.XMLDocumentFragmentScannerImpl.scanDocument(XMLDocumentFragmentScannerImpl.java:505)
at com.sun.org.apache.xerces.internal.parsers.XML11Configuration.parse(XML11Configuration.java:842)
at com.sun.org.apache.xerces.internal.parsers.XML11Configuration.parse(XML11Configuration.java:771)
at com.sun.org.apache.xerces.internal.parsers.XMLParser.parse(XMLParser.java:141)
at com.sun.org.apache.xerces.internal.parsers.AbstractSAXParser.parse(AbstractSAXParser.java:1213)
at com.sun.org.apache.xerces.internal.jaxp.SAXParserImpl$JAXPSAXParser.parse(SAXParserImpl.java:643)
at com.sun.tools.xjc.reader.internalizer.DOMForest.parse(DOMForest.java:395)
at com.sun.tools.xjc.reader.internalizer.DOMForest.parse(DOMForest.java:275)
at com.sun.tools.xjc.api.impl.s2j.SchemaCompilerImpl.parseSchema(SchemaCompilerImpl.java:158)
at com.github.jcustenborder.kafka.connect.transform.xml.XSDCompiler.compileContext(XSDCompiler.java:103)
at com.github.jcustenborder.kafka.connect.transform.xml.FromXml.configure(FromXml.java:130)
at org.apache.kafka.connect.runtime.ConnectorConfig.transformations(ConnectorConfig.java:264)
at org.apache.kafka.connect.runtime.Worker.buildWorkerTask(Worker.java:515)
at org.apache.kafka.connect.runtime.Worker.startTask(Worker.java:467)
at org.apache.kafka.connect.runtime.distributed.DistributedHerder.startTask(DistributedHerder.java:1186)
at org.apache.kafka.connect.runtime.distributed.DistributedHerder.access$1600(DistributedHerder.java:127)
at org.apache.kafka.connect.runtime.distributed.DistributedHerder$12.call(DistributedHerder.java:1201)
at org.apache.kafka.connect.runtime.distributed.DistributedHerder$12.call(DistributedHerder.java:1197)
at java.util.concurrent.FutureTask.run(FutureTask.java:266)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
at java.lang.Thread.run(Thread.java:748)
org.xml.sax.SAXParseException; systemId: http://datex2.eu/schema/1_0/1_0/DATEXIISchema_1_0_1_0.xsd; lineNumber: 1; columnNumber: 50; White spaces are required between publicId and systemId.
at com.sun.org.apache.xerces.internal.parsers.AbstractSAXParser.parse(AbstractSAXParser.java:1239)
at com.sun.org.apache.xerces.internal.jaxp.SAXParserImpl$JAXPSAXParser.parse(SAXParserImpl.java:643)
at com.sun.tools.xjc.reader.internalizer.DOMForest.parse(DOMForest.java:395)
at com.sun.tools.xjc.reader.internalizer.DOMForest.parse(DOMForest.java:275)
at com.sun.tools.xjc.api.impl.s2j.SchemaCompilerImpl.parseSchema(SchemaCompilerImpl.java:158)
at com.github.jcustenborder.kafka.connect.transform.xml.XSDCompiler.compileContext(XSDCompiler.java:103)
at com.github.jcustenborder.kafka.connect.transform.xml.FromXml.configure(FromXml.java:130)
at org.apache.kafka.connect.runtime.ConnectorConfig.transformations(ConnectorConfig.java:264)
at org.apache.kafka.connect.runtime.Worker.buildWorkerTask(Worker.java:515)
at org.apache.kafka.connect.runtime.Worker.startTask(Worker.java:467)
at org.apache.kafka.connect.runtime.distributed.DistributedHerder.startTask(DistributedHerder.java:1186)
at org.apache.kafka.connect.runtime.distributed.DistributedHerder.access$1600(DistributedHerder.java:127)
at org.apache.kafka.connect.runtime.distributed.DistributedHerder$12.call(DistributedHerder.java:1201)
at org.apache.kafka.connect.runtime.distributed.DistributedHerder$12.call(DistributedHerder.java:1197)
at java.util.concurrent.FutureTask.run(FutureTask.java:266)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
at java.lang.Thread.run(Thread.java:748)
Source XML file:
Tried tweaking a couple of the exposed config values, but got the same error.
[2020-09-08 14:19:04,531] INFO [source-file-01c|task-0] FromXmlConfig values:
package = com.github.jcustenborder.kafka.connect.transform.xml.model
schema.path = [http://datex2.eu/schema/1_0/1_0/DATEXIISchema_1_0_1_0.xsd]
xjc.options.automatic.name.conflict.resolution.enabled = false
xjc.options.strict.check.enabled = true
xjc.options.verbose.enabled = false
(com.github.jcustenborder.kafka.connect.transform.xml.FromXmlConfig:347)
[2020-09-08 14:19:04,533] INFO [source-file-01c|task-0] compileContext() - Generating source for http://datex2.eu/schema/1_0/1_0/DATEXIISchema_1_0_1_0.xsd (com.github.jcustenborder.kafka.connect.transform.xml.XSDCompiler:99)
org.xml.sax.SAXParseException; systemId: http://datex2.eu/schema/1_0/1_0/DATEXIISchema_1_0_1_0.xsd; lineNumber: 1; columnNumber: 50; White spaces are required between publicId and systemId.
at com.sun.org.apache.xerces.internal.parsers.AbstractSAXParser.parse(AbstractSAXParser.java:1239)
[2020-09-08 14:19:04,846] ERROR [source-file-01c|task-0] fatalError (com.github.jcustenborder.kafka.connect.transform.xml.XSDCompiler:36)
at com.sun.org.apache.xerces.internal.jaxp.SAXParserImpl$JAXPSAXParser.parse(SAXParserImpl.java:643)
at com.sun.tools.xjc.reader.internalizer.DOMForest.parse(DOMForest.java:395)
org.xml.sax.SAXParseException; systemId: http://datex2.eu/schema/1_0/1_0/DATEXIISchema_1_0_1_0.xsd; lineNumber: 1; columnNumber: 50; White spaces are required between publicId and systemId.
at com.sun.tools.xjc.reader.internalizer.DOMForest.parse(DOMForest.java:275)
at com.sun.org.apache.xerces.internal.util.ErrorHandlerWrapper.createSAXParseException(ErrorHandlerWrapper.java:203)
at com.sun.tools.xjc.api.impl.s2j.SchemaCompilerImpl.parseSchema(SchemaCompilerImpl.java:158)
at com.sun.org.apache.xerces.internal.util.ErrorHandlerWrapper.fatalError(ErrorHandlerWrapper.java:177)
at com.sun.org.apache.xerces.internal.impl.XMLErrorReporter.reportError(XMLErrorReporter.java:400)
at com.github.jcustenborder.kafka.connect.transform.xml.XSDCompiler.compileContext(XSDCompiler.java:103)
at com.sun.org.apache.xerces.internal.impl.XMLErrorReporter.reportError(XMLErrorReporter.java:327)
at com.sun.org.apache.xerces.internal.impl.XMLScanner.reportFatalError(XMLScanner.java:1472)
at com.sun.org.apache.xerces.internal.impl.XMLScanner.scanExternalID(XMLScanner.java:1072)
at com.sun.org.apache.xerces.internal.impl.XMLDocumentScannerImpl.scanDoctypeDecl(XMLDocumentScannerImpl.java:642)
at com.sun.org.apache.xerces.internal.impl.XMLDocumentScannerImpl$PrologDriver.next(XMLDocumentScannerImpl.java:924)
at com.github.jcustenborder.kafka.connect.transform.xml.FromXml.configure(FromXml.java:130)
This one is a weird one. It looks like it gets angry when there is a 301 redirect. Moving to
schema.path = https://datex2.eu/schema/1_0/1_0/DATEXIISchema_1_0_1_0.xsd
got me to the point that it would load the xsd. The next problem is this schema defines two Comment elements which angers it again. I'm going to add support to control some of this output.
#32