MalformedByteSequenceException
Hello I get a MalformedByteSequenceException executing this:
`String url = "http://www.investigacionesgeograficas.com/oai";
OAIClient oaiClient = new HttpOAIClient(url);
Context context;
try {
context = new Context()
.withOAIClient(oaiClient)
.withMetadataTransformer(FORMAT, KnownTransformer.OAI_DC);
ServiceProvider underTest = new ServiceProvider(context);
ListRecordsParameters parameters = ListRecordsParameters.request();
Calendar cal = Calendar.getInstance();
cal.add(Calendar.DAY_OF_MONTH, -50);
Date from = cal.getTime();
parameters.withFrom(from);
parameters.withMetadataPrefix("oai_dc");
Iterator<Record> it = underTest.listRecords(parameters);
while(it.hasNext()){
Record record = it.next();
System.out.println(record.getMetadata().getValue().searcher().findAll("dc.title"));
};
} catch (BadArgumentException e) {
e.printStackTrace();
}catch (TransformerFactoryConfigurationError e1) {
e1.printStackTrace();
}`
I am using xoai 4.1.0
The stacktrace:
com.sun.org.apache.xml.internal.utils.WrappedRuntimeException: Invalid byte 3 of 3-byte UTF-8 sequence.
at com.sun.org.apache.xalan.internal.xsltc.dom.XSLTCDTMManager.getDTM(XSLTCDTMManager.java:460)
at com.sun.org.apache.xalan.internal.xsltc.dom.XSLTCDTMManager.getDTM(XSLTCDTMManager.java:248)
at com.sun.org.apache.xalan.internal.xsltc.trax.TransformerImpl.getDOM(TransformerImpl.java:542)
at com.sun.org.apache.xalan.internal.xsltc.trax.TransformerImpl.transform(TransformerImpl.java:725)
at com.sun.org.apache.xalan.internal.xsltc.trax.TransformerImpl.transform(TransformerImpl.java:336)
at com.lyncode.xoai.xml.XSLPipeline.process(XSLPipeline.java:37)
at com.lyncode.xoai.serviceprovider.parsers.RecordParser.parse(RecordParser.java:73)
at com.lyncode.xoai.serviceprovider.parsers.ListRecordsParser.next(ListRecordsParser.java:70)
at com.lyncode.xoai.serviceprovider.handler.ListRecordHandler.nextIteration(ListRecordHandler.java:83)
at com.lyncode.xoai.serviceprovider.lazy.ItemIterator.hasNext(ItemIterator.java:40)
at com.lyncode.xoai.serviceprovider.lazy.ItemIterator.
The problem is in the method "parse" from RecordParse. It is getting bytes without encoding: inputStream = new ByteArrayInputStream(content.getBytes());
Hi,
Would you be able to provide a pull request with:
- A Test case demonstrating the issue (with expected behaviour)
- The corresponding fix
If you can provide this soon I will be able to integrate it in the next release.
Thanks!