xoai icon indicating copy to clipboard operation
xoai copied to clipboard

MalformedByteSequenceException

Open mpecero opened this issue 9 years ago • 1 comments

Hello I get a MalformedByteSequenceException executing this:

`String url = "http://www.investigacionesgeograficas.com/oai";

    OAIClient oaiClient = new HttpOAIClient(url);

    Context context;
    try {
        context = new Context()
         .withOAIClient(oaiClient)
         .withMetadataTransformer(FORMAT, KnownTransformer.OAI_DC);

        ServiceProvider underTest = new ServiceProvider(context);

        ListRecordsParameters parameters = ListRecordsParameters.request();
        Calendar cal = Calendar.getInstance();
        cal.add(Calendar.DAY_OF_MONTH, -50);
        Date from = cal.getTime();
        parameters.withFrom(from);
        parameters.withMetadataPrefix("oai_dc");

        Iterator<Record> it = underTest.listRecords(parameters);
        while(it.hasNext()){
            Record record = it.next();
            System.out.println(record.getMetadata().getValue().searcher().findAll("dc.title"));
        };


    } catch (BadArgumentException e) {
        e.printStackTrace();
    }catch (TransformerFactoryConfigurationError e1) {
        e1.printStackTrace();
    }`

I am using xoai 4.1.0

The stacktrace:

com.sun.org.apache.xml.internal.utils.WrappedRuntimeException: Invalid byte 3 of 3-byte UTF-8 sequence. at com.sun.org.apache.xalan.internal.xsltc.dom.XSLTCDTMManager.getDTM(XSLTCDTMManager.java:460) at com.sun.org.apache.xalan.internal.xsltc.dom.XSLTCDTMManager.getDTM(XSLTCDTMManager.java:248) at com.sun.org.apache.xalan.internal.xsltc.trax.TransformerImpl.getDOM(TransformerImpl.java:542) at com.sun.org.apache.xalan.internal.xsltc.trax.TransformerImpl.transform(TransformerImpl.java:725) at com.sun.org.apache.xalan.internal.xsltc.trax.TransformerImpl.transform(TransformerImpl.java:336) at com.lyncode.xoai.xml.XSLPipeline.process(XSLPipeline.java:37) at com.lyncode.xoai.serviceprovider.parsers.RecordParser.parse(RecordParser.java:73) at com.lyncode.xoai.serviceprovider.parsers.ListRecordsParser.next(ListRecordsParser.java:70) at com.lyncode.xoai.serviceprovider.handler.ListRecordHandler.nextIteration(ListRecordHandler.java:83) at com.lyncode.xoai.serviceprovider.lazy.ItemIterator.hasNext(ItemIterator.java:40) at com.lyncode.xoai.serviceprovider.lazy.ItemIterator.(ItemIterator.java:30) at com.lyncode.xoai.serviceprovider.ServiceProvider.listRecords(ServiceProvider.java:65)

The problem is in the method "parse" from RecordParse. It is getting bytes without encoding: inputStream = new ByteArrayInputStream(content.getBytes());

mpecero avatar Jul 26 '16 07:07 mpecero

Hi,

Would you be able to provide a pull request with:

  1. A Test case demonstrating the issue (with expected behaviour)
  2. The corresponding fix

If you can provide this soon I will be able to integrate it in the next release.

Thanks!

mmalmeida avatar Feb 19 '17 14:02 mmalmeida