protobuf-java-format icon indicating copy to clipboard operation
protobuf-java-format copied to clipboard

Exception on Special Characteres "//"

Open hebergentilin opened this issue 8 years ago • 3 comments

I'm getting errors when informing special characteres like '//' (char generated from encoded base64 file) at a proto bytes field.

protos

message InformaContestacaoCliente {
	required Contestacao contestacao = 1;
}

message Contestacao {
	repeated Anexo anexos = 1;
}

message Anexo {
    optional bytes anexo = 2;
}

formatFactory.java

FormatFactory formatFactory = new FormatFactory();
ProtobufFormatter formatter = formatFactory.createFormatter(FormatFactory.Formatter.XML_JAVAX);
InputStream in = TextUtils.toInputStream(paramString);
formatter.merge(in, this.builder);

I got a java.lang.RuntimeException: Can't get here. message exception at XmlJavaxFormat.java:566.

Change the formater, from XML_JAVAX to XML, I got this exception: com.googlecode.protobuf.format.ProtobufFormatter$ParseException: 4:22: Expected ">".

Request sending:

<ANEXOS>
    <anexos>
        <tipoAnexo>3</tipoAnexo>
        <descricao>foto frontal</descricao>
        <anexo><![CDATA[//]]></anexo>
    </anexos>
</ANEXOS>

hebergentilin avatar Feb 14 '17 16:02 hebergentilin

Would you mind making a pull request with a test that fails because of this?

scr avatar Feb 23 '17 16:02 scr

This is the same issue as #44 , the tokenizer is too restrictive, and doesn't tolerate special chars in values, i.e. doesn't tokenize 'anexo' node content.

bouviervj avatar Jan 12 '18 19:01 bouviervj

You can try to fix the regex used to match the next token, which is the core of the problem. It can be found here: https://github.com/bivas/protobuf-java-format/blob/091d247393772e94d64c2d8835ef4cedcdfc244e/src/main/java/com/googlecode/protobuf/format/XmlFormat.java#L320

But for now I could not manage to do it since making the regex more flexible often produces some side effects.

The best solution IMO should be to completely rewrite the XML parser using an existing one, which would be more reliable.

whiver avatar Jan 22 '18 21:01 whiver