quickfixj icon indicating copy to clipboard operation
quickfixj copied to clipboard

Respect `MessageEncoding` (tag 347)

Open sheinbergon opened this issue 4 years ago • 4 comments

Today, QF/J supports only fix width encoding US-ASCII or ISO-8859-1 Charset in order to both ensure compatibility with QF bindings in other languages, and to allow for an expected behavior in regards to message lengths.

However, the FIX specification defines the MessageEncoding header tag, stating this encoding should related to Encoded* fields. Honoring this field should ensure data specified in such tags is properly encoded/decoded, as it implies both the sender and the receiver are fully aware of the expected encoding for this field.

Given that such fields have an accompanying Encoded*Len companion field, we should also be able to properly read them from the byte stream.

QF/J doesn't really do anything with MessageEncoding today, and custom encoding for Encoded* fields is not supported.

I'm willing to contribute a PR, but would like to hear your initial thoughts on the matter first.

Cheers

Idan

sheinbergon avatar Oct 06 '21 13:10 sheinbergon

I think respecting the MessageEncoding field would be a sensible enhancement. :+1:

Actually QFJ also supports other encodings since some time but of course this is depending on mutual agreement between the counterparties and also does not allow the encoding to be specified on a per-message basis.

Thanks and cheers, Chris.

chrjohn avatar Oct 06 '21 22:10 chrjohn

and also does not allow the encoding to be specified on a per-message basis.

Why is that? Why shouldn't we able to decode tags on a per message (per tag) basis? That's the purpose of MessageEncoding, after all...

sheinbergon avatar Oct 25 '21 11:10 sheinbergon

You misread my comment. :) The statement you quoted does refer to the status quo in QFJ where you can only specify the encoding per JVM.

chrjohn avatar Oct 25 '21 23:10 chrjohn

Sorry for going MIA, it's been a busy few months. Given how things are implemented, and conceptually speaking, what would be the correct way to reading such encoded fields from the bytestream? Is that even feasible? It sounds like we should first scan the bytestream for the MessageEncoding header, and if it exists, scan for fields ending with *Encoded and decode them manually. That sounds like it might hurt performance a bit.

sheinbergon avatar Feb 12 '22 15:02 sheinbergon