fix-decoder icon indicating copy to clipboard operation
fix-decoder copied to clipboard

auto detect seperator

Open thawk opened this issue 2 years ago • 5 comments

Because the first 3 tags in FIX message is 8/9/35, we can assume the same seperator is used in whole message, so we can detect it from these 3 tags. Because the value of 9= is always numbers, so we can use the string from the first non-number charater after 9= to 35= as seperator.

thawk avatar Feb 03 '23 07:02 thawk

Thanks for the PR! Can you share some data to test this change with, that shows a case where it improves parsing?

drewnoakes avatar Feb 03 '23 08:02 drewnoakes

Because different software products different foormat of FIX log, the <SOH> will be replaced by different strings to be seen, even nul (0x01) will be used to replace soh for some reason I don't know :-(

Following is several types of log we have met:

8=FIX.4.2<SOH>9=130<SOH>35=AE<SOH>49=LSEHub<SOH>56=LSETR<SOH>115=BROKERX<SOH>34=2287<SOH>43=N<SOH>52=20120330-12:14:09<SOH>370=20120330-12:14:09.816<SOH>571=00008661533TRLO1-1-1-0<SOH>150=H<SOH>10=074<SOH>
8=FIX.4.2[SOH]9=130[SOH]35=AE[SOH]49=LSEHub[SOH]56=LSETR[SOH]115=BROKERX[SOH]34=2287[SOH]43=N[SOH]52=20120330-12:14:09[SOH]370=20120330-12:14:09.816[SOH]571=00008661533TRLO1-1-1-0[SOH]150=H[SOH]10=074[SOH]
8=FIX.4.2;9=130;35=AE;49=LSEHub;56=LSETR;115=BROKERX;34=2287;43=N;52=20120330-12:14:09;370=20120330-12:14:09.816;571=00008661533TRLO1-1-1-0;150=H;10=074;

thawk avatar Feb 24 '23 06:02 thawk

Because the first 3 tags in FIX message is 8/9/35

Are we sure about that? I don't work much with FIX at the moment. My recollection of FIX is that there are very few standard behaviours in the wild. I'm all for auto-detecting the separator, but I am concerned about doing so based on an assumption that might not be universally true.

drewnoakes avatar Feb 24 '23 09:02 drewnoakes

Hi both,

indeed FIX tags 8/9/35 are some sort of usual start in messages for what I saw in my FIX years, but I also remember that FIX can have very "exotic" variations to say the least ;-)

I guess adding some sort of auto-detection is indeed nice. It would be good if the detection is not "stubborn" and behave like an extra "magic" feature if it detects a format, and just quietly does nothing (or maybe a little warning in the UI) if the format is not matching a list a pre-configured formats (that users may be able to customize?)

Sorry if I am not clear ...

On Fri, 24 Feb 2023 at 10:27, Drew Noakes @.***> wrote:

Because the first 3 tags in FIX message is 8/9/35

Are we sure about that? I don't work much with FIX at the moment. My recollection of FIX is that there are very few standard behaviours in the wild. I'm all for auto-detecting the separator, but I am concerned about doing so based on an assumption that might not be universally true.

— Reply to this email directly, view it on GitHub https://github.com/drewnoakes/fix-decoder/pull/42#issuecomment-1443323763, or unsubscribe https://github.com/notifications/unsubscribe-auth/ABFVM3VBS37PD4MYOBVOKYTWZB5HRANCNFSM6AAAAAAUP5I4ZY . You are receiving this because you are subscribed to this thread.Message ID: @.***>

whatthefrog avatar Feb 24 '23 14:02 whatthefrog

Because the first 3 tags in FIX message is 8/9/35

Are we sure about that? I don't work much with FIX at the moment. My recollection of FIX is that there are very few standard behaviours in the wild. I'm all for auto-detecting the separator, but I am concerned about doing so based on an assumption that might not be universally true.

In page 20 of the FINANCIAL INFORMATION EXCHANGE PROTOCOL (FIX) Version 5.0 Service Pack2, Volume1, section FIX "Tag=Value" SYNTAX. Under Message Format, rule 2 says:

The first three fields in the standard header are Begin String (tag #8) followed by BodyLength (tag #9) followed by MsgType (tag #35).

So, if we use the standard header, it should works. If not, this algorithm will fall back to one of the seperators (/\||;|\x001|\[SOH\]|<SOH>|\^A/), it extends the list of supported sperators with three multiple charaters seperators [SOH]/<SOH>/^A, which are encountered in my work experience.

thawk avatar Mar 01 '23 01:03 thawk