python-suitcase
python-suitcase copied to clipboard
Support for variable-length, non-greedy Payload()
tl;dr: It would be nice if Payload() consumed data until the first None/null-character if no length is specified, rather than the last instance of it as it appears to currently be doing.
I'm running into a problem with a Structure containing two null-delimited strings. The protocol doesn't specify information about the length of these strings. I wrote my own protocol parser before discovery this library, so I already have the protocol broken down into logical segments/Structures, but I was hoping to get rid of my hack job in favour of using Suitcase to parse the individual fields.
Reading the docs, I see this is probably an explanation for my situation:
Parameters: length_provider – The LengthField with which this variable length payload is associated. If not included, it is assumed that the length_provider should consume the remainder of the bytes available in the string. This is only valid in cases where the developer knows that they will be dealing with a fixed sequence of bytes (already boxed).
What would be the best way to work within the framework with such a constraint? Is there any way to make Payload() be lazy rather than greedy?
success.py:
from suitcase.structure import Structure
from suitcase.fields import (
SBInt8,
SBInt16,
UBInt32,
SBInt64,
Payload,
Magic
)
class Header(Structure):
a = UBInt32()
b = SBInt8()
c = Payload()
e = SBInt64()
f = SBInt16()
g = SBInt16()
h = SBInt16()
i = SBInt16()
Success output:
In [1]: import success
In [2]: s = success.Header()
In [3]: s.unpack(b'\x00\x00\x007\n123.45.67.89-8888\x00Some-String-2\x00\x00\x00\x01Z\xa5\xfb\xc8\xab\x00\x0e\x00\x15\x00\x01\x00\x00')
In [4]: s
Out[4]:
Header (
a=55,
b=10,
c=b'123.45.67.89-8888\x00Some-String-2\x00',
e=1488843425963,
f=14,
g=21,
h=1,
i=0,
)
failure.py:
from suitcase.structure import Structure
from suitcase.fields import (
SBInt8,
SBInt16,
UBInt32,
SBInt64,
Payload,
Magic
)
class Header(Structure):
header_size = UBInt32()
version = SBInt8()
nis_id = Payload()
msg_id = Payload()
timestamp = SBInt64()
event_size = SBInt16()
job_discard_size = SBInt16()
num_jobs = SBInt16()
num_discards = SBInt16()
Failure output:
In [1]: import failure
In [2]: f = failure.Header()
In [3]: f.unpack(b'\x00\x00\x007\n123.45.67.89-8888\x00Some-String-2\x00\x00\x00\x01Z\xa5\xfb\xc8\xab\x00\x0e\x00\x15\x00\x01\x00\x00')
---------------------------------------------------------------------------
SuitcaseParseError Traceback (most recent call last)
<ipython-input-3-91ea443f72a0> in <module>()
----> 1 f.unpack(b'\x00\x00\x007\n123.45.67.89-8888\x00Some-String-2\x00\x00\x00\x01Z\xa5\xfb\xc8\xab\x00\x0e\x00\x15\x00\x01\x00\x00')
/home/mpelikan/.local/lib/python3.6/site-packages/suitcase/structure.py in unpack(self, data, trailing)
339
340 def unpack(self, data, trailing=False):
--> 341 return self._packer.unpack(data, trailing)
342
343 def pack(self):
/home/mpelikan/.local/lib/python3.6/site-packages/suitcase/structure.py in unpack(self, data, trailing)
62 def unpack(self, data, trailing=False):
63 stream = BytesIO(data)
---> 64 self.unpack_stream(stream)
65 stream.tell()
66 if trailing:
/home/mpelikan/.local/lib/python3.6/site-packages/suitcase/structure.py in unpack_stream(self, stream)
150 "%r we tried to read %s bytes but "
151 "we were only able to read %s." %
--> 152 (_name, length, len(data)))
153 try:
154 field.unpack(data)
SuitcaseParseError: While attempting to parse field 'd' we tried to read None bytes but we were only able to read 32.
Since the two strings in my case are found in succession and can be read out by the same Payload() field, I've used a bit of a hack to work around this for now.
...
_c_d = Payload()
c = FieldProperty(
_c_d, onget=lambda v: str(v.split(b'\x00')[0]))
d = FieldProperty(
_c_d, onget=lambda v: str(v.split(b'\x00')[1]))
...
I looked through the closed Issues and PRs, and noticed a similar request here: #21. In my case all of this data is within a fixed frame, but there are two unknown size Payload() fields, whereas in the case one of them is constrained to a fixed size which is what probably makes the test work...