pylsl
pylsl copied to clipboard
Support for pushing arbitrary byte sequences in cf_string channels
We would like to transmit arbitrary byte sequences using pylsl and cf_string channels. Unfortunately, the current pylsl implementation does not support that. Technically, it should work just fine though.
liblsl states the following about the cft_string channel format:
/** For variable-length ASCII strings or data blobs, such as video frames, complex event
descriptions, etc. */
cft_string = 3,
For the user's convenience, pylsl cf_string outlets accept Python str objects and encode them on the fly. Similarly, the cf_string inlets decode incoming byte strings to str objects.
While this is helpful 99% of the time, it makes sending/receiving arbitrary byte strings impossible, e.g. when one wants to transmit encoded video frame data. Decoding our bytestring to utf-8 first and then have pylsl encode it again is unfortunately not a viable workaround as not all byte sequences can be decoded to utf-8.
Proposed solution
We suggest adding a boolean flag to the corresponding pull_* and push_* function that indicates wether the data should be encoded/decoded, defaulting to True. This should fit conceptionally, given that it is the users responsibility to interpret the data.
Dear pylsl maintainers, what do you think about this proposal?
/cc @ClaraKuper
Just thinking out loud...
We could add another pylsl-only channel_format for cf_bytestring. This would require an additional mapping from cf_bytestring to the same lib.pull_ and lib.push_ functions as cf_string, but it wouldn't pass the cf_string check to trigger the encode/decode.
How does that sound?
I prefer doing something like this instead of adding a kwarg that is only relevant in << 1% of cases and might be confusing if not documented exactly correctly.
@cboulay I think this is a very good alternative. 👍🏻 @clarakuper and I will work on an implementation proposal.
I heard from @ClaraKuper in Slack that you are now encountering a problem with your char-arrays being interpreted as null-terminated strings, effectively cutting your data whenever a 0 appears. This is indeed inconsistent with the doctoring.
There's another push_ method in lsl's C-API that can handle your use-case. Unfortunately it is not exposed in pylsl, probably because the method requires an additional argument so it would be hard to make the call transparent to the user.
https://github.com/sccn/liblsl/blob/05f7e16525930058c2e00681b10da9031fa7b90e/src/lsl_outlet_c.cpp#L161-L173
Since we are providing the length of the char array, it uses the 'from sequence' std::string constructor (5 here), and it shouldn't do a null termination.
So, as above, if you use a new cf_bytestring type, you can modify the function maps to have that entry redirect to lib.lsl_push_sample_buftp. I don't know how to best handle the extra argument; it might require passing a ..., *args.
Even with the above change, I suspect you'll still encounter some difficulties on the receiving end. For example, xdf importers may attempt to coerce the blob into a string and will do the null termination... I haven't looked too closely at that.