tungstenite-rs
tungstenite-rs copied to clipboard
allow sending messages with static or shared data
This allows the server to skip allocating buffers holding the message data to be send if the data is static.
- This doesn't really save anything for clients atm due to the need of masking, although changing the masking logic to run on the output buffer instead of the message buffer could solve that.
- The change in the
Message
fields data make this a breaking change for any code directly creating or deconstructing theMessage
enum. - Keeping the
text
andbinary
methods intact requires adding separatestatic_*
methods, an alternative might be to changetext
/binary
to take anInto<Cow<_>>
directly, this would break any code calling those methods with anything that doesn't implementInto<Cow<_>>
but should be fine for the most common parameters ofString/&str/Vec<u8>/&[u8]
an alternative might be to change text/binary to take an Into<Cow<_>> directly
Another approach would be to implement https://github.com/snapview/tungstenite-rs/issues/96 similar to the way it was tried to be implemented here: https://github.com/snapview/tungstenite-rs/pull/104 , but to use Bytes
as a least common denominator for the type of the data that the message holds. Of course it's not the same as holding 'static
data, yet more efficient than storing Vec<u8>
or String
.
I've changed things around a bit, switching to custom enums for the binary and string data to allow dealing with shared data trough Bytes
and changed the Message::text
/Mesage::binary
methods to handle both the unique(Vec<u8>
/String
), shared(Bytes
) and static(&'static [u8]
/&'static str
) cases transparently.
I've also made the datatype that stores the message data opaque so things can be changed in the future without any api breaks.
I've currently not added support from shared string data, as there is no way to go from Arc<String>
to Bytes
without copying, it is still possible to add this later by making the Frame
deal with the shared text case seperate.
I'd eventually like to review this, but I'll probably need a couple more days before I manage so. @jxs, do you want too?
I'm trying to understand whether this would make the client side any more inefficient.
I have some mostly finished code that removes the need to make the data a unique copy when applying the mask that I was originally going to put into a follow up pr but I can add it here is desired,
I have some mostly finished code that removes the need to make the data a unique copy when applying the mask that I was originally going to put into a follow up pr but I can add it here is desired
That would be interesting to check (probably in a separate commit?).
pushed
When will this be fully merged? I think it's a very nice feature and doesn't make stuff too complex.
I need to send data that allocated by custom allocator.
Can it be trait
instead of enum MessageData
, with method to provide &[u8]
and Vec<u8>
?
When will this be fully merged? I think it's a very nice feature and doesn't make stuff too complex.
It has some unresolved merge conflicts. It seems like replacing the binary bytes with Bytes
would be the simplest solution and will take less code, so unless it has some unfortunate consequences for the performance, I think it would be a way to go.
Can it be trait instead of enum MessageData, with method to provide &[u8] and Vec
?
Do you mean Message
becoming Message<T: AsRef>
? In theory, it's of course possible (not sure if it would be particularly ergonomic though). If we used Bytes
instead of a Vec<u8>
, would it solve your problem?
If we used Bytes instead of a Vec
, would it solve your problem?
I don't see how Bytes
can help without unsafe
.
I have data allocated with libc::malloc
and want to send them via web socket.
Yes, I can transmutate
them into &'static [u8]
and then call Bytes::from_static
.
But this is if I really desperate for performance. So Bytes
and Vec
is not what I need.
&[u8]
would be nice interface (but tungstenite-rs
for some reason may want to modify data?),
so trait
looks the most promising.
Do you mean Message becoming Message<T: AsRef>
Yes, something like this, it can be type Message = GenericMessage<Vec<u8>>
,
so the end user don't see the difference.
I don't see how Bytes can help without unsafe. I have data allocated with libc::malloc and want to send them via web socket.
What I mean is that imagine that you're allocating memory with libc::malloc
, but store it internally as BytesMut
or something like this (might indeed require unsafe
), then it would be easy to take Bytes
and pass it to tungstenite
without copying the data one more time.
&[u8] would be nice interface (but tungstenite-rs for some reason may want to modify data?), so trait looks the most promising.
&[u8]
is not a problem, but the question is how are we going to store it (taking into account that the library could be used with Tokio runtime), i.e. when users read messages, the returned Message
must own the data. When users write messages, the Message
must own something that allows getting a reference to the underlying data in a thread-safe way (so that the Message
could be passed between threads).
The only universal solution that I have in mind is to make Message
generic over T
where T
would implement AsRef
or something like this. This means that the connect
and similar user-friendly methods would use Vec<u8>
for T
, whereas when the users create sockets by themselves, they can specify any type they want (in your case it would mean wrapping the data into some sort of an Arc
though if you're using it with Tokio for instance).
@Dushistov, so for your particular question the workaround would be to create Vec<u8>
from the data that you allocated with libc
and then pass it to the library.
The only problem with this approach is that you'll have to copy the data which is not particularly efficient, but if we make the type of Message
more generic (as I described in the comment above), that alone won't solve your problem, because whenever we receive a frame from the user, we write it to the send queue (that eventually gets flushed), e.g. imaging calling write_message()
4 times in a row with 4 different messages. If they all end up being in a queue, tungstenite
will have to own the data inside to be able to send them.
In the case of this PR, it's mitigated by making the Frame
hold Bytes
(e.g. a strong shared reference to the original data). In other words: you'll have to place your data inside Bytes
anyway (or inside another Arc
based structure, but that would effectively be a reinvention of a Bytes
🙂).
so for your particular question the workaround would be to create Vec
I am already doing so. But this looks like waste, C++ allocate buffer, copy data to it
transfer ownership to Rust code, Rust code then again allocate memory via Vectungstenite-rs
.
If they all end up being in a queue, tungstenite will have to own the data inside to be able to send them.
if tungstenite only want to own data then this is not problem, wraper around allocated data has proper Drop
implementation that call libc::free
, so it is actually like Vec<u8>
with custom allocator and without capacity field.
The only universal solution that I have in mind is to make Message generic
generic Message
simplify thing, it would be possible to have two Message
, one for sending, and one for receiving.
if tungstenite only want to own data then this is not problem, wraper around allocated data has proper Drop implementation that call libc::free, so it is actually like Vec
with custom allocator and without capacity field.
Ok, I think I finally got your point :) So basically you don't really need sending the shared data (like this PR mentions), you're more concerned about the fact that currently you can't pass a custom T: AsRef<[u8]>
structure to the Message
, right?
generic Message simplify thing, it would be possible to have two Message, one for sending, and one for receiving.
Hm, not quite. I mean, of course we could define different data types for incoming and outgoing message (strictly speaking there is nothing wrong with it), but we would need to update quite a few things to make it ergonomic with tokio-tungstenite
then (e.g. won't be able to use StreamExt::split()
and similar things), but there are some solutions for this available though.