r2dbc-postgresql
r2dbc-postgresql copied to clipboard
How do I fetch COPY output result ?
Is there official way of getting COPY output result ? I see that CopyData messages are coming in PostgresqlResult.messages but there is no way for me to get it because PostgresqlResult.map is hardcoded to get regular rows only.
Thank you.
There's no direct API for CopyData consumption. PostgresReplicationStream uses CopyData that lets you consume the replication stream of START_REPLICATION SLOT … LOGICAL ….
Care to elaborate on your use-case so we can understand what you want to achieve to discuss how to make it possible?
@mp911de My use-case is data replication. So It's possible that single COPY query will return 100s of GB of data. As result I need to parse response in streaming fashion.
Data replication as in logical decode? How would you consume COPY though the pgjdbc driver? Maybe we can come up with something similar.
I read it as he just wants to consume the copy stream. Data replication is orthogonal to the request
Dave is correct. I want to consume very long stream of binary data. Typically COPY output is either CSV or BINARY. So my client code will take of parsing it.
What's also important is to know that the stream is fully consumed. For that I need to get CopyDone.
If I were to modify existing code with minimal intrusion I'd probably have something like CopyDataStreaming where data field is of Flux<ByteBuf> type.
Is that something we could map onto PostgresqlResult in the form of e.g.
/**
* Return a stream of mapped objects from {@link CopyData}. The stream terminates when the underlying stream receives {@link CopyDone}.
*/
<T> Flux<T> mapCopy(Function<CopyData, ? extends T> mappingFunction);
Since I'm not terribly familiar with Copy (and CopyIn, CopyOut), I'm happy for suggestions. Also an example how the calling code could look like would be helpful to find a suitable approach. CopyData holds a ByteBuf that can be consumed by the caller.
Thank you, Mark. I'm afraid returning response in ByteBuf won't cut it. It can easily be terabyte of data. Let me fork and experiment with implementation options and I'll raise PR to discuss if solution is feasible.
Sure, happy to continue the discussion over the actual code.
Note that the driver is built with the assumption to decode an entire frame (message) (see BackendMessageDecoder) before a consumer can process the frame. According to the Postgres protocol spec, a single frame may be up to 2GB which is a lot of memory.
No, copy is a stream it could be TB as mentioned.
Does the server send multiple CopyData frames or is that part of a single frame?
Good question I actually haven't looked. you are probably correct though it does probably send a frame