r2dbc-postgresql How do I fetch COPY output result ?

trafficstars

Is there official way of getting COPY output result ? I see that CopyData messages are coming in PostgresqlResult.messages but there is no way for me to get it because PostgresqlResult.map is hardcoded to get regular rows only.

Thank you.

Sep 19 '20 18:09 unoexperto

There's no direct API for CopyData consumption. PostgresReplicationStream uses CopyData that lets you consume the replication stream of START_REPLICATION SLOT … LOGICAL ….

Care to elaborate on your use-case so we can understand what you want to achieve to discuss how to make it possible?

Sep 21 '20 07:09 mp911de

@mp911de My use-case is data replication. So It's possible that single COPY query will return 100s of GB of data. As result I need to parse response in streaming fashion.

Sep 30 '20 20:09 unoexperto

Data replication as in logical decode? How would you consume COPY though the pgjdbc driver? Maybe we can come up with something similar.

Oct 01 '20 07:10 mp911de

I read it as he just wants to consume the copy stream. Data replication is orthogonal to the request

Oct 01 '20 10:10 davecramer

Dave is correct. I want to consume very long stream of binary data. Typically COPY output is either CSV or BINARY. So my client code will take of parsing it.

What's also important is to know that the stream is fully consumed. For that I need to get CopyDone.

If I were to modify existing code with minimal intrusion I'd probably have something like CopyDataStreaming where data field is of Flux<ByteBuf> type.

Oct 01 '20 10:10 unoexperto

Is that something we could map onto PostgresqlResult in the form of e.g.

/**
 * Return a stream of mapped objects from {@link CopyData}. The stream terminates when the underlying stream receives  {@link CopyDone}.
 */
<T> Flux<T> mapCopy(Function<CopyData, ? extends T> mappingFunction);

Since I'm not terribly familiar with Copy (and CopyIn, CopyOut), I'm happy for suggestions. Also an example how the calling code could look like would be helpful to find a suitable approach. CopyData holds a ByteBuf that can be consumed by the caller.

Oct 01 '20 11:10 mp911de

Thank you, Mark. I'm afraid returning response in ByteBuf won't cut it. It can easily be terabyte of data. Let me fork and experiment with implementation options and I'll raise PR to discuss if solution is feasible.

Oct 01 '20 12:10 unoexperto

Sure, happy to continue the discussion over the actual code.

Note that the driver is built with the assumption to decode an entire frame (message) (see BackendMessageDecoder) before a consumer can process the frame. According to the Postgres protocol spec, a single frame may be up to 2GB which is a lot of memory.

Oct 01 '20 13:10 mp911de

No, copy is a stream it could be TB as mentioned.

Oct 01 '20 13:10 davecramer

Does the server send multiple CopyData frames or is that part of a single frame?

Oct 01 '20 13:10 mp911de

Good question I actually haven't looked. you are probably correct though it does probably send a frame

Oct 01 '20 13:10 davecramer

r2dbc-postgresql r2dbc-postgresql copied to clipboard

How do I fetch COPY output result ?

r2dbc-postgresql
r2dbc-postgresql copied to clipboard