asyncpg icon indicating copy to clipboard operation
asyncpg copied to clipboard

Q: disabling codecs while transferring data?

Open ale-dd opened this issue 2 years ago • 11 comments

I have a couple of potentially silly questions.

Suppose I am reading rows in batches from a table in a source database and writing the same rows in batches into another table in another database.

  1. Can I disable all decoders on the connection to the source database as well as all encoders on the connection to the destination database and just deal with records that have a binary representation of each column?
  2. Can I do so on a statement by statement basis such that for all the statements that are not directly involved into this data triangulation the codecs still apply? Or should I have a separate connection for all the meta operations like introspecting the current status and checkpointing progress?

Thank you!

ale-dd avatar Feb 17 '22 00:02 ale-dd

Codecs are configured per-connection, so if you want them off in one case and on in another, you must use separate connections.

elprans avatar Feb 18 '22 04:02 elprans

Would turning them off also avoid the introspection query that takes place when preparing a statement? That would be a big plus to me, since I often have to coerce types via explicit casting to avoid that query breaking when I am dealing with CRDB...

ale-dd avatar Feb 24 '22 01:02 ale-dd

No. The introspection query is necessary for builtin decoders also.

elprans avatar Feb 24 '22 02:02 elprans

Is there no way to avoid even builtin decoders and encoders so as to be able to just "shuttle" the raw data between two databases without any attempt at interpreting it?

ale-dd avatar Feb 25 '22 01:02 ale-dd

Theres is: use copy_from_table/copy_to_table.

elprans avatar Feb 25 '22 01:02 elprans

Unfortunately, the destination table is expected to already contain older versions of many of the same rows I'd be copying into it. What I need to perform are UPSERTs. For this reason, I cannot use copy_from_table/copy_to_table.

ale-dd avatar Feb 25 '22 03:02 ale-dd

Use a temporary table and then INSERT ... ON CONFLICT from it: https://schinckel.net/2019/12/13/asyncpg-and-upserting-bulk-data/

elprans avatar Feb 25 '22 04:02 elprans

I've seen that recommendation before. I'll keep it in mind in the future. Thank you!

I'm curious if what I was hoping for is something that is just not feasible because of intrinsic reasons pertaining the protocol, or if it is just not a feature that asyncpg presently offers.

ale-dd avatar Feb 25 '22 05:02 ale-dd

It is inherent. asyncpg needs introspection to make sense of the incoming bytes, it's not even text.

elprans avatar Feb 25 '22 05:02 elprans

Yes, I understand that bit, and I was wondering if the same blob of bytes attained from a SELECT could be provided to an INSERT, without interpreting it, provided that the same columns are specified by the SELECT and by the subsequent INSERT.

ale-dd avatar Feb 25 '22 08:02 ale-dd

No, that's not how the implementation works. Turning off data decoding completely is not supported and is not planned to be supported (well, except raw copy which is already implemented).

elprans avatar Feb 25 '22 18:02 elprans