mammoth icon indicating copy to clipboard operation
mammoth copied to clipboard

Idea: Allow Mammoth to serialize/deserialize values, for more precise customization

Open cakoose opened this issue 4 years ago • 3 comments

I problem I have:

  • I want different int8 columns to deserialize differently: number vs BigInt vs Buffer.
  • I have schema-aware JSON deserializers. I'd like to use those to deserialize different jsonb columns differently.

With "pg", you can only have a single serializer/deserializer for a type.

Mammoth has richer type information, e.g. jsonb<Article>. If Mammoth allowed transforming values, I could get exactly the serialization/deserialization I want.

One downside is performance. But maybe there won't be a performance hit if Mammoth hooks in to "pg" at a lower level, parsing the text/binary protocol directly. (Hopefully "pg-protocol" can do most of the work.)

cakoose avatar Jan 28 '21 17:01 cakoose

Just ran into another reason: "pg" doesn't correctly deserialize arrays of enums: https://github.com/brianc/node-pg-types/issues/56

Mammoth has the type information to do it correctly.

cakoose avatar Mar 05 '21 08:03 cakoose

And would you suggest to let mammoth handle the pool and everything pg is doing right now as well or do you see another way?

martijndeh avatar Apr 07 '21 07:04 martijndeh

The statement about "parsing the text/binary protocol directly" was a bit extreme. I think we might get acceptable performance just passing in a custom value parser, e.g:

import pg from "pg";

const client = new pg.Client({
    connectionString: "...",
    types: {
        getTypeParser(typeId: ..., format: 'text' | 'binary'): any {
            return v => v;
        }
    }
} as any); 

This just relays the raw string/binary value, so Mammoth can parse the values depending on the Mammoth schema. (It can use 'pg-types' to do the low-level work.)

cakoose avatar Apr 08 '21 03:04 cakoose