sling-cli
sling-cli copied to clipboard
Best way to sling a MySQL BLOB of integers to a PostgreSQL int array
I have a MySQL source table with a BLOB column storing a list of integers from 0-255. Pixel values essentially. Is there an optimal way to sling these to a PostgreSQL table with an INT ARRAY column type? Appreciate any help!
No way for native int array, but jsonb should be possible.
Since blob is mapped to text (see here), you will to cast it as a json yourself with the source.options.columns options. Put something like { my_blob_col: json }, and it should create a jsonb column in postgres.
If int array type in postgres is a must, best I can see is you running a target.options.post_sql query to do that yourself.
Thanks! When setting the source column to "json" and running Sling, it reports that it can't insert into the target database because:
pq: invalid byte sequence for encoding "UTF8": 0x00
Does a transform need to be changed?
can you try with source options { transforms: [ replace_0x00 ] }? That's the unicode NULL byte
Aha, stuck on an older Sling version at the moment but will try that as soon as possible.
Looking at https://stackoverflow.com/a/25600307
Can you try one of the decode transforms? See https://github.com/slingdata-io/sling-cli/blob/main/core/sling/tranforms.go.
On Thu, Jan 25, 2024, 6:21 PM Cody Kilpatrick @.***> wrote:
can you try with source options { transforms: [ replace_0x00 ] }? That's the unicode NULL byte
That worked after switching to v1.0.71, but now it complains about byte character 0xba 😄
— Reply to this email directly, view it on GitHub https://github.com/slingdata-io/sling-cli/issues/132#issuecomment-1911017853, or unsubscribe https://github.com/notifications/unsubscribe-auth/AB2QZYUK5ZRERUQQCIWFRBDYQLEGZAVCNFSM6AAAAABCLBXO26VHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMYTSMJRGAYTOOBVGM . You are receiving this because you commented.Message ID: @.***>
Also try a different encoding, not too familiar with MySQL advanced settings:
ALTER DATABASE mydb CHARSET utf8;
Just some updates not success yet but I tried:
"transforms": ["replace_0x00", "Each decode transformation"]
"hex_representation": "text and json"
With my postgres table typed as both jsonb and text.
The closest result I got was with just replace_0x00, no transforms into a postgres text column and I received this error:
pq: invalid byte sequence for encoding "UTF8": 0xc1 0xe4 Which almost feels like the value in my int array are being converted to Ascii?
Also just to re-set to the same page: I am running sling with this command:
docker run --network=forerunner-network --rm -v "/Users/codykilpatrick/code/spear/forerunner/packages/forerunner-legacy-api/sling-docker/task.json:/task.json" -i slingdata/sling:v1.0.71 run -d -c task.json
my task.json:
{
"source": {
"conn": "mysql_conn_string",
"stream": "SELECT heading, typekey, bearing_time_configuration, date, id, pixels AS hex_representation FROM forerunner_legacy.bearing_time_record WHERE {incremental_where_cond};",
"primary_key": "id",
"update_key": "date",
"options": {}
},
"target": {
"conn": "postgres_conn_string",
"object": "forerunner_public.bearing_time_record_staging",
"options": {
"table_tmp": "forerunner_public.bearing_time_record_staging_temp",
"pre_sql": "CREATE INDEX temp_idx_id ON forerunner_public.bearing_time_record_staging_temp (id);"
}
},
"mode": "incremental"
}
My Postgres Table
CREATE TABLE forerunner_public.bearing_time_record_staging (
id int PRIMARY KEY,
hex_representation text,
date bigint,
heading bigint,
bearing_time_configuration text,
typekey int,
is_synced boolean
);
My MySQL table:
CREATE TABLE bearing_time_record (
id INT NOT NULL,
date BIGINT NOT NULL,
heading INT NOT NULL,
pixels BLOB NOT NULL,
bearing_time_configuration VARCHAR(255) NOT NULL,
typekey INT NOT NULL,
PRIMARY KEY (id)
);
Also thank you so much for all of your help, you don't know how appreciative I am.
Ah ok, can you find out what the default charset / encoding is for your MySQL source?
On Thu, Jan 25, 2024, 9:08 PM Cody Kilpatrick @.***> wrote:
Just some updates not success yet but I tried: "transforms": ["replace_0x00", "Each decode transformation"] "hex_representation": "text and json" With my postgres table typed as both jsonb and text.
The closest result I got was with just replace_0x00, no transforms into a postgres text column and I received this error: pq: invalid byte sequence for encoding "UTF8": 0xc1 0xe4 Which almost feels like the value in my int array are being converted to Ascii?
Also just to re-set to the same page: I am running sling with this command:
docker run --network=forerunner-network --rm -v "/Users/codykilpatrick/code/spear/forerunner/packages/forerunner-legacy-api/sling-docker/task.json:/task.json" -i slingdata/sling:v1.0.71 run -d -c task.json
my task.json:
{ "source": { "conn": "mysql_conn_string", "stream": "SELECT heading, typekey, bearing_time_configuration, date, id, pixels AS hex_representation FROM forerunner_legacy.bearing_time_record WHERE {incremental_where_cond};", "primary_key": "id", "update_key": "date", "options": {} }, "target": { "conn": "postgres_conn_string", "object": "forerunner_public.bearing_time_record_staging", "options": { "table_tmp": "forerunner_public.bearing_time_record_staging_temp", "pre_sql": "CREATE INDEX temp_idx_id ON forerunner_public.bearing_time_record_staging_temp (id);" } }, "mode": "incremental" }
My Postgres Table
CREATE TABLE forerunner_public.bearing_time_record_staging ( id int PRIMARY KEY, hex_representation text, date bigint, heading bigint, bearing_time_configuration text, typekey int, is_synced boolean );
My MySQL table:
CREATE TABLE bearing_time_record ( id INT NOT NULL, date BIGINT NOT NULL, heading INT NOT NULL, pixels BLOB NOT NULL, bearing_time_configuration VARCHAR(255) NOT NULL, typekey INT NOT NULL, PRIMARY KEY (id) );
Also thank you so much for all of your help, you don't know how appreciative I am.
— Reply to this email directly, view it on GitHub https://github.com/slingdata-io/sling-cli/issues/132#issuecomment-1911198242, or unsubscribe https://github.com/notifications/unsubscribe-auth/AB2QZYX5TOK5MLKAQYZXQDTYQLXW5AVCNFSM6AAAAABCLBXO26VHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMYTSMJRGE4TQMRUGI . You are receiving this because you commented.Message ID: @.***>
Hey Sorry for the late response on this one.
This query:
SELECT DEFAULT_CHARACTER_SET_NAME, DEFAULT_COLLATION_NAME FROM information_schema.SCHEMATA WHERE schema_name = "my_db";
Returns: DEFAULT_CHARACTER_SET_NAME = latin1 DEFAULT_COLLATION_NAME=latin1_swedish_ci