[source-postgres] Array fields in PostgreSQL table are not populating in Snowflake target table
Connector Name
source-postgres
Connector Version
3.1.7
What step the error happened?
During the sync
Relevant information
Note: the connection is from source-postgres 3.1.7 to destination-snowflake 3.1.2
We have a PostgreSQL table that tracks transfers of a list of customer from one agent to another.
create table customer_transfers
(
id uuid default gen_random_uuid() not null
primary key,
transferred_customer_ids uuid[] default '{}'::uuid[],
from_agent_id uuid not null,
to_agent_id uuid not null,
created_at timestamp not null,
updated_at timestamp not null,
);
We are syncing it from PostgreSQL into Snowflake for analysis and noticed that the transferred_customer_ids in the Snowflake target table was NULL for all records. Additionally, we are seeing the following in the _airbyte_meta field for each record in the target table
{ "errors": [ "Problem with `transferred_customer_ids`"] }
When looking at the intermediate _airbyte_raw_customer_transfers table, I noticed that the json object in the _airbyte_data field for the records looked like
{
"created_at": "2018-12-26T22:39:09.357020",
"from_agent_id": "79e7c905-84cc-4674-a10f-a89e89710a21",
"id": "32f9096a-717b-4be9-9b7f-d0d5c579c29f",
"to_agent_id": "b6c85557-c920-4ad3-9be4-fabfc49f3dd0",
"transferred_customer_ids": "{9d324c1c-cee7-4050-8e93-31add0625073,ae664d72-1cd2-4647-82a5-864b09cfd696,ae664d72-1cd2-4647-82a5-864b09cfd696}",
"updated_at": "2018-12-26T22:39:09.357020"
}
Since the value for transferred_customer_ids is stored as a string (following Postgres conventions) rather than as a json array, I suspect that is causing the snowflake-target import process to report the values as the incorrect data type for the column in the target table and replacing them with NULLs when importing the other values.
Also, I'm happy to take a stab at submitting a fix, but it has been a very long time since I've worked in a Java code base so I'd likely be quite slow to get this resolved--assuming my diagnosis of the problem is even correct.
Relevant log output
No response
Contribute
- [ ] Yes, I want to contribute
@asimy in the settings of your connection, what is the data type for the column? Destinations trust this catalog for typing information, so if the data type is correct there then the destinations should cast it properly.
@jbfbell, the data type is "Array" in the connection.
@asimy, did you get a solution to this problem. I have the same issue.
@Wamolambo I have not found a solution. None of the array fields in our PG tables are populating in Snowflake.
This is source-postgres issue - transferring to the right team
Possible duplicate of airbytehq/airbyte#34986
Just verified with the latest PG -> Snowflake and BigQuery syncs. uuid arrays can now be correctly synced. I'm closing this ticket. Please reopen if this happens again.