postgres_scanner icon indicating copy to clipboard operation
postgres_scanner copied to clipboard

duckdb can't insert due to constraint, but also there is no constraint (there is one)

Open NFeruch opened this issue 1 year ago • 1 comments

What happens?

I have a table in postgres that I want to use duckdb to insert into. I have a unique constraint on the table that uses multiple columns to determine a unique row. On conflict, the duplicate rows should be ignored.

Here is the code I use to set it up (in a class but doesn't make a difference)

self.conn = duckdb.connect()
self.conn.install_extension('postgres')
self.conn.load_extension('postgres')
self.conn.sql("ATTACH 'dbname=player_data user=postgres host=127.0.0.1 password=mypassword' AS postgres (TYPE POSTGRES);")

If I run this

r = self.conn.sql('''
	INSERT INTO postgres.ranked.ranked_data
	SELECT * FROM 'data/*.json';
''')

print(r)

it gives me this error

duckdb.duckdb.Error: Failed to copy data: ERROR:  duplicate key value violates unique constraint "unique_row"
DETAIL:  Key (leagueid, queuetype, tier, rank, leaguepoints)=(766896e1-2b75-450f-acd4-083d5ef60b55, RANKED_SOLO_5x5, DIAMOND, I, 50) already exists.
CONTEXT:  COPY ranked_data, line 45

If I add an ON CONFLICT statement

r = self.conn.sql('''
	INSERT INTO postgres.ranked.ranked_data
	SELECT * FROM 'data/*.json'
	ON CONFLICT DO NOTHING;
''')

print(r)

It gives me this error

duckdb.duckdb.BinderException: Binder Error: There are no UNIQUE/PRIMARY KEY Indexes that refer to this table, ON CONFLICT is a no-op

According to duckdb, it can't insert because there is a constraint but also there is no constraint at the same time. Help me make this make sense please!

To Reproduce

This is the table

CREATE TABLE "ranked_data" (
	"leagueid" UUID NULL DEFAULT NULL,
	"queuetype" VARCHAR NULL DEFAULT NULL,
	"tier" VARCHAR NULL DEFAULT NULL,
	"rank" VARCHAR NULL DEFAULT NULL,
	"summonerid" VARCHAR NULL DEFAULT NULL,
	"leaguepoints" BIGINT NULL DEFAULT NULL,
	"wins" BIGINT NULL DEFAULT NULL,
	"losses" BIGINT NULL DEFAULT NULL,
	"veteran" BOOLEAN NULL DEFAULT NULL,
	"inactive" BOOLEAN NULL DEFAULT NULL,
	"freshblood" BOOLEAN NULL DEFAULT NULL,
	"hotstreak" BOOLEAN NULL DEFAULT NULL,
	"region" VARCHAR NULL DEFAULT NULL,
	"api_page" BIGINT NULL DEFAULT NULL,
	"timestamp" TIMESTAMP NULL DEFAULT NULL,
	UNIQUE "unique_row" ("leagueid", "queuetype", "tier", "rank", "leaguepoints")
)
;

OS:

Ubuntu, latest

DuckDB Version:

whatever comes from pip

DuckDB Client:

python 3.10

Full Name:

Nicholas Hansen-Feruch

Affiliation:

None

What is the latest build you tested with? If possible, we recommend testing with the latest nightly build.

I have tested with a stable release

Did you include all relevant data sets for reproducing the issue?

No - I cannot easily share my data sets due to their large size

Did you include all code required to reproduce the issue?

  • [ ] Yes, I have

Did you include all relevant configuration (e.g., CPU architecture, Python version, Linux distribution) to reproduce the issue?

  • [X] Yes, I have

NFeruch avatar May 13 '24 02:05 NFeruch

https://github.com/duckdb/postgres_scanner/issues/189

Tishj avatar May 13 '24 07:05 Tishj

I am also facing the same issue

karthik-sr87 avatar Jun 04 '24 12:06 karthik-sr87

As mentioned in https://github.com/duckdb/postgres_scanner/issues/189#issuecomment-1986799254 this is currently an expected restriction - upserts are not supported yet in the postgres scanner. Closing this as a duplicate of that issue

Mytherin avatar Jun 04 '24 20:06 Mytherin