django-taggit
django-taggit copied to clipboard
Different unicode but same slug exception error
Here's an interesting exception:
Traceback (most recent call last):
File "[...]/python3.4/site-packages/django/db/backends/utils.py", line 64, in execute
return self.cursor.execute(sql, params)
psycopg2.IntegrityError: duplicate key value violates unique constraint "taggit_tag_slug_key"
DETAIL: Key (slug)=(musique-vocale-france-18e-siecle) already exists.
After some time troubleshooting this, I realised that the error was being caused by the fact that two different unicode strings [used as tags] lead to the same slug. They were:
>>> 'musique vocale -- france -- 18e siècle'.encode('utf-8')
b'musique vocale -- france -- 18e si\xc3\xa8cle'
>>> 'musique vocale -- france -- 18e siècle'.encode('utf-8')
b'musique vocale -- france -- 18e sie\xcc\x80cle'
siècle
looks similar to siècle
, but in fact they are different as shown above. The first string was the name of an existing tag, the second string was being applied as a new tag.
In this case [when two tag names are technically different], shouldn't we end up with different slugs [instead of this nasty exception]? In this case, something like musique-vocale-france-18e-siecle-2
would've solved this problem. Albeit not the unicode confusion to the end-user.
This error seems to happen only when django_postgrespool
is active, I have recently discovered.
How did you solve this? Did you really need to disable django_postgrespool?
Hi @luisehk! It's been a while since I dealt with this issue. I couldn't confirm the causality I suggested earlier with django_postgrespool
as we no longer use it in that project.
If you find something else out, please update this issue.
Several past issues also cover collation issues around slugs and the like, I feel like this is probably still an issue (or at least some more documentation needs to happen regarding collation).