django-taggit icon indicating copy to clipboard operation
django-taggit copied to clipboard

Different unicode but same slug exception error

Open avorio opened this issue 9 years ago • 4 comments

Here's an interesting exception:

Traceback (most recent call last):
  File "[...]/python3.4/site-packages/django/db/backends/utils.py", line 64, in execute
    return self.cursor.execute(sql, params)
psycopg2.IntegrityError: duplicate key value violates unique constraint "taggit_tag_slug_key"
DETAIL:  Key (slug)=(musique-vocale-france-18e-siecle) already exists.

After some time troubleshooting this, I realised that the error was being caused by the fact that two different unicode strings [used as tags] lead to the same slug. They were:

>>> 'musique vocale -- france -- 18e siècle'.encode('utf-8')
b'musique vocale -- france -- 18e si\xc3\xa8cle'
>>> 'musique vocale -- france -- 18e siècle'.encode('utf-8')
b'musique vocale -- france -- 18e sie\xcc\x80cle'

siècle looks similar to siècle, but in fact they are different as shown above. The first string was the name of an existing tag, the second string was being applied as a new tag.

In this case [when two tag names are technically different], shouldn't we end up with different slugs [instead of this nasty exception]? In this case, something like musique-vocale-france-18e-siecle-2 would've solved this problem. Albeit not the unicode confusion to the end-user.

avorio avatar Sep 09 '15 22:09 avorio

This error seems to happen only when django_postgrespool is active, I have recently discovered.

avorio avatar Oct 20 '16 17:10 avorio

How did you solve this? Did you really need to disable django_postgrespool?

luisehk avatar Jul 18 '17 18:07 luisehk

Hi @luisehk! It's been a while since I dealt with this issue. I couldn't confirm the causality I suggested earlier with django_postgrespool as we no longer use it in that project.

If you find something else out, please update this issue.

avorio avatar Jul 19 '17 16:07 avorio

Several past issues also cover collation issues around slugs and the like, I feel like this is probably still an issue (or at least some more documentation needs to happen regarding collation).

rtpg avatar Apr 14 '21 06:04 rtpg