django-taggit icon indicating copy to clipboard operation
django-taggit copied to clipboard

Integrity Error duplicate entry when adding tags with unequal case

Open ghost opened this issue 9 years ago • 10 comments

To reproduce:

-Go to admin and find model with tags. -Choose a model instance and add tag 'test' -Choose another model instance and add tag 'Test'

You should get an integrity error.

ghost avatar Apr 28 '15 13:04 ghost

@digiology what version of taggit?

frewsxcv avatar Apr 28 '15 13:04 frewsxcv

Both version 0.12.2 and 0.14.0

ghost avatar Apr 28 '15 13:04 ghost

Same happened to me. To avoid that I had to clean the tags making all of them lower case in my own Forms.

seocam avatar Nov 23 '15 21:11 seocam

The bug also happens with accents. For example the tags test and Tést

seocam avatar Nov 24 '15 21:11 seocam

Got the same with version 0.17.6. Using it from aldryn-newsblog.

Generated sql command is :

mysql> INSERT INTO `taggit_tag` (`name`, `slug`) VALUES ('réseaux de chaleur', 'reseaux-de-chaleur_1')            
    -> ;
ERROR 1062 (23000): Duplicate entry 'réseaux de chaleur' for key 'taggit_tag_name_4ed9aad194b72af1_uniq'

As 'Réseaux de chaleur' already exist and even if it lead to a different slug…

mysql> select * from taggit_tag where slug="reseaux-de-chaleur";
+----+---------------------+--------------------+
| id | name                | slug               |
+----+---------------------+--------------------+
|  8 | Réseaux de chaleur  | reseaux-de-chaleur |
+----+---------------------+--------------------+

From the admin interface, it works ok. It tells me that this slug and name already exists.

Database engine in my case is Mysql / MyISAM:

mysql> select * from taggit_tag where name = "Réseaux de chaleur" collate utf8_bin;
+----+---------------------+--------------------+
| id | name                | slug               |
+----+---------------------+--------------------+
|  8 | Réseaux de chaleur  | reseaux-de-chaleur |
+----+---------------------+--------------------+
1 row in set (0.00 sec)

mysql> select * from taggit_tag where name = "réseaux de chaleur" ;
+----+---------------------+--------------------+
| id | name                | slug               |
+----+---------------------+--------------------+
|  8 | Réseaux de chaleur  | reseaux-de-chaleur |
+----+---------------------+--------------------+
1 row in set (0.00 sec)

mysql> select * from taggit_tag where name = "réseaux de chaleur" collate utf8_bin;
Empty set (0.00 sec)

It seems to be a mysql encoding problem…

Found a note about that in django doc: https://docs.djangoproject.com/en/1.8/ref/databases/#collation-settings

tried to change code to handle this case… until I saw this settings and set it to true:

TAGGIT_CASE_INSENSITIVE = True

Problem solved !

alexandrenorman avatar Dec 20 '15 23:12 alexandrenorman

Has this issue been fixed yet? just came across the same thing this morning.

aidan-doherty avatar Jun 05 '17 08:06 aidan-doherty

For me TAGGIT_CASE_INSENSITIVE fixed things for ascii strings, but entering a tag with the unicode character "é" (\u00e9) twice still seems to cause an integrity error.

(it might be fixable by changing mysql collation settings, but in the end i just switched to postgres)

hjwp avatar Sep 07 '17 05:09 hjwp

Maybe I'm misunderstanding, but isn't this a major feature which is completely inoperable? We've run into this on our project, and it seems that case-sensitive tags simply don't work at all. Is that right, or is there some configuration settings or usage patterns that I'm not understanding? Given that CASE_INSENSITIVE defaults to False, I assume that case sensitive is the primary code path, and should function as one would expect (i.e., tags of different cases are respected).

fildred13 avatar Jan 21 '21 18:01 fildred13

@fildred13 This is a MySQL issue in particular, we can't really do much about case sensitivity checks on this side short of documenting the issue more. (See https://docs.djangoproject.com/en/1.8/ref/databases/#collation-settings). At least that's my understanding of this problem.

We should document this though, cuz this is a major footgun that's avoidable with the right set of changes.

rtpg avatar Apr 14 '21 06:04 rtpg

We run into this problem of case sensitivity with MySQL also years ago. We are maintaining a branch, where in _to_tag_model_instances of taggit/managers.py we use regex-expressions instead of 'regular' name__in filters: https://github.com/GrandComicsDatabase/django-taggit/tree/django32

Realized during testing today, that on a fresh db, one seemingly also needs to adjust the uniqueness constraint on the 'name' of a Tag.

jochengcd avatar Oct 24 '21 19:10 jochengcd