django-taggit icon indicating copy to clipboard operation
django-taggit copied to clipboard

Some unicode tags return empty slug

Open tpeaton opened this issue 10 years ago • 10 comments

Certain unicode doesn't yield a usable slug.

>>> obj.tags.add('(ง’̀’́)ง')
>>> obj.tags.all()
[<Tag: ง’̀’́)ง>]
>>> obj.tags.all()[0].slug
u''

We've solved this internally by doing this:

from shortuuid import uuid

name = data['name'].encode('ascii', 'ignore').decode('utf-8')
if not taggit_slugify(name):
    data['slug'] = 'tag-{}'.format(uuid())

Seem reasonable? If so, I can add it to this method and create a PR.

tpeaton avatar Oct 09 '15 16:10 tpeaton

Do you have 'unidecode' installed?

frewsxcv avatar Oct 09 '15 16:10 frewsxcv

I do not.

tpeaton avatar Oct 09 '15 16:10 tpeaton

So if you install that, it might fix this issue. We should probably add a note about it in the docs somewhere....

Relevant PR: https://github.com/alex/django-taggit/pull/315

frewsxcv avatar Oct 09 '15 16:10 frewsxcv

That seems to work fine, thanks!

tpeaton avatar Oct 09 '15 16:10 tpeaton

Yes, installing "unidecode" fixed issues. (pip install unidecode)

uksmartsolutions avatar Apr 11 '22 19:04 uksmartsolutions

I have installed unidecode, but my ciryllic tags still have empty slugs.

AliIslamov avatar Jul 13 '23 08:07 AliIslamov

@AliIslamov could you provide a snippet of how you are creating your tags, along with an assertion that the slug is indeed the empty string?

In particular, could you try things like calling tag.slugify(tag_string) to confirm that the slugify method is returning an empty string?

rtpg avatar Jul 14 '23 01:07 rtpg

I would like to add more tests or something here to work through this issue but am having a hard time conceptualizing at what part of the system this is failing

rtpg avatar Jul 14 '23 01:07 rtpg

@AliIslamov could you provide a snippet of how you are creating your tags, along with an assertion that the slug is indeed the empty string?

In particular, could you try things like calling tag.slugify(tag_string) to confirm that the slugify method is returning an empty string?

I already have solved my problem.

  1. Have written in settings.py:
TAGGIT_STRIP_UNICODE_WHEN_SLUGIFYING = True
  1. Have installed unidecode:
pip install unidecode
  1. Have written in models.py for my application:
from unidecode import unidecode

And now everything works well 👍

AliIslamov avatar Jul 14 '23 08:07 AliIslamov

I would like to add more tests or something here to work through this issue but am having a hard time conceptualizing at what part of the system this is failing

We can conclude that while unicode is enabled by default, cyrillic tags get an empty slug.

The situation is solved by forcing unicode to ascii convertation, installing the unidecode module and importing it into the models.py file, in which the TaggableManager() field is present.

In my opinion it makes sense to add this instruction in documentation.

AliIslamov avatar Jul 14 '23 08:07 AliIslamov