canvas icon indicating copy to clipboard operation
canvas copied to clipboard

slugify() method does not support other charsets

Open luiyongsheng opened this issue 5 years ago • 5 comments

I'm trying to add tags with Chinese characters in a blog post but resulted in 500 error during axios request due to the empty value in tag.slug probably caused by slugify.

Temporary fix applied:

const tag = {
    name: searchQuery,
    slug: this.slugify(searchQuery) || searchQuery,
    user_id: Canvas.user.id
}

|| searchQuery added behind the slug: this.slugify(searchQuery)

luiyongsheng avatar Jan 28 '20 18:01 luiyongsheng

Thanks for bringing this up @luiyongsheng. Just because I don't know, is the Chinese-character slug still a valid url slug? Or do you have an example of what you're trying to save so I can try it on my end?

Just making sure this sort of thing won't break other localizations.

austintoddj avatar Jan 28 '20 18:01 austintoddj

image

For example, I added the term 本地化 as a tag and when I click done, no error was shown. But by monitoring the Network activities in DevTools, the POST request to canvas/api/posts/{slug} returned 500 error as shown: image

I believed it was caused by the empty value here: image

So I added this code and repack the script as a temporary fix: image

luiyongsheng avatar Jan 28 '20 19:01 luiyongsheng

Thanks for the update. Yeah, verified the characters on my end too. This simple snippet:

/**
* Generate a URL friendly "slug" from a given string.
*
* @param text
* @return string
* @source https://gist.github.com/mathewbyrne/1280286
*/
slugify(text) {
    return text
        .toString()
        .toLowerCase()
        .replace(/\s+/g, '-')
        .replace(/[^\w\-]+/g, '')
        .replace(/--+/g, '-')
},

is the only part that's trying to "slugify" the name, it's not sufficient enough to handle localized characters at the moment.

I've come across Slugify and NodeSlug, and they may do the job here. I'll look at getting these tested out, but if you want to take a stab at it yourself and submit a PR for it I'd be more than happy to take a look. The method on line 16 of HelperMixin.js is where the new logic would need to go.

austintoddj avatar Jan 28 '20 20:01 austintoddj

@austintoddj I think it’s better to let the user enter the slug himself.

guladima avatar Feb 16 '20 06:02 guladima

Suggestion by @thewwwizard on #892:

/**
 * Return a URL-friendly slug.
 *
 * @param str
 * @returns {string}
 * @link https://gist.github.com/mathewbyrne/1280286#gistcomment-2588056
 */
slugify(str) {
    let text = str.toString().toLowerCase().trim();

    const sets = {}
    sets.default = [
        { to: 'a', from: '[ÀÁÂÃÄÅÆĀĂĄẠẢẤẦẨẪẬẮẰẲẴẶ]' },
        { to: 'c', from: '[ÇĆĈČ]' },
        { to: 'd', from: '[ÐĎĐÞ]' },
        { to: 'e', from: '[ÈÉÊËĒĔĖĘĚẸẺẼẾỀỂỄỆ]' },
        { to: 'g', from: '[ĜĞĢǴ]' },
        { to: 'h', from: '[ĤḦ]' },
        { to: 'i', from: '[ÌÍÎÏĨĪĮİỈỊ]' },
        { to: 'j', from: '[Ĵ]' },
        { to: 'ij', from: '[IJ]' },
        { to: 'k', from: '[Ķ]' },
        { to: 'l', from: '[ĹĻĽŁ]' },
        { to: 'm', from: '[Ḿ]' },
        { to: 'n', from: '[ÑŃŅŇ]' },
        { to: 'o', from: '[ÒÓÔÕÖØŌŎŐỌỎỐỒỔỖỘỚỜỞỠỢǪǬƠ]' },
        { to: 'oe', from: '[Œ]' },
        { to: 'p', from: '[ṕ]' },
        { to: 'r', from: '[ŔŖŘ]' },
        { to: 's', from: '[ߌŜŞŠ]' },
        { to: 't', from: '[ŢŤ]' },
        { to: 'u', from: '[ÙÚÛÜŨŪŬŮŰŲỤỦỨỪỬỮỰƯ]' },
        { to: 'w', from: '[ẂŴẀẄ]' },
        { to: 'x', from: '[ẍ]' },
        { to: 'y', from: '[ÝŶŸỲỴỶỸ]' },
        { to: 'z', from: '[ŹŻŽ]' },
        { to: '-', from: "[·/_,:;']" },
    ];

    /**
     * russian
     */
    sets.ru = [
        { to: 'a', from: '[А]' },
        { to: 'b', from: '[Б]' },
        { to: 'v', from: '[В]' },
        { to: 'g', from: '[Г]' },
        { to: 'd', from: '[Д]' },
        { to: 'e', from: '[ЕЭ]' },
        { to: 'yo', from: '[Ё]' },
        { to: 'zh', from: '[Ж]' },
        { to: 'z', from: '[З]' },
        { to: 'i', from: '[И]' },
        { to: 'j', from: '[Й]' },
        { to: 'k', from: '[К]' },
        { to: 'l', from: '[Л]' },
        { to: 'm', from: '[М]' },
        { to: 'n', from: '[Н]' },
        { to: 'o', from: '[О]' },
        { to: 'p', from: '[П]' },
        { to: 'r', from: '[Р]' },
        { to: 's', from: '[С]' },
        { to: 't', from: '[Т]' },
        { to: 'u', from: '[У]' },
        { to: 'f', from: '[Ф]' },
        { to: 'h', from: '[Х]' },
        { to: 'c', from: '[Ц]' },
        { to: 'ch', from: '[Ч]' },
        { to: 'sh', from: '[Ш]' },
        { to: 'shch', from: '[Щ]' },
        { to: 'y', from: '[Ы]' },
        { to: 'yu', from: '[Ю]' },
        { to: 'ya', from: '[Я]' },
    ];

    /**
     * bulgarian
     */
    sets.bg = [
        { to: 'a', from: '[А]' },
        { to: 'b', from: '[Б]' },
        { to: 'v', from: '[В]' },
        { to: 'g', from: '[Г]' },
        { to: 'd', from: '[Д]' },
        { to: 'e', from: '[ЕЭ]' },
        { to: 'zh', from: '[Ж]' },
        { to: 'z', from: '[З]' },
        { to: 'i', from: '[И]' },
        { to: 'y', from: '[Й]' },
        { to: 'k', from: '[К]' },
        { to: 'l', from: '[Л]' },
        { to: 'm', from: '[М]' },
        { to: 'n', from: '[Н]' },
        { to: 'o', from: '[О]' },
        { to: 'p', from: '[П]' },
        { to: 'r', from: '[Р]' },
        { to: 's', from: '[С]' },
        { to: 't', from: '[Т]' },
        { to: 'u', from: '[У]' },
        { to: 'f', from: '[Ф]' },
        { to: 'h', from: '[Х]' },
        { to: 'ts', from: '[Ц]' },
        { to: 'ch', from: '[Ч]' },
        { to: 'sh', from: '[Ш]' },
        { to: 'sht', from: '[Щ]' },
        { to: 'a', from: '[Ъ]' },
        { to: 'y', from: '[Ь]' },
        { to: 'yu', from: '[Ю]' },
        { to: 'ya', from: '[Я]' },
    ];

    // first try user locale's sets
    if(Canvas.user.locale in sets){
        sets[Canvas.user.locale].forEach((set) => {
            text = text.replace(new RegExp(set.from, 'gi'), set.to);
        });
    }

    // after that use detault sets
    sets.default.forEach((set) => {
        text = text.replace(new RegExp(set.from, 'gi'), set.to);
    });

    return text
        .replace(/\s+/g, '-') // Replace spaces with -
        .replace(/[^\w-]+/g, '') // Remove all non-word chars
        .replace(/--+/g, '-') // Replace multiple - with single -
        .replace(/^-+/, '') // Trim - from start of text
        .replace(/-+$/, ''); // Trim - from end of text
},

More work to be done here, but merging in that PR without this change, and keeping the conversation here until we can resolve this.

austintoddj avatar Dec 27 '20 18:12 austintoddj