canvas slugify() method does not support other charsets

I'm trying to add tags with Chinese characters in a blog post but resulted in 500 error during axios request due to the empty value in tag.slug probably caused by slugify.

Temporary fix applied:

const tag = {
    name: searchQuery,
    slug: this.slugify(searchQuery) || searchQuery,
    user_id: Canvas.user.id
}

|| searchQuery added behind the slug: this.slugify(searchQuery)

Jan 28 '20 18:01 luiyongsheng

Thanks for bringing this up @luiyongsheng. Just because I don't know, is the Chinese-character slug still a valid url slug? Or do you have an example of what you're trying to save so I can try it on my end?

Just making sure this sort of thing won't break other localizations.

Jan 28 '20 18:01 austintoddj

For example, I added the term 本地化 as a tag and when I click done, no error was shown. But by monitoring the Network activities in DevTools, the POST request to canvas/api/posts/{slug} returned 500 error as shown:

I believed it was caused by the empty value here:

So I added this code and repack the script as a temporary fix:

Jan 28 '20 19:01 luiyongsheng

Thanks for the update. Yeah, verified the characters on my end too. This simple snippet:

/**
* Generate a URL friendly "slug" from a given string.
*
* @param text
* @return string
* @source https://gist.github.com/mathewbyrne/1280286
*/
slugify(text) {
    return text
        .toString()
        .toLowerCase()
        .replace(/\s+/g, '-')
        .replace(/[^\w\-]+/g, '')
        .replace(/--+/g, '-')
},

is the only part that's trying to "slugify" the name, it's not sufficient enough to handle localized characters at the moment.

I've come across Slugify and NodeSlug, and they may do the job here. I'll look at getting these tested out, but if you want to take a stab at it yourself and submit a PR for it I'd be more than happy to take a look. The method on line 16 of HelperMixin.js is where the new logic would need to go.

Jan 28 '20 20:01 austintoddj

@austintoddj I think it’s better to let the user enter the slug himself.

Feb 16 '20 06:02 guladima

Suggestion by @thewwwizard on #892:

/**
 * Return a URL-friendly slug.
 *
 * @param str
 * @returns {string}
 * @link https://gist.github.com/mathewbyrne/1280286#gistcomment-2588056
 */
slugify(str) {
    let text = str.toString().toLowerCase().trim();

    const sets = {}
    sets.default = [
        { to: 'a', from: '[ÀÁÂÃÄÅÆĀĂĄẠẢẤẦẨẪẬẮẰẲẴẶ]' },
        { to: 'c', from: '[ÇĆĈČ]' },
        { to: 'd', from: '[ÐĎĐÞ]' },
        { to: 'e', from: '[ÈÉÊËĒĔĖĘĚẸẺẼẾỀỂỄỆ]' },
        { to: 'g', from: '[ĜĞĢǴ]' },
        { to: 'h', from: '[ĤḦ]' },
        { to: 'i', from: '[ÌÍÎÏĨĪĮİỈỊ]' },
        { to: 'j', from: '[Ĵ]' },
        { to: 'ij', from: '[Ĳ]' },
        { to: 'k', from: '[Ķ]' },
        { to: 'l', from: '[ĹĻĽŁ]' },
        { to: 'm', from: '[Ḿ]' },
        { to: 'n', from: '[ÑŃŅŇ]' },
        { to: 'o', from: '[ÒÓÔÕÖØŌŎŐỌỎỐỒỔỖỘỚỜỞỠỢǪǬƠ]' },
        { to: 'oe', from: '[Œ]' },
        { to: 'p', from: '[ṕ]' },
        { to: 'r', from: '[ŔŖŘ]' },
        { to: 's', from: '[ßŚŜŞŠ]' },
        { to: 't', from: '[ŢŤ]' },
        { to: 'u', from: '[ÙÚÛÜŨŪŬŮŰŲỤỦỨỪỬỮỰƯ]' },
        { to: 'w', from: '[ẂŴẀẄ]' },
        { to: 'x', from: '[ẍ]' },
        { to: 'y', from: '[ÝŶŸỲỴỶỸ]' },
        { to: 'z', from: '[ŹŻŽ]' },
        { to: '-', from: "[·/_,:;']" },
    ];

    /**
     * russian
     */
    sets.ru = [
        { to: 'a', from: '[А]' },
        { to: 'b', from: '[Б]' },
        { to: 'v', from: '[В]' },
        { to: 'g', from: '[Г]' },
        { to: 'd', from: '[Д]' },
        { to: 'e', from: '[ЕЭ]' },
        { to: 'yo', from: '[Ё]' },
        { to: 'zh', from: '[Ж]' },
        { to: 'z', from: '[З]' },
        { to: 'i', from: '[И]' },
        { to: 'j', from: '[Й]' },
        { to: 'k', from: '[К]' },
        { to: 'l', from: '[Л]' },
        { to: 'm', from: '[М]' },
        { to: 'n', from: '[Н]' },
        { to: 'o', from: '[О]' },
        { to: 'p', from: '[П]' },
        { to: 'r', from: '[Р]' },
        { to: 's', from: '[С]' },
        { to: 't', from: '[Т]' },
        { to: 'u', from: '[У]' },
        { to: 'f', from: '[Ф]' },
        { to: 'h', from: '[Х]' },
        { to: 'c', from: '[Ц]' },
        { to: 'ch', from: '[Ч]' },
        { to: 'sh', from: '[Ш]' },
        { to: 'shch', from: '[Щ]' },
        { to: 'y', from: '[Ы]' },
        { to: 'yu', from: '[Ю]' },
        { to: 'ya', from: '[Я]' },
    ];

    /**
     * bulgarian
     */
    sets.bg = [
        { to: 'a', from: '[А]' },
        { to: 'b', from: '[Б]' },
        { to: 'v', from: '[В]' },
        { to: 'g', from: '[Г]' },
        { to: 'd', from: '[Д]' },
        { to: 'e', from: '[ЕЭ]' },
        { to: 'zh', from: '[Ж]' },
        { to: 'z', from: '[З]' },
        { to: 'i', from: '[И]' },
        { to: 'y', from: '[Й]' },
        { to: 'k', from: '[К]' },
        { to: 'l', from: '[Л]' },
        { to: 'm', from: '[М]' },
        { to: 'n', from: '[Н]' },
        { to: 'o', from: '[О]' },
        { to: 'p', from: '[П]' },
        { to: 'r', from: '[Р]' },
        { to: 's', from: '[С]' },
        { to: 't', from: '[Т]' },
        { to: 'u', from: '[У]' },
        { to: 'f', from: '[Ф]' },
        { to: 'h', from: '[Х]' },
        { to: 'ts', from: '[Ц]' },
        { to: 'ch', from: '[Ч]' },
        { to: 'sh', from: '[Ш]' },
        { to: 'sht', from: '[Щ]' },
        { to: 'a', from: '[Ъ]' },
        { to: 'y', from: '[Ь]' },
        { to: 'yu', from: '[Ю]' },
        { to: 'ya', from: '[Я]' },
    ];

    // first try user locale's sets
    if(Canvas.user.locale in sets){
        sets[Canvas.user.locale].forEach((set) => {
            text = text.replace(new RegExp(set.from, 'gi'), set.to);
        });
    }

    // after that use detault sets
    sets.default.forEach((set) => {
        text = text.replace(new RegExp(set.from, 'gi'), set.to);
    });

    return text
        .replace(/\s+/g, '-') // Replace spaces with -
        .replace(/[^\w-]+/g, '') // Remove all non-word chars
        .replace(/--+/g, '-') // Replace multiple - with single -
        .replace(/^-+/, '') // Trim - from start of text
        .replace(/-+$/, ''); // Trim - from end of text
},

More work to be done here, but merging in that PR without this change, and keeping the conversation here until we can resolve this.

Dec 27 '20 18:12 austintoddj