demo icon indicating copy to clipboard operation
demo copied to clipboard

[Slugger] blog_index stops working if the title contains certain characters

Open o-alquimista opened this issue 4 years ago • 5 comments

When you write a post with the following title: /test, you get an error when accessing the blog_index route:

An exception has been thrown during the rendering of a template ("Parameter "slug" for route "blog_post" must match "[^/]++" ("/test" given) to generate a corresponding URL.").

The slug \test (backward slash) doesn't break it.

I use the same slugger on my project and it breaks both under the same conditions. I'll be investigating this too, and share back any developments.

EDIT: Oops, I just learned that adding an opening chevron < at the beginning of the title breaks blog_index too. I suspect more characters may also cause this problem. This may require a whitelist-like regex to solve.

o-alquimista avatar Aug 03 '19 17:08 o-alquimista

This is my solution for all forbidden characters:

    public static function slugify(string $string): string
    {
        $slug = preg_replace('/\s+/', '-', mb_strtolower(trim($string), 'UTF-8'));
        $slug = preg_replace('/[^a-z0-9-]/', '', $slug);

        return $slug;
    }

The function strip_tags() was removed because, as the PHP manual says:

Warning: Because strip_tags() does not actually validate the HTML, partial or broken tags can result in the removal of more text/data than expected.

Whenever we used a title such as <test, strip_tags() would remove the entire string in front of it, not just the chevron. An empty slug is then rejected.

So now the slug can only contain a-z, 0-9 and - (hyphen).

Are there any security concerns from removing strip_tags() in this case?

o-alquimista avatar Aug 03 '19 21:08 o-alquimista

It is better to replace everything not-allowed with a hyphen. I.e., /[^a-z0-9-]+/ with -. Then there is no need to replace the whitespace separately an it behaves better in edge cases, like foo :-) bar, where the Douglas's function returns foo---bar instead of a more desirable foo-bar.

Also, it is better to trim the dashes after, not before, because otherwise foo :-) becomes foo--.

Therefore, I suggest the following code:

public static function slugify(string $string): string
    {
        $slug = preg_replace('/[^a-z0-9-]+/', '-', mb_strtolower($string, 'UTF-8'));
        return trim($slug, '-');
    }

jkufner avatar Aug 04 '19 09:08 jkufner

@jkufner Your solution gives much better results, and it simplifies the code a bit.

I've created a pull request already. I can make the changes there and insert your author details if you provide them.

The continuous integration build keeps failing and I haven't found where to inspect that.

o-alquimista avatar Aug 04 '19 13:08 o-alquimista

@o-alquimista I've amended the commit, please see https://github.com/o-alquimista/demo/pull/1

jkufner avatar Aug 04 '19 15:08 jkufner

Isn't this fixed as we are using the new Symfony slugger provided by String component?

juanmiguelbesada avatar Jan 11 '20 19:01 juanmiguelbesada

Let's close this old issue as fixed because the application no longer suffers from these issues. Thanks.

javiereguiluz avatar Apr 17 '23 08:04 javiereguiluz