Routing icon indicating copy to clipboard operation
Routing copied to clipboard

The router fails to match unicode uri

Open steve-todorov opened this issue 8 years ago • 5 comments

I've been writing a wrapper library around this one and it turns out it's not possible to match a url containing anything but english letters. Here is an example url containing a string in Bulgarian:

http://localhost:8000/bg/Тестова-страница

As of Symfony 3.2, the symfony router now officially supports UTF-8 characters in the url. So I assumed symfony-cmf/routing supports it as well. Unfortunately the link above returns a 404 error.

I did a little digging through the source and found out that the class Symfony\Cmf\Component\Routing\Candidates\Candidates#getCandidates is causing this issue. The method calls getCandidatesFor($url), but the $url is not being urldecoded, which is why it fails to find any candidates. Adding a simple urldecode solves the problem.

/**
 * {@inheritdoc}
 */
public function getCandidates(Request $request)
{
    $url = urldecode($request->getPathInfo()); // if not decoded, the path here would look like this: /bg/%D0%A2%D0%B5%D1%81%D1%82%D0%BE%D0%B2%D0%B0-%D1%81%D1%82%D1%80%D0%B0%D0%BD%D0%B8%D1%86%D0%B0
    $candidates = $this->getCandidatesFor($url);

    $locale = $this->determineLocale($url);
    if ($locale) {
        $candidates = array_unique(array_merge($candidates, $this->getCandidatesFor(substr($url, strlen($locale) + 1))));
    }

    return $candidates;
}

I would be happy to open a PR with the fix if there is no better solution.

steve-todorov avatar Dec 06 '16 23:12 steve-todorov

i am a bit afraid of the side effects like if somebody is using encoded paths in their repositoy.

would it make sense to provide a separate candidate strategy that does the decoding? and let the user choose in configuration which strategy they want?

Am 06.12.2016 3:41 nachm. schrieb "Steve Todorov" <[email protected]

:

I've been writing a wrapper library around this one and it turns out it's not possible to match a url containing anything but english letters. Here is an example url containing a string in Bulgarian:

http://localhost:8000/bg/Тестова-страница

As of Symfony 3.2, the symfony router now officially supports http://symfony.com/blog/new-in-symfony-3-2-unicode-routing-support UTF-8 characters in the url. So I assumed that the symfony-cmf/routing supports it as well. Unfortunately the link above returns a 404 error.

I did a little digging through the source and found out that the class Symfony\Cmf\Component\Routing\Candidates\Candidates#getCandidates is causing this issue. The method calls getCandidatesFor($url), but the $url is not being urldecoded, which is why it fails to find any candidates. Adding a simple urldecode solves the problem.

/**

  • {@inheritdoc} */ public function getCandidates(Request $request) { $url = urldecode($request->getPathInfo()); // if not decoded, the path here would look like this: /bg/%D0%A2%D0%B5%D1%81%D1%82%D0%BE%D0%B2%D0%B0-%D1%81%D1%82%D1%80%D0%B0%D0%BD%D0%B8%D1%86%D0%B0 $candidates = $this->getCandidatesFor($url);

    $locale = $this->determineLocale($url); if ($locale) { $candidates = array_unique(array_merge($candidates, $this->getCandidatesFor(substr($url, strlen($locale) + 1)))); }

    return $candidates; }

I would be happy to open a PR with the fix if there is no better solution.

— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub https://github.com/symfony-cmf/routing/issues/183, or mute the thread https://github.com/notifications/unsubscribe-auth/AAErINTqBldf9Kv8lCmSYBHyy5AdA3IYks5rFfK3gaJpZM4LGBm8 .

dbu avatar Dec 07 '16 00:12 dbu

I'm currently using it with ORM and haven't had any problems so far, but you might be right. It would make sense to have another class which handles specifically this case. I'm wondering, if it would be too much work to make it possible to configure this in symfony-cmf/routing-bundle as well? I have that a feeling my library will evolve in a bundle which extends the routing-bundle.

steve-todorov avatar Dec 07 '16 05:12 steve-todorov

i think it indeed would make sense to have the decoding strategy in this library, and refactor the configuration on routing bundle a bit. right now we just have limit_candidates but we could move that to

candidates:
    strategy: plain|urldecode
    limit: [number]

we are in the process of preparing version 2 of the bundle anyways, so now would be an excellent time to do this.

can you do a pull request for a UrlDecode strategy class, and one on the bundle for the configuration change?

dbu avatar Dec 07 '16 17:12 dbu

I am struggeling with UTF-8 URLs and the CMF routing component right now. Did you make any progress on this one?

I tried to apply the patch in Candidates suggested by @steve-todorov and it did not work. But as I am using my own dynamic route provider, I could add the urldecode($url) in getRouteCollectionForRequest() which fixed the problem for now, but feels a little awkward.

Any ideas on how to improve the situation for the community?

kevinpapst avatar Jun 25 '17 11:06 kevinpapst

sorry for the late response.

looking at http://symfony.com/blog/new-in-symfony-3-2-unicode-routing-support, it seems that the symfony router scans the request for utf-8 and then automatically handles it as unicode.

we should probably not uncondidtionally urldecode. can you find the implementation in symfony and see if we could port this check over to the candidates strategy? as symfony itself already seems to have solved the problem, i would like to use exactly the same approach, if possible.

dbu avatar Jul 12 '17 06:07 dbu