Umbraco.Cloud.Issues
Umbraco.Cloud.Issues copied to clipboard
Special characters in URL causes custom redirect to fail on Umbraco Cloud
Issue description
Our website is built with Umbraco 10.5.0 and configured to allow Unicode characters in URLs. Here is a snippet containing the RequestHandler section in appsettings.json:
...
"Umbraco": {
"CMS": {
"RequestHandler": {
"ConvertUrlsToAscii": "false",
"EnableDefaultCharReplacements": false
}
}
}
...
This setting tells Umbraco to remain all URLs in UTF-8 and not convert them to ASCII. It works right and any native non Latin letters appear as they should (not URL-encoded) in URLs. The issue happens only when such URLs are used in redirects on Umbraco Cloud. URLs become encoded after redirect and request results in not found page. Umbraco special property type alias umbracoRedirect is used for routing, but it is also reproduceable with custom redirects.
I have tested it further and it seems to be a general Azure issue.
It's not something I tried but you give something similar - https://stackoverflow.com/questions/27817266/how-to-correctly-use-urlencode-and-decode - a go for now and see if that solves it for you :)
@RyuLindow thank you for suggested solution. I tried it for custom redirect and it works locally and on .NET Fiddle, but unfortunately, it has the same problem on Umbraco Cloud - URL becomes encoded again after redirect. Besides, it still would be an issue when using of Umbraco special property umbracoRedirect where redirect is controlled by Umbraco.
Thanks for trying, @ZupaDev
Much appreciated :)
We'll need @sajumb to take it further from here.
I have created a product backlog item for us to look at. I am not certain that it is specially related to Umbraco Cloud or if that is something that we want to support. I will update this thread when we have new insight.
Hi,
is there any news about this issue? Do you know some rough estimation when it can be solved?
Hi, Thank you for your patience as we begin delving into this matter. Our team has taken note of your report and is actively considering avenues for investigation. To be honest, we have not got a whole lot of reports as most users naturally don't run into an issue with non-Latin characters for redirects or try to avoid them. To avoid the issue until it's resolved, consider manually URL-encoding non-Latin characters in URLs used for redirects.
Once we have more concrete findings or potential solutions, we will promptly communicate those to you.
Hi, We are starting to look into this in more detail. To better assist you, could you please provide some examples of the URLs that haven't worked as expected after the redirect? This will help me understand the specific encoding issues you're facing and propose a more targeted solution. Please post these here or in a dm. Thanks!
We will be closing this task within the next few days unless we get more details and a how-to-reproduce. Thank you for your understanding.
Hi Søren, sorry for a delay with reply, I've missed notification about your comments.
We are using native Danish characters in URLs without converting them to ASCII on our web site. For example, /nyheder/optimistisk-syn-på-inflation-og-vækst-gav-positiv-start-på-2023. Such URL works well while opening it in a browser.
However, if a redirect is applied to a page with such URL either by using Umbraco special property type alias umbracoRedirect or by Redirect method in Controller, all the special characters in URL become replaced with encodings like and result URL is like this /nyheder/optimistisk-syn-p%E5-inflation-og-v%E6kst-gav-positiv-start-p%E5-2023 and results in 404 page.
Hi @ZupaDev,
Thanks for your patience as we've been addressing the issue with non-Latin characters in URLs during redirects on Umbraco Cloud. After our investigation, we've identified a strategy that should help resolve this challenge.
The core of the issue lies in the way Azure App Services, which Umbraco Cloud relies on, handles URL encoding. To counter this, we suggest implementing a URL rewriting strategy using ASP.NET Core's Microsoft.AspNetCore.Rewrite.IRule
and Microsoft.AspNetCore.Builder.UseRewriter
. Here's how you can conceptually approach this:
-
Define Custom Rewrite Rules: Using
Microsoft.AspNetCore.Rewrite.IRule
, you can define custom rules for how URLs should be rewritten. This allows for specific handling of non-Latin characters, ensuring they retain their original form after redirects. -
Implement the Rewrite Logic: With
Microsoft.AspNetCore.Builder.UseRewriter
, these rules can be applied to incoming requests. This middleware will intercept requests and apply your custom rewrite rules, effectively managing URLs with special characters. -
Testing and Adjustment: It's crucial to test these rewrite rules in a controlled environment. This step ensures that the rules work as expected and do not unintentionally affect other URL patterns.
Here is an example as exposed in the appsettings.json
file, illustrating how these rewrite rules can be defined:
"Umbraco": {
"CMS": {
"UserDefinedCharCollection": [
{"Char": "æ", "Replacement": "ae"},
{"Char": "ø", "Replacement": "oe"},
{"Char": "å", "Replacement": "aa"}
// Additional mappings can be added as needed
]
}
}
In this example, specific non-Latin characters are mapped to their desired ASCII representations. These mappings are then used in the rewrite rules to ensure URLs are transformed appropriately.
Hello everyone - I'm closing this issue down, as a solution has been provided and no activity has been made since.
If this is still an issue, feel free to reopen the issue
Kind regards - Mikkel