site-kit-wp
site-kit-wp copied to clipboard
Incorrect redirects to WP Dashboard for sites with Internationalised Domain Names
Bug Description
On a site which uses IDNs (domains with multi-byte UTF-8 characters), redirects to the SiteKit dashboard are incorrectly directed to the WP Dashboard in various scenarios. This set of bugs were identified when implementing #4776.
Steps to reproduce
Scenario 1 - Setup SiteKit with Google Analytics
- Go to 'https://türkish.com/wp-admin'. Login using credentials which can be obtained from @wpdarren, @jamesozzie or @jimmymadon.
- Setup SiteKit and tick the 'Connect Google Analytics' during SiteKit setup. After Step 4 - Connecting Google Analytics, instead of being redirected to the SiteKit dashboard, we are redirected to the WP Dashboard.
- Navigate to the SiteKit dashboard. Google Analytics is still not connected.
Scenario 2 - Setup SiteKit without Google Analytics
- Setup SiteKit but do not tick the 'Connect Google Analytics' during SiteKit setup. After Step 3, instead of being redirected to the SiteKit dashboard, we are redirected to the WP Dashboard.
Scenario 3 - Connect Analytics or AdSense
- Go to Settings and attempt to connect AdSense.
- After going through oAuth/setup, we are redirected back to WP Dashboard, and navigating back to settings shows that the set up is incomplete.
Screenshots
Additional Context
- PHP Version: 7.4.15
- OS: macOS 12.5.1
- Browser: Chrome Version 105.0.5195.102
- Plugin Version: 1.83.0
- Device: MacBook Pro 13
- WordPress Version: 6.0.2
Do not alter or remove anything below. The following sections will be managed by moderators only.
Acceptance criteria
- On sites which use multi-byte characters in their URLs / domain names (Internationalised Domain Names):
- When setting up Site Kit, after the final step, the user should be redirected to the Site Kit dashboard as normal.
- When setting up Site Kit with Analytics, the user should be redirected to the Site Kit dashboard as normal and Analytics should be connected successfully.
- When connecting all modules in the plugin from the settings page, the module should be connected successfully and the user should be redirected to the Site Kit dashboard as normal.
Implementation Brief
Test Coverage
QA Brief
Changelog entry
@jimmymadon feel free to add some initial AC here to start.
@aaemnnosttv Have added some preliminary ACs.
Thanks @jimmymadon – this looks good for the most part. I'm still a bit unclear exactly what the experience is we're trying to fix here. I understand the user ends up on the WP dashboard but from what I recall, it doesn't always happen? Also, is it specific to the initial connection only or can it happen anytime the user goes through oAuth?
@aaemnnosttv While the behaviour was slightly inconsistent in my testing, I was eventually able to reproduce all the three scenarios that @wpdarren initially found out. This is happening anytime the user goes through oAuth and not only on initial SiteKit connection.
Hm... this is weird... I was able to reproduce the issue on the https://türkish.com/wp-admin
site. Then I asked @jamesozzie to move the site to another hosting where I can log in via ssh to troubleshoot the problem. Once the site has been migrated to the new hosting, I can't reproduce the issue anymore.
@jimmymadon do you remember which version of the plugin was on the previous hosting? I installed the latest version from the wp.org
on the new site.
cc @asvinb
@eugene-manuilov I am 90% sure that it was WP 6.0.2 as I remember adding the Additional Context
section using WP's Site Info from the test site itself.
Thanks, @jimmymadon. I meant the plugin version. Was it the latest version installed from wp.org?
@eugene-manuilov That should have been 1.83.0 when I created the issue. I'm 95% sure that I used the "Site Info" from this live test site when writing the 'Additional Context' section of this bug. (I myself could not replicate it on my local so the Additional Context
would all be for the live site we found this issue on).
@wpdarren can you reproduce this issue on a different domain? The turkish.com
one doesn't work at the moment and I can't troubleshoot the problem.
Ok, @jamesozzie restored the turkish.com
site and I can reproduce the issue again.
@aaemnnosttv could you please review IB? The reason why the user is redirected to the wp dashboard page is that the redirect URL is a) harmed by the FILTER_SANITIZE_URL
filter (it returns https://trkish.com/...
instead of https://türkish.com/...
) and b) the wp_safe_redirect
function uses wp_validate_redirect
that also strips some unicode symbols from the domain name.
Thanks for the digging @eugene-manuilov ! I'm wondering if this means maybe we shouldn't be using wp_safe_redirect
at all? 🤔
@felixarntz would you please review while I am out?
Hey @felixarntz just checking you are ok to pick this up? This is in the current sprint so ideally we would want this to be reviewed asap. Let me know if you do not think you have capacity for this, thanks so much!
IB ✅
I wanted to add a note here as I've been at this issue awhile now 😆
First of all: on that site, the dashboard is never reachable, seemingly because User_Input_State::VALUE_REQUIRED === $this->user_input_state->get()
evaluates to true
and causes a wp_safe_redirect
to be run, which redirects to the WP Admin Dashboard instead of Site Kit's User Input page.
It seems there's a few things going on with this particular site/domain; part of it seems to be an issue with the site wanting to load the User Input page from the dashboard due to this check being true
: https://github.com/google/site-kit-wp/blob/c149fae22d37b5c565b1942d485a965ad92d6a81/includes/Core/Authentication/Authentication.php#L307
If that wp_safe_redirect
is changed to wp_redirect
the redirect there works. If that check has && Feature_Flags::enabled( 'userInput' )
added to it (which is arguably should), then with the User Input feature flag disabled the dashboard will load, though the redirect from the Proxy OAuth flow still redirects to the main WP dashboard.
Replacing all of the wp_safe_redirect
calls doesn't seem wise or scalable, so improving those calls to work with the IDN domains (I suspect it needs to support both the UTF-8 style AND the ASCII styles) should fix this issue everywhere…
As a further update here, looks like this is actually a bug in core we’ll need to work around :disappointed:
This function is used to validate URLs in wp_safe_redirect
: https://developer.wordpress.org/reference/functions/wp_validate_redirect/
And it turns out this section of code:
// In PHP 5 parse_url() may fail if the URL query part contains 'http://'.
// See https://bugs.php.net/bug.php?id=38143
$cut = strpos( $location, '?' );
$test = $cut ? substr( $location, 0, $cut ) : $location;
$lp = parse_url( $test );
results in a comparison between türkish.com
and t%C3%BCrkish.com
. We can work around this by always registering the ASCII version of an IDN with Site Kit, which will work and behave as-expected from my testing, but will require us to ensure we always use the ASCII-representation of an IDN.
This will also require all existing users with IDNs to re-connect Site Kit. So this might well be still more work, and might require thinking about how we handle the migration, especially as changing the siteurl
and home
settings in wp_options
resulting in a message requiring “re-connecting”… but reconnecting this way seems to trigger an “already connected” error :sweat_smile:
Turns out this one is quite the can of worms :smile:
QA Update: ❌
@tofumatt as per our conversation on Slack. I am still being redirected back to WP Dashboard after setting up Site KIt. I noticed that Analytics didn't set up successfully as the CTA is 'complete setup' in settings. I have included a screencast below to show you the steps that I took to test this.
Interestingly, if I set up Site Kit without Analytics then this passes, because I am redirected to the Site Kit Dashboard.
https://user-images.githubusercontent.com/73545194/212664508-a1554e72-3e32-4b87-aca9-95c691eb2866.mp4
QA Update: ⚠️
@tofumatt I have an observation that I'd like to run by you.
Scenario: I have a Google Analytics account for this domain already set up with UA and GA4 property. When setting up Site Kit from the splash screen and at the GA step, the account is automatically selected. I continue with the set up and it completes. I am redirected to the Site Kit Dashboard. All good!
Where it started to look odd, is when I disconnected and reset Site Kit.
I then started the set up again from the splash screen, and this time, I set up a new GA account. After I gave permission, I was redirected to the WP Dashboard, and analytics had not successfully set up.
I created a screencast so you could take a look.
https://user-images.githubusercontent.com/73545194/213163103-c4bd8638-11f4-432d-b1fb-67cb35e6c114.mp4
Another side note, if I select another account that does not have a web stream for this domain, the same happens.
It looks like when the Analytics site is set up it uses the ASCII characters even if the site is registered with UTF-8 characters
https://user-images.githubusercontent.com/90871/213318865-39282def-9be2-4dd3-8191-9795d037295c.mp4
If this is the case, it's arguably a separate bug as it’s not a redirect that’s failing, it’s the entire setup process. If that's the case let me know and I'll file a new issue; it shouldn't be part of this one.
Equally, if selecting an existing türkish.com
or xn--trkish-3ya.com
property fails, please let me know the property name (whether it's xn--trkish-3ya.com
or türkish.com
) and the value of the wp_options
value for siteurl
in your MySQL database. 🙂
QA Update: ⚠️
@tofumatt I've run through a few scenarios:
This is a screenshot when the Google analytics account is selected automatically. In this scenario the setup works with no issues. I created the analytics account and set up web stream for the domain.
When I set up a new Google analytics account via Site Kit, this is what I see. I added the account name but the property appeared as xn--trkish-3ya.com
by default. If I leave this as is, the issue occurs where I am redirected to WP Dashboard and analytics is not set up. I'm not taken to the page to create the GA account.
When I set up a new Google analytics account via Site Kit, and use a account name and property without any international characters, the same issue occurs.
When I select a Google account that does not have a web stream for this domain. I get the permissions box appearing to create the web stream for GA4 property. I then go through oAuth with my Google account. The same issue then occurs where I'm redirected to WP dashboard and Analytics is not connected.
Equally, if selecting an existing türkish.com or xn--trkish-3ya.com property fails, please let me know the property name (whether it's xn--trkish-3ya.com or türkish.com) and the value of the wp_options value for siteurl in your MySQL database.
Unfortunately, I don't have access to the database for türkish.com.
To recap: the redirect functions when you're not setting up Analytics, just search console. Also, when you have an already created Google Analytics account with web stream for that domain, then the redirect functions as expected. The issue is when you try and create a new GA account via Site KIt OR, select an account with no web stream for this domain.
I don't think it's about web streams, because I'm able to create a new account for the site if I use the Unicode domain format instead of the ASCII one that Site Kit supplies, see this video:
https://user-images.githubusercontent.com/90871/213423015-746c3b83-9bdb-4357-9ff2-c4b1b05dd3ee.mp4
That successfully creates a new Analytics account and everything works.
My suspicion is that we have a lot of code in Site Kit that is using the ASCII-formatted IDN, but if a site was set up to use the Unicode format this creates issues, maybe with data being submitted being mismatched or something. The above works when my siteurl
looks like this:
data:image/s3,"s3://crabby-images/9457a/9457a7f5c8f5a6786b4346693d6dc1e3e98b5d5e" alt="CleanShot 2023-01-19 at 10 49 41"
If I change it to xn--trkish-3ya.com
, now this works:
https://user-images.githubusercontent.com/90871/213423952-b5bcc64a-2aaa-49e6-8401-759d479e4b7b.mp4
Can you check your site URL in the WordPress admin and let me know what it is @wpdarren? https://xn--trkish-3ya.com/wp-admin/options-general.php should do the trick 🙂
I'm wondering if we want to file a bigger issue about how we handle IDNs and make the formatting consistent throughout the plugin. It seems things are getting mismatched and it's causing issues with redirects, hostnames, etc. 🤔
@tofumatt
Can you check your site URL in the WordPress admin and let me know what it is @wpdarren? https://xn--trkish-3ya.com/wp-admin/options-general.php should do the trick 🙂
The URL is https://türkish.com
Am I right in thinking that this ticket hasn't caused any new issues with IDNs, so we should move this into approved, as it does work for a site with no Analytics, and a site that has already created Google Analytics account for the domain. We can then create a new ticket to document the issues we've discovered here.
What do you think?
From my knowledge we have not had a huge number of support tickets about the issues reported here, so it makes me think this is the best approach for now.
A lot of those bugs we encountered here are unrelated to this change, yes. So let's move this to approval and we're use #6435 as the new issue for discovery on IDN issues and create more follow-up tickets as-needed. Thanks! 👍🏻
QA Update: ✅
Verified:
- Used a site with IDNs.
- When setting up Site Kit without Analytics, I was redirected to Site Kit dashboard.
- When setting up Site Kit with an Analytics account for the domain, I was redirected to the Site Kit dashboard.
- When setting up AdSense, Tag Manager and Optimize, I was redirected to the Site Kit dashboard.
There are a few bugs discovered during the testing and these are highlighted in #6435.