plugin-GoogleAnalyticsImporter
plugin-GoogleAnalyticsImporter copied to clipboard
Making GA4 imports work with a proxy
Description:
We previously added the ability for UA imports to work with a proxy. This is to make GA4 imports work with a proxy as well. Fixes Issue: #521
Review
- [ ] Functional review done
- [ ] Potential edge cases thought about (behavior of the code with strange input, with strange internal state or possible interactions with other Matomo subsystems)
- [ ] Usability review done (is anything maybe unclear or think about anything that would cause people to reach out to support)
- [ ] Security review done
- [ ] Wording review done
- [ ] Code review done
- [ ] Tests were added if useful/possible
- [ ] Reviewed for breaking changes
- [ ] Developer changelog updated if needed
- [ ] Documentation added if needed
- [ ] Existing documentation updated if needed
@AltamashShaikh Do these changes look like they should work to you?
@snake14 No change, unfortunately. I'm pasting output from the console below. Please let me know if there are any specific debugging steps I could try that would help. I'll poke at it some today and see if I can figure anything out.
I did confirm that setting the https_proxy
env var before a console run allows OAuth to succeed and the import to begin. I can use that to get around this issue for our current use case, but I still want to help with getting this PR to release.
/var/www/html# php ./console googleanalyticsimporter:import-ga4-reports --property="properties/XXXXXXXXX" --dates=2023-07-01,2024-01-01
ERROR [2024-04-29 16:25:13] 124 Uncaught exception:
/var/www/html/plugins/GoogleAnalyticsImporter/vendor/prefixed/guzzlehttp/guzzle/src/Handler/CurlFactory.php(146):
cURL error 28: Failed to connect to oauth2.googleapis.com port 443 after 300000 ms: Timeout was reached (see
https://curl.haxx.se/libcurl/c/libcurl-errors.html) for https://oauth2.googleapis.com/token [Query: , CLI mode: 1]
In CurlFactory.php line 146:
cURL error 28: Failed to connect to oauth2.googleapis.com port 443 after 300000 ms: Timeout was reached (see https:
//curl.haxx.se/libcurl/c/libcurl-errors.html) for https://oauth2.googleapis.com/token
For the record, with this PR the GA3 import still works, no regression. With a little debug I can see that the new code in AuthorizationGA4.php getClientClassArguments()
is setting what looks like a properly configured Guzzle client (pasted below). I'm trying to figure out a way to verify that the passing authHttpHandler
to BetaAnalyticsDataClient
and AnalyticsAdminServiceClient
actually has the desired effect.
\\Matomo\\Dependencies\\GoogleAnalyticsImporter\\GuzzleHttp\\Client::__set_state(array(
'config' =>
array (
'proxy' => 'http://proxy.foo.com:3128',
'exceptions' => false,
'base_uri' =>
\\Matomo\\Dependencies\\GoogleAnalyticsImporter\\GuzzleHttp\\Psr7\\Uri::__set_state(array(
'scheme' => 'https',
'userInfo' => '',
'host' => 'www.googleapis.com',
'port' => NULL,
'path' => '',
'query' => '',
'fragment' => '',
'composedComponents' => NULL,
)),
'handler' =>
\\Matomo\\Dependencies\\GoogleAnalyticsImporter\\GuzzleHttp\\HandlerStack::__set_state(array(
'handler' =>
\\Closure::__set_state(array(
)),
'stack' =>
array (
0 =>
array (
0 =>
\\Closure::__set_state(array(
)),
1 => 'http_errors',
),
1 =>
array (
0 =>
\\Closure::__set_state(array(
)),
1 => 'allow_redirects',
),
2 =>
array (
0 =>
\\Closure::__set_state(array(
)),
1 => 'cookies',
),
3 =>
array (
0 =>
\\Closure::__set_state(array(
)),
1 => 'prepare_body',
),
),
'cached' => NULL,
)),
'allow_redirects' =>
array (
'max' => 5,
'protocols' =>
array (
0 => 'http',
1 => 'https',
),
'strict' => false,
'referer' => false,
'track_redirects' => false,
),
'http_errors' => true,
'decode_content' => true,
'verify' => true,
'cookies' => false,
'idn_conversion' => false,
'headers' =>
array (
'User-Agent' => 'GuzzleHttp/7',
),
),
))
Thank you @machinehum . Looking at the GuzzleHttp\Client you output, I'd say that looks correct. What does the https://github.com/matomo-org/plugin-GoogleAnalyticsImporter/blob/cad809ff8b9c7dddf13ba24607080f3a17a7152b/vendor/prefixed/google/gax/src/CredentialsWrapper.php#L110 look like if you comment out line 52 where authHttpHandler gets set in the arguments? I'm wondering if maybe I used the wrong base_uri
value for GA4?
@snake14 Hm. In CredentialsWrapper.php
at line 110 I dumped the value of $args['authHttpHandler']
, and then of $authHttpHandler
after it is set. $args['authHttpHandler']
is alway null and $authHttpHandler
is always a vanilla GuzzleHttp\Client instance, without the proxy or other custom settings. This is true regardless of whether line 52 in AuthorizationGA4.php
is commented out or not.
I did confirm that the conditional on line 51-53 is being entered. So it seems like the new form of setting the arguments isn't actually passing them through? I'm going to try some more debugging. I suppose you could do similar even without a proxy to see whether this version of AuthorizationGA4.php
is effectively passing the credentials argument or not.
With the change in this PR to AuthorizationGA4, the authHttpHandler
argument is being set in an array next to \Matomo\Dependencies\GoogleAnalyticsImporter\Google\ApiCore\CredentialsWrapper, but from looking at the Google API code it looks like it needs to be an argument to the credentials wrapper builder, alongside keyFile
. This change to AuthorizationGA4::getClientClassArguments():
$arguments = [
'keyFile' => $this->getClientConfiguration()
];
$proxyHttpClient = StaticContainer::get('GoogleAnalyticsImporter.proxyHttpClient');
if ($proxyHttpClient) {
$proxyHttpHandler = \Matomo\Dependencies\GoogleAnalyticsImporter\Google\Auth\HttpHandler\HttpHandlerFactory::build($proxyHttpClient);
$arguments['authHttpHandler'] = $proxyHttpHandler;
}
$credentialWrapper = \Matomo\Dependencies\GoogleAnalyticsImporter\Google\ApiCore\CredentialsWrapper::build($arguments);
return ['credentials' => $credentialWrapper];
}
results in CredentialsWrapper showing what looks like a good GuzzleHttp\Client object. However, the initial OAuth request still isn't using the proxy. The BetaAnalyticsDataGapicClient and AnalyticsAdminServiceGapicClient builders also allows for passing in a set of config arguments via credentialsConfig
, as opposed to a preconstructed CredentialWrapper object. I tried that and got the same result. I'll keep poking at it.
Nice find @machinehum . I meant to pass the handler into the CredentialsWrapper
constructor. I went ahead and updated the PR with the latest changes that you shared. It looks like maybe passing the transport
argument similar to credentials
might work? I found mention of it in Google's documentation.
@snake14 Yeah, I did try the transport config eventually as well. That didn't make any difference either. Ended up spending 9 hours debugging this today. The documentation on the google PHP API side seems a little scattered. I did see that some of the modules of google-cloud-php, like BigTables, implement their own proxy settings. This makes me suspicious about whether the "official" solution is fully tested and working. Most of the threads I can find on it say "just use environment variables", heh. I opened an issue at https://github.com/googleapis/google-cloud-php/issues/7274 to hopefully get some guidance.
Thank you for all of your effort @machinehum ; it's greatly appreciated. I agree that it's hard to find a clear answer in Google's documentation. Pretty much all of my searches for using a proxy with Google's PHP API classes ended up pointing to answers for UA and not GA4.