plugin-GoogleAnalyticsImporter icon indicating copy to clipboard operation
plugin-GoogleAnalyticsImporter copied to clipboard

Making GA4 imports work with a proxy

Open snake14 opened this issue 2 months ago • 9 comments

Description:

We previously added the ability for UA imports to work with a proxy. This is to make GA4 imports work with a proxy as well. Fixes Issue: #521

Review

snake14 avatar Apr 29 '24 04:04 snake14

@AltamashShaikh Do these changes look like they should work to you?

snake14 avatar Apr 29 '24 04:04 snake14

@snake14 No change, unfortunately. I'm pasting output from the console below. Please let me know if there are any specific debugging steps I could try that would help. I'll poke at it some today and see if I can figure anything out.

I did confirm that setting the https_proxy env var before a console run allows OAuth to succeed and the import to begin. I can use that to get around this issue for our current use case, but I still want to help with getting this PR to release.

/var/www/html# php ./console googleanalyticsimporter:import-ga4-reports --property="properties/XXXXXXXXX" --dates=2023-07-01,2024-01-01
ERROR     [2024-04-29 16:25:13] 124  Uncaught exception: 
/var/www/html/plugins/GoogleAnalyticsImporter/vendor/prefixed/guzzlehttp/guzzle/src/Handler/CurlFactory.php(146):
cURL error 28: Failed to connect to oauth2.googleapis.com port 443 after 300000 ms: Timeout was reached (see
 https://curl.haxx.se/libcurl/c/libcurl-errors.html) for https://oauth2.googleapis.com/token [Query: , CLI mode: 1]

In CurlFactory.php line 146:

  cURL error 28: Failed to connect to oauth2.googleapis.com port 443 after 300000 ms: Timeout was reached (see https:
  //curl.haxx.se/libcurl/c/libcurl-errors.html) for https://oauth2.googleapis.com/token

machinehum avatar Apr 29 '24 16:04 machinehum

For the record, with this PR the GA3 import still works, no regression. With a little debug I can see that the new code in AuthorizationGA4.php getClientClassArguments() is setting what looks like a properly configured Guzzle client (pasted below). I'm trying to figure out a way to verify that the passing authHttpHandler to BetaAnalyticsDataClient and AnalyticsAdminServiceClient actually has the desired effect.

\\Matomo\\Dependencies\\GoogleAnalyticsImporter\\GuzzleHttp\\Client::__set_state(array(
   'config' => 
  array (
    'proxy' => 'http://proxy.foo.com:3128',
    'exceptions' => false,
    'base_uri' => 
    \\Matomo\\Dependencies\\GoogleAnalyticsImporter\\GuzzleHttp\\Psr7\\Uri::__set_state(array(
       'scheme' => 'https',
       'userInfo' => '',
       'host' => 'www.googleapis.com',
       'port' => NULL,
       'path' => '',
       'query' => '',
       'fragment' => '',
       'composedComponents' => NULL,
    )),
    'handler' => 
    \\Matomo\\Dependencies\\GoogleAnalyticsImporter\\GuzzleHttp\\HandlerStack::__set_state(array(
       'handler' => 
      \\Closure::__set_state(array(
      )),
       'stack' => 
      array (
        0 => 
        array (
          0 => 
          \\Closure::__set_state(array(
          )),
          1 => 'http_errors',
        ),
        1 => 
        array (
          0 => 
          \\Closure::__set_state(array(
          )),
          1 => 'allow_redirects',
        ),
        2 => 
        array (
          0 => 
          \\Closure::__set_state(array(
          )),
          1 => 'cookies',
        ),
        3 => 
        array (
          0 => 
          \\Closure::__set_state(array(
          )),
          1 => 'prepare_body',
        ),
      ),
       'cached' => NULL,
    )),
    'allow_redirects' => 
    array (
      'max' => 5,
      'protocols' => 
      array (
        0 => 'http',
        1 => 'https',
      ),
      'strict' => false,
      'referer' => false,
      'track_redirects' => false,
    ),
    'http_errors' => true,
    'decode_content' => true,
    'verify' => true,
    'cookies' => false,
    'idn_conversion' => false,
    'headers' => 
    array (
      'User-Agent' => 'GuzzleHttp/7',
    ),
  ),
))


machinehum avatar Apr 29 '24 17:04 machinehum

Thank you @machinehum . Looking at the GuzzleHttp\Client you output, I'd say that looks correct. What does the https://github.com/matomo-org/plugin-GoogleAnalyticsImporter/blob/cad809ff8b9c7dddf13ba24607080f3a17a7152b/vendor/prefixed/google/gax/src/CredentialsWrapper.php#L110 look like if you comment out line 52 where authHttpHandler gets set in the arguments? I'm wondering if maybe I used the wrong base_uri value for GA4?

snake14 avatar Apr 29 '24 21:04 snake14

@snake14 Hm. In CredentialsWrapper.php at line 110 I dumped the value of $args['authHttpHandler'], and then of $authHttpHandler after it is set. $args['authHttpHandler'] is alway null and $authHttpHandler is always a vanilla GuzzleHttp\Client instance, without the proxy or other custom settings. This is true regardless of whether line 52 in AuthorizationGA4.php is commented out or not.

I did confirm that the conditional on line 51-53 is being entered. So it seems like the new form of setting the arguments isn't actually passing them through? I'm going to try some more debugging. I suppose you could do similar even without a proxy to see whether this version of AuthorizationGA4.php is effectively passing the credentials argument or not.

machinehum avatar Apr 30 '24 15:04 machinehum

With the change in this PR to AuthorizationGA4, the authHttpHandler argument is being set in an array next to \Matomo\Dependencies\GoogleAnalyticsImporter\Google\ApiCore\CredentialsWrapper, but from looking at the Google API code it looks like it needs to be an argument to the credentials wrapper builder, alongside keyFile. This change to AuthorizationGA4::getClientClassArguments():

       $arguments = [
              'keyFile' => $this->getClientConfiguration()
        ];

        $proxyHttpClient = StaticContainer::get('GoogleAnalyticsImporter.proxyHttpClient');
        if ($proxyHttpClient) {
            $proxyHttpHandler = \Matomo\Dependencies\GoogleAnalyticsImporter\Google\Auth\HttpHandler\HttpHandlerFactory::build($proxyHttpClient);
            $arguments['authHttpHandler'] = $proxyHttpHandler;
        }

        $credentialWrapper =  \Matomo\Dependencies\GoogleAnalyticsImporter\Google\ApiCore\CredentialsWrapper::build($arguments);

        return ['credentials' => $credentialWrapper];
    }

results in CredentialsWrapper showing what looks like a good GuzzleHttp\Client object. However, the initial OAuth request still isn't using the proxy. The BetaAnalyticsDataGapicClient and AnalyticsAdminServiceGapicClient builders also allows for passing in a set of config arguments via credentialsConfig, as opposed to a preconstructed CredentialWrapper object. I tried that and got the same result. I'll keep poking at it.

machinehum avatar Apr 30 '24 19:04 machinehum

Nice find @machinehum . I meant to pass the handler into the CredentialsWrapper constructor. I went ahead and updated the PR with the latest changes that you shared. It looks like maybe passing the transport argument similar to credentials might work? I found mention of it in Google's documentation.

snake14 avatar Apr 30 '24 23:04 snake14

@snake14 Yeah, I did try the transport config eventually as well. That didn't make any difference either. Ended up spending 9 hours debugging this today. The documentation on the google PHP API side seems a little scattered. I did see that some of the modules of google-cloud-php, like BigTables, implement their own proxy settings. This makes me suspicious about whether the "official" solution is fully tested and working. Most of the threads I can find on it say "just use environment variables", heh. I opened an issue at https://github.com/googleapis/google-cloud-php/issues/7274 to hopefully get some guidance.

machinehum avatar May 01 '24 00:05 machinehum

Thank you for all of your effort @machinehum ; it's greatly appreciated. I agree that it's hard to find a clear answer in Google's documentation. Pretty much all of my searches for using a proxy with Google's PHP API classes ended up pointing to answers for UA and not GA4.

snake14 avatar May 01 '24 00:05 snake14