typo3-realurl icon indicating copy to clipboard operation
typo3-realurl copied to clipboard

Wrong CacheEntry when using _DOMAIN configuration

Open p2media opened this issue 7 years ago • 2 comments

We are using subdomains for dynamic user generated content with the following configuration:

Our configuration:

'_DOMAINS' => array(
    'encode' => array(
        array(
            'GETvar' => 'tx_myext_myplugin[fieldname]',
            'value' => '1',
            'urlPrepend' => 'http://sub1.example.de'
        ),
        array(
            'GETvar' => 'tx_myext_myplugin[fieldname]',
            'value' => '2',
            'urlPrepend' => 'http://custom2.example.de'
        )
    ),
    'decode' => array(
        'sub1.example.de' => array(
            'GETvars' => array(
                'tx_myext_myplugin[fieldname]' => '1',
            ),
        ),
        'custom2.example.de' => array(
            'GETvars' => array(
                'tx_myext_myplugin[fieldname]' => '2',
            ),
        )
    )
)

When using _DOMAIN configuration UrlDecode will populate the wrong GET-params.

Example:

Browser 1 visits the URL "http://custom2.example.de/kontakt.html?no_cache=1".
UrlDecoder will decode the following GET-params ...

array(
    'id' => 42,
    'tx_myext_myplugin[fieldname]' => 2,
    'no_cache' => 1,
)

... and create a CacheEntry with the following data:

page_id rootpage_id original_url speaking_url request_variables
42 1 id=42&tx_myext_myplugin[fieldname]=2&no_cache=1 kontakt.html?no_cache=1 {"id":"42","tx_myext_myplugin[fieldname]":"2","no_cache":"1"}

Browser 2 visits the same page with an other subdomain "http://sub1.example.de/kontakt.html?no_cache=1".
UrlDecoder will decode the same GET-params, because the UrlDecoder only searches a CacheEntry without the domain in the tx_realurl_urldata cache table.

array(
    'id' => 42,
    'tx_myext_myplugin[fieldname]' => 2,
    'no_cache' => 1,
)

If we ignore the dynamic GET-param for the subdomain "tx_myext_myplugin[fieldname]" via cache/ignoredGetParametersRegExp = /^(?:glcid|utm_[a-z]+|pk_campaign|pk_kwd|TSFE_ADMIN_PANEL.*|tx_myext_myplugin\[fieldname\])$/', the CacheEntry won't save the GET-param into its requestVariables and the decoding works.
But unfortunately the GET-param will be appended without substituting.

We suggest one of the following options:

  1. All GET-params configured in _DOMAIN configuration will be removed in the requestVariables of a CacheEntry
  2. There will be a new configuration (maybe cache/removeGetParametersRegExp) to remove GET-params while encoding.

Best regards

P.S.: Everyone who's wondering why we generate supposed duplicate content:
It's for user generated content and we have proper canonical URLs.

p2media avatar Apr 20 '17 10:04 p2media

I think i am having the same problem, but in my case, i get wrong page paths. Initially, all page paths are generated with the values from the default language. After editing a pages_language_overlay record, the "old" paths are expired, and for the localization a new entry is created. If all caches are flushed however, the wrong params appear again. I think, that adding the domain for a cache entry could help in that case

akiessling avatar May 04 '17 11:05 akiessling

Unfortunately in this configuration URLs will be ambiguous. URL has to change if you use parameters, otherwise realurl will not be able to decode it properly.

When you use _DOMAINS for languages, URL changes due to translation. So when decoding it will be fetched correctly.

I believe that suggested solution will break language domains. So they cannot be accepted.

I am open to suggestions but I think in this case there is no solution with stock version of realurl. You need something custom here.

dmitryd avatar May 27 '17 10:05 dmitryd