typo3-realurl icon indicating copy to clipboard operation
typo3-realurl copied to clipboard

$requestVariables['L'] should only store INT values to tx_realurl_urldata Table

Open velletti opened this issue 8 years ago • 18 comments

We have a multi langauge setup for our homepage

en.html is mapped to index.php?L=0 and de.html is mapped to index.php?L=1

Rootpage ID = 76

Today it happened that the following entry was stored into tx_realurl_urldata :

original_url= L=1%27A%3D0&id=76 speaking_url = de.html request_variables = {"id":"76","L":"1'A=0"}

and then "de.html" shows the content of the englisch Version. Deleting that row in table 'tx_realurl_urldata', fixes the problem. But it re-appeared.

My (custom made) RealUrl Conf File looks like this: `$TYPO3_CONF_VARS['EXTCONF']['realurl'] = array( '_DEFAULT' => array( 'init' => array( 'enableCHashCache' => 1, 'appendMissingSlash' => 'ifNotFile', 'enableUrlDecodeCache' => 1, 'enableUrlEncodeCache' => 1, 'emptyUrlReturnValue' => '/', 'postVarSet_failureMode' => '', ),

    'cache' => array ( 'banUrlsRegExp' => '/ContactLeadId=|gclid=|type=|(?:^|\?|&)q=/' )

    'redirects' => array(),
    'preVars' => array(
        array(
            'GETvar' => 'L',
            'valueMap' => array(
                // alle sprachen die doch nicht live gehen deaktivieren
                //'en' => 0, //international (needs no url part because its the default language)
                'de' => 1, //germany
                'it' => 2, //italy
                'cz' => 3, //czechrepublic
                'fr' => 4, //france
                'ch_de' => 6, //switzerland - german
                'at' => 7, //austria
                'es' => 18, //spain
                'ch_fr' => 19, //switzerland - french
            ),
            'noMatch' => 'bypass',
        ),
    ),
    'pagePath' => array(
        'type' => 'user',
        'userFunc' => 'EXT:realurl/class.tx_realurl_advanced.php:&tx_realurl_advanced->main',
        'spaceCharacter' => '-',   ..... .... `

Solution:

Classes/Controller/UrlCacheController.php Change line 581 -> 583 if (isset($requestVariables['L'])) { $this->detectedLanguageId = (int)$requestVariables['L'] ; }

to this:

if (isset($requestVariables['L'])) { $requestVariables['L'] = (int)$requestVariables['L'] ; $this->detectedLanguageId = $requestVariables['L']; } and Classes/Cache/DatabaseCache.php add $requestVariables['L'] = (int)$requestVariables['L'] ; before line 141 $this->databaseConnection->exec_UPDATEquery('tx_realurl_urldata',

I will Try to create a Pull Request

velletti avatar Oct 11 '16 09:10 velletti

sorry, but was succesfull to create a pull request :-(

after my commits / create Branches did not work, i create a fork and commited there .. : see https://github.com/velletti/typo3-realurl/commit/69fc9ef464128e9f7858d4f6c8ca6a33b8009130

velletti avatar Oct 11 '16 10:10 velletti

Please, fix your TypoScript instead: https://docs.typo3.org/typo3cms/TyposcriptReference/Setup/Config/Index.html#linkvars

It is not realurl's job to change or manipulate parameters passed by users. Realurl uses and stores them as is. You should always have check for L being integer in your linkVars!

In addition your custom config uses non-existing options.

dmitryd avatar Oct 11 '16 17:10 dmitryd

Sorry for disturbing again: linkVars = L(int) does not help : Real Url was still storing Request like &L='A=123 to the table.

Actually i added some lines to the canCacheUrl() to check if "L" is an allowed param.

it also can be done in parseUrlParameters() function as there realUrl just takes the incomming request without checking the params agains the realUrlConf

'preVars' => array (
            0 => array (
                'GETvar' => 'L',
                'valueMap' => array (

If you want, i can prepare a pull request for that .,,

I think this bug happens realy rare as it is after my research a misconfigured Email Campagne of one of our partners that creates this kind of wrong links, but it was destroying our webpage definitly as ..

velletti avatar Oct 13 '16 07:10 velletti

Realurl processes parameters as it gets. It will not convert anything to int. Please, do it in the calling side.

Goal of RealURL is to convert URLs as is according to config.

linkVars = L(int) does not help : Real Url was still storing Request like &L='A=123 to the table.

What realurl & TYPO3 version do you use? It looks to me like it is some old realurl & TYPO3. L(int) is supported in the new TYPO3 and new realurl does not store anything when decoding. Thus the problem cannot happen.

dmitryd avatar Oct 13 '16 08:10 dmitryd

TYPO3: 6.2.7 LTS updated from 4.5 1 Year Ago Real Url : 2.1.4 updated from 1.3.x 4 Weeks a go to 2.0.x and all versions inbetween (thats why there are still some old valus in my realurlconf .. )

-> Please, do it in the calling side. i would like to, but i do not know the "calling side" i think an email that is already send out to all customers .. so unchangeable ..

TYPO3. L(int) may help INSIDE of TYPO3 but i think the problem is inUrlEncoder.php:

` protected function parseUrlParameters() { $urlParts = parse_url($this->urlToEncode); $this->urlParameters = array(); if ($urlParts['query']) { // Cannot use parse_str() here because we do not need deep arrays here. $parts = GeneralUtility::trimExplode('&', $urlParts['query']); foreach ($parts as $part) { list($parameter, $value) = explode('=', $part); // Remember: urldecode(), not rawurldecode()! $this->urlParameters[urldecode($parameter)] = urldecode($value); } } $this->originalUrlParameters = $this->urlParameters;

    $sortedUrlParameters = $this->urlParameters;
    $this->sortArrayDeep($sortedUrlParameters);
    $this->originalUrl = $this->createQueryStringFromParameters($sortedUrlParameters);
}`

so $this->originalUrl contains "L"=> '1'A=1234' in my error case as it does not look for the settings

if someone calls our webpage with index.php?id=76&L=1 i get "de.html" => with valid L parameter 1 => perfekt.

if someone calls our webpage with index.php?id=76&L=1'A=1234 ( bad Guy or bad external Script ) i get "de.html" => with Invalid L parameter 1'A=1234 => added as additional row in the tx_realurl_urldata table showing englisch version of the webpage ..

velletti avatar Oct 13 '16 11:10 velletti

Firsts, that is no such thing as "Real Url".

Secondly, if original URL contains L in this bad format, than your TypoScript is clearly wrong because the limitation for L(int) definitely works in 6.2. See here. parseUrlParameter parses the link as it is sent to realurl by TYPO3. It is up to you how to generate the link. But you have to do it correctly. May be you improperly use addQueryString in typolink, may be you do something else wrong. But you must ensure that links come in the correct format to realurl. Realurl always encodes URLs as is. It will not fix anything for you. If you give bad input, you get bad output. Not sure why you expect something else.

I cannot help you with this. Fix your installation to provide valid links to realurl. Use the debugger to see where the wrong URL comes from and fix that place. It is an error in your implementation if L comes as non-integer. TYPO3 provided a solution with linkVars for years especially to solve this exact problem.

dmitryd avatar Oct 13 '16 14:10 dmitryd

I have seen a similar problem as described here:

TYPO3-Version: 6.2.30 realurl-Version: 2.1.4

TypoScript:

realurl

config { simulateStaticDocuments = 0 baseURL = ... tx_realurl_enable = 1 uniqueLinkVars = 1 linkVars = L(0-1) defaultGetVars.L = 0 .... }

Extract of : tx_realurl_urldata original_url: L=0%27A%3D0&id=6 speaking_url: /de/page.../

The problem is, you can call URLs with any parameters you wish and it will be inserted into the realurl cache. I don't see config.linkvars really doing anything to prevent that.

The speaking URL is without the L=..., but it will be mapped into the original query.

sypets avatar Feb 14 '17 18:02 sypets

@sypets Try to update RealUrl version. I believe from version 2.1.5 no entries are added to realurl cache during decoding, so the problem you're describing won't happen.

chesio avatar Feb 15 '17 10:02 chesio

The problem still may happen if you use typolink.addQueryString without exclude=L. This option bypasses checks for config.linkVars and it is the most typical case of bad parameters in links.

dmitryd avatar Feb 16 '17 10:02 dmitryd

Good morning! We have the same problem:

TYPO3: 7.6.15 RealURL: 2.1.5

If we enable RealURL &L=1 becomes &L=1'A=0, so the language could not be used.

We tried everything from above, but it do not work! :(

tx_realurl_enable = 1
htmlTag_langKey = en

linkVars = L(0-13)
uniqueLinkVars = 1

$GLOBALS['TYPO3_CONF_VARS']['EXTCONF']['realurl'] = array( '_DEFAULT' => array( 'init' => array( 'enableCHashCache' => 0, 'appendMissingSlash' => 'ifNotFile', 'enableUrlDecodeCache' => 0, 'enableUrlEncodeCache' => 0, 'postVarSet_failureMode' => '', ), 'redirects' => array(), 'preVars' => array( array( 'GETvar' => 'no_cache', 'valueMap' => array( 'no_cache' => 1, ), 'noMatch' => 'bypass', ), array( 'GETvar' => 'L', 'valueMap' => array( 'us' => '1', 'de' => '2', 'fr' => '3', 'es' => '5', 'it' => '4', 'en' => '0', ), 'noMatch' => 'bypass', ), ), 'pagePath' => array( 'type' => 'user', 'userFunc' => 'EXT:realurl/class.tx_realurl_advanced.php:&tx_realurl_advanced->main', 'spaceCharacter' => '-', 'languageGetVar' => 'L', 'expireDays' => 7, 'rootpage_id' => 1, 'firstHitPathCache' => 1, 'disablePathCache' => 1, ), 'fixedPostVars' => array(), ), );

The other languages are working correctly.

What can we do?

Greetings, Johannes

jhamecher avatar Mar 16 '17 07:03 jhamecher

PS: After using this SQL Query the error ist gone: DELETE FROM tx_realurl_urldata WHERE request_variables LIKE '%A=%'

The problem is that RealURL is adding false entries to the tx_realurl_urldata table.

jhamecher avatar Mar 16 '17 09:03 jhamecher

@jhamecher: yes, the bad URL entrys will come back if f.e. an axternal link with this bad encoded parameter in my case it is an email Newsletter with added Google tagmanager Params, but inserted with wrong encoding by a marketing guy and finaly wrong back encoded by the email programm of the user.

the Typo Script settings helps, taht typo3 behaves correctly seleting the correct language but in Real URL still $requestVariables['L'] contgains "1'A=0d..." ..

a simple intval() cast in Real URL before storing and before comparing will help .. (see my inital post. the line Numbers may have changed in the meantime ...

BTW: I love real Url ..

velletti avatar Mar 16 '17 10:03 velletti

@jhamecher

The problem is that RealURL is adding false entries to the tx_realurl_urldata table.

There is no such thing as "false entries". Realurl receives those URLs from TYPO3. If TYPO3 asked to encode such URLs, realurl have to handle it.

As I wrote before, it is your job to make sure that your URLs are secure and use correct format.

dmitryd avatar Mar 16 '17 13:03 dmitryd

@dmitryd will it help if we "donate" something (we have no paypal and need a payment recipe) ? Together with a new patch to the actual version?

with each update of real url i have to reimplement my changes because NOT I AM THE BAD GUY who is calling our website with www.allplan.com/id=12345&L=1'A=1234&_utm=123452134 ..

this link is in some old Email Newslettters already sent out by one of my co-workers .. The mails are out in the wild so no way back .. and sometimes a new marketing guy made the same mistake: takes a Link from typo3 backend, adds his Google Analytics tracking code at the and, does not encode it corectly an puts in to link redirecter track code of the newsletter tool.

and again such a broken link

type casting of L param helps for TYPO3 to react on the correct language but in realurl decoder script finaly reaches $requestVariables['L'] contains "1'A=1234" .. and after that calls to the reaurl URL of that page ( something like /de/products/specialprice.html/ ) will deliver the englsich version instead of the german ..

It will realy be helpful to add these 2 additional checks to real Ulrf ...

velletti avatar Mar 16 '17 14:03 velletti

Thank you, we will check it.

jhamecher avatar Mar 16 '17 15:03 jhamecher

@velletti

config.linkVars = L(0-2)

or something like that should solve your issue. If you use typolink.addQueryString, exclude L and set it manually there. That's all you need to do.

You can also add Apache rules that check that L is numeric and send a 410 gone. I did not test but something like this should do:

RewriteCond %{QUERY_STRING} (^|&)L= [AND]
RewriteCond %{QUERY_STRING} !(^|&)L=\d+(&|$)
RewriteRule .* - [G,L]

We are talking about security here. You need to secure your site to disallow arbitrary values.

Since the demand for automatic solution exists, I thought of a separate extension that would secure parameters. However I did not have much time to do it. It has to be a separate extension because realurl's goal is to encode passed links as is. It is just the protocol that realurl follows.

So there should be something inserted before the realurl that will fix links if the developer could not do it with typoscript.

dmitryd avatar Mar 16 '17 15:03 dmitryd

Okay. I made a change. Now:

  1. There will be no exceptions
  2. Logging will go only to the devLog
  3. The url with wrong language will not be encoded
  4. Page cache will be disabled to prevent spreading of wrong L values.

Do you think it also makes sense to add X-Robots-Tag http header to prevent such bad pages from being indexed by Google?

dmitryd avatar Mar 17 '17 10:03 dmitryd

Thank you!

jhamecher avatar Jun 05 '17 13:06 jhamecher