flattr-extension icon indicating copy to clipboard operation
flattr-extension copied to clipboard

Recognize original URLs from translation services

Open da2x opened this issue 7 years ago • 4 comments

Flattr should recognize translator services and be able to pull out the original URL and credit the original URL. About 4 % of traffic to my own humble English-language corner of the web come in through these translation proxy services.

Baidu:

  • http://fanyi.baidu.com/transpage?query=https%3A%2F%2Fwww.example.com%2F&source=url&ie=utf8&from=it&to=zh&render=1

Bing Translator and Microsoft Translate:

  • https://www.translatetheweb.com/?from=&to=en&a=https://www.example.com/
  • https://www.microsofttranslator.com/bv.aspx?from=&to=en&a=https://www.example.com/

Google Translate:

  • https://translate.google.com/translate?sl=auto&tl=en&js=y&prev=_t&hl=en&ie=UTF-8&u=https%3A%2F%2Fwww.example.com&edit-text=&act=url
  • https://translate.googleusercontent.com/translate_c?act=url&depth=1&hl=en&ie=UTF8&prev=_t&rurl=translate.google.com&sl=auto&sp=nmt4&tl=en&u=https://www.example.com/&xid={redacted}&usg={redacted}

Yanex Translate:

  • https://translate.yandex.ru/translate?url=https%3A%2F%2Fwww.example.com%2F&lang=en-ru
  • https://translate.yandex.com/translate?url=https%3A%2F%2Fwww.example.com%2F&lang=ru-en

Not sure if this should be performed in the extension or server-side, though.

da2x avatar Aug 08 '18 11:08 da2x

Hardcoding such services would likely not be a good approach so I'll check whether there's meta data on those pages that points to the canonical page.

In general please be aware that the extension doesn't support web applications at this point as they have a different usage pattern than news sites, blogs, videos or other content that can be consumed. Therefore this is an interesting edge case where it's technically a web application but it's purpose is to show you consumable content.

ThomasGreiner avatar Aug 30 '18 11:08 ThomasGreiner

Oh, these aren't web applications either. They're content-translating HTTP proxies.

I already looked for useful metadata but there is literally just the URL patterns to work with here. All of these are from market leading search engine providers that display links (in the above formats) to translation next to foreign-language search results.

da2x avatar Aug 30 '18 12:08 da2x

It'd be interesting to know how search engines treat such translated pages. Since there's no metadata on those they may just ignore and not include them in search results.

On another note I quickly wanted to mention that this issue has been passed on to the server team as they want to look into this use-case on their end.

ThomasGreiner avatar Sep 04 '18 13:09 ThomasGreiner

It'd be interesting to know how search engines treat such translated pages. Since there's no metadata on those they may just ignore and not include them in search results.

All of the example links are actually included on search result pages. You’ll see them as “Translate page” links next to the main foreign-language search result on Bing, Baidu, Yandex, and Google search. Try searching for something in a foreign language on English language Bing or Google, for example.

On another note I quickly wanted to mention that this issue has been passed on to the server team as they want to look into this use-case on their end.

Thanks!

da2x avatar Sep 04 '18 13:09 da2x