no-google-amp-bot
no-google-amp-bot copied to clipboard
Please use "canonical" result from getting the google amp page
I think the most reliable way to get the correct target page is to retrieve the google amp link and look for canonical relation in the HTML and embedded Javascript. Consider the following PCRE regexps:
<link\s+rel="?canonical"?\s+href="?(.*?)"?\s*/?>
<link\s+href="?(.*?)"?\s+rel="?canonical"?\s*/?>
"canonicalUrl":"(.*?)"
The target URL would be in the first grouping.