seo icon indicating copy to clipboard operation
seo copied to clipboard

All pages have "X-Robots-Tag: none, noimageindex" header

Open piotrpog opened this issue 4 years ago • 37 comments

I installed this plugin only to have sitemap functionality. But i recently noticed that it attaches X-Robots-Tag: none, noimageindex http header to all sites by default.

Why is that? Can I fix it somehow?

piotrpog avatar Aug 25 '19 16:08 piotrpog

@piotrpog Hmm, that's new to me. Do you have a site that's currently exhibiting this behaviour you could link to? Is the header on all pages or just sitemap pages?

alexjcollins avatar Aug 26 '19 11:08 alexjcollins

@alexjcollins It is on all pages, not only these in sitemap. For a moment i cannot link to website because i disabled plugin to make my website appear in google again.

piotrpog avatar Sep 12 '19 15:09 piotrpog

@alexjcollins We're also having this issue, you can see it on the header - https://jonesfoster.com

Our hosting provider came back with:

# grep -rnw /xxx/xxx-e 'X-Robots-Tag'
/xxx/vendor/ether/seo/src/services/SeoService.php:18:  * Adds the `X-Robots-Tag` header to the request if needed.

So something on that line is adding it on. SEO Plugin is 3.4.4, but can't update to latest just yet.

bymayo avatar Sep 12 '19 15:09 bymayo

I've updated to Craft 3.3.3 and SEO 3.6.2 and the problem is still there.

Just for clarification, this our client pointed this out when they tried to put their site in to Google Search Console and it wouldn't index any pages.

EDIT: It seems it only applies when dev mode is on! 🤦‍♂️Case closed... But might be worth mentioning this in docs.

bymayo avatar Sep 13 '19 08:09 bymayo

I had the same issue. We changed production to devmode true just to see quickly what the exact error was. We changed back to devMode false, and cleared the cache. Somehow however this did not remove the header. Unfortunately we did not notice this, our client experienced a drop in ranking and notified us.

It may be safer to not switch on devMode but perhaps on environment settings, anything less then 'production' perhaps.

rolfkokkeler avatar Dec 03 '19 14:12 rolfkokkeler

it looks like i have the same problem: site was on dev mode, then changed to production via .env file. But: robots.txt is still set to User-agent: * Disallow: /

how did you manage to apply the change to prod ?

puck3000 avatar May 27 '20 17:05 puck3000

ps: site is https://www.anis.ch

puck3000 avatar May 27 '20 17:05 puck3000

@puck3000 Is the site definitely in production mode? Also, what does your system Robots setting look like? Here's the default for reference:

Screenshot 2020-05-27 at 19 08 22

alexjcollins avatar May 27 '20 18:05 alexjcollins

@alexjcollins thank you for caring ;-) yes, the site definitely is in production, the "ENVIRONMENT" Variable in .env ist set to "production". and the robots settings are untouched and look the same as on your screenshot.

puck3000 avatar May 27 '20 18:05 puck3000

@puck3000 Thanks for the reply.

Okay, that's really strange – if the robots settings are identical, you should have a sitemap reference at the top of your robots.txt file.

Is there any chance that you already have a physical robots.txt file in /web that could be overriding the plugin generated version?

alexjcollins avatar May 28 '20 06:05 alexjcollins

hy alex I checked and no, there's no robots.txt file. Then I tried to add one, and strangely, even if I add a "phisical" robots.txt to the web root, I still see

User-agent: * Disallow: / on anis.ch/robots.txt ... When I place another file, like text.txt in the web root, it just works as it should.

Is there any other place, where this "wrong" robots.txt could be generated?

On Mai 28 2020, at 8:32 am, Alex Collins [email protected] wrote:

@puck3000 (https://github.com/puck3000) Thanks for the reply. Okay, that's really strange – if the robots settings are identical, you should have a sitemap reference at the top of your robots.txt file. Is there any chance that you already have a physical robots.txt file in /web that could be overriding the plugin generated version? — You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub (https://github.com/ethercreative/seo/issues/244#issuecomment-635135522), or unsubscribe (https://github.com/notifications/unsubscribe-auth/AAZ2M3IFJQ53TQNO3VKM6QLRTYAONANCNFSM4IPJSNPA).

puck3000 avatar May 28 '20 07:05 puck3000

@puck3000 When in production mode, do you have devMode set to true in config/general.php?

alexjcollins avatar May 28 '20 10:05 alexjcollins

no, it is only set in dev mode:


<?php
/**
* General Configuration
*
* All of your system's general configuration settings go in here. You can see a
* list of the available settings in vendor/craftcms/cms/src/config/GeneralConfig.php.
*
* @see \craft\config\GeneralConfig
*/

return [
// Global settings
'*' => [
// Default Week Start Day (0 = Sunday, 1 = Monday...)
'defaultWeekStartDay' => 1,

// Whether generated URLs should omit "index.php"
'omitScriptNameInUrls' => true,

// Control Panel trigger word
'cpTrigger' => 'admin',

// The secure key Craft will use for hashing and encrypting data
'securityKey' => getenv('SECURITY_KEY'),

// Whether to save the project config out to config/project.yaml
// (see https://docs.craftcms.com/v3/project-config.html)
'useProjectConfigFile' => false,
],

// Dev environment settings
'dev' => [
// Dev Mode (see https://craftcms.com/guides/what-dev-mode-does)
'devMode' => true,
],

// Staging environment settings
'staging' => [
// Set this to `false` to prevent administrative changes from being made on staging
'allowAdminChanges' => true,
],

// Production environment settings
'production' => [
// Set this to `false` to prevent administrative changes from being made on production
'allowAdminChanges' => true,
],
];

should I set it explicitly to false in production?

On Mai 28 2020, at 12:36 pm, Alex Collins [email protected] wrote:

@puck3000 (https://github.com/puck3000) When in production mode, do you have devMode set to true in config/general.php? — You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub (https://github.com/ethercreative/seo/issues/244#issuecomment-635260483), or unsubscribe (https://github.com/notifications/unsubscribe-auth/AAZ2M3NZUPQLUVRFRP3TT53RTY5EHANCNFSM4IPJSNPA).

puck3000 avatar May 28 '20 11:05 puck3000

@puck3000 Might be worth giving it a go, although I'm pretty sure it'll be false by default.

alexjcollins avatar May 28 '20 13:05 alexjcollins

@alexjcollins sadly you where right, setting it explicitly didn't change anything...

puck3000 avatar May 28 '20 13:05 puck3000

@puck3000 It’s a big ask, but is there any possibility of sending over your site files and a database dump?

If you can, please could you send to [email protected]

alexjcollins avatar May 28 '20 14:05 alexjcollins

Possibly having the same issue with X-Robots-Tag: none, noimageindex being applied, but cannot find the source. Is there something I can lookup/change to remove the tag?

uaextension avatar Aug 01 '20 03:08 uaextension

I had the same issue. ENVIRONMENT="production" However, if devMode="true" in general.php it activated the X-Robots-Tag!!!

I ran into this issue while migrating content and devMode was true in production while I was troubleshooting and left it on in case something came up.

This resulted in about 60 important pages being unindexed over a couple of days.

I tested with https://search.google.com/search-console in both settings and found this to be the culprit.

I'm now wary, but it would be nice to have the option to ignore the X-Robots-Tag based in the general.php config settings.. Was this a holdover from CraftCMS 2?

charlietriplett avatar Oct 16 '20 15:10 charlietriplett

I've also discovered this bug, for me it was because I had ENVIRONMENT=live instead of ENVIRONMENT=production. That's a pretty severe bug for an SEO plugin to have.

RyanRoberts avatar Jun 09 '21 11:06 RyanRoberts

Facing this issue as well.. I used to have dev, staging and prod instead.. this plugin literally checks on production. Not good.

src/services/SeoService.php :: 26
if (CRAFT_ENVIRONMENT !== 'production')
{
    $headers->set('x-robots-tag', 'none, noimageindex');
    return;
}

Neither a plugin of Craft itself should do this hard-coded.

bertoost avatar Sep 19 '21 07:09 bertoost

Thank you @bertoost ! Was trying to look for a solution where Google was reporting the x-robots-tag was stopping a client site from being crawled. Our env was set to prod

jamiematrix avatar Feb 03 '22 15:02 jamiematrix

Even removing

if (CRAFT_ENVIRONMENT !== 'production')
{
    $headers->set('x-robots-tag', 'none, noimageindex');
    return;
}

is not solving the issue. x-robots-tag is then set to none

ineghi avatar Feb 09 '22 15:02 ineghi

Still got that issue with Craft CMS 4 version.

// services/SeoService.php

$env = getenv('ENVIRONMENT') ?? getenv('CRAFT_ENVIRONMENT');

If I use CRAFT_ENVIRONMENT and not ENVIRONMENT $env return false form the line above. So the header is set to block robots.

If I include both (which is rediculous), as above, now it works.

ENVIRONMENT=production
CRAFT_ENVIRONMENT=production

Why that condition is not based on devMode or disallowRobots Craft config settings? That way you don't have to include specific environmental settings that may be different from one dev to another.

jesuismaxime avatar Jun 14 '22 13:06 jesuismaxime

Still got that issue with Craft CMS 4 version.

// services/SeoService.php

$env = getenv('ENVIRONMENT') ?? getenv('CRAFT_ENVIRONMENT');

If I use CRAFT_ENVIRONMENT and not ENVIRONMENT $env return false form the line above. So the header is set to block robots.

If I include both (which is rediculous), as above, now it works.

ENVIRONMENT=production
CRAFT_ENVIRONMENT=production

Why that condition is not based on devMode or disallowRobots Craft config settings? That way you don't have to include specific environmental settings that may be different from one dev to another.

You just saved my life. Adding ENVIRONMENT=production on top of CRAFT_ENVIRONMENT=production fixed it for me.

pascalminator avatar Jul 13 '22 14:07 pascalminator

@pascalminator you're welcome! Still, I would like a follow up from the creators 😆

jesuismaxime avatar Jul 13 '22 14:07 jesuismaxime

I'm still having this issue.. the pages are not being found. Pretty big issue! Those are my configs:

.env

ENVIRONMENT=production CRAFT_ENVIRONMENT=production

config/general.php
Screenshot 2022-09-07 at 09 20 52

Robots settings inside SEO plugin: Screenshot 2022-09-07 at 09 22 09

Running all of this on: Craft CMS: v4.2.3 ether/seo: v4.0.3

When surfing to my domain.com/robots.txt I still get this:

User-agent: * Disallow: /cpresources/ Disallow: /vendor/ Disallow: /.env

Screenshot 2022-09-07 at 09 16 36

SkermBE avatar Sep 07 '22 07:09 SkermBE

@SkermBE

The URL next to "Referring Page" in your page indexing screenshot is indeed blocking Googlebot:

User-agent: Googlebot
Disallow: /?*

User-agent: Baiduspider
Disallow: /?*

User-agent: YandexBot
Disallow: /?*

User-agent: ichiro
Disallow:  /?*

User-agent: sogou spider
Disallow:  /?*

User-agent: Sosospider
Disallow: /?*

User-agent: YoudaoBot
Disallow: /?*

User-agent: YetiBot
Disallow: /?*

User-agent: bingbot
Crawl-delay: 2
Disallow: /?*

User-Agent: Yahoo! Slurp 
Crawl-delay: 2
Disallow: /?*

User-agent: rdfbot
Disallow: /?*

User-agent: Seznambot 
Request-rate: 1/2s
Disallow: /?*

User-agent: ia_archiver
Disallow: 

User-agent: Mediapartners-Google
Disallow: 

Is this the correct domain?

When surfing to my domain.com/robots.txt I still get this: User-agent: * Disallow: /cpresources/ Disallow: /vendor/ Disallow: /.env

The SEO settings screenshot show this is correct as you're in production mode

jamiematrix avatar Sep 07 '22 10:09 jamiematrix

@jamiematrix

I know nothing about thad referring page.. When i surf to my own domain (witch is not 4rank.bid or some weird thing) I get the robots.txt like mentioned.

User-agent: *
Disallow: /cpresources/
Disallow: /vendor/
Disallow: /.env

But still Google search console is saying it's being blocked. When doing a live test now, this is the result: Screenshot 2022-09-07 at 14 39 31

SkermBE avatar Sep 07 '22 12:09 SkermBE

Seems to be fixed with: https://github.com/ethercreative/seo/issues/432

Arno-Ramon avatar Sep 26 '22 09:09 Arno-Ramon

Just got mega burned by this. Thanks to those who found and submitted PRs.

BigglesZX avatar Oct 04 '22 15:10 BigglesZX