puppet-mongodb icon indicating copy to clipboard operation
puppet-mongodb copied to clipboard

Mongos 4x 'errmsg: not master and slaveOk=false'

Open jeff1evesque opened this issue 7 years ago • 2 comments

Affected Puppet, Ruby, OS and module versions/distributions

  • Puppet: 5.5.8
  • Ruby: 2.0.0
  • Distribution: Centos 7
  • Module version: v2.4.1

How to reproduce (e.g Puppet code you use)

The following site.pp can be reproduced using a Vagrantfile:

## local variables
$mongo_repo = 'https://repo.mongodb.org/yum/redhat/7/mongodb-org/4.0/x86_64/'

## persistent puppetagent
service { 'puppet':
    ensure  => running,
    enable  => true,
}

##
## configsrv members
##
## Note: Starting in MongoDB 3.2, the WiredTiger storage engine
##       is the default storage engine.
##
##     - https://docs.mongodb.com/v4.0/core/wiredtiger/
##
node /^mongocfg(1|2|3)$/ {
  file {'/data':
    ensure => directory,
    owner  => root,
    group  => root,
    mode   => '0755',
  }
  -> class {'mongodb::globals':
    manage_package_repo => true,
    repo_location       => $mongo_repo,
  }
  -> class {'mongodb::server':
    replset        => 'cfg1',
    bind_ip        => ['0.0.0.0'],
    dbpath         => '/data/db',
    storage_engine => 'wiredTiger',
  }
  -> class {'mongodb::client': }
}

##
## configure mongos
##
## @configdb, connection string '<config replset name>/<host1:port>,<host2:port>,[...]'
##
node 'mongos' {
  file {'/data':
    ensure => directory,
    owner  => root,
    group  => root,
    mode   => '0755',
  }

  class {'mongodb::globals':
    manage_package_repo => true,
    repo_location       => $mongo_repo,
  }
  -> class {'mongodb::client': }
  -> class {'mongodb::mongos':
    configdb => 'cfg1/192.168.0.11:27019, 192.168.0.12:27019, 192.168.0.13:27019',
  }
  -> mongodb_shard { 'rs1':
    member => 'rs1/192.168.0.14:27018',
    keys   => [{
      'rs1' => {
        'name' => 1,
      }
    }],
  }
}

##
## shard1 members
##
## Note: Starting in MongoDB 3.2, the WiredTiger storage engine
##       is the default storage engine.
##
##     - https://docs.mongodb.com/v4.0/core/wiredtiger/
##
node /^mongod(1|2|3)$/ {
  file {'/data':
    ensure => directory,
    owner  => root,
    group  => root,
    mode   => '0755',
  }
  -> class {'mongodb::globals':
    manage_package_repo => true,
    repo_location       => $mongo_repo,
  }
  -> class {'mongodb::server':
    shardsvr       => true,
    replset        => 'rs1',
    bind_ip        => ['0.0.0.0'],
    storage_engine => 'wiredTiger',
  }
  -> class {'mongodb::client': }
  -> mongodb_replset{'rs1':
    members => [
        '192.168.0.14:27018',
        '192.168.0.15:27018',
        '192.168.0.16:27018',
    ],
  }
}

What are you seeing

The following error on the mongos instance:

[root@mongos vagrant]# /opt/puppetlabs/bin/puppet agent -t
Info: Using configured environment 'production'
Info: Retrieving pluginfacts
Info: Retrieving plugin
Info: Retrieving locales
Info: Loading facts
Info: Caching catalog for mongos
Info: Applying configuration version '1543453871'
Notice: /Stage[main]/Mongodb::Mongos::Service/Service[mongos]/ensure: ensure changed 'stopped' to 'running'
Info: /Stage[main]/Mongodb::Mongos::Service/Service[mongos]: Unscheduling refresh on Service[mongos]
Error: Failed to apply catalog: Execution of '/bin/mongo --quiet --eval printjson(sh.status())' returned 252: 2018-11-29T01:11:16.441+0000 E QUERY    [js] Error: error: {
        "operationTime" : Timestamp(0, 0),
        "ok" : 0,
        "errmsg" : "not master and slaveOk=false",
        "code" : 13435,
        "codeName" : "NotMasterNoSlaveOk",
        "$clusterTime" : {
                "clusterTime" : Timestamp(0, 0),
                "signature" : {
                        "hash" : BinData(0,"AAAAAAAAAAAAAAAAAAAAAAAAAAA="),
                        "keyId" : NumberLong(0)
                }
        }
} :
_getErrorWithCode@src/mongo/shell/utils.js:25:13
DBCommandCursor@src/mongo/shell/query.js:708:1
DBQuery.prototype._exec@src/mongo/shell/query.js:113:28
DBQuery.prototype.hasNext@src/mongo/shell/query.js:288:5
DBCollection.prototype.findOne@src/mongo/shell/collection.js:260:10
printShardingStatus@src/mongo/shell/utils_sh.js:521:19
sh.status@src/mongo/shell/utils_sh.js:89:5
@(shell eval):1:11

Each mongod outputs the following:

[root@mongod1 vagrant]# puppet agent -t
Info: Using configured environment 'production'
Info: Retrieving pluginfacts
Info: Retrieving plugin
Info: Retrieving locales
Info: Loading facts
Info: Caching catalog for mongod1
Info: Applying configuration version '1543453835'
Notice: /Stage[main]/Main/Node[__node_regexp__mongod123]/Mongodb_replset[rs1]/ensure: created
Error: /Stage[main]/Main/Node[__node_regexp__mongod123]/Mongodb_replset[rs1]: Could not evaluate: Can't connect to any member of replicaset rs1.
Notice: Applied catalog in 13.41 seconds

Each of the mongocfg outputs the following:

[root@mongocfg2 vagrant]# puppet agent -t
Info: Using configured environment 'production'
Info: Retrieving pluginfacts
Info: Retrieving plugin
Info: Retrieving locales
Info: Loading facts
Info: Caching catalog for mongocfg2
Info: Applying configuration version '1543455233'
Notice: /Stage[main]/Mongodb::Server::Config/File[/etc/mongod.conf]/content:
--- /etc/mongod.conf    2018-11-28 23:05:18.037105359 +0000
+++ /tmp/puppet-file20181129-8644-1q494i4       2018-11-29 01:33:58.753453267 +0000
@@ -17,8 +17,6 @@
 journal = true
 # Turn on/off security.  Off is currently the default
 noauth=true
-# Is the mongod instance a shard server
-shardsvr = true
 # Configure ReplicaSet membership
 replSet = cfg1


Info: Computing checksum on file /etc/mongod.conf
Info: /Stage[main]/Mongodb::Server::Config/File[/etc/mongod.conf]: Filebucketed /etc/mongod.conf to puppet with sum 628427327cc0b76ee01640b02382a7b5
Notice: /Stage[main]/Mongodb::Server::Config/File[/etc/mongod.conf]/content: content changed '{md5}628427327cc0b76ee01640b02382a7b5' to '{md5}94b58fdf8723363500dd3bcdec9f25ae'
Info: Class[Mongodb::Server::Config]: Scheduling refresh of Class[Mongodb::Server::Service]
Info: Class[Mongodb::Server::Service]: Scheduling refresh of Service[mongodb]
Notice: /Stage[main]/Mongodb::Server::Service/Service[mongodb]: Triggered 'refresh' from 1 event
Notice: Applied catalog in 15.57 seconds

What behaviour did you expect instead

My hopes are to have a replicated shard of mongodb v4x.

Output log

mongodb-shard.log

[root@mongos vagrant]# ls -l /var/log/mongodb
total 68
-rw-r--r--. 1 root   root   28758 Nov 29 01:38 mongodb-shard.log
-rw-r-----. 1 mongod mongod 27005 Nov 29 01:45 mongod.log.rpmsave
[root@mongos vagrant]# cat /var/log/mongodb/mongodb-shard.log
2018-11-29T00:52:18.532+0000 I CONTROL  [main] Automatically disabling TLS 1.0, to force-enable TLS 1.0 specify --sslDisabledProtocols 'none'
2018-11-29T00:52:18.544+0000 F -        [main] Failed global initialization: FailedToParse: More than one ':' detected. If this is an ipv6 address, it needs to be surrounded by '[' and ']'; ${::ipaddress}:27019
2018-11-29T00:52:18.551+0000 I CONTROL  [main] shutting down with code:14
2018-11-29T00:53:32.729+0000 I CONTROL  [main] Automatically disabling TLS 1.0, to force-enable TLS 1.0 specify --sslDisabledProtocols 'none'
2018-11-29T00:53:32.737+0000 F -        [main] Failed global initialization: FailedToParse: More than one ':' detected. If this is an ipv6 address, it needs to be surrounded by '[' and ']'; ${::ipaddress}:27019
2018-11-29T00:53:32.745+0000 I CONTROL  [main] shutting down with code:14
2018-11-29T00:55:43.804+0000 I CONTROL  [main] Automatically disabling TLS 1.0, to force-enable TLS 1.0 specify --sslDisabledProtocols 'none'
2018-11-29T00:55:43.910+0000 I NETWORK  [main] getaddrinfo(" 192.168.0.12") failed: Name or service not known
2018-11-29T00:55:45.946+0000 I NETWORK  [main] getaddrinfo(" 192.168.0.13") failed: Name or service not known
2018-11-29T00:55:45.953+0000 I CONTROL  [main]
2018-11-29T00:55:45.953+0000 I CONTROL  [main] ** WARNING: Access control is not enabled for the database.
2018-11-29T00:55:45.953+0000 I CONTROL  [main] **          Read and write access to data and configuration is unrestricted.
2018-11-29T00:55:45.953+0000 I CONTROL  [main] ** WARNING: You are running this process as the root user, which is not recommended.
2018-11-29T00:55:45.953+0000 I CONTROL  [main]
2018-11-29T00:55:45.953+0000 I CONTROL  [main] ** WARNING: This server is bound to localhost.
2018-11-29T00:55:45.953+0000 I CONTROL  [main] **          Remote systems will be unable to connect to this server.
2018-11-29T00:55:45.953+0000 I CONTROL  [main] **          Start the server with --bind_ip <address> to specify which IP
2018-11-29T00:55:45.953+0000 I CONTROL  [main] **          addresses it should serve responses from, or with --bind_ip_all to
2018-11-29T00:55:45.953+0000 I CONTROL  [main] **          bind to all interfaces. If this behavior is desired, start the
2018-11-29T00:55:45.953+0000 I CONTROL  [main] **          server with --bind_ip 127.0.0.1 to disable this warning.
2018-11-29T00:55:45.953+0000 I CONTROL  [main]
2018-11-29T00:55:45.988+0000 I SHARDING [mongosMain] mongos version v4.0.4
2018-11-29T00:55:45.989+0000 I CONTROL  [mongosMain] git version: f288a3bdf201007f3693c58e140056adf8b04839
2018-11-29T00:55:45.989+0000 I CONTROL  [mongosMain] OpenSSL version: OpenSSL 1.0.1e-fips 11 Feb 2013
2018-11-29T00:55:45.989+0000 I CONTROL  [mongosMain] allocator: tcmalloc
2018-11-29T00:55:45.989+0000 I CONTROL  [mongosMain] modules: none
2018-11-29T00:55:45.989+0000 I CONTROL  [mongosMain] build environment:
2018-11-29T00:55:45.989+0000 I CONTROL  [mongosMain]     distmod: rhel70
2018-11-29T00:55:45.989+0000 I CONTROL  [mongosMain]     distarch: x86_64
2018-11-29T00:55:45.989+0000 I CONTROL  [mongosMain]     target_arch: x86_64
2018-11-29T00:55:45.989+0000 I CONTROL  [mongosMain] db version v4.0.4
2018-11-29T00:55:45.989+0000 I CONTROL  [mongosMain] git version: f288a3bdf201007f3693c58e140056adf8b04839
2018-11-29T00:55:45.989+0000 I CONTROL  [mongosMain] OpenSSL version: OpenSSL 1.0.1e-fips 11 Feb 2013
2018-11-29T00:55:45.989+0000 I CONTROL  [mongosMain] allocator: tcmalloc
2018-11-29T00:55:45.989+0000 I CONTROL  [mongosMain] modules: none
2018-11-29T00:55:45.989+0000 I CONTROL  [mongosMain] build environment:
2018-11-29T00:55:45.989+0000 I CONTROL  [mongosMain]     distmod: rhel70
2018-11-29T00:55:45.989+0000 I CONTROL  [mongosMain]     distarch: x86_64
2018-11-29T00:55:45.989+0000 I CONTROL  [mongosMain]     target_arch: x86_64
2018-11-29T00:55:45.989+0000 I CONTROL  [mongosMain] options: { config: "/etc/mongodb-shard.conf", processManagement: { pidFilePath: "/var/run/mongodb/mongos.pid" }, sharding: { configDB: "cfg1/192.168.0.11:27019, 192.168.0.12:27019, 192.168.0.13:27019" }, systemLog: { quiet: true } }
2018-11-29T00:55:45.993+0000 E SHARDING [mongosMain] Failed to set up listener: SocketException: Address already in use
2018-11-29T00:55:45.993+0000 I CONTROL  [mongosMain] shutting down with code:48

Any additional information you'd like to impart

The associated git issue I'm working on can be reviewed.

jeff1evesque avatar Nov 29 '18 01:11 jeff1evesque

I changed my site.pp to the following:

[...EARLIER-CODE-UNCHANGED...]
##
## configure mongos
##
## @configdb, connection string '<config replset name>/<host1:port>,<host2:port>,[...]'
##
node 'mongos' {
  class {'mongodb::globals':
    manage_package_repo => true,
    repo_location       => $mongo_repo,
  }
  -> class {'mongodb::client': }
  -> class {'mongodb::mongos':
    configdb => 'cfg1/192.168.0.11:27019,192.168.0.12:27019,192.168.0.13:27019',
  }
  -> mongodb_shard { 'rs1':
    member => 'rs1/192.168.0.14:27018',
    keys   => [{
      'rs1' => {
        'name' => 1,
      }
    }],
  }
}
[...LATER-CODE-UNCHANGED...]

When I rerun the puppet agent on the mongos, I get the following error:

agent trace:

[root@mongos vagrant]# puppet agent -t
Info: Using configured environment 'production'
Info: Retrieving pluginfacts
Info: Retrieving plugin
Info: Retrieving locales
Info: Loading facts
Info: Caching catalog for mongos
Info: Applying configuration version '1543465624'
Notice: /Stage[main]/Mongodb::Mongos::Service/Service[mongos]/ensure: ensure changed 'stopped' to '                                                                                running'
Info: /Stage[main]/Mongodb::Mongos::Service/Service[mongos]: Unscheduling refresh on Service[mongos                                                                                ]
Error: Failed to apply catalog: Execution of '/bin/mongo --quiet --eval printjson(sh.status())' ret                                                                                urned 252: 2018-11-29T04:27:07.402+0000 E QUERY    [js] Error: error: {
        "operationTime" : Timestamp(0, 0),
        "ok" : 0,
        "errmsg" : "not master and slaveOk=false",
        "code" : 13435,
        "codeName" : "NotMasterNoSlaveOk",
        "$clusterTime" : {
                "clusterTime" : Timestamp(0, 0),
                "signature" : {
                        "hash" : BinData(0,"AAAAAAAAAAAAAAAAAAAAAAAAAAA="),
                        "keyId" : NumberLong(0)
                }
        }
} :
_getErrorWithCode@src/mongo/shell/utils.js:25:13
DBCommandCursor@src/mongo/shell/query.js:708:1
DBQuery.prototype._exec@src/mongo/shell/query.js:113:28
DBQuery.prototype.hasNext@src/mongo/shell/query.js:288:5
DBCollection.prototype.findOne@src/mongo/shell/collection.js:260:10
printShardingStatus@src/mongo/shell/utils_sh.js:521:19
sh.status@src/mongo/shell/utils_sh.js:89:5
@(shell eval):1:11

shard log:

[root@mongos vagrant]# cat /var/log/mongodb/mongodb-shard.log
2018-11-29T04:27:06.234+0000 I CONTROL  [main] Automatically disabling TLS 1.0, to force-enable TLS 1.0 specify --sslDisabledProtocols 'none'
2018-11-29T04:27:06.243+0000 I CONTROL  [main]
2018-11-29T04:27:06.243+0000 I CONTROL  [main] ** WARNING: Access control is not enabled for the database.
2018-11-29T04:27:06.243+0000 I CONTROL  [main] **          Read and write access to data and configuration is unrestricted.
2018-11-29T04:27:06.243+0000 I CONTROL  [main] ** WARNING: You are running this process as the root user, which is not recommended.
2018-11-29T04:27:06.243+0000 I CONTROL  [main]
2018-11-29T04:27:06.243+0000 I CONTROL  [main] ** WARNING: This server is bound to localhost.
2018-11-29T04:27:06.243+0000 I CONTROL  [main] **          Remote systems will be unable to connect to this server.
2018-11-29T04:27:06.243+0000 I CONTROL  [main] **          Start the server with --bind_ip <address> to specify which IP
2018-11-29T04:27:06.243+0000 I CONTROL  [main] **          addresses it should serve responses from, or with --bind_ip_all to
2018-11-29T04:27:06.243+0000 I CONTROL  [main] **          bind to all interfaces. If this behavior is desired, start the
2018-11-29T04:27:06.243+0000 I CONTROL  [main] **          server with --bind_ip 127.0.0.1 to disable this warning.
2018-11-29T04:27:06.243+0000 I CONTROL  [main]
2018-11-29T04:27:06.258+0000 I SHARDING [mongosMain] mongos version v4.0.4
2018-11-29T04:27:06.258+0000 I CONTROL  [mongosMain] git version: f288a3bdf201007f3693c58e140056adf8b04839
2018-11-29T04:27:06.259+0000 I CONTROL  [mongosMain] OpenSSL version: OpenSSL 1.0.1e-fips 11 Feb 2013
2018-11-29T04:27:06.259+0000 I CONTROL  [mongosMain] allocator: tcmalloc
2018-11-29T04:27:06.259+0000 I CONTROL  [mongosMain] modules: none
2018-11-29T04:27:06.259+0000 I CONTROL  [mongosMain] build environment:
2018-11-29T04:27:06.259+0000 I CONTROL  [mongosMain]     distmod: rhel70
2018-11-29T04:27:06.259+0000 I CONTROL  [mongosMain]     distarch: x86_64
2018-11-29T04:27:06.259+0000 I CONTROL  [mongosMain]     target_arch: x86_64
2018-11-29T04:27:06.259+0000 I CONTROL  [mongosMain] db version v4.0.4
2018-11-29T04:27:06.259+0000 I CONTROL  [mongosMain] git version: f288a3bdf201007f3693c58e140056adf8b04839
2018-11-29T04:27:06.259+0000 I CONTROL  [mongosMain] OpenSSL version: OpenSSL 1.0.1e-fips 11 Feb 2013
2018-11-29T04:27:06.259+0000 I CONTROL  [mongosMain] allocator: tcmalloc
2018-11-29T04:27:06.259+0000 I CONTROL  [mongosMain] modules: none
2018-11-29T04:27:06.259+0000 I CONTROL  [mongosMain] build environment:
2018-11-29T04:27:06.259+0000 I CONTROL  [mongosMain]     distmod: rhel70
2018-11-29T04:27:06.259+0000 I CONTROL  [mongosMain]     distarch: x86_64
2018-11-29T04:27:06.259+0000 I CONTROL  [mongosMain]     target_arch: x86_64
2018-11-29T04:27:06.259+0000 I CONTROL  [mongosMain] options: { config: "/etc/mongodb-shard.conf", processManagement: { pidFilePath: "/var/run/mongodb/mongos.pid" }, sharding: { configDB: "cfg1/192.168.0.11:27019,192.168.0.12:27019,192.168.0.13:27019" }, systemLog: { quiet: true } }
2018-11-29T04:27:06.259+0000 E SHARDING [mongosMain] Failed to set up listener: SocketException: Address already in use
2018-11-29T04:27:06.259+0000 I CONTROL  [mongosMain] shutting down with code:48

jeff1evesque avatar Nov 29 '18 04:11 jeff1evesque

I decided to kill the current mongos pid, then rerun the puppet agent:

[root@mongos vagrant]# netstat -ntlup
Active Internet connections (only servers)
Proto Recv-Q Send-Q Local Address           Foreign Address         State       PID/Program name
tcp        0      0 127.0.0.1:25            0.0.0.0:*               LISTEN      783/master
tcp        0      0 0.0.0.0:27017           0.0.0.0:*               LISTEN      17666/mongod
tcp        0      0 0.0.0.0:111             0.0.0.0:*               LISTEN      321/rpcbind
tcp        0      0 0.0.0.0:22              0.0.0.0:*               LISTEN      664/sshd
tcp6       0      0 ::1:25                  :::*                    LISTEN      783/master
tcp6       0      0 :::111                  :::*                    LISTEN      321/rpcbind
tcp6       0      0 :::22                   :::*                    LISTEN      664/sshd
udp        0      0 127.0.0.1:323           0.0.0.0:*                           392/chronyd
udp        0      0 0.0.0.0:68              0.0.0.0:*                           15286/dhclient
udp        0      0 0.0.0.0:111             0.0.0.0:*                           321/rpcbind
udp        0      0 0.0.0.0:904             0.0.0.0:*                           321/rpcbind
udp6       0      0 ::1:323                 :::*                                392/chronyd
udp6       0      0 :::111                  :::*                                321/rpcbind
udp6       0      0 :::904                  :::*                                321/rpcbind
[root@mongos vagrant]# kill 17666
[root@mongos vagrant]#
[root@mongos vagrant]#
[root@mongos vagrant]#
[root@mongos vagrant]# puppet agent -t
Info: Using configured environment 'production'
Info: Retrieving pluginfacts
Info: Retrieving plugin
Info: Retrieving locales
Info: Loading facts
Info: Caching catalog for mongos
Info: Applying configuration version '1543466578'
Notice: /Stage[main]/Mongodb::Mongos::Service/Service[mongos]/ensure: ensure changed 'stopped' to 'running'
Info: /Stage[main]/Mongodb::Mongos::Service/Service[mongos]: Unscheduling refresh on Service[mongos]
Notice: Failed to connect to mongodb within timeout window of 240 seconds; giving up.
Error: Unable to connect to mongodb server! (127.0.0.1:27017)
Error: /Stage[main]/Mongodb::Mongos::Service/Mongodb_conn_validator[mongos]/ensure: change from 'absent' to 'present' failed: Unable to connect to mongodb server! (127.0.0.1:27017)
Info: Class[Mongodb::Mongos::Service]: Unscheduling all events on Class[Mongodb::Mongos::Service]
Error: Failed to apply catalog: Execution of '/bin/mongo --quiet --eval printjson(sh.status())' returned 1: 2018-11-29T04:47:04.864+0000 E QUERY    [js] Error: couldn't connect to server 127.0.0.1:27017, connection attempt failed: SocketException: Error connecting to 127.0.0.1:27017 :: caused by :: Connection refused :
connect@src/mongo/shell/mongo.js:257:13
@(connect):1:21
exception: connect failed
Error: Could not send report: SSL_connect returned=6 errno=0 state=unknown state

jeff1evesque avatar Nov 29 '18 04:11 jeff1evesque