nginx-gridfs icon indicating copy to clipboard operation
nginx-gridfs copied to clipboard

with replsets "Mongo Exception: Cannot connect to primary node"

Open mugenken opened this issue 13 years ago • 32 comments

I get this error message when i use replication sets. I am on Gentoo with nginx 0.8.53 and the latest github version of nginx-gridfs. All MongoDB nodes are accessable from this server. The database name is myapp and the name of the replication set is myapp.

Nginx location config without replsets works:

location /gridfs/ {
    gridfs myapp field=filename type=string;
    mongo 192.168.8.20:27017;
}

with replsets does not work:

location /gridfs/ {
    gridfs myapp field=filename type=string;
    mongo "myapp" 192.168.8.20:27017 192.168.8.30:27017 192.168.8.60:27017;
}

Am i doing something wrong or is there information i could help you with?

mugenken avatar May 24 '11 12:05 mugenken

I am having the same problem. I am using the latest version of nginx and nginx-gridfs on Ubuntu 10.10.

Here is my log.

2011/05/25 09:13:15 [error] 14696#0: Mongo Exception: Cannot connect to primary node.
2011/05/25 09:13:15 [error] 14697#0: Mongo Exception: Cannot connect to primary node.
2011/05/25 09:13:16 [alert] 14695#0: worker process 14696 exited with fatal code 2 and can not be respawn
2011/05/25 09:13:16 [alert] 14695#0: worker process 14697 exited with fatal code 2 and can not be respawn
2011/05/25 09:13:16 [alert] 14695#0: waitpid() failed (10: No child processes)

mattatcha avatar May 25 '11 14:05 mattatcha

I tried to figure out whats going on and it seems the driver is called with the correct data, but returns mongo_conn_cannot_find_primary before actually trying to connect any node. at least i can't see any packets to the mongo servers coming from the nginx server. This is no connection issue between the servers as local mongo servers get no communication from the driver as well.

mugenken avatar May 27 '11 15:05 mugenken

I am also having the same problem. nginx-1.0.3 and the latest version of nginx-gridfs on Debian Squeeze

8bitbrad avatar Jun 10 '11 00:06 8bitbrad

same issue here...nginx 1.0.3 mongo 1.6

elmacnifico avatar Jun 13 '11 16:06 elmacnifico

same issue too.

can anyone figure out how to fix this??

LNGi avatar Jul 18 '11 03:07 LNGi

+1 (nginx 1.0.5, mongo 1.8.2)

fwoeck avatar Aug 19 '11 08:08 fwoeck

I also had a similar issue, I was getting an error message like:

2011/11/05 01:32:20 [error] 89646#0: Mongo Exception: Replica set name myapp does not match.
2011/11/05 01:32:20 [alert] 89416#0: worker process 89646 exited with fatal code 2 and can not be respawn

I tried changing the replica set name to something else and after a while it seems to work. I couldn't figure out the reason.

Was using nginx-gridfs v0.8, nginx v1.0.0, mongo v2.0.1

cheeming avatar Nov 05 '11 10:11 cheeming

I just merged in abbat's commit - can anybody confirm whether or not that helps with this issue?

mdirolf avatar Nov 18 '11 15:11 mdirolf

Hi, with the new driver, I still get en exception when I try to connect to a replicaset with

mongo "lsng" 127.0.0.1:27017 46.137.178.90:27017;

=> nginx-error log: 2011/12/17 10:14:46 [error] 18964#0: Mongo Exception: Unknown Error 2011/12/17 10:14:46 [alert] 18963#0: worker process 18964 exited with fatal code 2 and cannot be respawned

However connection to the same set with

mongo "lsng" 127.0.0.1:27017;

works. --Frank

fwoeck avatar Dec 17 '11 10:12 fwoeck

hmm, I just see that a

mongo "lsng" 46.137.178.90:27017;

doesn't work either (46.137.178.90 beeing a secondary). Is it possible that I have to specify something like

rs.slaveOk();

in the replica set configuration? --Frank

fwoeck avatar Dec 17 '11 10:12 fwoeck

I did manage to make this module work on a Debian squeeze replica set with MongoDB 2.0.2 (10gen package), Nginx 0.7.67 and the Mongo C driver v0.4.

Seems the module was having issues with the replica name (as mentioned above) and connection error codes. I fixed this in my fork. See https://github.com/viotti/nginx-gridfs/commit/93d02cce72b1fb4f254b8c7c4456ec06aea8d230.

I also had to reconfigure both MongoDB and Nginx-gridfs to use dot-decimal notation as suggested in the Mongo C driver documentation (http://api.mongodb.org/c/current/connections.html#basic-connections). Using DNS host names it did not work.

Nginx config:

location /uploads/ {
    gridfs mydb root_collection=storage.uploads field=filename type=string;
    mongo "myreplica" 10.0.0.1:27017 10.0.0.2:27017;
}

MongoDB config:

MongoDB shell version: 2.0.2
connecting to: test
SECONDARY> rs.conf()
{
    "_id" : "myreplica",
    "version" : 3,
    "members" : [
        {
            "_id" : 0,
            "host" : "10.0.0.1:27017"
        },
        {
            "_id" : 1,
            "host" : "10.0.0.2:27017"
        },
        {
            "_id" : 2,
            "host" : "10.0.0.3:27017",
            "arbiterOnly" : true
        }
    ]
}

viotti avatar Feb 01 '12 18:02 viotti

I think this problem is due to the compile option of the mongo C-Driver and the driver itself:

  • The mongo C-Driver needs the compile option "-D_GNU_SOURCE=1" to enable getaddrinfo to translate a hostname to an IP address.
  • The mongo C-Driver fails to fail-over because of a bug.
    • ref: https://github.com/kiwanami/mongo-c-driver/commit/6ab5cdc36c448ede34c1cb0f0939796c8c117dd3

Using my forked sources, this problem on the Linux is fixed and I confirmed the failover with the replica set. I sent pull requests, however, my quick fixes may be difficult to include because the patches are not so beautiful...

kiwanami avatar Feb 02 '12 02:02 kiwanami

That explains the "Cannot connect to primary node" messages when I mentioned only the secondary node on the Nginx configuration file. The expected behavior was for the driver to find all the seeds and automatically connect to the primary.

I think the merge of our forks will resolve those replica set issues.

viotti avatar Feb 02 '12 18:02 viotti

Yes. I think so too. ++

kiwanami avatar Feb 03 '12 02:02 kiwanami

I found out that I couldn't connect to a single node/seed because the Nginx module was using the amount of hosts to decide whether to connect directly or via a replica set. One node -> direct connection. More than one -> replica set connection. This is just plain wrong since its perfectly fine to connect to a single seed in a replica set setting (http://api.mongodb.org/c/current/connections.html#replica-set-connections).

I changed the code to check if a replica set name was specified in the mongo configuration directive, instead of relying on the amount of hosts provided.

https://github.com/viotti/nginx-gridfs/commit/c9b6fcfda4359f1c234377177048730d5f82e4fd

viotti avatar Feb 03 '12 19:02 viotti

viotti, tried your fork at c9b6fcfda4359f1c234377177048730d5f82e4fd using ngx_resty 1.0.10.48 and mongo c driver v0.4 and i still can't connect to any of a single seed node, the full set specified (in any order), or even with just the primary listed:

  # mongo "loko0" 10.111.47.8:27017; # secondary only
  mongo "loko0" 10.40.129.174:27017; # primary only
  # mongo "loko0" 10.40.129.174:27017 10.111.47.8:27017; # primary first
  # mongo "loko0" 10.111.47.8:27017 10.40.129.174:27017;  # secondary first

they all log the error "Mongo Exception: Cannot connect to primary node". tcpdump shows traffic flowing to the both mongod's on startup.

davidbirdsong avatar Feb 26 '12 03:02 davidbirdsong

Are you using dot decimal notation in your MongoDB replica set configuration?

viotti avatar Feb 27 '12 13:02 viotti

as in the name attribute of the rs conf?

loko0:PRIMARY> rs.status().members[0].name loko-2:27017 loko0:PRIMARY> rs.status().members[1].name loko-3:27017

then, no. if i change them to dotted ip's, i guess it will work?

On Mon, Feb 27, 2012 at 5:15 AM, Rafael Viotti [email protected] wrote:

Are you using dot decimal notation in your MongoDB replica set configuration?


Reply to this email directly or view it on GitHub: https://github.com/mdirolf/nginx-gridfs/issues/26#issuecomment-4193209

davidbirdsong avatar Feb 27 '12 20:02 davidbirdsong

Yes, I guess things will work if you change the rs conf to dotted IP's (the host attribute of rs.conf()).

http://www.mongodb.org/display/DOCS/Moving+or+Replacing+a+Member

Using host names requires special compilation of the Mongo C driver, as reported by kiwanami.

viotti avatar Feb 27 '12 22:02 viotti

Hi guys, thanks for you work! Alas, I'm not getting it done - using host names. I took the kiwanami / mongo-c-driver master for compilation. As soon as I use more than one host nginx crashes. If I wanted to use ip-addresses instead - how could this be done, if the nodes are NATed? Do you have any hints? Thanks, Frank

fwoeck avatar Mar 04 '12 14:03 fwoeck

Hi. Would you report your error messages at error.log and environment (OS, Distr, nginx, mongodb version, etc) information? And, maybe you should execute the command 'git checkout master' in the submodule directory 'mongo-c-driver'. Here is the correct procedure of checking out the submodule.

git submodule init
git submodule update
cd mongo-c-driver
git checkout master

kiwanami avatar Mar 05 '12 01:03 kiwanami

@kiwanami - sure: I'm using nginx/1.0.12, the master commits from you and viotti, RHEL6/64bit, kernel 2.6.32-220.4.1.el6.x86_64, MongoDB 2.0.3

/etc/hosts

127.0.0.1         localhost localhost.localdomain ls03.livesein.de
46.137.178.90  ls04.livesein.de

this is the nginx conf snippet:

    location /gridfs/ {
      internal;
      gridfs livesein_ng_production field=filename type=string;
      mongo 127.0.0.1;
      # mongo "lsng" 127.0.0.1:27017 46.137.178.90:27017;
      # mongo "lsng" ls03.livesein.de:27017 ls04.livesein.de:27017;
      add_header Cache-Control public;
      add_header ETag "";
      expires max;
    }

and the replset conf:

{
    "_id" : "lsng",
    "version" : 8,
    "members" : [
        {
            "_id" : 1,
            "host" : "ls03.livesein.de:27017",
            "priority" : 3
        },
        {
            "_id" : 2,
            "host" : "ls03.livesein.de:27018",
            "arbiterOnly" : true
        },
        {
            "_id" : 3,
            "host" : "ls04.livesein.de:27017",
            "priority" : 2
        }
    ]
}

Using "mongo 127.0.0.1;" works fine.

I get the following errors:

  • with "mongo ls03.livesein.de:27017;"
2012/03/05 05:30:36 [alert] 3592#0: worker process 3593 exited with fatal code 2 and cannot be respawned
2012/03/05 05:31:57 [error] 3797#0: Mongo Exception: Connection Failure.
  • with "mongo "lsng" 127.0.0.1:27017 46.137.178.90:27017;" or "mongo "lsng" ls03.livesein.de:27017 ls04.livesein.de:27017;":
2012/03/05 05:31:57 [alert] 3796#0: worker process 3797 exited with fatal code 2 and cannot be respawned
2012/03/05 05:34:03 [error] 3991#0: Mongo Exception: Cannot connect to primary node.

--Frank

fwoeck avatar Mar 05 '12 05:03 fwoeck

According to your log report, mongo-c-driver may be old. The latest driver does not kill the process, while the original driver kills the worker process with the signal SIGPIPE. Will you try to recompile nginx and nginx-gridfs again, with the 'git checkout master' command? Then, you can compare the output of 'git log' with the log https://github.com/kiwanami/mongo-c-driver/commits/master.

kiwanami avatar Mar 06 '12 05:03 kiwanami

hi @kiwanami , I used https://github.com/mongodb/mongo-c-driver/commits/ latest master as of today which should contain your pull request, right? Alas, the errors are the same. --Frank

fwoeck avatar Mar 06 '12 06:03 fwoeck

Hi, @fwoeck Using my nginx-gridfs forked branch and orignal c-driver branch https://github.com/mongodb/mongo-c-driver/, I confirmed that the latest branch worked fine. Which branch for the nginx-gridfs do you use?

kiwanami avatar Mar 06 '12 07:03 kiwanami

this commit: https://github.com/viotti/nginx-gridfs/commit/c9b6fcfda4359f1c234377177048730d5f82e4fd

fwoeck avatar Mar 06 '12 07:03 fwoeck

Will you apply this patch? This patch enables for c-driver to use the getaddrinfo to resolve hostnames. https://github.com/kiwanami/nginx-gridfs/commit/b9bc845456d560bdfba7b9fc8b07c01e0c2c5764

kiwanami avatar Mar 06 '12 07:03 kiwanami

Yay, it works with mongo "lsng" ls03.livesein.de:27017 ls04.livesein.de:27017; I love you! Thanks a lot - I'm sorry if I missed something. --Frank

fwoeck avatar Mar 06 '12 08:03 fwoeck

I'm glad to know work fine!

kiwanami avatar Mar 06 '12 08:03 kiwanami

i can confirm that the replica set with host names that match the replica set config work now, thanks! i used: https://github.com/kiwanami/nginx-gridfs/commit/b9bc845456d560bdfba7b9fc8b07c01e0c2c5764

will nginx prefer the first node in the list always and only move down the list if there's a failure?

davidbirdsong avatar Jun 16 '12 01:06 davidbirdsong

The nodes in the list are just seeds. The Mongo driver will use those seeds to discover the replica set primary node, and connect to it.

viotti avatar Jun 18 '12 17:06 viotti

On Mon, Jun 18, 2012 at 10:46 AM, Rafael Viotti [email protected] wrote:

The nodes in the list are just seeds. The Mongo driver will use those seeds to discover the replica set primary node, and connect to it.

do i understand it that the seeds in nginx.conf need to match the names in the resplica set config? for instance if the set config contained:

node1:27017 node2:27017

but nginx was running locally on one of the set members, could i specfify?

mongo "shard0" 127.0.0.1:27017;


Reply to this email directly or view it on GitHub: https://github.com/mdirolf/nginx-gridfs/issues/26#issuecomment-6403548

davidbirdsong avatar Jun 19 '12 19:06 davidbirdsong