moped icon indicating copy to clipboard operation
moped copied to clipboard

Failover does not work properly when primary and secondary switched

Open rakusai opened this issue 11 years ago • 3 comments

Option : read: master

When master and secondary has been switched (not down), Moped throw the error for 5 minutes (default refresh time) and failover does not work properly.

Throw Moped::Error::QueryFailure not master 

This situation happens:

  1. each replicaset node has priority.
  2. the master went down so secondary became the new primary.
  3. when old master is live again, then old master become master again (mongo switches master and secondary automatically).

I think we can fix this by adding Errors::QueryFailure, reconfiguring_replica_set hook in with_retry (in read_preference/selectable.rb)

      def with_retry(cluster, retries = cluster.max_retries, &block)
        begin
          return block.call
        rescue Errors::ConnectionFailure => e
          raise e unless retries > 0
        rescue Errors::QueryFailure => e
          raise e unless retries > 0 && e.reconfiguring_replica_set?
        end

        Loggable.warn("  MOPED:", "Retrying connection attempt #{retries} more time(s).", "n/a")
        sleep(cluster.retry_interval)
        cluster.refresh
        with_retry(cluster, retries - 1, &block)
      end

rakusai avatar Mar 19 '14 17:03 rakusai

We have the same issue, it keeps raising "not master", instead of switching to the new master

Exception: Moped::Errors::OperationFailure: The operation: #<Moped::Protocol::Command
  @length=80
  @request_id=324
  @response_to=0
  @op_code=2004
  @flags=[]
  @full_collection_name="cache_production.$cmd"
  @skip=0
  @limit=-1
  @selector={:getlasterror=>1, "w"=>1}
  @fields=nil>
failed with error 10054: "not master"

See https://github.com/mongodb/mongo/blob/master/docs/errors.md
for details about this error.

reidmorrison avatar Apr 08 '14 12:04 reidmorrison

This is fixed with pull request https://github.com/mongoid/moped/pull/265

reidmorrison avatar Apr 18 '14 18:04 reidmorrison

This is fixed in #351

mateusdelbianco avatar Feb 13 '15 19:02 mateusdelbianco