couchdb icon indicating copy to clipboard operation
couchdb copied to clipboard

Bad Security Object Error After Moving Shards

Open arifcse019 opened this issue 7 years ago • 6 comments

Security objects for some databases fail to sync properly in a new node after all shards are moved from an old node.

Expected Behavior

Security objects for all databases should sync when shards are moved to a new node

Current Behavior

Security objects for some databases fail to sync properly in a new node after all shards are moved from an old node. The log says things like:

" [error] 2018-09-19T17:24:05.388202Z [email protected] <0.19944.2> -------- Bad security object in <<"db-name">>: [{{[{<<"_id">>,<<"_security">>},{<<"admins">>,{[{<<"names">>,[]},{<<"roles">>,[]}]}},{<<"members">>,{[{<<"names">>,[<<"user-name">>]},{<<"roles">>,[]}]}}]},13},{{[]},7}] "

Steps to Reproduce (for bugs)

  1. Add a new node to the cluster
  2. Move All Shards from an old node to this new one
  3. Shut down and delete the old node
  4. Verify Security Objects on all databases

Context

We are trying to replace couch cluster instances with new instances as part of preparing for a scenario where one instance can go away abruptly

Your Environment

  • Version used: Couch 2.1.2, 3 node cluster

arifcse019 avatar Sep 19 '18 20:09 arifcse019

@arifcse019 Can you provide a minimal example for us with a script that uses cURL? I've personally performed the steps you mention above (1-4) many times and have never run across this.

What technique are you using to move the shards? There is newly updated documentation on the approved approach online, can you follow that?

http://docs.couchdb.org/en/stable/cluster/sharding.html

wohali avatar Sep 19 '18 23:09 wohali

@wohali I am using the following two ruby scripts to move shards: first to update cluster metadata to add the shards to the new node, second one to update cluster metadata to stop looking for those shards in the old one.

https://gist.github.com/arifcse019/43a638e4ce837b029d62d59fd0b9a20f (move_shards_in.rb) https://gist.github.com/arifcse019/c8a7096275e16d344f6c53ad884716ea (move_shards_out.rb)

And the steps are as I described in my issue. These two scripts are run as part of step 2

arifcse019 avatar Sep 20 '18 15:09 arifcse019

Hello, I'm facing the same issue on CouchDB 3.1:

[error] 2020-07-30T10:47:23.718632Z [email protected] <0.23319.995> -------- Bad security object in <<"_users">>: [{{[{<<"members">>,{[{<<"roles">>,[<<"_admin">>]}]}},{<<"admins">>,{[{<<"roles">>,[<<"_admin">>]}]}}]},8},{{[{<<"admins">>,{[{<<"roles">>,[<<"_admin">>]}]}}]},8}]

This is the path I followed:

  • I added a new node to the cluster
  • I added all the shards/nodes to the metadata for the db
  • I invoked _sync_shards for the database

The shards correctly appeared on the other node but the security object got lost and the error started to appear in the log.

I went in the db with Fauxton and the permission were back to basic _admin/_admin.

I modified them and the error has now gone away.

So my guess is that there's something missing in the _sync_shards code when it comes to copying database permissions.

skeyby avatar Jul 30 '20 11:07 skeyby

Same problem here.

Steps to reproduce:

  • Create a DB on single-node.
  • Add permissions to DB
  • Add a second-node to the cluster
  • Add the second-node to all shards

Current Behavior:

  • The DB is visible in the second-node, but it's security object is not synchronized/copied from the first-node.

Expected Behaviour:

  • The DB on the second-node should see the same permissions as in the first-node. When you change permissions on the first-node, they are copied to the second-node. The same is expected when applying the steps above.

Version:

couchdb-3.1.1-1.el7.x86_64 running on CentOS 7

kripper avatar Oct 17 '20 05:10 kripper

More info about the error message. Normal database security should look like

[root@esrp-0a ~]# curl -s http://login:[email protected]:5984/db_name/_security| jq
{
  "members": {
    "roles": [
      "_admin"
    ]
  },
  "admins": {
    "roles": [
      "_admin"
    ]
  }
}

For database where error present the same command produce

[root@esrp-0a ~]# curl -s http://login:[email protected]:5984/db_name/_security| jq
{}

sergey-safarov avatar Mar 09 '24 17:03 sergey-safarov

To display the status of security objects on my server I have created a script. Required to edit login and pass in the script.

#!/bin/sh

db_url=http://login:[email protected]:5984
fix_db=false

escape_dbname() {
	local DBNAME=$1
	echo $DBNAME | sed -e 's:/:%2f:g' -e 's:\+:%2B:'
}

security_json() {
cat << EOF
{"members":{"roles":["_admin"]},"admins":{"roles":["_admin"]}}
EOF
}

get_db_list() {
curl -s ${db_url}/_all_dbs | jq -r '.[]'
}

check_db_security() {
	local dbname=$1
	local esc_dbname=$(escape_dbname ${dbname})
	curl -s ${db_url}/${esc_dbname}/_security | jq 'if . == {} then false else true end'
}

maybe_fix_db_security() {
	local dbname=$1
	local esc_dbname=$(escape_dbname ${dbname})
	if [ "${fix_db}" == "false" ]; then
		echo "need to fix database: ${dbname}"
		return
	fi
	echo "fixing database: ${dbname}"
	security_json | curl -X PUT -H 'content-type: application/json' -H 'accept: application/json' -d@- -s ${db_url}/${esc_dbname}/_security
}

for i in $(get_db_list)
do
	sec_status=$(check_db_security $i)
	if [ "${sec_status}" == "false" ]; then
		maybe_fix_db_security $i
	fi
done

To fix security objects need to set "fix_db" variable to "true" value.

sergey-safarov avatar Mar 09 '24 17:03 sergey-safarov