incubator-pegasus icon indicating copy to clipboard operation
incubator-pegasus copied to clipboard

fix(duplication): fix some nodes can not send dup message cause by combine raw write message

Open ninsmiracle opened this issue 1 year ago • 10 comments

What problem does this PR solve?

#1840

What is changed and how does it work?

1.read the config of max_allowed_write_size 2.judge if current duplication batch_bytes greater than max_allowed_write_size So that we can prevent master cluster send some message that backup cluster can not receive.

Side effects
  • master cluster send message by batch or not

ninsmiracle avatar Jan 09 '24 07:01 ninsmiracle

What's the relationship between the newly added config and duplicate_log_batch_bytes?

acelyc111 avatar Jan 09 '24 11:01 acelyc111

What's the relationship between the newly added config and duplicate_log_batch_bytes?

duplicate_log_batch_bytes is the batch size you can set,always set to 4096. The 'newly' added config max_allowed_write_size is the max size of a write mutation that a cluster could received. Not only duplication write,normal write will be limit by this config also.

If we only consider duplicate_log_batch_bytes,master will send some mutations that backup cluster can not receive. Backup cluster will log:

E2024-01-04 11:42:02.88 (1704339722088300009 6004) replica.replica7.0400174218c587bd: replica_stub.cpp:1645:response_client(): [email protected]:42801: write fail: client = xx.xx.xx.xxx:42801, code = RPC_RRDB_RRDB_DUPLICATE, timeout = 236, status = replication::partition_status::PS_PRIMARY, error = ERR_INVALID_DATA

W2024-01-04 11:47:33.656 (1704340053656585710 6004) replica.replica7.0400174218c58cd1: replica_2pc.cpp:77:on_client_write(): [22.53@xxxx:42801] client from xxxx:42801 write request body size exceed threshold, request = [default], request_body_size = 1048716, max_allowed_write_size = 1048576, it will be rejected!

We can see it clearly that request_body_size(master cluster send to backup cluster) is greater than max_allowed_write_size.

ninsmiracle avatar Jan 09 '24 12:01 ninsmiracle

What's the relationship between the newly added config and duplicate_log_batch_bytes?

duplicate_log_batch_bytes is the batch size you can set,always set to 4096. The 'newly' added config max_allowed_write_size is the max size of a write mutation that a cluster could received. Not only duplication write,normal write will be limit by this config also.

If we only consider duplicate_log_batch_bytes,master will send some mutations that backup cluster can not receive. Backup cluster will log:

E2024-01-04 11:42:02.88 (1704339722088300009 6004) replica.replica7.0400174218c587bd: replica_stub.cpp:1645:response_client(): [email protected]:42801: write fail: client = xx.xx.xx.xxx:42801, code = RPC_RRDB_RRDB_DUPLICATE, timeout = 236, status = replication::partition_status::PS_PRIMARY, error = ERR_INVALID_DATA

W2024-01-04 11:47:33.656 (1704340053656585710 6004) replica.replica7.0400174218c58cd1: replica_2pc.cpp:77:on_client_write(): [22.53@xxxx:42801] client from xxxx:42801 write request body size exceed threshold, request = [default], request_body_size = 1048716, max_allowed_write_size = 1048576, it will be rejected!

We can see it clearly that request_body_size(master cluster send to backup cluster) is greater than max_allowed_write_size.

  1. duplicate_log_batch_bytes is optional, how to comprehend "always" set to 4096?
  2. I only saw dup_max_allowed_write_size is used in duplication module, how it take effect on "normal write"?
  3. What's the config value of duplicate_log_batch_bytes in your environment, if it's less than the config value of max_allowed_write_size in backup cluster (i.e. 1048576), how can this situaction happen?

acelyc111 avatar Jan 09 '24 16:01 acelyc111

What's the relationship between the newly added config and duplicate_log_batch_bytes?

duplicate_log_batch_bytes is the batch size you can set,always set to 4096. The 'newly' added config max_allowed_write_size is the max size of a write mutation that a cluster could received. Not only duplication write,normal write will be limit by this config also. If we only consider duplicate_log_batch_bytes,master will send some mutations that backup cluster can not receive. Backup cluster will log:

E2024-01-04 11:42:02.88 (1704339722088300009 6004) replica.replica7.0400174218c587bd: replica_stub.cpp:1645:response_client(): [email protected]:42801: write fail: client = xx.xx.xx.xxx:42801, code = RPC_RRDB_RRDB_DUPLICATE, timeout = 236, status = replication::partition_status::PS_PRIMARY, error = ERR_INVALID_DATA

W2024-01-04 11:47:33.656 (1704340053656585710 6004) replica.replica7.0400174218c58cd1: replica_2pc.cpp:77:on_client_write(): [22.53@xxxx:42801] client from xxxx:42801 write request body size exceed threshold, request = [default], request_body_size = 1048716, max_allowed_write_size = 1048576, it will be rejected!

We can see it clearly that request_body_size(master cluster send to backup cluster) is greater than max_allowed_write_size.

  1. duplicate_log_batch_bytes is optional, how to comprehend "always" set to 4096?
  2. I only saw dup_max_allowed_write_size is used in duplication module, how it take effect on "normal write"?
  3. What's the config value of duplicate_log_batch_bytes in your environment, if it's less than the config value of max_allowed_write_size in backup cluster (i.e. 1048576), how can this situaction happen?

1.Becauseduplicate_log_batch_bytes default value is 4096 2.I saw max_allowed_write_size in replica_2pc_cpp.Here it can effect on "normal write" 3.It's not easy for master cluster get the value of duplicate_log_batch_bytes config by backup cluster. I don't think it's necessary for master cluster to get this value,because when we doing duplication ,we usually keep the similarity config between two clusters.

I'm sorry that what I said above caused ambiguity.

ninsmiracle avatar Jan 10 '24 07:01 ninsmiracle

Is it necessary to add the new config? Can the issue be resolved by adjust duplicate_log_batch_bytes on master cluster or max_allowed_write_size on backup cluster?

acelyc111 avatar Jan 29 '24 11:01 acelyc111

3.It's not easy for master cluster get the value of duplicate_log_batch_bytes config by backup cluster. I don't think it's necessary for master cluster to get this value,because when we doing duplication ,we usually keep the similarity config between two clusters.

In this patch, a new config dup_max_allowed_write_size is added, it face the same issue.

An alternative way is to set DSN_TAG_VARIABLE(max_allowed_write_size, FT_MUTABLE), then you can change the value by HTTP (e.g. ip:port/updateConfig?max_allowed_write_size=2097152) without restart the server.

acelyc111 avatar Jan 29 '24 12:01 acelyc111

Is it necessary to add the new config? Can the issue be resolved by adjust duplicate_log_batch_bytes on master cluster or max_allowed_write_size on backup cluster?

If the maintainer of the duplication clusters config max_allowed_write_size to zero , the problem will be resolved. However , if maintainer config it to zero for every duplication cluster , when dup service send mutatioin, packaging functionality will no longer supported . I think it's a unsuitable design.

ninsmiracle avatar Jan 29 '24 12:01 ninsmiracle

Is it necessary to add the new config? Can the issue be resolved by adjust duplicate_log_batch_bytes on master cluster or max_allowed_write_size on backup cluster?

If the maintainer of the duplication clusters config max_allowed_write_size to zero , the problem will be resolved. However , if maintainer config it to zero for every duplication cluster , when dup service send mutatioin, packaging functionality will no longer supported . I think it's a unsuitable design.

How about limiting the message size separately for common client write operations and duplication write operations, you can distinguish them by rpc code, then use corresponding limits?

acelyc111 avatar Feb 04 '24 03:02 acelyc111

Is it necessary to add the new config? Can the issue be resolved by adjust duplicate_log_batch_bytes on master cluster or max_allowed_write_size on backup cluster?

The same purpose can be achieved through the configuration of operation and maintenance personnel. But I think it is better to provide some support on the Pegasus server side.

ninsmiracle avatar Feb 05 '24 12:02 ninsmiracle

ponding limits?

good idea ! I will resubmit this pr in rpc related logic later~

ninsmiracle avatar Feb 05 '24 12:02 ninsmiracle

@ninsmiracle Please try to rebase the master branch.

acelyc111 avatar Mar 01 '24 07:03 acelyc111