yugabyte-db
yugabyte-db copied to clipboard
[YSQL] COPY fails with [ERROR: Already present: Duplicate request]
Jira Link: DB-1635 I've seen this error several times now when loading data:
ERROR: Already present: [Already present (yb/consensus/retryable_requests.cc:241): Duplicate request]
According to my analysis, I believe that COPY is needlessly failing. This is the sequence of events that leads to the error:
- A
COPY
command is started by the client. - The Postgres backend sends a
yb.tserver.TabletServerService/Write
RPC request to a tablet for a batch of tuples - The
yb.tserver.TabletServerService/Write
operation is flushed to the tablet followers in ayb.consensus.ConsensusService/UpdateConsensus
request - Before the tablet LEADER can respond to the
yb.tserver.TabletServerService/Write
operation, the tablet switches leaders - Not knowing that the
yb.tserver.TabletServerService/Write
had succeeded, the client attempted the write again, and the tserver responded withERROR: Already present: [Already present (yb/consensus/retryable_requests.cc:241): Duplicate request]
- The query fails
I managed to reproduce the issue on a test cluster, and have pieced together the events from a wire capture:
[yugabyte@yb-demo-tvesely-kroger-copy-566-n1 ~]$ while [[ ! -f /tmp/error_occurred ]]; do ysqlsh -c "copy bar from '/tmp/copy.dump' with (rows_per_transaction 100)" || touch /tmp/error_occurred; done &^C
[yugabyte@yb-demo-tvesely-kroger-copy-566-n1 ~]$ sudo tcpdump -i any -w /tmp/copy_repro.pcap port 9100 or port 7100 or port 5433 &
[1] 2521
[yugabyte@yb-demo-tvesely-kroger-copy-566-n1 ~]$ tcpdump: listening on any, link-type LINUX_SLL (Linux cooked), capture size 262144 bytes
[yugabyte@yb-demo-tvesely-kroger-copy-566-n1 ~]$ while [[ ! -f /tmp/error_occurred ]]; do ysqlsh -c "copy bar from '/tmp/copy.dump' with (rows_per_transaction 50)" || touch /tmp/error_occurred; done &
[2] 2555
[yugabyte@yb-demo-tvesely-kroger-copy-566-n1 ~]$ while [[ ! -f /tmp/error_occurred ]]; do ysqlsh -c "copy bar from '/tmp/copy.dump' with (rows_per_transaction 50)" || touch /tmp/error_occurred; done &
[3] 2715
[yugabyte@yb-demo-tvesely-kroger-copy-566-n1 ~]$ while [[ ! -f /tmp/error_occurred ]]; do ysqlsh -c "copy bar from '/tmp/copy.dump' with (rows_per_transaction 50)" || touch /tmp/error_occurred; done &
[4] 2748
[yugabyte@yb-demo-tvesely-kroger-copy-566-n1 ~]$ while [[ ! -f /tmp/error_occurred ]]; do ysqlsh -c "copy bar from '/tmp/copy.dump' with (rows_per_transaction 50)" || touch /tmp/error_occurred; done &
[5] 2780
[yugabyte@yb-demo-tvesely-kroger-copy-566-n1 ~]$ while [[ ! -f /tmp/error_occurred ]]; do ysqlsh -c "copy bar from '/tmp/copy.dump' with (rows_per_transaction 50)" || touch /tmp/error_occurred; done &
[6] 2802
[yugabyte@yb-demo-tvesely-kroger-copy-566-n1 ~]$
ERROR: Already present: [Already present (yb/consensus/retryable_requests.cc:241): Duplicate request]
I killed the tserver on yb-demo-tvesely-copy-566-n3
(10.142.15.196
) and the error ERROR: Already present: [Already present (yb/consensus/retryable_requests.cc:241): Duplicate request]
happened immediately. From the Postgres logs:
I0212 20:47:40.074374 2563 tcp_stream.cc:308] { local: 10.142.15.195:49106 remote: 10.142.15.196:9100 }: Recv failed: Network error (yb/util/net/socket.cc:537): recvmsg error: Connection refused (system error 111)
I0212 20:47:40.590497 2936 tablet_rpc.cc:414] Already present (yb/consensus/retryable_requests.cc:241): Failed Write(tablet: a09121672a2d4f1ca3ab05e07481faf3, num_ops: 3, num_attempts: 71, txn: 9950613a-0a58-4739-9f7b-b695711af38a) to tablet a09121672a2d4f1ca3ab05e07481faf3 on tablet server { uuid: b84195c8e1cf45edb6371d123c43f658 private: [host: "10.142.15.195" port: 9100] cloud_info: placement_cloud: "gcp" placement_region: "us-east1" placement_zone: "us-east1-b" after 71 attempt(s): Duplicate request
2021-02-12 20:47:40.804 UTC [2783] ERROR: Already present: [Already present (yb/consensus/retryable_requests.cc:241): Duplicate request]
2021-02-12 20:47:40.804 UTC [2783] STATEMENT: copy bar from '/tmp/copy.dump' with (rows_per_transaction 50)
I0212 20:47:41.372903 2737 tcp_stream.cc:308] { local: 10.142.15.195:49278 remote: 10.142.15.196:9100 }: Recv failed: Network error (yb/util/net/socket.cc:537): recvmsg error: Connection refused (system error 111)
This appears to be the original write request that went to the tablet on a09121672a2d4f1ca3ab05e07481faf3
on node 10.142.15.196
before the tserver on that node went down:
Search String: protobuf.field.value.string == "�Pa:
XG9�{��q�" && protobuf.field.value.uint64 == 43147536 && ip.addr == 10.142.15.196 && tcp.stream == 79
327593 20:47:37.003132 10.142.15.195 10.142.15.196 YBRPC 9130 46484 → 9100 [PSH, ACK] Seq=3612270 Ack=422037 Win=178816 Len=9062 TSval=345000483 TSecr=345007251 (PROTOBUF) yb.rpc.RequestHeader (PROTOBUF) yb.tserver.WriteRequestPB (PROTOBUF) yb.rpc.RequestHeader (PROTOBUF) yb.tserver.WriteRequestPB (PROTOBUF) yb.rpc.RequestHeader (PROTOBUF) yb.tserver.WriteRequestPB (PROTOBUF) yb.rpc.RequestHeader (PROTOBUF) yb.tserver.WriteRequestPB (PROTOBUF) yb.rpc.RequestHeader (PROTOBUF) yb.tserver.WriteRequestPB (PROTOBUF) yb.rpc.RequestHeader (PROTOBUF) yb.tserver.WriteRequestPB (PROTOBUF) yb.rpc.RequestHeader (PROTOBUF) yb.tserver.WriteRequestPB (PROTOBUF) yb.rpc.RequestHeader (PROTOBUF) yb.tserver.WriteRequestPB (PROTOBUF) yb.rpc.RequestHeader (PROTOBUF) yb.tserver.WriteRequestPB (PROTOBUF) yb.rpc.RequestHeader (PROTOBUF) yb.tserver.WriteRequestPB (PROTOBUF) yb.rpc.RequestHeader (PROTOBUF) yb.tserver.WriteRequestPB (PROTOBUF) yb.rpc.RequestHeader (PROTOBUF) yb.tserver.WriteRequestPB (PROTOBUF) yb.rpc.RequestHeader (PROTOBUF) yb.tserver.WriteRequestPB (PROTOBUF) yb.rpc.RequestHeader (PROTOBUF) yb.tserver.WriteRequestPB (PROTOBUF) yb.rpc.RequestHeader (PROTOBUF) yb.tserver.WriteRequestPB (PROTOBUF) yb.rpc.RequestHeader (PROTOBUF) yb.tserver.WriteRequestPB
Frame 327593: 9130 bytes on wire (73040 bits), 9130 bytes captured (73040 bits)
Linux cooked capture v1
Internet Protocol Version 4, Src: 10.142.15.195, Dst: 10.142.15.196
Transmission Control Protocol, Src Port: 46484, Dst Port: 9100, Seq: 3612270, Ack: 422037, Len: 9062
Yugabyte RPC
Message Length: 651
Message Length: 551
Message Length: 548
Message Length: 548
Message Length: 551
Message Length: 551
Message Length: 449
Message Length: 651
Message Length: 651
Message Length: 547
Protobuf Header Length: 48
Call ID: 22590
Protocol Buffers: yb.rpc.RequestHeader (Message: yb.rpc.RequestHeader)
call_id: 22590
remote_method: (39 bytes) (Message: yb.rpc.RemoteMethodPB)
service_name: yb.tserver.TabletServerService
method_name: Write
timeout_millis: 2500
Protobuf Message Length: 496
Protocol Buffers: yb.tserver.TabletServerService/Write,request (Message: yb.tserver.WriteRequestPB)
tablet_id: a09121672a2d4f1ca3ab05e07481faf3
propagated_hybrid_time: 6607515062281265152
include_trace: False
write_batch: (76 bytes) (Message: yb.docdb.KeyValueWriteBatchPB)
transaction: (72 bytes) (Message: yb.TransactionMetadataPB)
transaction_id: �Pa:\nXG9�{��q\032�
isolation: SNAPSHOT_ISOLATION (1)
status_tablet: 4a79b38093de4b098d81160a1e5289b8
priority: 8699739297894306930
start_hybrid_time: 6607515062202617856
DEPRECATED_may_have_metadata: True
read_time: (36 bytes) (Message: yb.ReadHybridTimePB)
read_ht: 6607515062202617856
DEPRECATED_max_of_read_time_and_local_limit_ht: 6607515062407417856
global_limit_ht: 6607515062407417856
in_txn_limit_ht: 6607515062277722112
pgsql_write_batch: (97 bytes) (Message: yb.PgsqlWriteRequestPB)
client: YQL_CLIENT_PGSQL (2)
stmt_id: 43147536
stmt_type: PGSQL_UPSERT (4)
table_id: 000030a900003000800000000000410d
schema_version: 0
hash_code: 1790
partition_column_values: (5 bytes) (Message: yb.PgsqlExpressionPB)
value: (3 bytes) (Message: yb.QLValuePB)
int32_value: 6158
range_column_values: (5 bytes) (Message: yb.PgsqlExpressionPB)
value: (3 bytes) (Message: yb.QLValuePB)
int32_value: 6158
range_column_values: (28 bytes) (Message: yb.PgsqlExpressionPB)
value: (26 bytes) (Message: yb.QLValuePB)
binary_value: G78S59\034�K�O\004�,mb�i6�
column_refs: (0 bytes) (Message: yb.PgsqlColumnRefsPB)
ysql_catalog_version: 125
pgsql_write_batch: (97 bytes) (Message: yb.PgsqlWriteRequestPB)
client: YQL_CLIENT_PGSQL (2)
stmt_id: 42568416
stmt_type: PGSQL_UPSERT (4)
table_id: 000030a900003000800000000000410d
schema_version: 0
hash_code: 2286
partition_column_values: (5 bytes) (Message: yb.PgsqlExpressionPB)
range_column_values: (5 bytes) (Message: yb.PgsqlExpressionPB)
range_column_values: (28 bytes) (Message: yb.PgsqlExpressionPB)
column_refs: (0 bytes) (Message: yb.PgsqlColumnRefsPB)
ysql_catalog_version: 125
pgsql_write_batch: (96 bytes) (Message: yb.PgsqlWriteRequestPB)
client: YQL_CLIENT_PGSQL (2)
stmt_id: 41892688
stmt_type: PGSQL_UPSERT (4)
table_id: 000030a900003000800000000000410d
schema_version: 0
hash_code: 73
partition_column_values: (5 bytes) (Message: yb.PgsqlExpressionPB)
value: (3 bytes) (Message: yb.QLValuePB)
int32_value: 8930
range_column_values: (5 bytes) (Message: yb.PgsqlExpressionPB)
value: (3 bytes) (Message: yb.QLValuePB)
int32_value: 8930
range_column_values: (28 bytes) (Message: yb.PgsqlExpressionPB)
value: (26 bytes) (Message: yb.QLValuePB)
binary_value: G��S*@&\r+>C��&���BSg
column_refs: (0 bytes) (Message: yb.PgsqlColumnRefsPB)
ysql_catalog_version: 125
client_id1: 16377370652852944639
client_id2: 7142353017941589943
request_id: 189
min_running_request_id: 189
rejection_score: 0
batch_idx: 0
Message Length: 351
Message Length: 351
Message Length: 548
Message Length: 650
Message Length: 449
Message Length: 951
The yb.tserver.TabletServerService/Write
RPC call never recieved a response before the node went down, however it WAS replicated over to its RAFT peers:
35722 2021-02-12 12:47:37.067085 10.142.15.196 10.142.15.195 YBRPC 2558 44201 → 9100 [PSH, ACK] Seq=1154541 Ack=161649 Win=1395 Len=2490 TSval=345007472 TSecr=345000523 (PROTOBUF) yb.rpc.RequestHeader (PROTOBUF) yb.consensus.ConsensusRequestPB (PROTOBUF) yb.rpc.RequestHeader (PROTOBUF) yb.consensus.ConsensusRequestPB (PROTOBUF) yb.rpc.RequestHeader (PROTOBUF) yb.consensus.ConsensusRequestPB (PROTOBUF) yb.rpc.RequestHeader (PROTOBUF) yb.consensus.ConsensusRequestPB (PROTOBUF) yb.rpc.RequestHeader (PROTOBUF) yb.tserver.UpdateTransactionRequestPB (PROTOBUF) yb.rpc.RequestHeader (PROTOBUF) yb.tserver.UpdateTransactionRequestPB (PROTOBUF) yb.rpc.RequestHeader (PROTOBUF) yb.tserver.UpdateTransactionRequestPB (PROTOBUF) yb.rpc.RequestHeader (PROTOBUF) yb.tserver.UpdateTransactionRequestPB
Frame 35722: 2558 bytes on wire (20464 bits), 2558 bytes captured (20464 bits)
Linux cooked capture v1
Internet Protocol Version 4, Src: 10.142.15.196, Dst: 10.142.15.195
Transmission Control Protocol, Src Port: 44201, Dst Port: 9100, Seq: 1154541, Ack: 161649, Len: 2490
Yugabyte RPC
Message Length: 475
Message Length: 581
Message Length: 581
Protobuf Header Length: 58
Call ID: 31490775
Protocol Buffers: yb.rpc.RequestHeader (Message: yb.rpc.RequestHeader)
call_id: 31490775
remote_method: (48 bytes) (Message: yb.rpc.RemoteMethodPB)
service_name: yb.consensus.ConsensusService
method_name: UpdateConsensus
timeout_millis: 3000
Protobuf Message Length: 520
Protocol Buffers: yb.consensus.ConsensusService/UpdateConsensus,request (Message: yb.consensus.ConsensusRequestPB)
tablet_id: a09121672a2d4f1ca3ab05e07481faf3
caller_uuid: c3e539d6350e4937b22fb50116bbae14
caller_term: 2
preceding_id: (5 bytes) (Message: yb.OpIdPB)
term: 2
index: 12528
committed_op_id: (5 bytes) (Message: yb.OpIdPB)
term: 2
index: 12526
ops: (369 bytes) (Message: yb.consensus.ReplicateMsg)
id: (5 bytes) (Message: yb.OpIdPB)
term: 2
index: 12529
hybrid_time: 6607515062341758976
op_type: WRITE_OP (3)
write_request: (339 bytes) (Message: yb.tserver.WriteRequestPB)
tablet_id: a09121672a2d4f1ca3ab05e07481faf3
write_batch: (235 bytes) (Message: yb.docdb.KeyValueWriteBatchPB)
write_pairs: (51 bytes) (Message: yb.docdb.KeyValuePairPB)
key: G\006�H�
value: $
write_pairs: (51 bytes) (Message: yb.docdb.KeyValuePairPB)
key: G\b�H�
value: $
write_pairs: (51 bytes) (Message: yb.docdb.KeyValuePairPB)
key: G
value: $
transaction: (72 bytes) (Message: yb.TransactionMetadataPB)
transaction_id: �Pa:\nXG9�{��q\032�
isolation: SNAPSHOT_ISOLATION (1)
status_tablet: 4a79b38093de4b098d81160a1e5289b8
priority: 8699739297894306930
start_hybrid_time: 6607515062202617856
DEPRECATED_may_have_metadata: True
read_time: (36 bytes) (Message: yb.ReadHybridTimePB)
read_ht: 6607515062202617856
DEPRECATED_max_of_read_time_and_local_limit_ht: 6607515062407417856
global_limit_ht: 6607515062407417856
in_txn_limit_ht: 6607515062277722112
client_id1: 16377370652852944639
client_id2: 7142353017941589943
request_id: 189
min_running_request_id: 189
batch_idx: 0
committed_op_id: (5 bytes) (Message: yb.OpIdPB)
term: 2
index: 12525
monotonic_counter: 0
dest_uuid: b84195c8e1cf45edb6371d123c43f658
leader_lease_duration_ms: 2000
ht_lease_expiration: 1613162859048077
propagated_safe_time: 6607515062280314879
propagated_hybrid_time: 6607515062468923392
Message Length: 209
Message Length: 153
Message Length: 153
Message Length: 153
Message Length: 153
And the response:
35794 2021-02-12 12:47:37.081766 10.142.15.195 10.142.15.196 YBRPC 472 9100 → 44201 [PSH, ACK] Seq=161750 Ack=1157532 Win=1399 Len=404 TSval=345000561 TSecr=345007485 (PROTOBUF) yb.rpc.ResponseHeader (PROTOBUF) yb.tserver.UpdateTransactionResponsePB (PROTOBUF) yb.rpc.ResponseHeader (PROTOBUF) yb.tserver.UpdateTransactionResponsePB (PROTOBUF) yb.rpc.ResponseHeader (PROTOBUF) yb.tserver.UpdateTransactionResponsePB (PROTOBUF) yb.rpc.ResponseHeader (PROTOBUF) yb.tserver.UpdateTransactionResponsePB (PROTOBUF) yb.rpc.ResponseHeader (PROTOBUF) yb.consensus.ConsensusResponsePB (PROTOBUF) yb.rpc.ResponseHeader (PROTOBUF) yb.consensus.ConsensusResponsePB (PROTOBUF) yb.rpc.ResponseHeader (PROTOBUF) yb.consensus.ConsensusResponsePB (PROTOBUF) yb.rpc.ResponseHeader (PROTOBUF) yb.consensus.ConsensusResponsePB
Frame 35794: 472 bytes on wire (3776 bits), 472 bytes captured (3776 bits)
Linux cooked capture v1
Internet Protocol Version 4, Src: 10.142.15.195, Dst: 10.142.15.196
Transmission Control Protocol, Src Port: 9100, Dst Port: 44201, Seq: 161750, Ack: 1157532, Len: 404
Yugabyte RPC
Message Length: 18
Message Length: 18
Message Length: 18
Message Length: 18
Message Length: 75
Message Length: 75
Message Length: 75
Protobuf Header Length: 7
Call ID: 31490775
Protocol Buffers: yb.rpc.ResponseHeader (Message: yb.rpc.ResponseHeader)
call_id: 31490775
is_error: False
Protobuf Message Length: 66
Protocol Buffers: yb.consensus.ConsensusService/UpdateConsensus,response (Message: yb.consensus.ConsensusResponsePB)
responder_uuid: b84195c8e1cf45edb6371d123c43f658
responder_term: 2
status: (17 bytes) (Message: yb.consensus.ConsensusStatusPB)
last_received: (5 bytes) (Message: yb.OpIdPB)
term: 2
index: 12529
last_committed_idx: 12526
last_received_current_leader: (5 bytes) (Message: yb.OpIdPB)
term: 2
index: 12529
num_sst_files: 1
propagated_hybrid_time: 6607515062605508608
Message Length: 75
This is the Duplicate request
message, which caused the COPY
to terminate. It was sent to the new leader for tablet a09121672a2d4f1ca3ab05e07481faf3
, and because the original tablet LEADER had already replicated these tuples, this tablet already had a record of it:
Packet Filter: YBRPC.call_id == 24328 && tcp.stream == 77
Frame 5017: 619 bytes on wire (4952 bits), 619 bytes captured (4952 bits)
Linux cooked capture v1
Internet Protocol Version 4, Src: 10.142.15.195, Dst: 10.142.15.195
Transmission Control Protocol, Src Port: 34232, Dst Port: 9100, Seq: 4697686, Ack: 489810, Len: 551
Yugabyte RPC
Message Length: 547
Protobuf Header Length: 48
Call ID: 24328
Protocol Buffers: yb.rpc.RequestHeader (Message: yb.rpc.RequestHeader)
call_id: 24328
remote_method: (39 bytes) (Message: yb.rpc.RemoteMethodPB)
service_name: yb.tserver.TabletServerService
method_name: Write
timeout_millis: 2500
Protobuf Message Length: 496
Protocol Buffers: yb.tserver.TabletServerService/Write,request (Message: yb.tserver.WriteRequestPB)
tablet_id: a09121672a2d4f1ca3ab05e07481faf3
propagated_hybrid_time: 6607515062281265152
include_trace: False
write_batch: (76 bytes) (Message: yb.docdb.KeyValueWriteBatchPB)
transaction: (72 bytes) (Message: yb.TransactionMetadataPB)
transaction_id: �Pa:\nXG9�{��q\032�
isolation: SNAPSHOT_ISOLATION (1)
status_tablet: 4a79b38093de4b098d81160a1e5289b8
priority: 8699739297894306930
start_hybrid_time: 6607515062202617856
DEPRECATED_may_have_metadata: True
read_time: (36 bytes) (Message: yb.ReadHybridTimePB)
read_ht: 6607515062202617856
DEPRECATED_max_of_read_time_and_local_limit_ht: 6607515062407417856
global_limit_ht: 6607515062407417856
in_txn_limit_ht: 6607515062277722112
pgsql_write_batch: (97 bytes) (Message: yb.PgsqlWriteRequestPB)
client: YQL_CLIENT_PGSQL (2)
stmt_id: 43147536
stmt_type: PGSQL_UPSERT (4)
table_id: 000030a900003000800000000000410d
schema_version: 0
hash_code: 1790
partition_column_values: (5 bytes) (Message: yb.PgsqlExpressionPB)
value: (3 bytes) (Message: yb.QLValuePB)
int32_value: 6158
range_column_values: (5 bytes) (Message: yb.PgsqlExpressionPB)
value: (3 bytes) (Message: yb.QLValuePB)
int32_value: 6158
range_column_values: (28 bytes) (Message: yb.PgsqlExpressionPB)
value: (26 bytes) (Message: yb.QLValuePB)
binary_value: G78S59\034�K�O\004�,mb�i6�
column_refs: (0 bytes) (Message: yb.PgsqlColumnRefsPB)
ysql_catalog_version: 125
pgsql_write_batch: (97 bytes) (Message: yb.PgsqlWriteRequestPB)
client: YQL_CLIENT_PGSQL (2)
stmt_id: 42568416
stmt_type: PGSQL_UPSERT (4)
table_id: 000030a900003000800000000000410d
schema_version: 0
hash_code: 2286
partition_column_values: (5 bytes) (Message: yb.PgsqlExpressionPB)
value: (3 bytes) (Message: yb.QLValuePB)
int32_value: 8688
range_column_values: (5 bytes) (Message: yb.PgsqlExpressionPB)
value: (3 bytes) (Message: yb.QLValuePB)
int32_value: 8688
range_column_values: (28 bytes) (Message: yb.PgsqlExpressionPB)
value: (26 bytes) (Message: yb.QLValuePB)
binary_value: G_KS\005ܝ���Cp�\025\016�1ķ�
column_refs: (0 bytes) (Message: yb.PgsqlColumnRefsPB)
ysql_catalog_version: 125
pgsql_write_batch: (96 bytes) (Message: yb.PgsqlWriteRequestPB)
client: YQL_CLIENT_PGSQL (2)
stmt_id: 41892688
stmt_type: PGSQL_UPSERT (4)
table_id: 000030a900003000800000000000410d
schema_version: 0
hash_code: 73
partition_column_values: (5 bytes) (Message: yb.PgsqlExpressionPB)
value: (3 bytes) (Message: yb.QLValuePB)
int32_value: 8930
range_column_values: (5 bytes) (Message: yb.PgsqlExpressionPB)
value: (3 bytes) (Message: yb.QLValuePB)
int32_value: 8930
range_column_values: (28 bytes) (Message: yb.PgsqlExpressionPB)
value: (26 bytes) (Message: yb.QLValuePB)
binary_value: G��S*@&\r+>C��&���BSg
column_refs: (0 bytes) (Message: yb.PgsqlColumnRefsPB)
ysql_catalog_version: 125
client_id1: 16377370652852944639
client_id2: 7142353017941589943
request_id: 189
min_running_request_id: 189
rejection_score: 0
batch_idx: 0
The tablet responds with ALREADY_PRESENT
, because this tuple had already been inserted into this tablet by the original leader:
Frame 5026: 171 bytes on wire (1368 bits), 171 bytes captured (1368 bits)
Linux cooked capture v1
Internet Protocol Version 4, Src: 10.142.15.195, Dst: 10.142.15.195
Transmission Control Protocol, Src Port: 9100, Dst Port: 34232, Seq: 489940, Ack: 4700999, Len: 103
Yugabyte RPC
Message Length: 99
Protobuf Header Length: 6
Call ID: 24328
Protocol Buffers: yb.rpc.ResponseHeader (Message: yb.rpc.ResponseHeader)
call_id: 24328
is_error: False
Protobuf Message Length: 91
Protocol Buffers: yb.tserver.TabletServerService/Write,response (Message: yb.tserver.WriteResponsePB)
error: (77 bytes) (Message: yb.tserver.TabletServerErrorPB)
code: UNKNOWN_ERROR (1)
status: (73 bytes) (Message: yb.AppStatusPB)
code: ALREADY_PRESENT (6)
message: Duplicate request
source_file: ../../src/yb/consensus/retryable_requests.cc
source_line: 241
errors:
pgsql_response_batch: (2 bytes) (Message: yb.PgsqlResponsePB)
status: PGSQL_STATUS_OK (0)
pgsql_response_batch: (2 bytes) (Message: yb.PgsqlResponsePB)
status: PGSQL_STATUS_OK (0)
pgsql_response_batch: (2 bytes) (Message: yb.PgsqlResponsePB)
status: PGSQL_STATUS_OK (0)
Based on the above evidence, it's my conclusion that the error above need not cancel the running transaction. Is there any reason that the COPY
should not be allowed to continue?
Seeing this in upgrade tests with LST on too, seems to be a duplicate of this:
2022-06-21 15:33:30,318 MainThread INFO --------------------------------------------------------------------------------
2022-06-21 15:33:30,318 MainThread INFO Running Long System Test 0.1
2022-06-21 15:33:30,318 MainThread INFO --------------------------------------------------------------------------------
2022-06-21 15:33:30,318 MainThread INFO
2022-06-21 15:33:30,354 MainThread INFO Reproduce with: git checkout 9a2992a1 && /var/lib/jenkins/code/yb-long-system-test/nohints/long_system_test.py --seed=0 --threads=4 --runtime=30 --nodes 10.9.133.26:5433,10.9.204.168:5433,10.9.219.0:5433 --upgrade
2022-06-21 15:33:30,642 MainThread INFO Database version: PostgreSQL 11.2-YB-2.12.5.0-b0 on x86_64-pc-linux-gnu, compiled by gcc (Homebrew gcc 5.5.0_4) 5.5.0, 64-bit
2022-06-21 15:33:30,645 MainThread INFO Creating tables for database db_lst_761509
2022-06-21 15:33:51,228 MainThread INFO Starting worker_0: SingleInsertAction, SingleUpdateAction, SingleDeleteAction, BulkInsertAction, BulkUpdateAction, SetConfigAction
2022-06-21 15:33:51,229 MainThread INFO Starting worker_1: RandomSelectAction, SetConfigAction
2022-06-21 15:33:51,229 MainThread INFO Starting worker_2: RandomSelectAction, SetConfigAction
2022-06-21 15:33:51,230 MainThread INFO Starting worker_3: RandomSelectAction, SetConfigAction
2022-06-21 15:34:01,240 MainThread INFO Worker queries/s: [035.5][015.6][017.9][015.7]
2022-06-21 15:34:11,250 MainThread INFO Worker queries/s: [028.1][013.9][014.2][014.1]
2022-06-21 15:34:21,260 MainThread INFO Worker queries/s: [012.9][004.8][005.0][004.3]
2022-06-21 15:34:31,269 MainThread INFO Worker queries/s: [030.6][013.3][015.8][013.5]
2022-06-21 15:34:41,279 MainThread INFO Worker queries/s: [022.7][009.1][010.4][009.4]
2022-06-21 15:34:51,285 MainThread INFO Worker queries/s: [035.8][016.1][018.6][016.5]
2022-06-21 15:35:01,295 MainThread INFO Worker queries/s: [033.3][014.7][016.8][015.9]
2022-06-21 15:35:11,305 MainThread INFO Worker queries/s: [027.9][015.9][014.8][016.0]
2022-06-21 15:35:21,315 MainThread INFO Worker queries/s: [027.5][016.9][014.8][016.8]
2022-06-21 15:35:31,323 MainThread INFO Worker queries/s: [029.5][013.5][013.9][016.4]
2022-06-21 15:35:41,333 MainThread INFO Worker queries/s: [029.8][014.4][014.3][014.7]
2022-06-21 15:35:51,343 MainThread INFO Worker queries/s: [030.7][015.9][014.8][016.3]
2022-06-21 15:36:01,355 MainThread INFO Worker queries/s: [027.0][015.7][013.8][014.3]
2022-06-21 15:36:11,365 MainThread INFO Worker queries/s: [030.6][015.6][013.2][015.5]
2022-06-21 15:36:21,374 MainThread INFO Worker queries/s: [035.4][016.6][016.3][017.7]
2022-06-21 15:36:31,384 MainThread INFO Worker queries/s: [035.9][017.0][019.4][017.6]
2022-06-21 15:36:41,393 MainThread INFO Worker queries/s: [035.9][015.2][016.1][017.5]
2022-06-21 15:36:51,403 MainThread INFO Worker queries/s: [033.8][015.1][017.6][017.3]
2022-06-21 15:37:01,413 MainThread INFO Worker queries/s: [035.6][015.5][018.7][017.2]
2022-06-21 15:37:11,423 MainThread INFO Worker queries/s: [028.1][014.8][012.8][013.7]
2022-06-21 15:37:21,433 MainThread INFO Worker queries/s: [028.5][014.3][013.1][014.7]
2022-06-21 15:37:31,444 MainThread INFO Worker queries/s: [025.6][014.1][013.1][014.6]
2022-06-21 15:37:41,452 MainThread INFO Worker queries/s: [027.7][013.1][011.7][014.0]
2022-06-21 15:37:51,463 MainThread INFO Worker queries/s: [026.2][013.9][012.8][014.3]
2022-06-21 15:38:01,472 MainThread INFO Worker queries/s: [025.9][013.3][011.7][013.2]
2022-06-21 15:38:11,482 MainThread INFO Worker queries/s: [024.9][012.5][012.5][013.3]
2022-06-21 15:38:21,492 MainThread INFO Worker queries/s: [026.8][012.8][011.3][014.0]
2022-06-21 15:38:31,502 MainThread INFO Worker queries/s: [024.6][013.5][011.9][012.5]
2022-06-21 15:38:41,511 MainThread INFO Worker queries/s: [024.5][012.6][011.4][013.0]
2022-06-21 15:38:51,521 MainThread INFO Worker queries/s: [024.7][012.1][011.7][014.8]
2022-06-21 15:39:01,536 MainThread INFO Worker queries/s: [023.8][012.9][011.0][013.5]
2022-06-21 15:39:11,546 MainThread INFO Worker queries/s: [022.1][012.8][010.0][012.4]
2022-06-21 15:39:21,556 MainThread INFO Worker queries/s: [023.7][013.0][012.0][012.4]
2022-06-21 15:39:31,562 MainThread INFO Worker queries/s: [024.8][013.5][011.2][013.4]
2022-06-21 15:39:41,572 MainThread INFO Worker queries/s: [026.3][014.3][012.8][014.1]
2022-06-21 15:39:51,582 MainThread INFO Worker queries/s: [024.7][013.4][011.8][012.8]
2022-06-21 15:40:01,592 MainThread INFO Worker queries/s: [023.5][012.5][011.8][013.3]
2022-06-21 15:40:11,604 MainThread INFO Worker queries/s: [021.2][011.5][009.9][012.4]
2022-06-21 15:40:21,614 MainThread INFO Worker queries/s: [022.6][011.8][010.6][012.5]
2022-06-21 15:40:31,625 MainThread INFO Worker queries/s: [023.4][012.3][011.8][012.3]
2022-06-21 15:40:41,635 MainThread INFO Worker queries/s: [019.3][010.8][010.5][012.3]
2022-06-21 15:40:51,648 MainThread INFO Worker queries/s: [022.9][012.3][010.0][012.2]
2022-06-21 15:41:01,658 MainThread INFO Worker queries/s: [021.1][011.4][011.3][011.9]
2022-06-21 15:41:11,669 MainThread INFO Worker queries/s: [021.9][012.6][011.8][012.6]
2022-06-21 15:41:21,674 MainThread INFO Worker queries/s: [022.5][012.8][011.6][012.3]
2022-06-21 15:41:31,684 MainThread INFO Worker queries/s: [023.9][012.8][012.2][011.8]
2022-06-21 15:41:41,697 MainThread INFO Worker queries/s: [027.4][012.2][013.5][001.6]
2022-06-21 15:41:51,706 MainThread INFO Worker queries/s: [028.8][013.7][013.4][011.8]
2022-06-21 15:42:01,716 MainThread INFO Worker queries/s: [027.3][013.4][014.5][013.0]
2022-06-21 15:42:11,726 MainThread INFO Worker queries/s: [027.8][013.4][014.5][014.6]
2022-06-21 15:42:21,736 MainThread INFO Worker queries/s: [027.7][012.7][013.5][013.2]
2022-06-21 15:42:31,746 MainThread INFO Worker queries/s: [025.0][013.0][013.4][012.0]
2022-06-21 15:42:41,756 MainThread INFO Worker queries/s: [023.7][011.9][012.2][012.0]
2022-06-21 15:42:51,766 MainThread INFO Worker queries/s: [026.6][013.5][014.4][013.7]
2022-06-21 15:43:01,777 MainThread INFO Worker queries/s: [019.5][010.5][009.7][010.2]
2022-06-21 15:43:11,787 MainThread INFO Worker queries/s: [012.9][006.0][008.2][000.5]
2022-06-21 15:43:21,797 MainThread INFO Worker queries/s: [020.0][013.2][013.3][000.3]
2022-06-21 15:43:31,807 MainThread INFO Worker queries/s: [025.4][014.4][013.6][011.6]
2022-06-21 15:43:41,817 MainThread INFO Worker queries/s: [024.0][012.2][012.8][012.3]
2022-06-21 15:43:51,827 MainThread INFO Worker queries/s: [010.7][004.6][006.3][004.9]
2022-06-21 15:44:01,837 MainThread INFO Worker queries/s: [013.6][004.8][007.0][004.9]
2022-06-21 15:44:11,847 MainThread INFO Worker queries/s: [022.9][010.9][012.7][009.6]
2022-06-21 15:44:21,856 MainThread INFO Worker queries/s: [025.1][013.3][013.9][012.6]
2022-06-21 15:44:31,866 MainThread INFO Worker queries/s: [018.9][006.8][007.4][005.6]
2022-06-21 15:44:41,879 MainThread INFO Worker queries/s: [023.2][011.4][012.2][010.3]
2022-06-21 15:44:51,888 MainThread INFO Worker queries/s: [017.2][006.8][007.5][008.1]
2022-06-21 15:45:01,900 MainThread INFO Worker queries/s: [016.2][007.1][009.3][007.3]
2022-06-21 15:45:11,910 MainThread INFO Worker queries/s: [018.9][009.3][010.8][008.0]
2022-06-21 15:45:21,920 MainThread INFO Worker queries/s: [019.8][010.6][010.3][010.7]
2022-06-21 15:45:31,932 MainThread INFO Worker queries/s: [023.3][011.6][012.9][011.4]
2022-06-21 15:45:41,942 MainThread INFO Worker queries/s: [015.8][007.6][008.5][007.7]
2022-06-21 15:45:51,951 MainThread INFO Worker queries/s: [018.0][007.8][009.3][008.1]
2022-06-21 15:46:01,962 MainThread INFO Worker queries/s: [024.5][010.8][012.2][010.9]
2022-06-21 15:46:11,971 MainThread INFO Worker queries/s: [014.9][007.4][007.7][009.3]
2022-06-21 15:46:21,981 MainThread INFO Worker queries/s: [023.6][011.9][011.9][011.0]
2022-06-21 15:46:31,990 MainThread INFO Worker queries/s: [028.2][012.5][013.6][012.3]
2022-06-21 15:46:42,001 MainThread INFO Worker queries/s: [027.4][011.9][012.6][012.8]
2022-06-21 15:46:52,010 MainThread INFO Worker queries/s: [024.3][012.5][011.8][008.5]
2022-06-21 15:47:02,020 MainThread INFO Worker queries/s: [024.2][012.1][010.7][000.0]
2022-06-21 15:47:12,030 MainThread INFO Worker queries/s: [022.5][011.4][009.9][000.0]
2022-06-21 15:47:22,040 MainThread INFO Worker queries/s: [021.7][010.1][009.7][009.4]
2022-06-21 15:47:32,050 MainThread INFO Worker queries/s: [024.3][012.0][011.2][009.3]
2022-06-21 15:47:42,060 MainThread INFO Worker queries/s: [027.4][013.7][012.3][013.1]
2022-06-21 15:47:52,070 MainThread INFO Worker queries/s: [024.8][011.2][012.1][010.9]
2022-06-21 15:48:02,080 MainThread INFO Worker queries/s: [025.4][011.5][011.5][013.3]
2022-06-21 15:48:12,090 MainThread INFO Worker queries/s: [023.2][011.3][012.1][010.6]
2022-06-21 15:48:22,100 MainThread INFO Worker queries/s: [025.2][012.9][012.2][012.3]
2022-06-21 15:48:32,110 MainThread INFO Worker queries/s: [025.3][012.7][012.1][012.1]
2022-06-21 15:48:42,120 MainThread INFO Worker queries/s: [024.4][011.5][010.7][011.1]
2022-06-21 15:48:52,126 MainThread INFO Worker queries/s: [021.0][010.4][009.4][008.0]
2022-06-21 15:49:02,136 MainThread INFO Worker queries/s: [019.9][010.8][011.8][009.8]
2022-06-21 15:49:12,147 MainThread INFO Worker queries/s: [024.3][010.5][011.1][010.8]
2022-06-21 15:49:22,157 MainThread INFO Worker queries/s: [012.8][005.4][007.6][007.3]
2022-06-21 15:49:32,167 MainThread INFO Worker queries/s: [016.3][008.2][009.0][008.3]
2022-06-21 15:49:42,177 MainThread INFO Worker queries/s: [017.3][008.3][009.6][008.6]
2022-06-21 15:49:52,187 MainThread INFO Worker queries/s: [020.0][009.3][010.7][009.7]
2022-06-21 15:50:02,197 MainThread INFO Worker queries/s: [017.1][009.6][010.6][010.4]
2022-06-21 15:50:12,207 MainThread INFO Worker queries/s: [016.4][009.5][009.0][007.8]
2022-06-21 15:50:22,216 MainThread INFO Worker queries/s: [019.3][008.5][009.7][008.5]
2022-06-21 15:50:32,226 MainThread INFO Worker queries/s: [020.9][010.3][010.6][010.2]
2022-06-21 15:50:42,236 MainThread INFO Worker queries/s: [018.4][010.0][009.4][008.5]
2022-06-21 15:50:52,245 MainThread INFO Worker queries/s: [018.2][009.0][008.9][009.2]
2022-06-21 15:51:02,255 MainThread INFO Worker queries/s: [017.5][006.7][008.5][007.4]
2022-06-21 15:51:12,264 MainThread INFO Worker queries/s: [021.1][008.8][010.3][008.1]
2022-06-21 15:51:22,274 MainThread INFO Worker queries/s: [021.7][009.7][010.6][007.5]
2022-06-21 15:51:32,284 MainThread INFO Worker queries/s: [020.9][010.2][010.6][010.2]
2022-06-21 15:51:42,294 MainThread INFO Worker queries/s: [022.7][011.7][010.8][010.5]
2022-06-21 15:51:52,300 MainThread INFO Worker queries/s: [018.9][010.8][010.2][009.7]
2022-06-21 15:52:02,309 MainThread INFO Worker queries/s: [017.5][009.0][009.9][009.0]
2022-06-21 15:52:12,319 MainThread INFO Worker queries/s: [020.0][011.2][012.1][009.4]
2022-06-21 15:52:22,329 MainThread INFO Worker queries/s: [017.8][010.6][010.9][008.6]
2022-06-21 15:52:32,338 MainThread INFO Worker queries/s: [020.9][009.5][009.3][008.7]
2022-06-21 15:52:42,348 MainThread INFO Worker queries/s: [017.3][006.7][009.0][008.0]
2022-06-21 15:52:52,358 MainThread INFO Worker queries/s: [021.9][009.7][009.9][010.4]
2022-06-21 15:53:02,369 MainThread INFO Worker queries/s: [019.5][006.1][007.8][007.8]
2022-06-21 15:53:12,379 MainThread INFO Worker queries/s: [017.9][008.1][008.4][009.0]
2022-06-21 15:53:22,389 MainThread INFO Worker queries/s: [019.1][009.2][008.4][009.7]
2022-06-21 15:53:32,398 MainThread INFO Worker queries/s: [013.4][006.1][007.6][008.2]
2022-06-21 15:53:42,406 MainThread INFO Worker queries/s: [018.7][008.9][009.1][010.0]
2022-06-21 15:53:52,416 MainThread INFO Worker queries/s: [018.3][008.8][008.4][007.9]
2022-06-21 15:54:02,426 MainThread INFO Worker queries/s: [012.6][010.1][009.4][007.5]
2022-06-21 15:54:12,435 MainThread INFO Worker queries/s: [018.4][009.8][009.3][009.2]
2022-06-21 15:54:22,445 MainThread INFO Worker queries/s: [017.3][009.7][009.8][008.7]
2022-06-21 15:54:32,455 MainThread INFO Worker queries/s: [020.9][009.9][009.4][008.5]
2022-06-21 15:54:42,465 MainThread INFO Worker queries/s: [017.1][009.2][007.8][008.7]
2022-06-21 15:54:52,475 MainThread INFO Worker queries/s: [021.1][011.0][009.8][011.2]
2022-06-21 15:55:02,482 MainThread INFO Worker queries/s: [015.2][006.5][007.2][007.6]
2022-06-21 15:55:12,491 MainThread INFO Worker queries/s: [010.9][007.7][000.0][009.7]
2022-06-21 15:55:22,501 MainThread INFO Worker queries/s: [021.6][010.0][000.0][010.2]
2022-06-21 15:55:30,478 worker_2 ERROR Exception while reconnecting to YugabyteDB
2022-06-21 15:55:30,478 worker_2 ERROR could not connect to server: Connection refused
Is the server running on host "10.9.219.0" and accepting
TCP/IP connections on port 5433?
2022-06-21 15:55:30,495 worker_2 ERROR Exception while reconnecting to YugabyteDB
2022-06-21 15:55:30,495 worker_2 ERROR could not connect to server: Connection refused
Is the server running on host "10.9.219.0" and accepting
TCP/IP connections on port 5433?
2022-06-21 15:55:30,512 worker_2 ERROR Exception while reconnecting to YugabyteDB
2022-06-21 15:55:30,512 worker_2 ERROR could not connect to server: Connection refused
Is the server running on host "10.9.219.0" and accepting
TCP/IP connections on port 5433?
2022-06-21 15:55:30,531 worker_2 ERROR Exception while reconnecting to YugabyteDB
2022-06-21 15:55:30,531 worker_2 ERROR could not connect to server: Connection refused
Is the server running on host "10.9.219.0" and accepting
TCP/IP connections on port 5433?
2022-06-21 15:55:30,548 worker_2 ERROR Exception while reconnecting to YugabyteDB
2022-06-21 15:55:30,549 worker_2 ERROR could not connect to server: Connection refused
Is the server running on host "10.9.219.0" and accepting
TCP/IP connections on port 5433?
2022-06-21 15:55:30,566 worker_2 ERROR Exception while reconnecting to YugabyteDB
2022-06-21 15:55:30,566 worker_2 ERROR could not connect to server: Connection refused
Is the server running on host "10.9.219.0" and accepting
TCP/IP connections on port 5433?
2022-06-21 15:55:30,585 worker_2 ERROR Exception while reconnecting to YugabyteDB
2022-06-21 15:55:30,585 worker_2 ERROR could not connect to server: Connection refused
Is the server running on host "10.9.219.0" and accepting
TCP/IP connections on port 5433?
2022-06-21 15:55:30,602 worker_2 ERROR Exception while reconnecting to YugabyteDB
2022-06-21 15:55:30,602 worker_2 ERROR could not connect to server: Connection refused
Is the server running on host "10.9.219.0" and accepting
TCP/IP connections on port 5433?
2022-06-21 15:55:30,619 worker_2 ERROR Exception while reconnecting to YugabyteDB
2022-06-21 15:55:30,619 worker_2 ERROR could not connect to server: Connection refused
Is the server running on host "10.9.219.0" and accepting
TCP/IP connections on port 5433?
2022-06-21 15:55:30,636 worker_2 ERROR Exception while reconnecting to YugabyteDB
2022-06-21 15:55:30,636 worker_2 ERROR could not connect to server: Connection refused
Is the server running on host "10.9.219.0" and accepting
TCP/IP connections on port 5433?
2022-06-21 15:55:30,636 worker_2 ERROR Giving up
Exception in thread worker_2:
Traceback (most recent call last):
File "/usr/share/python-3.7.7/lib/python3.7/threading.py", line 926, in _bootstrap_inner
self.run()
File "/usr/share/python-3.7.7/lib/python3.7/threading.py", line 870, in run
self._target(*self._args, **self._kwargs)
File "/var/lib/jenkins/code/yb-long-system-test/nohints/lst/worker.py", line 49, in run
with self.conn.cursor() as cur:
psycopg2.InterfaceError: connection already closed
2022-06-21 15:55:32,511 MainThread INFO Worker queries/s: [022.1][009.9][000.0][010.4]
2022-06-21 15:55:42,521 MainThread INFO Worker queries/s: [021.9][009.6][000.0][010.8]
2022-06-21 15:55:52,531 MainThread INFO Worker queries/s: [020.4][010.5][000.0][010.6]
2022-06-21 15:56:02,541 MainThread INFO Worker queries/s: [014.7][007.8][000.0][007.4]
2022-06-21 15:56:09,638 worker_0 ERROR Unexpected query failure: InternalError_
Query: UPDATE t0 SET c0_int4range = int4range((octet_length('66')), NULL) WHERE (64.63889331793206) != (-16.267694121145368);
values: None
runtime: 2022-06-21 15:56:03.092 - 2022-06-21 15:56:09.637
supports explain: True
supports rollback: True
affected rows: None
Action: SingleUpdateAction
Error class: InternalError_
Error code: XX000
Error message: ERROR: Already present: [Already present (yb/consensus/retryable_requests.cc:244): Duplicate request 341 from client 95e6fe92-ce24-4b86-9f23-44da1c3c1e42 (min running 341)]
Transaction isolation level: default
DB Node: host: 10.9.133.26, port: 5433
DB Backend PID: 17655
Duplicate retryable requests were resolved by fix for #15006
Reopen issue since it's happening on some overloaded clusters.