yugabyte-db
yugabyte-db copied to clipboard
[YSQL] Issue in select count(*) from table which resulting different count when ran multiple times
Jira Link: DB-3595
Description
We have observed that once the data is ingested in a table and when we fire the select count(*) from accounts
on that table multiple times without any operation being performed on the table, the result is changing consistently.
Noted the observations in this document - https://docs.google.com/document/d/1ha9dpdk7lyFvlBcA3doIHIhGbTTHkhMV9h2HhqxSNV8/edit?usp=sharing
In these experiments we fired the count(*)
on the same table multiple times and experienced that the output is changing consistently.
The schema of the table -
etl2=# \dS accounts;
Table "public.accounts"
Column | Type | Collation | Nullable | Default
------------------+--------+-----------+----------+---------
block | bigint | | not null |
address | text | | not null |
dc_balance | bigint | | not null | 0
dc_nonce | bigint | | not null | 0
security_balance | bigint | | not null | 0
security_nonce | bigint | | not null | 0
balance | bigint | | not null | 0
nonce | bigint | | not null | 0
staked_balance | bigint | | |
Indexes:
"accounts_pkey" PRIMARY KEY, lsm (block HASH, address ASC)
"account_address_idx" lsm (address HASH)
"account_block_idx" lsm (block HASH)
Triggers:
account_insert AFTER INSERT ON accounts FOR EACH ROW EXECUTE PROCEDURE account_inventory_update()
YugabyteDB Cluster information -
The cluster name - pgupta-voyager-test-2
(https://portal.dev.yugabyte.com/universes/e0bd5274-9145-46a2-b27f-109c6524ac2e )
DB Build name - 2.14.2.0-b16
DB Version - 11.2-YB-2.14.2.0-b0
DB name - etl2
Table name - accounts
Data ingested was of size - ~96GB
and has 700172554
rows and is present in S3 in this bucket - s3://yb-voyager-test-data/data/accounts_data.sql
@priyanshi-yb Is this with set yb_disable_transactional_writes=true;
or specific gflags? I'll try to reproduce this and https://github.com/yugabyte/yugabyte-db/issues/14113
@def- No, not with this set yb_disable_transactional_writes=true;
but the data was ingested with these session variables -
SET client_encoding TO 'UTF8';
SET session_replication_role TO replica;
SET yb_enable_upsert_mode to true;
I see that there were a lot of FATALs on node 2:
[yugabyte@ip ~]$ find /mnt/d0 -name '*FATAL.details*'
/mnt/d0/yb-data/tserver/logs/yb-tserver.FATAL.details.2022-09-05T15_46_46.pid28930.txt
/mnt/d0/yb-data/tserver/logs/yb-tserver.FATAL.details.2022-09-05T15_46_50.pid18501.txt
/mnt/d0/yb-data/tserver/logs/yb-tserver.FATAL.details.2022-09-21T10_46_11.pid2149.txt
/mnt/d0/yb-data/tserver/logs/yb-tserver.FATAL.details.2022-09-21T10_51_46.pid3698.txt
/mnt/d0/yb-data/tserver/logs/yb-tserver.FATAL.details.2022-09-21T10_49_47.pid43912.txt
/mnt/d0/yb-data/tserver/logs/yb-tserver.FATAL.details.2022-09-21T10_56_34.pid12027.txt
/mnt/d0/yb-data/tserver/logs/yb-tserver.FATAL.details.2022-09-21T11_01_13.pid20756.txt
[yugabyte@ip-172-151-77-199 ~]$ cat /mnt/d0/yb-data/tserver/logs/yb-tserver.FATAL.details.2022-09-21T10_46_11.pid2149.txt
F20220921 10:46:11 ../../src/yb/tablet/operations/operation_driver.cc:399] T f3acef454d564e4d81088e8e114427cf P 849132839a8f472c8df8106357f4987c S RD-P Ts { physical: 1663757111175053 } kChangeMetadata (0x0000000e0d23e240): Apply failed: Timed out (yb/util/operation_counter.cc:166): Timed out waiting for all pending operations to complete. 216 transactions pending. Waited for 60.000s, request: tablet_id: "f3acef454d564e4d81088e8e114427cf" schema { columns { id: 0 name: "block" type { main: INT64 } is_key: true is_hash_key: true is_nullable: false is_static: false is_counter: false sorting_type: 0 order: 1 pg_type_oid: 20 } columns { id: 1 name: "address" type { main: STRING } is_key: true is_nullable: false is_static: false is_counter: false sorting_type: 3 order: 2 pg_type_oid: 25 } columns { id: 2 name: "dc_balance" type { main: INT64 } is_key: false is_nullable: true is_static: false is_counter: false sorting_type: 0 order: 3 pg_type_oid: 20 } columns { id: 3 name: "dc_nonce" type { main: INT64 } is_key: false is_nullable: true is_static: false is_counter: false sorting_type: 0 order: 4 pg_type_oid: 20 } columns { id: 4 name: "security_balance" type { main: INT64 } is_key: false is_nullable: true is_static: false is_counter: false sorting_type: 0 order: 5 pg_type_oid: 20 } columns { id: 5 name: "security_nonce" type { main: INT64 } is_key: false is_nullable: true is_static: false is_counter: false sorting_type: 0 order: 6 pg_type_oid: 20 } columns { id: 6 name: "balance" type { main: INT64 } is_key: false is_nullable: true is_static: false is_counter: false sorting_type: 0 order: 7 pg_type_oid: 20 } columns { id: 7 name: "nonce" type { main: INT64 } is_key: false is_nullable: true is_static: false is_counter: false sorting_type: 0 order: 8 pg_type_oid: 20 } columns { id: 8 name: "staked_balance" type { main: INT64 } is_key: false is_nullable: true is_static: false is_counter: false sorting_type: 0 order: 9 pg_type_oid: 20 } table_properties { contain_counters: false is_transactional: true consistency_level: STRONG use_mangled_column_name: false num_tablets: 48 is_ysql_catalog_table: false retain_delete_markers: false partition_key_version: 0 } colocated_table_id { } pgschema_name: "public" } schema_version: 1 new_table_name: "accounts_350m_upsert_diable" dest_uuid: "849132839a8f472c8df8106357f4987c" propagated_hybrid_time: 6814749127353565184 alter_table_id: "000040000000300080000000000082a2"
@ 0x2a01089 google::LogDestination::LogToSinks()
@ 0x2a0050b google::LogMessage::SendToLog()
@ 0x2a01853 google::LogMessage::Flush()
@ 0x2a01e6f google::LogMessageFatal::~LogMessageFatal()
@ 0x34e2c71 yb::tablet::OperationDriver::ReplicationFinished()
@ 0x2dd4135 yb::consensus::ConsensusRound::NotifyReplicationFinished()
@ 0x2e20945 yb::consensus::ReplicaState::ApplyPendingOperationsUnlocked()
@ 0x2e1fc6d yb::consensus::ReplicaState::AdvanceCommittedOpIdUnlocked()
@ 0x2dff748 yb::consensus::RaftConsensus::UpdateMajorityReplicated()
@ 0x2dcb361 yb::consensus::PeerMessageQueue::NotifyObserversOfMajorityReplOpChangeTask()
@ 0x3a394ee yb::ThreadPool::DispatchThread()
@ 0x3a34c04 yb::Thread::SuperviseThread()
@ 0x7f06904e3694 start_thread
@ 0x7f06909e541d __clone
I’m guessing tablet server crashing during copying into this table can cause lost rows. There should be an application error in that case. Is it maybe being ignored in Voyager? We have a bunch of bugs open for this kind of FATAL: https://github.com/yugabyte/yugabyte-db/search?q=%22Timed+out+waiting+for+all+pending+operations+to+complete%22&type=issues
Locally trying to get a repro I’m always getting conflicts as soon as I copy from multiple connections at once: https://gist.github.com/def-/cff997c0d78a76747e8e6af976098411
psycopg2.errors.SerializationFailure: IO error: Errors occurred while reaching out to the tablet servers: . Errors from tablet servers: [Operation failed. Try again (yb/docdb/conflict_resolution.cc:116): 0de48430-ed30-4be3-a34c-20eb49f76e3c Conflicts with higher priority transaction: 64ba20d6-26bb-48a9-9cd4-a7e7b48bc8cf (transaction error 3), Operation failed. Try again (yb/tablet/transaction_participant.cc:209): Transaction was recently aborted: 0de48430-ed30-4be3-a34c-20eb49f76e3c: 40001 (pgsql error 40001) (transaction error 1)]
The rows are missing then, from the 1000 expected I get half with 2 threads when this conflict happened.
Grepping through the postgres logs I am seeing the same conflicts on the universe, for example:
[yugabyte@ip ~]$ grep "Conflicts with higher priority transaction" /mnt/d0/yb-data/tserver/logs/postgresql-2022-09-21_000000.log
2022-09-21 10:56:45.199 UTC [15386] ERROR: Operation failed. Try again: [Operation failed. Try again (yb/docdb/conflict_resolution.cc:114): ef33d032-62cb-4540-8f52-d9cc32a291b1 Conflicts with higher priority transaction: abb62472-5e38-46cb-9298-626119b2cf72 (transaction error 3)]
2022-09-21 11:01:23.789 UTC [14970] ERROR: Operation failed. Try again: [Operation failed. Try again (yb/docdb/conflict_resolution.cc:114): 826d7e53-b443-4b3f-8056-2270b4060e96 Conflicts with higher priority transaction: 4af86ee3-8759-4914-ae2d-6e8a7e5a9124 (transaction error 3)]
Are these failures handled correctly to retry the inserts?
Please turn yb_enable_upsert_mode
off for large data loading.
cc @paullee-yb
I cannot reproduce this issue anymore with https://github.com/yugabyte/yugabyte-db/commit/03fc5792142beee569bdb6a0406e86fcfcbde666, closing as fixed.
Activating it for backports.