yugabyte-db icon indicating copy to clipboard operation
yugabyte-db copied to clipboard

[YSQL] Issue in select count(*) from table which resulting different count when ran multiple times

Open priyanshi-yb opened this issue 1 year ago • 5 comments

Jira Link: DB-3595

Description

We have observed that once the data is ingested in a table and when we fire the select count(*) from accounts on that table multiple times without any operation being performed on the table, the result is changing consistently.

Noted the observations in this document - https://docs.google.com/document/d/1ha9dpdk7lyFvlBcA3doIHIhGbTTHkhMV9h2HhqxSNV8/edit?usp=sharing

In these experiments we fired the count(*) on the same table multiple times and experienced that the output is changing consistently.

The schema of the table -

etl2=# \dS accounts;
                Table "public.accounts"
     Column      |  Type  | Collation | Nullable | Default 
------------------+--------+-----------+----------+---------
block            | bigint |           | not null | 
address          | text   |           | not null | 
dc_balance       | bigint |           | not null | 0
dc_nonce         | bigint |           | not null | 0
security_balance | bigint |           | not null | 0
security_nonce   | bigint |           | not null | 0
balance          | bigint |           | not null | 0
nonce            | bigint |           | not null | 0
staked_balance   | bigint |           |          | 
Indexes:
   "accounts_pkey" PRIMARY KEY, lsm (block HASH, address ASC)
   "account_address_idx" lsm (address HASH)
   "account_block_idx" lsm (block HASH)
Triggers:
   account_insert AFTER INSERT ON accounts FOR EACH ROW EXECUTE PROCEDURE account_inventory_update()

YugabyteDB Cluster information -

The cluster name - pgupta-voyager-test-2 (https://portal.dev.yugabyte.com/universes/e0bd5274-9145-46a2-b27f-109c6524ac2e ) DB Build name - 2.14.2.0-b16 DB Version - 11.2-YB-2.14.2.0-b0 DB name - etl2 Table name - accounts

Data ingested was of size - ~96GB and has 700172554 rows and is present in S3 in this bucket - s3://yb-voyager-test-data/data/accounts_data.sql

priyanshi-yb avatar Sep 21 '22 13:09 priyanshi-yb

@priyanshi-yb Is this with set yb_disable_transactional_writes=true; or specific gflags? I'll try to reproduce this and https://github.com/yugabyte/yugabyte-db/issues/14113

def- avatar Sep 21 '22 17:09 def-

@def- No, not with this set yb_disable_transactional_writes=true; but the data was ingested with these session variables -

SET client_encoding TO 'UTF8';
SET session_replication_role TO replica;
SET yb_enable_upsert_mode to true;

priyanshi-yb avatar Sep 21 '22 17:09 priyanshi-yb

I see that there were a lot of FATALs on node 2:

[yugabyte@ip ~]$ find /mnt/d0 -name '*FATAL.details*'
/mnt/d0/yb-data/tserver/logs/yb-tserver.FATAL.details.2022-09-05T15_46_46.pid28930.txt
/mnt/d0/yb-data/tserver/logs/yb-tserver.FATAL.details.2022-09-05T15_46_50.pid18501.txt
/mnt/d0/yb-data/tserver/logs/yb-tserver.FATAL.details.2022-09-21T10_46_11.pid2149.txt
/mnt/d0/yb-data/tserver/logs/yb-tserver.FATAL.details.2022-09-21T10_51_46.pid3698.txt
/mnt/d0/yb-data/tserver/logs/yb-tserver.FATAL.details.2022-09-21T10_49_47.pid43912.txt
/mnt/d0/yb-data/tserver/logs/yb-tserver.FATAL.details.2022-09-21T10_56_34.pid12027.txt
/mnt/d0/yb-data/tserver/logs/yb-tserver.FATAL.details.2022-09-21T11_01_13.pid20756.txt
[yugabyte@ip-172-151-77-199 ~]$ cat /mnt/d0/yb-data/tserver/logs/yb-tserver.FATAL.details.2022-09-21T10_46_11.pid2149.txt
F20220921 10:46:11 ../../src/yb/tablet/operations/operation_driver.cc:399] T f3acef454d564e4d81088e8e114427cf P 849132839a8f472c8df8106357f4987c S RD-P Ts { physical: 1663757111175053 } kChangeMetadata (0x0000000e0d23e240): Apply failed: Timed out (yb/util/operation_counter.cc:166): Timed out waiting for all pending operations to complete. 216 transactions pending. Waited for 60.000s, request: tablet_id: "f3acef454d564e4d81088e8e114427cf" schema { columns { id: 0 name: "block" type { main: INT64 } is_key: true is_hash_key: true is_nullable: false is_static: false is_counter: false sorting_type: 0 order: 1 pg_type_oid: 20 } columns { id: 1 name: "address" type { main: STRING } is_key: true is_nullable: false is_static: false is_counter: false sorting_type: 3 order: 2 pg_type_oid: 25 } columns { id: 2 name: "dc_balance" type { main: INT64 } is_key: false is_nullable: true is_static: false is_counter: false sorting_type: 0 order: 3 pg_type_oid: 20 } columns { id: 3 name: "dc_nonce" type { main: INT64 } is_key: false is_nullable: true is_static: false is_counter: false sorting_type: 0 order: 4 pg_type_oid: 20 } columns { id: 4 name: "security_balance" type { main: INT64 } is_key: false is_nullable: true is_static: false is_counter: false sorting_type: 0 order: 5 pg_type_oid: 20 } columns { id: 5 name: "security_nonce" type { main: INT64 } is_key: false is_nullable: true is_static: false is_counter: false sorting_type: 0 order: 6 pg_type_oid: 20 } columns { id: 6 name: "balance" type { main: INT64 } is_key: false is_nullable: true is_static: false is_counter: false sorting_type: 0 order: 7 pg_type_oid: 20 } columns { id: 7 name: "nonce" type { main: INT64 } is_key: false is_nullable: true is_static: false is_counter: false sorting_type: 0 order: 8 pg_type_oid: 20 } columns { id: 8 name: "staked_balance" type { main: INT64 } is_key: false is_nullable: true is_static: false is_counter: false sorting_type: 0 order: 9 pg_type_oid: 20 } table_properties { contain_counters: false is_transactional: true consistency_level: STRONG use_mangled_column_name: false num_tablets: 48 is_ysql_catalog_table: false retain_delete_markers: false partition_key_version: 0 } colocated_table_id { } pgschema_name: "public" } schema_version: 1 new_table_name: "accounts_350m_upsert_diable" dest_uuid: "849132839a8f472c8df8106357f4987c" propagated_hybrid_time: 6814749127353565184 alter_table_id: "000040000000300080000000000082a2"
    @          0x2a01089  google::LogDestination::LogToSinks()
    @          0x2a0050b  google::LogMessage::SendToLog()
    @          0x2a01853  google::LogMessage::Flush()
    @          0x2a01e6f  google::LogMessageFatal::~LogMessageFatal()
    @          0x34e2c71  yb::tablet::OperationDriver::ReplicationFinished()
    @          0x2dd4135  yb::consensus::ConsensusRound::NotifyReplicationFinished()
    @          0x2e20945  yb::consensus::ReplicaState::ApplyPendingOperationsUnlocked()
    @          0x2e1fc6d  yb::consensus::ReplicaState::AdvanceCommittedOpIdUnlocked()
    @          0x2dff748  yb::consensus::RaftConsensus::UpdateMajorityReplicated()
    @          0x2dcb361  yb::consensus::PeerMessageQueue::NotifyObserversOfMajorityReplOpChangeTask()
    @          0x3a394ee  yb::ThreadPool::DispatchThread()
    @          0x3a34c04  yb::Thread::SuperviseThread()
    @     0x7f06904e3694  start_thread
    @     0x7f06909e541d  __clone

I’m guessing tablet server crashing during copying into this table can cause lost rows. There should be an application error in that case. Is it maybe being ignored in Voyager? We have a bunch of bugs open for this kind of FATAL: https://github.com/yugabyte/yugabyte-db/search?q=%22Timed+out+waiting+for+all+pending+operations+to+complete%22&type=issues

def- avatar Sep 21 '22 18:09 def-

Locally trying to get a repro I’m always getting conflicts as soon as I copy from multiple connections at once: https://gist.github.com/def-/cff997c0d78a76747e8e6af976098411

psycopg2.errors.SerializationFailure: IO error: Errors occurred while reaching out to the tablet servers: . Errors from tablet servers: [Operation failed. Try again (yb/docdb/conflict_resolution.cc:116): 0de48430-ed30-4be3-a34c-20eb49f76e3c Conflicts with higher priority transaction: 64ba20d6-26bb-48a9-9cd4-a7e7b48bc8cf (transaction error 3), Operation failed. Try again (yb/tablet/transaction_participant.cc:209): Transaction was recently aborted: 0de48430-ed30-4be3-a34c-20eb49f76e3c: 40001 (pgsql error 40001) (transaction error 1)]

The rows are missing then, from the 1000 expected I get half with 2 threads when this conflict happened.

Grepping through the postgres logs I am seeing the same conflicts on the universe, for example:

[yugabyte@ip ~]$ grep "Conflicts with higher priority transaction" /mnt/d0/yb-data/tserver/logs/postgresql-2022-09-21_000000.log
2022-09-21 10:56:45.199 UTC [15386] ERROR:  Operation failed. Try again: [Operation failed. Try again (yb/docdb/conflict_resolution.cc:114): ef33d032-62cb-4540-8f52-d9cc32a291b1 Conflicts with higher priority transaction: abb62472-5e38-46cb-9298-626119b2cf72 (transaction error 3)]
2022-09-21 11:01:23.789 UTC [14970] ERROR:  Operation failed. Try again: [Operation failed. Try again (yb/docdb/conflict_resolution.cc:114): 826d7e53-b443-4b3f-8056-2270b4060e96 Conflicts with higher priority transaction: 4af86ee3-8759-4914-ae2d-6e8a7e5a9124 (transaction error 3)]

Are these failures handled correctly to retry the inserts?

def- avatar Sep 21 '22 18:09 def-

Please turn yb_enable_upsert_mode off for large data loading.

cc @paullee-yb

tedyu avatar Sep 21 '22 19:09 tedyu

I cannot reproduce this issue anymore with https://github.com/yugabyte/yugabyte-db/commit/03fc5792142beee569bdb6a0406e86fcfcbde666, closing as fixed.

def- avatar Oct 25 '22 14:10 def-

Activating it for backports.

rthallamko3 avatar Oct 25 '22 15:10 rthallamko3