ytsaurus
ytsaurus copied to clipboard
Data throughput statistics in master transactions
In the context of master transactions, there are often data uploads, sometimes even distributed ones.
It is often desirable to understand the speed at which new versions of data are being uploaded. Currently, we can only see the presence of an exclusive lock, and a total value of user write throughput through monitoring.
Proposal: introduce transient counters similar to dyntable performance counters within the context of a transaction. For example, use an EMA (exponential moving average) as a state. With each chunk completion, propagate an update (e.g., update(chunk.dataWeight)) up the chain of transactions.
Doing so for write throughput is desirable, but theoretically it can also be useful for read throughput (though may be harder due to various ways of reading data not involving the transactions, e.g. through CHYT or various caches).