aptos-core icon indicating copy to clipboard operation
aptos-core copied to clipboard

Add big 1-D array with fixed-width elements

Open alnoki opened this issue 2 years ago • 4 comments

Per recent discussion with @hariria, @lightmark, and @davidiw, tagging @wrwg and @zekun000

Motivations

For fixed-size elements, e.g. an inner node in a critical-bit binary search tree as below, element lookup is O(1) in C, for example, based on simple pointer arithmetic for the size_of() an individual element:

    /// Inner node
    struct InnerNode has store {
        /// Critical bit position. Bit numbers 0-indexed from LSB:
        ///
        /// ```
        /// >    11101...1010010101
        /// >     bit 5 = 0 -|    |- bit 0 = 1
        /// ```
        critical_bit: u8,
        /// Parent node vector index. `ROOT` when node is root,
        /// otherwise corresponds to vector index of an inner node.
        parent_index: u64,
        /// Left child node index. When bit 63 is set, left child is an
        /// outer node. Otherwise left child is an inner node.
        left_child_index: u64,
        /// Right child node index. When bit 63 is set, right child is
        /// an outer node. Otherwise right child is an inner node.
        right_child_index: u64
    }

Here, each InnerNode is (8 + 64 + 64 + 64) / 8 = 25 bytes wide, such that accessing the nth InnerNode in a C-style array of inner nodes requires simply requires accessing the data at n * 25 bytes from the start of the array. In C, with the array loaded into memory, this is an efficient lookup.

Aptos data storage limiting factors

Aptos implements the vector-style array in storage as a blob, which requires serializing/deserializing each blob in order to access each element within. Moreover, the vector data type does not have any notion of fixed-width elements, such that pointer arithmetic-style lookup is not possible: to access the nth element requires searching from the beginning of the blob, an O(n) operation. Moreover, when a vector exceeds the 4K chunk size it is originally allocated, another 4k blob must be allocated, and accessing an element in the second blob requires traversing over each element in the first blob. While specific documentation on this schema is not available for reference, it is conveyed here per discussions above and corrections/future plans are welcome.

Here, the computational cost to access is O(n), such that looking up the 1000th element in a vector requires iterating over the previous 999 elements, and this direct computational cost introduces two potential issues: DDos attacks, and/or inconsistent gas fees. If vector reads/writes are billed the same regardless of underlying vector length, then a malicious actor has to simply keep appending to the end of a vector in a loop, instigating an I/O-style attack that delays processing throughput. Conversely, if vector operations fees are assessed based on the underlying vector size, then users who want to add an element at the end of a protocol's registry, for instance, will pay progressively more as the registry grows.

Suggestions

If the 4k blob size is to be reserved, one suggestion is to split the 1-D array into separate blobs, then do pointer-arithmetic style O(1) lookup within each blob, which is possible for fixed-size elements. This is similar to an aptos_std::big_vector::BigVector, but adds O(1) lookup within each blob, which is made possible by the fixed-size element assumption. It is suggested that this be handled as a native function for improved efficiency.

It would be additionally appropriate to amortize the cost of creating a new blob over every vector write, such that the following operations cost the same: a push_back() which creates the last element in a bucket, and a push_back() which forces the allocation of a new bucket. This behavior could additionally be an option in aptos_std::big_vector::BigVector, e.g. amortize_new_bucket_cost_over_push_backs: bool.

alnoki avatar Aug 31 '22 18:08 alnoki

pointer arithmetic-style lookup is not possible: to access the nth element requires searching from the beginning of the blob, an O(n) operation.

I think there is some misunderstanding, vector definitely support O(1) point lookup with native public fun borrow<Element>(v: &vector<Element>, i: u64): &Element;.

What we discussed is just about serde, which seems to make a BST in a vector less attractive considering the storage overhead.

lightmark avatar Aug 31 '22 21:08 lightmark

@lightmark Thanks for the clarification. Per our recent whiteboard discussion, all vectors are stored as a single blob in the global storage Merkle tree, and so the 4k figure is only a recommendation based on typical filesystem parameters: if an SSD by default reads/writes to 4k sectors then enforcing a 4k bucket size should streamline I/O, but using 8k may not provide any meaningful difference in overall cost. Lookup within a vector, once it is loaded into memory, is still O(1), so then the optimization is just a question of reading into and writing from memory.

One thing that is still unclear, though, is the exact relative costs for different operations. For example with an element width of 64 bytes, and thus 64 elements per bucket, if it only costs 10 gas units to append an element in the general case, but 640 units to append in the case of allocating a new bucket, it costs 64x for the user who allocates a new bucket. If the cost were to amortized, however, the general case would then cost 20 units and fees would be fixed across operations.

@lightmark you mentioned that another engineer may be able to comment on this fee relativism/amortization issue - is there any documentation on the relevant gas schedule or mechanisms in place? For example the cost effects for optimizing CPU operations versus hard drive I/O?

alnoki avatar Aug 31 '22 22:08 alnoki

@runtian-zhou

alnoki avatar Sep 01 '22 18:09 alnoki

This issue is stale because it has been open 45 days with no activity. Remove the stale label or comment - otherwise this will be closed in 15 days.

github-actions[bot] avatar Nov 07 '22 03:11 github-actions[bot]

This issue is stale because it has been open 45 days with no activity. Remove the stale label or comment - otherwise this will be closed in 15 days.

github-actions[bot] avatar Dec 24 '22 01:12 github-actions[bot]