zeitgeist
zeitgeist copied to clipboard
Investigation: How to safely remove outcome assets ("garbage collect") over multiple blocks?
Problem
Zeitgeist should provide a fast environment, in which outcome assets get created and unused ones get efficiently destroyed.
Currently Substrate uses a Base-16 Modified Merkle Tree ("Trie") data structure above a key-value database. That's why each storage read/write, in this case the deletion of an outcome asset, takes O(log(n)), where n is the number of nodes in the merkle tree. Therefore storage accesses are more expensive than a non-tree key-value database.
However, to save balances for traders in a prediction market, we use a StorageDoubleMap from orml_tokens
. To properly delete all balances for a specific currency/asset, we are left with the usage of the storage API.
The current challenge is, that the orml_tokens
StorageDoubleMap
maps from AccountId
to CurrencyId
, which requires us to iterate over all accounts, which is unsustainable. That's why we tried to cache all accounts, which have a specific currency id. This adds extra storage and expensive reads and writes and will not scale well in the long run, when the storage layers are not switched. There is no change in sight, which will allow us to have a StorageDoubleMap
from CurrencyId
to AccountId
, because orml_tokens
is optimised for the query of one specific AccountId
rather than to get one specific CurrencyId
.
In addition, orml_tokens
is really important for Zeitgeist, because it will allow to handle tokens from other chains (like aUSD) in the way as pallet_balances
does (with NamedMultiReservableCurrency or MultiLockableCurrency). This needs to get pointed out, when we might want to use pallet_assets
(maintained by Parity), which does not have all the functionality of orml_tokens
. In there, one can use a destroy
function, which is exactly what we need (at first thinking). Therefore I tried theoretically the use of pallet_assets
for the outcome assets, but found out, that too many accounts can't get deleted (info down below). The need to use orml_tokens
for the xcm tokens persists. When we might consider a solution with another pallet or custom implementation, we need to decouple the outcome asset logic from orml_tokens
and migrate it to the other pallet / custom implementation. I would strongly suggest that, because Zeitgeist depends on an efficient way to delete the unused / finished outcome assets. I know, that the ideal solution to that problem should take O(1) time, but then we shouldn't use the tree layer from Substrate and would need to implement or use another storage layer just for the outcome asset information (asset_id -> account -> balance).
Try with pallet_assets
pallet_assets
provides a do_destroy
function, which has takes a DestroyWitness
as argument and returns the resulting DestroyWitness
. One could specify the number of accounts, which is the maximum limit for deletions. If the accounts of the storage map Asset
exceed the as parameter specified number, the extrinsic fails. The accounts number of the storage map Asset
increases here. This means, that we can't really use do_destroy
, because an asset id could have too much accounts for one block. do_destroy
only allows to destroy all account information or to destroy nothing and return BadWitness
if the witness parameter accounts
is too small. When we specify it to u32::MAX
and there are u32::MAX
accounts for a specific asset id, the blockchain has to do too much in one extrinsic.
I created an issue for this here.
However, in one or another way, we need to iterate and remove a map structure in substrate. To remove items from a StorageDoubleMap
, one can use drain_prefix
or clear_prefix
. drain_prefix
would be important, if we'd like to iterate over each element before we remove it. clear_prefix
just takes care, that the elements get removed without looking at every deleted element.
Because the number of accounts, who hold balances for a specific outcome asset is unbounded, but a blockchain is a constrained environment, we need to limit the number of deletions. This leads us to distribute the required effort over multiple blocks.
First Solution Approach
The clear_prefix
function serves us very well in the regard, that we'd like to know, when to stop the deletion process for a specific first_key
(in our case outcome asset id).
Once the resultant maybe_cursor field is None, then no further items remain to be deleted..
This requires us also to block each possible item insertion for this particular asset id. So in example: No owner of an outcome asset, which was from a resolved market, should be able to trade or mint the related outcome assets for this market.
NOTE: After the initial call for any given map, it is important that no further items are inserted into the map which match the first_key. If so, then the map may not be empty when the resultant maybe_cursor is None.
Then we have the cursor function parameter, which points to the location, where to continue removing.
None should only be passed once (in the initial call) for any given storage map and first_key. Subsequent calls operating on the same map/first_key should always pass Some, and this should be equal to the previous call result’s maybe_cursor field.
Example usage of clear_prefix
:
#[pallet::storage]
pub type GarbageAssets<T: Config> = StorageValue<
_,
BoundedVec<AssetId, T::GarbageAssetsLimit>,
ValueQuery,
>;
/// The cursors required to delete outcome assets over multiple blocks.
#[pallet::storage]
pub type GarbageAssetCursors<T: Config> = StorageMap<
_,
Blake2_128Concat,
AssetId,
Vec<u8, alloc::alloc::Global>,
OptionQuery,
>;
...
let mut garbage_assets = <GarbageAssets<T>>::get();
if garbage_assets.is_empty() {
return;
}
// use a config constant here instead of 500
let mut iteration_limit: u32 = 500;
let cleared_indices = Vec::new();
let mut index_counter = 0u32;
for asset_id in garbage_assets {
let maybe_cursor: Option<&[u8]> = <GarbageAssetCursors<T>>::get(asset_id).map(|x| x.as_slice());
let removal_results: MultiRemovalResults = <Accounts<T>>::clear_prefix(asset_id, iteration_limit, maybe_cursor);
iteration_limit = iteration_limit.saturating_sub(removal_results.loops);
if let Some(cursor) = removal_results.maybe_cursor {
<GarbageAssetCursors<T>>::insert(asset_id, cursor);
} else {
<GarbageAssetCursors<T>>::remove(asset_id);
cleared_indices.push(index_counter);
}
if iteration_limit.is_zero() {
break;
}
index_counter = index_counter.saturating_add(1u32);
}
for i in cleared_indices {
match garbage_assets.get_mut(i) {
// handle remove gracefully
Some(_) => garbage_assets.remove(i),
None => log::error!("The index {} was not found in the garbage_assets vector. This should never happen.", i),
}
}
<GarbageAssets<T>>::put(garbage_assets);
The above is a possible solution for the "garbage collection" over multiple blocks. In that case we need storage items to save the GarbageAssets
and the GarbageAssetCursors
.
Approach to query the remaining block weight
Ideally we'd like to "garbage collect resolved outcome assets", whenever there is space left in a block, because that approach wouldn't block the usage of the chain just for the "garbage collection". One can think about querying the block weight in on_finalize
. On the other hand we could delete an equal portion of assets each block in on_initialize
. But the best solution all in all is to use the hook on_idle
, which has got the remaining_weight
and is called before the on_finalize
hook. Keep in mind, that on_idle
and also on_finalize
belongs to the mandatory dispatch class, which allows an "unlimited" block weight. Therefore we are responsible to correctly calculate the weight, which we we'd like to put into with the garbage collection.
First Approach to calculate the remaining weight:
let dispatch_class = frame_support::weights::DispatchClass::Operational;
let consumed_weight = *<frame_system::Pallet<T>>::block_weight().get(dispatch_class);
let weight_left: Weight = match T::BlockWeights::get().get(dispatch_class).max_total {
Some(max_weight) => max_weight.saturating_sub(consumed_weight),
// There is no `max_total` limit (`None`),
// or we are below the limit.
_ => 0u64,
};
Best approach (use the pallet hook on_idle
):
fn on_idle(n: BlockNumber, remaining_weight: Weight) -> Weight {
let (iterations_number, cleared_assets_number) = do_garbage_collection(remaining_weight);
T::WeightInfo::do_garbage_collection(iterations_number, cleared_assets_number)
}
Extrinsic approach to avoid the additional storage information of cursors and garbage assets, Zeitgeist could provide an extrinsic, which has the garbage asset id parameter as input. In that case the extrinsic could check if the specified asset id belongs to a resolved market (and therefore counts as garbage asset). In that case, the information about the garbage assets is not required on-chain (same with the cursors). Zeitgeist could call this extrinsic, whenever there is enough weight left for a block. It could be an DispatchClass::Operational
extrinsic to allow more block weight. It could take no fee in the extrinsic header.
#[pallet::weight((5000, DispatchClass::Operational, Pays::No)))]
#[transactional]
pub fn do_garbage_collection(
origin: OriginFor<T>,
market_id: T::MarketId,
asset_id: Asset<T::MarketId>,
) -> DispatchResult {
ensure_root(origin)?;
// TODO: check if corresponding market assets are ready for garbage collection
let mut garbage_left = false;
let mut drain_counter = 0u32;
for _ in <Accounts<T>>::drain_prefix(asset_id) {
drain_counter += 1;
if drain_counter >= T::RemoveKeysLimit::get() {
garbage_left = true;
break;
}
}
if garbage_left {
// emit event, that items are still present to get garbage collected
} else {
// emit event, that all items got removed
}
Ok(())
}
We currently remove the deletion of outcome assets in this PR https://github.com/zeitgeistpm/zeitgeist/pull/806 .
@sea212 Gave a great explanation how to remove the orml_tokens
outcome assets in a storage migration.
The market will be deleted completely out of storage later and we won't know which outcome assets to delete in the future. I think this is fine though, since we can deduct which which markets got deleted (we got incremental market ids and some won't return a market -> delete markets) and then iterate through all accounts in a (potential multi-block) migration and delete outcome assets where the market id belongs to one of the deleted markets.
pallet_assets improvement https://github.com/paritytech/substrate/pull/12310
Some examples of multi block execution:
- Runtime Task Executor: https://github.com/paritytech/substrate/pull/8197
- fast-unstake pallet: https://github.com/paritytech/substrate/tree/master/frame/fast-unstake
- pallet-contracts: https://github.com/paritytech/substrate/blob/master/frame/contracts/src/lib.rs
I propose we design a multi-block execution pallet that provides an interface that other pallets can use to schedule multi-block executions. The calling pallet provides the function (which must contain weight as a parameter and return if it completed) and the multi-block execution pallet regards all the registrations and schedules them during on_idle
.
pallet_assets
does currently have not the exact same functionality as we have with orml_tokens
now.
Look at the current API for pallet_assets
https://docs.rs/pallet-assets/latest/pallet_assets/index.html
There you can see that we are missing some kind of reserve
functionality to deposit a bond. I added an issue here on the substrate repository.
My current plan: We wait and deal with the problem when we need to scale (my preferred option) and potentially get the functionality added along the way in the pallet_assets module. If it is really required to clean the cache of dead assets, then we can manually do a storage migration or we implement additional caching like Vivek did in his PR.
The market will be deleted completely out of storage later and we won't know which outcome assets to delete in the future. I think this is fine though, since we can deduct which which markets got deleted (we got incremental market ids and some won't return a market -> delete markets) and then iterate through all accounts in a (potential multi-block) migration and delete outcome assets where the market id belongs to one of the deleted markets.
Solved in https://github.com/zeitgeistpm/zeitgeist/pull/1201