Refactor BlockRemover Component
There seems to be a lot of similarities with how BlockRemover & BlockWaiter operate and are used by GetCollections & RemoveCollections. Because of this there seems to be some items that can be refactored and shared.
- Combine
BlockErrorTypeinto one shared type
// BlockWaiter
pub enum BlockErrorType {
BlockNotFound,
BatchTimeout,
BatchError,
}
// BlockRemover
pub enum BlockErrorType {
Timeout,
Failed,
}
Proposed Shared BlockErrorType
pub enum BlockErrorType {
BlockNotFound,
BatchTimeout, // maps to block_remover::BlockErrorType::Timeout
BatchError, // maps to block_remover::BlockErrorType::Failed
}
- Change return type of
BlockRemoverto be the same asBlockWaiter
We should return an error per collection that is attempted to be removed, instead of one error for the batch of collections.
// BlockRemover
pub struct BlockRemoverError {
pub ids: Vec<CertificateDigest>,
pub error: BlockErrorType,
}
// BlockWaiter
pub struct BlockError {
pub id: CertificateDigest,
pub error: BlockErrorType,
}
- Change
tx_remove_block/rx_remove_blockin the diagram below to a oneshot channel instead of a mpsc channel.
We only send one list of collections in to the gRPC endpoint and then in turn to the BlockRemover. Oneshot channel will suffice here. Similar to how BlockWaiter does it

/cc @akichidis
I'll start from the end to the top on the above points:
- Change tx_remove_block/rx_remove_block in the diagram below to a oneshot channel instead of a mpsc channel.
Yes, we should absolutely do that
- Change return type of BlockRemover to be the same as BlockWaiter We should return an error per collection that is attempted to be removed, instead of one error for the batch of collections.
We can definitely do that, but we might want to see how far we want to go with this. Basically, if we want to do it just on the API level (meaning instead of sending one message back , just send N ones) then it should be easy. Otherwise, we might need to perform a deeper refactoring if we want to ensure that block removal is treated in a non batch fashion, but rather independently. block_remover is working primarily in a batched fashion in various places (e.x when we issue delete requests to delete the payload from the workers we do it for multiple certificates and are multiplexed) and an error there will propagate to the top without allowing continue with deletion on single entities. That being said, some light re-design will be needed to achieve this properly
- Combine BlockErrorType into one shared type
pub enum BlockErrorType { BlockNotFound, BatchTimeout, // maps to block_remover::BlockErrorType::Timeout BatchError, // maps to block_remover::BlockErrorType::Failed }
I am a bit skeptical with this in a sense that as those components might evolve in the future, other errors/necessities might be surfaced making it then harder to evolve our gRPC api since we have coupled the responses of those endpoints. That being said, if we go forward with aligning the errors, I would first reduce them to a smaller more abstract subset and then do the unification. For example, the BatchTimeout for the get_collections could be a generic now Timeout. So would look for a proto struct like:
pub enum BlockErrorType {
NotFound,
Timeout,
Error
}
@arun-koshy @huitseeker thoughts?