celestia-node
celestia-node copied to clipboard
das: Parallelise `catchUp` method
Supplement https://github.com/celestiaorg/celestia-node/pull/473 by additionally parallelising the DASing of past headers.
TODOs mentioned here.

Now that DASState is almost implemented, we should consider the storing of errors for parallel catchUp
workers whenever this issue is tackled.
#870 introduces a lock into CacheAvailability which can create performance degrading lock contention which we should keep in mind while implementing this.
Copy of the comment on the PR:
By contention point TODO I meant specifically this place. Basically, the routine hitting the autobatching threshold will lock all the other writers in future parallelized DASer until the batch is synced on disk. This fundamentally degrades the parallelization of a DASer. The solution I know is to decouple reads and writes like in header.Store
Idea for parallelisation:
DASState will contain 1. sampleState (state of the sampling routine), 2. array of worker catchUp routine states
type State struct {
sampleStateLk sync.Mutex
sampleState RoutineState
workerStates []WorkerState
}
type WorkerState struct {
ID uint64
From, To uint64
LastSampledHeight uint64
Err SampleErr
}
Ref #848
We should revisit https://github.com/celestiaorg/celestia-node/issues/504 and close it if after implementation of parallelisation, the issue is not reproducible again