mithril
mithril copied to clipboard
Stabilize snapshot archive production in aggregator
Issue
As seen in #1137, the snapshot archive is unstable and produces from time to time corrupted files. We need to fix this issue, and to do so, we could make a capture of evolving files (volatile, ledger, and latest immutable files) in a temp folder prior to the creation of the archive. This would guarantee that the produced archive is much more likely to be valid. The temp folder would be wiped out after the creation of the archive.
To do
- [ ] Implement the capture of evolving files prior to starting the archive production
- [ ] Evaluate if this is enough or if we need to stop the Cardano node when doing this operation
- [ ] Assess impact of IO/CPU performance on frequency of occurrence of this problem
- [ ] Add a metric in prometheus endpoint to monitor the number of failed archive creation (see #1096)
Related issue
#1160