Birch icon indicating copy to clipboard operation
Birch copied to clipboard

Persistent array implementation for Array and RaggedArray

Open lawmurray opened this issue 3 years ago • 0 comments

Recursive data structures have a special place in Birch: they allow sharing of objects between particles across resampling epochs (Murray, 2020). For N particles and T steps, this enables O(T + N log N) (Jacob, Murray & Rubenthaler, 2015) or even O(T + N) (Koskela, Jenkins, Johansen & Spanò, 2020) memory use. Singly-linked containers such as Stack and Tape are particularly good in this regard. Dense arrays, on the other hand, do not enable object sharing. If dense arrays are used to store state history across resampling epochs, only O(TN) memory use is achieved. Dense arrays are represented in Birch using e.g. Real[_].

Two of the standard containers, Array and RaggedArray are implemented using a dense array under the hood. The aim is to replace them with a persistent array implementation instead. We will then have a consistent usage pattern where the containers provided by the standard library are all recursive data structures and provide efficient memory use across reampling epochs, while dense arrays are typically used only for numerical vectors and matrices. No more design paralysis between Array<MyClass> and MyClass[_]!

One possible way of implementing RaggedArray<Type> would be to use, essentially, Array<Array<Type>>.

References

  • L.M. Murray (2020). Lazy object copy as a platform for population-based probabilistic programming. [arxiv]

  • P.E. Jacob, L.M. Murray and S. Rubenthaler (2015). Path Storage in the Particle Filter. Statistics and Computing. 25(2):487--496. [doi] [arXiv]

  • J Koskela, P A Jenkins, A M Johansen, and D Spanò. Asymptotic genealogies of interacting particle systems with an application to sequential Monte Carlo. Annals of Statistics 48(1):560-583, 2020 [AoS] [arXiv]

lawmurray avatar Feb 15 '21 06:02 lawmurray