kotlinx.collections.immutable icon indicating copy to clipboard operation
kotlinx.collections.immutable copied to clipboard

Document garbage collection implications/guarantees

Open nanodeath opened this issue 4 years ago • 3 comments

It's unclear to me whether old state gets trapped in these persistent collections and maintain strong references that prevent garbage collection. For example if I have persistentMap().put("foo", GiantObject()).put("foo", GiantObject(2)), does the first value for key "foo" become immediately eligible for garbage collection? Does this behavior vary at all depending on how far "apart" those operations occur -- like what if I did 32 update operations between those two writes?

I can think of two naive implementations of persistent maps.

  • Every mutation copies the entire collection and returns it. Pros: no references to old data because it's caught up in the old collection which is itself eligible for garbage collection. Also, access time is the same as the basic HashMap. Cons: mutations are O(n). Generates O(n) garbage every time a mutation occurs.
  • Every mutation appends a "node" to a linked-list-like structure, which are then read in totality to determine current state. Pros: mutations take O(1) time. Cons: read operations take O(n) time. Any item that's ever been in the collection can never be garbage collected.

Clearly what this library is doing is far more sophisticated than either of those, however the GC ramifications are unclear, and reading the source has proven challenging.

Plus it should be documented anyway. If it's as simple as "This library has identical garbage-collection semantics to Java's HashMap, HashSet, and ArrayList counterparts." (if that's correct), that would be good enough for me.

nanodeath avatar Mar 21 '21 16:03 nanodeath

if I have persistentMap().put("foo", GiantObject()).put("foo", GiantObject(2)), does the first value for key "foo" become immediately eligible for garbage collection?

Yes

Does this behavior vary at all depending on how far "apart" those operations occur -- like what if I did 32 update operations between those two writes?

No

The situation when data structure retains reference to an unreachable element is usually called memory leak. Collections in this library are designed and implemented in such a way as to avoid memory leaks. That is, if after an operation an element becomes inaccessible from user code, the element becomes immediately eligible for garbage collection.

Please file an issue if you face memory leaks

qurbonzoda avatar Mar 22 '21 17:03 qurbonzoda

The main ask is to document the behavior; or is the behavior so "obvious" that it doesn't warrant documenting?

nanodeath avatar Mar 22 '21 17:03 nanodeath

Documenting the behavior will definitelly help. Contributions are welcome!

qurbonzoda avatar Mar 22 '21 23:03 qurbonzoda