pandas
pandas copied to clipboard
CoW: Use weakref callbacks to track dead references
- [x] closes #55245
- [x] Tests added and passed if fixing a bug or adding a new feature
- [x] All code checks passed.
- [x] Added type annotations to new arguments/methods/functions.
- [ ] Added an entry in the latest
doc/source/whatsnew/vX.X.X.rst
file if fixing a bug or adding a new feature.
This is an alternative solution to #55518. What it happening here:
- Uses the
weakref.ref
callback to count the number of dead references present at thereferenced_blocks
list. - Takes 20% more time at the constructor because of the function declaration!
- The function declaration can be moved to the
add_reference
andadd_index_reference
, but this is just moving the performance hit to these methods.
- The function declaration can be moved to the
- Everything is O(1) now, including
has_reference
. (_clear_dead_references
is amortized to O(1)) - If the number of dead references is more than 256 and +50% of the array, it will prune these references
- The implication is that the linear slow dead reference cleaning happens next to the GC calls, which I think this is a more intuitive behavior.
(also check another possibilities at #55008)
Let me know if I should continue working on this solution.
Assuming this is an alternate solution to @phofl PR in #55518
#55518 needs to be back ported. This is something that we can consider for main and 2.2, but I want to test this more before we rely on it.
I just fixed the tests. Any thoughts about it @phofl @WillAyd?
Any news?
Personally I would be more partial to exploring https://github.com/pandas-dev/pandas/issues/55631 (IMO I think the mental model of using a set would be simpler)
Can you run the ctors.py
benchmarks?
Sorry for the delay. Here are the benchmarks:
Change | Before [593fa855] | After [98addada] |
Ratio | Benchmark (Parameter) |
---|---|---|---|---|
+ | 5.74±0.2ms | 7.37±0.2ms | 1.29 | frame_ctor.FromArrays.time_frame_from_arrays_int |
+ | 7.65±0.08ms | 8.63±0.2ms | 1.13 | frame_ctor.FromArrays.time_frame_from_arrays_sparse |
+ | 1.43±0.07ms | 1.47±0.06ms | 1.03 | frame_ctor.FromRecords.time_frame_from_records_generator(1000) |
How does this correspond to the other timings in the issue?
How does this correspond to the other timings in the issue?
Not sure if I fully understand what is being asked, but this ~20% increase is from the overhead in the constructor. What this PR will be better is mainly at has_reference
, which will be O(1) independently of the number of references. Insertions also should be a little faster because there is no periodic calls to the cleanings method, but this is probably not meaningful in any way,
Updates?
This pull request is stale because it has been open for thirty days with no activity. Please update and respond to this comment if you're still interested in working on this.
Hey @phofl, what is the current status of this PR?
@phofl?
Hi, @phofl, can you give an update?