BTrees
BTrees copied to clipboard
Length Object not working from outside tree class
PROBLEM REPORT/ QUESTIONNING
Hi, I've been using BTree in order to optimize the performances of my code, which seems to be great in terms of insertion and deletion. I've unexpectedly encountered a major problem: I sometimes need to check on my BTree size, and I realized that using len(btree) is computationally very expensive, which is what now slows my code.
I've seen in the docs about the Length utility (https://btrees.readthedocs.io/en/latest/api.html#BTrees.OOBTree.BTree), but I really can't figure out how to use it, is there any example somewhere or can someone point me towards the right direction ?
I'm working on python 3.11.5 with Btrees version 5.2 and an OOBTree.
As update, I've tried implementing a custom class inheriting from OOBTree in order to integrate the Length object inside:
class AutoLengthOOBTree(OOBTree):
def __init__(self, *args, **kwargs):
super().__init__(*args, **kwargs)
self._length = Length()
def __setitem__(self, key, value):
print(f"Current Items length (from inside AutoLengthOOBTree) : {self._length()}")
if key not in self:
self._length.change(1)
super().__setitem__(key, value)
def __delitem__(self, key):
super().__delitem__(key)
self._length.change(-1)
def __len__(self):
return self._length()
Now the print in setitem is only here as debug purpose. The interesting (and somehow intriguing) fact is that whenever I update my tree the call of setitem is correct and the right lenght is displayed by its print. However, if I do print the len(myTree) anywhere else inside my code, the displayed length is always 0...
self._itemsBtree = AutoLengthOOBTree()
# Update the items dictionary
self._itemsBtree.setdefault(timestamp, {}).update({variable_name: value})
print(f"ItemsBtree length : {len(self._itemsBtree)}") # Prints 0
Rémy Macherel wrote at 2024-2-29 02:26 -0800:
As update, I've tried implementing a custom class inheriting from OOBTree in order to integrate the Length object inside:
class AutoLengthOOBTree(OOBTree): def __init__(self, *args, **kwargs): super().__init__(*args, **kwargs) self._length = Length() def __setitem__(self, key, value): print(f"Current Items length (from inside AutoLengthOOBTree) : {self._length()}") if key not in self: self._length.change(1) super().__setitem__(key, value) def __delitem__(self, key): super().__delitem__(key) self._length.change(-1) def __len__(self): return self._length()
Now the print in setitem is only here as debug purpose. The interesting (and somehow intriguing) fact is that whenever I update my tree the call of setitem is correct and the right lenght is displayed by its print. However, if I do print the len(myTree) anywhere else inside my code, the displayed length is always 0...
self._itemsBtree = AutoLengthOOBTree() # Update the items dictionary self._itemsBtree.setdefault(timestamp, {}).update({variable_name: value}) print(f"ItemsBtree length : {len(self._itemsBtree)}") # Prints 0
I tried something simpler:
>>> from BTrees.OOBTree import OOBTree
>>> from BTrees.Length import Length
>>> class AutoLengthOOBTree(OOBTree):
... def __init__(self, *args, **kwargs):
... super().__init__(*args, **kwargs)
... self._length = Length()
... def __setitem__(self, key, value):
... print(f"Current Items length (from inside AutoLengthOOBTree) : {self._length()}")
... if key not in self:
... self._length.change(1)
... super().__setitem__(key, value)
... def __delitem__(self, key):
... super().__delitem__(key)
... self._length.change(-1)
... def __len__(self):
... return self._length()
>>> t=AutoLengthOOBTree()
>>> t[1]=1
Current Items length (from inside AutoLengthOOBTree) : 0
>>> t[2]=1
Current Items length (from inside AutoLengthOOBTree) : 1
>>> len(t)
2
i.e. the len
has been correct.
I assume that update
does not call the derived __setitem__
.
I know that the Pruducts.PluggableIndex
es (part of Products.ZCatalog
)
use BTrees
with Length
. They do not use inheritance
but instead delegation; this is more work but gives less surprises.
Thanks @d-maurer for your help, do you have any example of the Pruducts usage or implementation ? And also I think you're right the setdefault combined with update doesn't seem to call the setitem method, which I find somehow intriguing too as it is able to create and modify elements in the tree.
Rémy Macherel wrote at 2024-2-29 05:14 -0800:
Thanks @d-maurer for your help, do you have any example of the Pruducts usage or implementation ? "https://github.com/zopefoundation/Products.ZCatalog/blob/f2d6ea367497841d02c7c925d9a903653d06fafa/src/Products/PluginIndexes/unindex.py#L127"
And also I think you're right the setdefault combined with update doesn't seem to call the setitem method, which I find somehow intriguing too as it is able to create and modify elements in the tree.
Python has a low level C API and a high level Python API, the former being considerably more efficient than tha latter.
BTrees
strive hard for efficiency. Therefore, it is using
the low level C API directly, not the Python level API.
For this reason, true BTrees
operations (in contrast to operations
overridden by the derived class) may fail to use methods defined
by derived classes.