semidbm
semidbm copied to clipboard
Using `semidbm` in a `shelve` object - a code snippet
Using Python 3.7's shelve
with the default dbm
I run into the same size limitation noted here http://jamesls.com/semidbm-a-pure-python-dbm.html (notably HASH: Out of overflow pages. Increase page size
) using a Mac. Having installed gdbm
it won't appear with my Conda Pythons.
semidbm
came to the rescue using the following code snippet. The class and function are lifted directly from Python's shelve.py
. I see no speed difference but I do see a an ability to scale to more objects that dbm
lacked. gdbm
should have provided a similar solution but on my Anaconda distribution I can't get it to work ( for reference import dbm.gnu
generates ModuleNotFoundError: No module named '_gdbm'
).
Thank you for this package! I hope that the snippet below helps other who use shelve
on a large dataset.
from shelve import Shelf
class DbfilenameShelfSemidbm(Shelf):
"""Shelf implementation using the "dbm" generic dbm interface.
This is initialized with the filename for the dbm database.
See the module's __doc__ string for an overview of the interface.
"""
def __init__(self, filename, flag='c', protocol=None, writeback=False):
import dbm
Shelf.__init__(self, semidbm.open(filename, flag), protocol, writeback)
def open_semidbm(filename, flag='c', protocol=None, writeback=False):
"""Open a persistent dictionary for reading and writing.
The filename parameter is the base filename for the underlying
database. As a side-effect, an extension may be added to the
filename and more than one file may be created. The optional flag
parameter has the same interpretation as the flag parameter of
dbm.open(). The optional protocol parameter specifies the
version of the pickle protocol.
See the module's __doc__ string for an overview of the interface.
"""
return DbfilenameShelfSemidbm(filename, flag, protocol, writeback)
Timing on a smaller client task (prior to the HASH error above):
-
dbm
inside a defaultshelve
1m20 -
semidbm
inside the derivedshelve
1m20 (the same asdbm
)