Allow for alternate storage backends
To allow for larger data sets, multi-process computation and managed backups, it'd be handy to be able to store data in a storage backend instead of in memory.
Possible back-ends:
- in-memory (default)
- memcached
- Redis
- PostgresSQL
I've got a need for this for a large-ish-scale classification project and will be working on it on my fork. If others would find value in this feature, I'm happy to discuss an approach that could eventually be merged into the main project. Thoughts?
im totally open to this, it also comes at a time when im working on a corpus manager to handle algorithms that rely on large datasets which can be downloaded. The pilot algorithm for this new functionality is WordNet but i hope to add many more in the future. I currently index the wordnet files and save the index in a json file which is loaded into memory to make queries so it would be an interesting idea to have other storage options.
I'm not sure whether or not it fits as a piece of a natural language toolkit or not but im more than happy to flesh out the details and see if itll all make sense.
Thanks! -Ken
Hey, is there an update on this? Or any tips on how to store data on a classification in a back end?
:up:
+1
+1
+1 !!!
+1
duplicate with #681
See #727
Support for a number of storage methods has been added with #727