pokebase
pokebase copied to clipboard
Recent refactor results in significant performance hit
I was looking at the current state of the refactor, and noticed that there's a significant performance drop off (about an order of magnitude) from the changes.
Namely, once the cache is set, retrieving a pokemon takes an order of magnitude longer (jumping from taking ~.25s to ~3s).
~/programming/pokebase (pre-refactor)$ time python3 speed-test-old.py
real 0m41.041s
user 0m0.796s
sys 0m0.065s
~/programming/pokebase (pre-refactor)$ time python3 speed-test-old.py
real 0m0.241s
user 0m0.217s
sys 0m0.021s
~/programming/pokebase (pre-refactor)$ git checkout master
Switched to branch 'master'
Your branch is up-to-date with 'origin/master'.
~/programming/pokebase (master)$ time python3 speed-test.py
real 0m31.404s
user 0m1.572s
sys 0m0.546s
~/programming/pokebase (master)$ time python3 speed-test.py
real 0m3.095s
user 0m0.940s
sys 0m0.482s
~/programming/pokebase (master)$ cat speed-test.py
import pokebase
from pokebase.cache import set_cache
set_cache('speed-test-cache')
pokebase.pokemon('jigglypuff')
~/programming/pokebase (master)$ cat speed-test-old.py
import pokebase
from pokebase.api import set_cache
set_cache('speed-test-cache-old')
pokebase.pokemon('jigglypuff')
Originally posted by @jrubinator in https://github.com/GregHilmes/pokebase/issues/10#issuecomment-414134798
How very thorough of you, I appreciate this a bunch.
Unfortunately, I didn't finish my rewrite of pokebase
before heading off to school, and I currently don't have the time to support it.
I believe I intended to try out a few caching methods. The old json tree was too complex in my opinion. Currently pokebase
is using the shelve
module, which I chose for its simplicity. Unfortunately, it is slower, as you have pointed out. I was planning on moving away from shelve
, as it presented the difficulty of behaving differently on different platforms.
My next idea was to try some form of SQL (mysql
?), but I didn't get the chance to try it out. If you're up to it, I'd be very interested in a pull request with a better caching algorithm/hard disk representation.