scrapy-deltafetch
scrapy-deltafetch copied to clipboard
DBRunRecoveryErro wrong.
File "/home/.virtualenvs/Spider_py2/local/lib/python2.7/site-packages/scrapy_deltafetch/middleware.py", line 79, in process_spider_output
if key in self.db:
DBRunRecoveryError: (-30974, 'DB_RUNRECOVERY: Fatal error, run database recovery -- PANIC: fatal region error detected; run recovery')
it looks like something wrong with the middleware. it occured in the process of a long time spider processing.
any news regarding this issue?
I have the same problem. I've tried to fix this problem developing this method:
def recover_cache():
db = DB()
db.open(DELTAFETCH_DB)
for k, v in db.items():
try:
db[k]
except KeyError:
db[k] = v
db.close()
And calling it to repare the index before start the crawler. This works but the error sooner or later appears again. I think the problem is that the method to store a key value in the Berkeley DB is not synchronized and when two threads or process call the method at the same time, the database is corrupted. I think it is necessary to sync the method or use a semaphor.
@jmgomezsoriano
And calling it to repare the index before start the crawler. This works but the error sooner or later appears again. I think the problem is that the method to store a key value in the Berkeley DB is not synchronized and when two threads or process call the method at the same time, the database is corrupted. I think it is necessary to sync the method or use a semaphor.
do you mean this http://pybsddb.sourceforge.net/ref/transapp/recovery.html?
I try to run the command "db_recover -c", the problem still exists.
I can only give up repairing db and restart it by "-a deltafetch_reset=1"