ChangesetMD
ChangesetMD copied to clipboard
Error with replication branch
opening replication file at http://planet.osm.org/replication/changesets/001/507/867.osm.gz
Traceback (most recent call last):
File "changesetmd.py", line 190, in <module>
md.doReplication(conn)
File "changesetmd.py", line 150, in doReplication
self.parseFile(connection, self.fetchReplicationFile(currentSequence), True)
File "changesetmd.py", line 71, in parseFile
action, root = context.next()
File "iterparse.pxi", line 208, in lxml.etree.iterparse.__next__ (src/lxml/lxml.etree.c:131322)
lxml.etree.XMLSyntaxError: no element found
On next run
$ python changesetmd.py -d changesets -r
concurrent update in progress. Bailing out!
The contents of this replication diff are empty
I guess there's two issues here. One is handling empty diffs better, the other is to make sure that the status is set to not in progress when exiting.
I can see two ways to do this...
- Set the running flag
- Do update work 2b. Catch any errors, and unset the running flag then exit
- Unset the running flag
or
- Acquire an explicit lock on the status table
- begin transaction
- Set the running flag
- Commit. This keeps the lock
- For each diff
- download
- begin transaction
- insert new data
- update sequence in status table
- commit
- catch errors from the above
- begin transaction
- de-set the running flag
- commit
Thinking about it, this could still result in a flag set problem is changesetmd crashes or is terminated after 4 and before 7, but it would require a hard enough crash to not throw an exception that can be caught and the flag de-set.
To completely avoid that, the best route is probably to get rid of the flag and use the explicit lock to indicate if it's running or not.