openstreetmap-website icon indicating copy to clipboard operation
openstreetmap-website copied to clipboard

add bulk_upload

Open jiaxuyang opened this issue 7 years ago • 19 comments

Key optimization

  1. Reduce redundant check Since all changes in an upload request belong to the same changeset. It's not necessary to query changesets table for every object in the osmChange file.
  2. Skip uniqueness rails validation Skip uniqueness validation and that in associated model. Delegate it to database unique index.
  3. Bulk check Bulk select data from database. Check it in batch. There can be dependencies between objects in one batch operation, which result in new problems to solve. Basically, we divide one batch into multi batch. a. batch create relation If relation1 is a member of relation2, and they are in same create batch. We can't save them in one batch. We have to save relation1 first to get its id, then use this id to replace the placeholder id and save relation2. b. batch delete relation If relation1 is a member of relation2, and they are in same delete batch. We just delete them all and don't raise an exception for the "used constraint".
  4. Bulk save a. Delete delete from ... where id in (...) b. Create Use activerecord-import gem to implement conveniently. c. Modify Postgres supports INSERT ... ON CONFLICT DO UPDATE
  5. update changeset bbox a. Before: query longitude and latitude for every object, update bbox for every object. b. After: query max, min of longitude and latitude once, update bbox once.

jiaxuyang avatar Sep 17 '18 14:09 jiaxuyang

What is this? How is it different to a normal upload?

tomhughes avatar Sep 17 '18 14:09 tomhughes

It's pretty much the same performance optimization I described in https://github.com/openstreetmap/openstreetmap-website/issues/1710 and also implemented in cgimap. I tested a few easy cases via url rewriting (upload --> bulk_upload), and they seemed ok. The current code introduces quite a bit of duplication, i.e. @gravitystorm probably needs to to through it in much more detail.

The general concept of bundling many small sql statments into larger ones works regardless of which language issues the sql requests. Still there's some more overhead in Rails processing vs. C++ (about factor 5 in one test case). In any case, we will see at least a 30x speedup.

mmd-osm avatar Sep 17 '18 21:09 mmd-osm

Well my point was why is there nothing in the PR to explain what it's about? or even in the commit messages...

Above all else, if it's just optimisations why is it a new method rather then just improving the existing method?

tomhughes avatar Sep 17 '18 21:09 tomhughes

Well my point was why is there nothing in the PR to explain what it's about? or even in the commit messages...

Above all else, if it's just optimisations why is it a new method rather then just improving the existing method?

Sorry for making you confused, I will write some descriptions later... The new bulk_upload method is slightly different from old upload method on response in some case, but won't break original logic. If it's OK, I will make a commit to replace the old one. Thanks!

jiaxuyang avatar Sep 18 '18 01:09 jiaxuyang

It's pretty much the same performance optimization I described in #1710 and also implemented in cgimap. I tested a few easy cases via url rewriting (upload --> bulk_upload), and they seemed ok. The current code introduces quite a bit of duplication, i.e. @gravitystorm probably needs to to through it in much more detail.

The general concept of bundling many small sql statments into larger ones works regardless of which language issues the sql requests. Still there's some more overhead in Rails processing vs. C++ (about factor 5 in one test case). In any case, we will see at least a 30x speedup.

Yes, I have seen your analysis and effort on cgimap. But it seems the new feature bulk_upload not finished? I just can't wait...

jiaxuyang avatar Sep 18 '18 02:09 jiaxuyang

But it seems the new feature bulk_upload not finished?

It's functionally complete and available for testing on https://upload.apis.dev.openstreetmap.org/ If you find some issue, please create an issue on Github, thanks!

See https://github.com/zerebubuth/openstreetmap-cgimap/issues/140#issuecomment-422082181 for additional details for JOSM.

mmd-osm avatar Sep 18 '18 05:09 mmd-osm

@jiaxuyang please don't spend a lot of time on this until we can understand what your goals are and decide whether this is a direction we want to pursue - the general expectation has been that @mmd-osm's cgimap work is the way forward.

tomhughes avatar Sep 18 '18 06:09 tomhughes

until we can understand what your goals are

My goal: improve the performance of diff upload api on rails. @mmd-osm I will follow any new progress in cgimap project. Hope to use cgimap in the future.

jiaxuyang avatar Sep 18 '18 07:09 jiaxuyang

It’s probably not a bad idea to have a roughly similar logic in both rails and cgimap, so I wouldn’t discard the idea right from the start based on what’s going on in cgimap. There’s a number of corner cases, and cross validating different implementations would be one use case which comes to mind.

mmd-osm avatar Sep 18 '18 09:09 mmd-osm

Well the goal was to have rails use the cgimap code as a library I thought?

But sure if it's API compatible then we could take it, but it sounded like it wasn't as there was talk of changes in the response?

Bigger problem is I'm not sure who would review it - the upload code is not something I have any experience of - it was all @zerebubuth's work I think.

tomhughes avatar Sep 18 '18 09:09 tomhughes

The new bulk_upload method is slightly different from old upload method on response in some case, but won't break original logic.

if it's API compatible then we could take it, but it sounded like it wasn't as there was talk of changes in the response?

I think the new logic in this pull request first processes all create, modify and finally the delete operations, so it could potentially re-order the sequence in which elements are being processed.

While this isn't an issue for e.g. iD (requests are already sorted in this sequence), other API users might send multiple create/modify/delete blocks in more or less random order. If they expect the {old_id,new_id, version} in the response to appear in the same order as in the request, this may cause some issues. Those cases would have to be validated in more detail. This equally applies to other use cases which work in today's implementation (which works step-by-step from top to bottom). One such use case is already mentioned in point 3. above.

Bigger problem is I'm not sure who would review it

Indeed, that's the most challenging issue. I struggle quite a bit with the Rails syntax, and can't really add much value here. The feedback so far is mostly based on testing, less so from trying to figure out what the code actually does.


The following example triggers an HTTP 500 Internal Server Error with the new code due to a deadlock, while the current code returns HTTP 400 "Placeholder Relation not found for reference -4 in relation -2."

<osmChange version="0.6">
<create>
<relation id="-2" version="0" changeset="1191">
  <member type="relation" role="" ref="-4" />
  <tag k="type" v="route" />
  <tag k="name" v="AtoB" />
</relation>
<relation id="-3" version="0" changeset="1191">
  <tag k="type" v="route" />
  <tag k="name" v="BtoA" />
</relation>
<relation id="-4" version="0" changeset="1191">
  <member type="relation" role="" ref="-2" />
  <member type="relation" role="" ref="-3" />
  <tag k="type" v="route_master" />
  <tag k="name" v="master" />
</relation>
</create>
</osmChange>

Potlatch2 uses different "if-unused" settings in different delete blocks, depending which object is to be deleted. The new code doesn't support this, as it assumes identical settings for all delete blocks. One possible issue here is that a user won't get an error message when trying to delete the way.

<osmChange version="0.6">
   <delete version="0.6">
    <way id="4000212040" version="5" changeset="1211"/>
  </delete>
  <delete version="0.6" if-unused="true">
    <node id="5003081701" version="1" changeset="1211"/>
    <node id="5003081706" version="1" changeset="1211"/>
    <node id="5003081703" version="1" changeset="1211"/>
  </delete>
</osmChange>

mmd-osm avatar Sep 18 '18 16:09 mmd-osm

The following test message currently produces an HTTP 500 Internal Server Errror - API threw unexpected NoMethodError exception: undefined method `check_consistency' for nil:NilClass

Expected result would be HTTP 404 not found

<?xml version="1.0" encoding="UTF-8"?>
<osmChange version="0.6" generator="iD">
  <modify>
    <node id="1200000000000" lon="-11" lat="-46" version="1" changeset="1216"/>
 </modify>
</osmChange>

mmd-osm avatar Oct 03 '18 09:10 mmd-osm

I have found another bug:

<?xml version="1.0" encoding="UTF-8"?>
<osmChange version="0.6" generator="iD">
  <create>
    <node id="-1" lat="1" lon="2" changeset="1248"/>
    </create>
   <create>
    <node id="-1" lat="1" lon="2" changeset="1248"/>
    </create>
</osmChange>

Expected Result:

HTTP 400: Placeholder IDs must be unique for created elements.

Actual Result:

HTTP 200 OK

<?xml version="1.0" encoding="UTF-8"?>
<diffResult version="0.6" generator="OpenStreetMap server" copyright="OpenStreetMap and contributors" attribution="http://www.openstreetmap.org/copyright" license="http://opendatacommons.org/licenses/odbl/1-0/">
    <node old_id="-1" new_id="5003081839" new_version="1"/>
    <node old_id="-1" new_id="5003081840" new_version="1"/>
</diffResult>

mmd-osm avatar Oct 13 '18 09:10 mmd-osm

I have found another bug:

<?xml version="1.0" encoding="UTF-8"?>
<osmChange version="0.6" generator="iD">
  <create>
    <node id="-1" lat="1" lon="2" changeset="1248"/>
    </create>
   <create>
    <node id="-1" lat="1" lon="2" changeset="1248"/>
    </create>
</osmChange>

Expected Result:

HTTP 400: Placeholder IDs must be unique for created elements.

Actual Result:

HTTP 200 OK

<?xml version="1.0" encoding="UTF-8"?>
<diffResult version="0.6" generator="OpenStreetMap server" copyright="OpenStreetMap and contributors" attribution="http://www.openstreetmap.org/copyright" license="http://opendatacommons.org/licenses/odbl/1-0/">
    <node old_id="-1" new_id="5003081839" new_version="1"/>
    <node old_id="-1" new_id="5003081840" new_version="1"/>
</diffResult>

This bug was bring in duo to the support for multiple action blocks. I have added unit test for this case. Thx.

jiaxuyang avatar Oct 14 '18 04:10 jiaxuyang

There's another issue with backwards compatibility: the current implementation assumes that every operation occurs from top to bottom. This way, you cannot reference objects which occur only later in the file. Although your implementation is not incorrect from a technical point of view, it allows for scenarios, which are forbidden in the current implementation. If we ever need to switch back to the current implementation, this will no longer be possible.

Here's an example:

<?xml version="1.0" encoding="UTF-8"?>
<osmChange version="0.6" generator="iD">
  <create>
    <node id="-1" lat="1" lon="2" changeset="1251"/>
    <way id="-1"  changeset="1251">
        <nd ref="-1"/>
        <nd ref="-2"/>
    </way>
    <node id="-2" lat="1" lon="2" changeset="1251"/>
    </create>
</osmChange>

Expected result:

HTTP 400 - Placeholder node not found for reference -2 in way -1

Actual result:

HTTP 200 OK

<?xml version="1.0" encoding="UTF-8"?>
<diffResult version="0.6" generator="OpenStreetMap server" copyright="OpenStreetMap and contributors" attribution="http://www.openstreetmap.org/copyright" license="http://opendatacommons.org/licenses/odbl/1-0/">
    <node old_id="-1" new_id="5003081852" new_version="1"/>
    <node old_id="-2" new_id="5003081853" new_version="1"/>
    <way old_id="-1" new_id="4000212054" new_version="1"/>
</diffResult>

Furthermore, due to the internal resorting of objects, the diffResult result message no longer shows the object in the exact same sequence as they occured in the osmChange message. I also consider this to be a breaking change which needs to be corrected.

mmd-osm avatar Oct 15 '18 15:10 mmd-osm

@jiaxuyang : thanks for the update, looks good so far!

@tomhughes : the functional issues I have reported earlier seem to be fixed now,. I will continue testing the code, though. If you like, you could maybe have a look through the code to see if the coding style is ok, overall code complexity is still reasonable, etc.

mmd-osm avatar Oct 26 '18 15:10 mmd-osm

Just to clarify this now replaces the existing upload endpoint rather than adding something in parallel right? Only I seem to see a lot more added code than removed code?

Realistically reviewing this is going to be a huge job, especially for somebody who doesn't really understand the current upload code - my best guess is that is likely a day's work to review it, if not a weekend's. So don't expect me to be able to do it immediately...

tomhughes avatar Oct 26 '18 15:10 tomhughes

Just to clarify this now replaces the existing upload endpoint rather than adding something in parallel right?

Yes, that's exactly right, it's a drop in replacement for the existing code, only with performance improvements. The results (and error messages) should be exactly the same as in today's version.

Only I seem to see a lot more added code than removed code?

Bulk operations introduced a lot more complexity due to the additional packaging logic, so this makes sense.

So don't expect me to be able to do it immediately...

That's perfectly fine, there's no hurry at all. I'm also waiting on @zerebubuth to review the corresponding cgimap implementation.

Overall, this may seem to be doubling efforts. Nevertheless, I still find it useful to also improve the Rails port by using bulk operations (similar to how the cgimap implementation works). cgimap will be faster in the end, but the difference won't be in the 100x range anymore, which is also good news for people who don't use cgimap as acceleration layer.

mmd-osm avatar Oct 26 '18 16:10 mmd-osm

I've had a look through this today. It's a large and complex PR touching one of the most complex parts of the codebase, so it's hard to review. But I want to say that I completely support the idea of refactoring the changeset upload processing to import entities in bulk, in order to improve performance.

I've merged with the current master and fixed the conflicts, since there's been some controller refactoring since this PR was made. You can see the results at https://github.com/gravitystorm/openstreetmap-website/tree/changeset_bulk . I haven't updated this PR, since the tests no longer pass.

It turns out that there is a problem somewhere between this PR and the updated version of activerecord-import that we are now using. This PR uses v0.25.0 and we now use v1.0.1. I think it's caused by our use of composite-primary-keys which allows belongs_to relations to have a foreign key specified as an array instead of a string, e.g.

belongs_to :old_node, :foreign_key => [:node_id, :version]

activerecord-import then blows up when trying to convert this array to a symbol. This was introduced in activerecord-import 0.28.1.

So I think there are two things that we should work on here now:

  1. Extract the additional changeset upload tests into their own PR. They contain useful checks that @mmd-osm and @jiaxuyang worked on as part of this PR, and will come in useful at some point even if this PR isn't merged.

  2. Figure out the best way forward with this foreign key problem. This could involve a combination of

    • rewriting this PR
    • changes to activerecord-import to restore compatibility with composite-primary-keys
    • rewriting this PR to work with the upcoming rails 6 "insert_all"
    • something else entirely

gravitystorm avatar Apr 03 '19 13:04 gravitystorm

I recommend closing this, as it's had a changes requested label for over three years with no response.

pnorman avatar Nov 28 '22 08:11 pnorman

I'm not expecting any updates from the OP any more, but there's some useful stuff in here (particularly from my point 1 above) that I'm hoping not to lose into the void.

gravitystorm avatar Nov 28 '22 15:11 gravitystorm

Testcases are now available as #3821.

PR can be closed.

mmd-osm avatar Nov 30 '22 21:11 mmd-osm