tilemaker icon indicating copy to clipboard operation
tilemaker copied to clipboard

Sorting OutputObjects is very slow

Open systemed opened this issue 5 years ago • 5 comments

callgrind suggests that, by far, the largest amount of time spent in tile output is in sorting OutputObjects. This is at TilesAtZoomIterator::RefreshData https://github.com/systemed/tilemaker/blob/master/src/tile_data.cpp#L114, calling a comparison function at https://github.com/systemed/tilemaker/blob/master/src/output_object.cpp#L140.

Without coastlines, this sorting takes up 60% of time. (With coastlines it's 30%; see #173.) We should be able to improve this.

The callers are roughly:

void std::__introsort_loop
TilesAtZoomIterator::RefreshData()
TilesAtZoomIterator::operator++()
TileData::GetTilesAtZoomBegin()
outputProc(...)

I don't fully understand how tile_data.cpp works (it's something that TimSC refactored) so it will probably require diving into that.

systemed avatar Jun 29 '20 11:06 systemed

https://github.com/systemed/tilemaker/commit/491b235570e9d74e074be16f6f3df2bbf5098c5f has improved this in some circumstances - we now don't add OutputObjects to the list if they fail the minZoom test.

systemed avatar Sep 10 '20 23:09 systemed

You dont actually need to sort. You just want to filter out duplicate outputobjects when merging tiles for lower zoom levels. You can use unordered set for this. This has much lower computational complexity.

Shall we try this out? This might be a very quick win if this works.

kleunen avatar Mar 21 '21 10:03 kleunen

Definitely worth a try!

systemed avatar Mar 21 '21 10:03 systemed

operator<(boost::intrusive_ptr<OutputObject> const&, boost::intrusive_ptr<OutputObject> const&) is using the most time, but is actually only 7% of total execution time. I don't think that this is in sort, probably other datastructures use this operator< as well.

Each sample counts as 0.01 seconds.
  %   cumulative   self              self     total
 time   seconds   seconds    calls   s/call   s/call  name
  6.84      5.55     5.55 69243814     0.00     0.00  operator<(boost::intrusive_ptr<OutputObject> const&, boost::intrusive_ptr<OutputObject> const&)
  5.82     10.27     4.72 126438112     0.00     0.00  void boost::geometry::detail::recalculate::recalculate_point<2ul>::apply<boost::geometry::model::point<long long, 2ul, boost::geometry
  3.70     13.28     3.01  8748454     0.00     0.00  bool boost::geometry::detail::get_turns::get_turns_in_sections<boost::geometry::model::multi_polygon<boost::geometry::model::polygon<bo
  3.43     16.06     2.78 22543587     0.00     0.00  boost::interprocess::rbtree_best_fit<boost::interprocess::mutex_family, boost::interprocess::offset_ptr<void, long, unsigned long, 0ul>
  3.30     18.74     2.68 40669082     0.00     0.00  OSMStoreImpl<NodeStore>::impl_nodes_at(unsigned long) const
  1.92     20.30     1.56 30322738     0.00     0.00  vector_tile::Tile::IsInitialized() const
  1.91     21.85     1.55   729209     0.00     0.00  bool boost::geometry::detail::get_turns::get_turns_in_sections<std::vector<mmap::polygon_t, boost::interprocess::node_allocator<mmap::p
  1.90     23.39     1.54 19226158     0.00     0.00  boost::intrusive::bstree_algorithms<boost::intrusive::rbtree_node_traits<boost::interprocess::offset_ptr<void, long, unsigned long, 0ul
  1.89     24.92     1.53  1515706     0.00     0.00  void boost::geometry::detail::sectionalize::sectionalize_part<boost::geometry::model::d2::point_xy<double, boost::geometry::cs::cartesi
  1.85     26.42     1.51 35201562     0.00     0.00  vector_tile::Tile_Value::~Tile_Value()
  1.85     27.92     1.50  3122137     0.00     0.00  std::pair<std::_Rb_tree_iterator<AttributeStore::key_value_set_store_t>, bool> std::_Rb_tree<AttributeStore::key_value_set_store_t, Att
  1.83     29.41     1.49 50715862     0.00     0.00  OsmLuaProcessing::Find(std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&) const
  1.76     30.84     1.43   959475     0.00     0.00  OutputObject::findValue(std::vector<vector_tile::Tile_Value, std::allocator<vector_tile::Tile_Value> >*, vector_tile::Tile_Value const&
  1.73     32.24     1.41 26712627     0.00     0.00  vector_tile::Tile_Value::Tile_Value(vector_tile::Tile_Value const&)
  1.43     33.40     1.16 23738356     0.00     0.00  boost::geometry::policies::relate::segments_tupled<boost::geometry::policies::relate::segments_intersection_points<boost::geometry::seg
  1.34     34.49     1.09 66814150     0.00     0.00  int boost::geometry::strategy::side::side_by_triangle<void>::apply<boost::geometry::model::point<long long, 2ul, boost::geometry::cs::c
  1.27     35.52     1.03  2413609     0.00     0.00  std::back_insert_iterator<std::deque<boost::geometry::detail::overlay::traversal_turn_info<boost::geometry::model::d2::point_xy<double,
  1.22     36.51     0.99  2323938     0.00     0.00  boost::geometry::model::linestring<boost::geometry::model::d2::point_xy<double, boost::geometry::cs::cartesian>, std::vector, std::allo
  1.21     37.49     0.98   475337     0.00     0.00  void boost::geometry::detail::sectionalize::sectionalize_part<boost::geometry::model::d2::point_xy<double, boost::geometry::cs::cartesi
  1.20     38.47     0.98 18054041     0.00     0.00  std::back_insert_iterator<std::deque<boost::geometry::detail::overlay::turn_info<boost::geometry::model::d2::point_xy<double, boost::ge
  1.06     39.33     0.86 50715862     0.00     0.00  int kaguya::nativefunction::_call_apply<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > (OsmLuaProcessi
  1.02     40.16     0.83 22543578     0.00     0.00  boost::interprocess::segment_manager_base<boost::interprocess::rbtree_best_fit<boost::interprocess::mutex_family, boost::interprocess::
  1.02     40.99     0.83  2491967     0.00     0.00  void boost::geometry::detail::get_turns::get_turns_in_sections<std::vector<mmap::polygon_t, boost::interprocess::node_allocator<mmap::p
  1.02     41.82     0.83     1587     0.00     0.00  MergeSingleTileDataAtZoom(TileCoordinates_, unsigned int, unsigned int, std::map<TileCoordinates_, std::vector<boost::intrusive_ptr<Out
  

kleunen avatar Mar 21 '21 17:03 kleunen

Seems the Lua generation "OsmLuaProcessing::Layer" is the biggest bottleneck now. Also because it is called single threaded.

Also: void OsmLuaProcessing::CorrectGeometry, if you disable that, it gives a measurable speedup

ndex % time    self  children    called     name
                                                 <spontaneous>
[1]     28.3    0.06   25.04                 kaguya::lua_type_traits<kaguya::FunctionInvokerType<std::tuple<void (OsmLuaProcessing::*)(std::__cxx11::basic_string<char, std::char_traits<char
                0.31   24.60 1948985/1948985     OsmLuaProcessing::Layer(std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&, bool) [2]
                0.05    0.06 1948985/5923905     OsmLuaProcessing* kaguya::get_pointer<OsmLuaProcessing>(lua_State*, int, kaguya::types::typetag<OsmLuaProcessing>) [146]
                0.02    0.01 1948985/2951227     kaguya::lua_type_traits<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >, void>::get(lua_State*, int) [296]
-----------------------------------------------
                0.31   24.60 1948985/1948985     kaguya::lua_type_traits<kaguya::FunctionInvokerType<std::tuple<void (OsmLuaProcessing::*)(std::__cxx11::basic_string<char, std::char_traits<
[2]     28.1    0.31   24.60 1948985         OsmLuaProcessing::Layer(std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&, bool) [2]
                0.08   13.41  977385/977385      void OsmLuaProcessing::CorrectGeometry<boost::geometry::model::multi_polygon<boost::geometry::model::polygon<boost::geometry::model::d2::poi
                0.49    6.83  977386/977386      OSMStore::store_multi_polygon<boost::geometry::model::multi_polygon<boost::geometry::model::polygon<boost::geometry::model::d2::point_xy<dou
                0.01    1.70 1343365/1358273     OsmLuaProcessing::linestringCached() [64]
                0.07    0.97  373033/373033      void OSMStore::perform_mmap_operation<OSMStore::store_linestring<boost::geometry::model::linestring<boost::geometry::model::d2::point_xy<dou
                0.00    0.23    7053/14088       OsmLuaProcessing::multiPolygonCached() [127]
                0.03    0.17  373033/373033      void OsmLuaProcessing::CorrectGeometry<boost::geometry::model::linestring<boost::geometry::model::d2::point_xy<double, boost::geometry::cs::
                0.11    0.00 1948985/1978623     std::_Rb_tree<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >, std::pair<std::__cxx11::basic_string<char, st
                0.10    0.00 1948985/3865670     std::map<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >, unsigned int, std::less<std::__cxx11::basic_string
                0.09    0.00 1948985/1978623     void std::deque<std::pair<boost::intrusive_ptr<OutputObject>, boost::container::small_vector<std::_Rb_tree_const_iterator<std::pair<std::__c
                0.06    0.00 4473606/22348907     void std::vector<boost::geometry::model::d2::point_xy<double, boost::geometry::cs::cartesian>, std::allocator<boost::geometry::model::d2::p
                0.04    0.00 1948985/1978623     std::pair<boost::intrusive_ptr<OutputObject>, boost::container::small_vector<std::_Rb_tree_const_iterator<std::pair<std::__cxx11::basic_stri
                0.04    0.00  598567/40669082     OSMStoreImpl<NodeStore>::impl_nodes_at(unsigned long) const [46]
                0.03    0.00 1948985/1978623     std::pair<boost::intrusive_ptr<OutputObject>, boost::container::small_vector<std::_Rb_tree_const_iterator<std::pair<std::__cxx11::basic_stri
                0.03    0.00  970332/1066281     void std::vector<boost::geometry::model::polygon<boost::geometry::model::d2::point_xy<double, boost::geometry::cs::cartesian>, true, true, s
                0.02    0.00 1343365/1684935     std::vector<boost::geometry::model::d2::point_xy<double, boost::geometry::cs::cartesian>, std::allocator<boost::geometry::model::d2::point_x
                0.02    0.00  977385/2110814     std::vector<boost::geometry::model::polygon<boost::geometry::model::d2::point_xy<double, boost::geometry::cs::cartesian>, true, true, std::v
                0.02    0.00 1948985/6032476     intrusive_ptr_release(OutputObject*) [266]
                0.02    0.00  970332/1021347     boost::geometry::model::polygon<boost::geometry::model::d2::point_xy<double, boost::geometry::cs::cartesian>, true, true, std::vector, std::
                0.01    0.01   18686/19631       void std::deque<boost::geometry::model::d2::point_xy<double, boost::geometry::cs::cartesian>, boost::interprocess::node_allocator<boost::geo
                0.00    0.00    7053/454913      std::vector<boost::geometry::model::polygon<boost::geometry::model::d2::point_xy<double, boost::geometry::cs::cartesian>, true, true, std::v
                0.00    0.00  598567/598567      void OsmLuaProcessing::CorrectGeometry<boost::geometry::model::d2::point_xy<double, boost::geometry::cs::cartesian> >(boost::geometry::model
                0.00    0.00  598567/628205      std::_Deque_iterator<boost::geometry::model::d2::point_xy<double, boost::geometry::cs::cartesian>, boost::geometry::model::d2::point_xy<doub
                0.00    0.00   61086/122204      std::_Deque_iterator<std::vector<mmap::polygon_t, boost::interprocess::node_allocator<mmap::polygon_t, boost::interprocess::segment_manager<
                0.00    0.00   23314/69970       std::_Deque_iterator<mmap::linestring_base_t<boost::geometry::model::d2::point_xy<double, boost::geometry::cs::cartesian>, std::vector, boos
                0.00    0.00   18686/39290       std::_Deque_iterator<boost::geometry::model::d2::point_xy<double, boost::geometry::cs::cartesian>, boost::geometry::model::d2::point_xy<doub
-

kleunen avatar Mar 21 '21 17:03 kleunen