carto icon indicating copy to clipboard operation
carto copied to clipboard

Possibly suboptimal translation to XML

Open ajashton opened this issue 13 years ago • 18 comments

I'm not sure whether this is a bug, but I managed to create ~50 lines of Carto that get translated into something like 12,000 lines of XML. I probably should just be breaking things up differently, but perhaps there's room for optimization on Carto's side as well.

The offending style is at https://gist.github.com/859095 and the data to go with it is http://dl.dropbox.com/u/2398828/country-labels.geojson

Not the main issue, but I noticed many identical fontsets are created as well.

ajashton avatar Mar 07 '11 20:03 ajashton

The fontset duplication was fixed in #21, I'll check out this output to see if there's a way around duplication here, though I'm suspicious that this style layout simply creates some exponential number of styles because it needs to.

tmcw avatar Jun 15 '11 22:06 tmcw

This is triggered by having multiple column names in the same style (Z_ABBREV, Z_ADMIN, Z_NAME in ajashton's example). You can get the same thing easily with osm rules with amenity, landuse, leisure etc exploding exponentially.

A workaround that ajashton suggested is to use a different ::attachment for each object. It's sufficient to use one for each column name, for example:

.poi[amenity='post_box'][zoom>=17]::amenity { .,. } .poi[railway='station'][zoom>=10]::railway { ... } .poi[amenity='post_office][zoom>=17]::amenity { ... }

Note that if you do this, then the ordering of the features changes from that expected from the layer datasource, since all the objects with a given ::attachment are rendered first, then all the second ::attachment, and so on.

gravitystorm avatar Aug 05 '11 18:08 gravitystorm

Pushing this to 0.2.5: very high priority for a fix, but barely livable for the time being.

tmcw avatar Sep 07 '11 15:09 tmcw

Okay, this is still a high priority, but I'm still kind of at a loss for a compact summary of the expected output and how this differs.

I can see some issue in https://gist.github.com/4329932 in which excess rules are generated that will never be used, but not sure if this is the same problem that this ticket addresses.

tmcw avatar Dec 18 '12 17:12 tmcw

Experimental branch for this at https://github.com/mapbox/carto/tree/condense

Since this work will likely change the output of a lot of tests and we don't have visual tests for carto, it's not an easy thing to do on my own.

tmcw avatar Dec 18 '12 18:12 tmcw

re: condense branch, sweet, happy to try to test re: overall issue of what the heck is ideal/expected - I'm also kind of at a loss. I need to play around with small examples more to get a handle on this. But, what I'm wondering is if we may also need a non-cascading mode (e.g. a non filter-mode="first" mode)...

springmeyer avatar Dec 18 '12 18:12 springmeyer

The main issue I'm seeing is that zoom rules are not properly taken into consideration when we're condensing filters, so we're generating 3 rules where there should be two for each zoom/filter combo.

tmcw avatar Dec 18 '12 18:12 tmcw

fantastic observation.

springmeyer avatar Dec 18 '12 19:12 springmeyer

Adding more benchmarking statements shows that for the above testcase (https://gist.github.com/859095) most time is taken up in tree.Definition.prototype.toXML. It does not appear that any one definition takes a long time, but rather that their are so many to process (245). So, @tmcw 's attempt to reduce rule generation - in any way possible - seems solid.

$ ./bin/carto issue_20.mss -b
Parsing time: 8ms
Rule generation time: 1ms
Inheritance time: 28ms
Sorted time: 1ms
Create style "0": 0ms
        Style level properties: 5ms
        Calling toXML on 245 definitions (rules): 2161ms
Calling toXML for style "0": 2168ms
Total Style toXML time: 2168ms
TOTAL: 2214ms

springmeyer avatar Dec 19 '12 03:12 springmeyer

Finally got some profiling going: https://gist.github.com/4338438

Basic approach:

node --prof ./bin/carto -b test/rendering/issue20.mss

Check out v8, install, set

export D8_PATH=/Users/tmcw/src/v8/

then

~/src/v8/tools/mac-tick-processor >> analysis.txt

Basic jist of this is in 4ef52a82af83ddb35c5f1f96d100b9234c76b897 which reduces render time for this input from 0.35s to 0.14s on my machine. This is only a generation improvement, though - the output still needs to be fixed.

tmcw avatar Dec 19 '12 17:12 tmcw

Losing track of how this is a bug again - here's a reduced test case: https://gist.github.com/4339857

I get the 'extra filter generated' case, but that's at most adding a few more rules - I need a better example of ideal versus generated output, since it's very difficult to tell the difference between basic filter-explosions based on NxMxX combinations (which we can never fix in Carto until Mapnik does cascading) and a bug that can be fixed here.

tmcw avatar Dec 19 '12 19:12 tmcw

okay, back in my court to see if I can gin up a better testcase.

springmeyer avatar Dec 19 '12 22:12 springmeyer

So I think this issue represents a "fundamental" problem/feature of carto, rather than a bug that is easily addressed.

It's really, really easy to run into when dealing with OSM data because of the multiple-key approach to tagging. But what's easy in XML turns into a filter explosion in carto. A typical landcover layer - say 20 polygon types spread over 4 or 5 column names - is easy to write in XML with 20 filters in the XML. In carto, however, 20 rules end up with hundreds or thousands of filters in the output, and you end up with the kind of complex gymnastics shown in the sql query at https://github.com/gravitystorm/openstreetmap-carto/issues/15 to achieve a similar rendering effect.

I see two ways forward with this, but neither are great:

a) Have a non-filter-mode-first output option for particular layers. This would allow writing carto that is a direct port of a simple xml landcover-style layer, i.e. where each feature could match multiple rules without needing NxMxX filters. But I suspect that might involve a bit of a rewrite of carto.

b) Have an "I don't care" or "match-one-rule" flag for particular layers. This would keep the filter-mode-first flag on the xml output, but instruct carto not to build all of the NxMxX filters. It would effectively say "if there's a feature with both landuse=forest and tourism=attraction, I'm happy with whichever rule comes first. Don't try figuring out the NxM combo". It's of limited applicability but would solve the opaque-polygon-fill case.

Slightly hand-wavy, but maybe provides ideas? My aim is for the carto required for these scenarios to be as simple to write as the XML, and ideally keep the complexity of the SQL to a minimum too.

gravitystorm avatar Jan 07 '13 11:01 gravitystorm

Hey @gravitystorm - I agree. There are some filter-layer combinations that weren't really thought of in the early days of Carto and are handled poorly.

At this point, there are a few factors which basically explain why this is not moving quickly: limited time to work on Carto, lack of non-XML tests, requirement of backwards compatibility, lack of 'ideal output examples' for this ticket. Hence everyone being aware of the problem and its importance but a little limited in ability to jfdi it.

At some point, Dane will be working on symbolizer-based tests, and if you can figure out blue-sky examples of input & output, that would be super-helpful.

tmcw avatar Jan 07 '13 13:01 tmcw

How much of a show stopper is this issue for rolling out osm-carto on osm.org? Is @springmeyer's suggestion to patch in the old landcover style XML a feasible stop gap until we have the time to tackle the underlying issue?

lxbarth avatar Jan 07 '13 13:01 lxbarth

It's not a showstopper at all - there are workarounds, and some are in place. This is more about the general case, and removing the need for workarounds in the first place. #235 is more of an problem for roll-out at the moment.

gravitystorm avatar Jan 07 '13 13:01 gravitystorm

I experienced the XML explosion and had a hard time figuring out what was causing it. I haven't looked deep into the issue, but it does seems the XML has a lot of unnecessary rules. For example, the style I'm working with generates this nonsensical rule that looks for ways that are bridges and tunnels at the same time:

  <Rule>
    <MaxScaleDenominator>400000</MaxScaleDenominator>
    <MinScaleDenominator>500</MinScaleDenominator>
    <Filter>([tunnel] = 1) and ([bridge] = 1)</Filter>
    <LineSymbolizer stroke-dasharray="3, 3" stroke="#adaca8" stroke-linecap="butt" stroke-linejoin="round" />
  </Rule>

I didn't to a reduction of this yet, so there isn't not much point is showing the carto code, but nowhere does it look for bridgetunnels :-)

emiltin avatar Sep 14 '13 15:09 emiltin

Are there any updates on ways to avoid this issue without doing complicated "feature" column expressions in SQL?

pnorman avatar Jun 29 '14 09:06 pnorman