vega-lite icon indicating copy to clipboard operation
vega-lite copied to clipboard

facet fails on geoshape

Open iliatimofeev opened this issue 6 years ago • 13 comments

facet fails on geojson with Error: Undefined data set name: "child_main"

example

{
  "data": {
    "format": {"type": "json", "property": "features"},
    "url": "https://gist.githubusercontent.com/dwtkns/c6945b98afe6cc2fc410/raw/77de7bc07fc3974ec892aa6be46e7e035f637ea8/us.geojson"
  },
  "facet": {"row": {"field": "properties.region_big", "type": "nominal"}},
  "spec": {
    "projection": {"type": "albersUsa"},
    "mark": "geoshape",
    "encoding": {"color": {"field": "properties.name", "type": "nominal"}}
  },
  "$schema": "https://vega.github.io/schema/vega-lite/v2.4.1.json"
}

generated vega (editor)

 "projections": [
    {
      "name": "projection",
      "size": {"signal": "[child_width, child_height]"},
      "fit": {"signal": "data('child_main')"}, //<------
      "type": "albersUsa"
    }
  ],

it works without projection

{
  "data": {
    "format": {"type": "json", "property": "features"},
    "url": "https://gist.githubusercontent.com/dwtkns/c6945b98afe6cc2fc410/raw/77de7bc07fc3974ec892aa6be46e7e035f637ea8/us.geojson"
  },
  "facet": {"column": {"field": "properties.region_big", "type": "nominal"}},
  "spec": {
    "mark": "circle",
    "encoding": {
      "y": {"field": "properties.name", "type": "nominal"},
      "x": {"field": "properties.shape_area", "type": "quantitative"}
    }
  },
  "$schema": "https://vega.github.io/schema/vega-lite/v2.4.1.json"
}

iliatimofeev avatar May 10 '18 00:05 iliatimofeev

I just ran into this and it seems like it is an issue also in the latest version of Vega-Lite (5.2). Here is a spec that reproduces it with a sample dataset; the colorscale seems correct, just that the maps are not showing:

{
  "data": {
    "url": "https://cdn.jsdelivr.net/npm/[email protected]/data/us-10m.json",
    "format": {"feature": "states", "type": "topojson"}
  },
  "facet": {"row": {"field": "key", "type": "nominal"}},
  "spec": {
    "mark": {"type": "geoshape", "tooltip": true},
    "encoding": {"color": {"field": "value", "type": "quantitative"}},
    "height": 100,
    "projection": {"type": "albersUsa"},
    "transform": [
      {
        "lookup": "id",
        "from": {
          "data": {
            "url": "https://cdn.jsdelivr.net/npm/[email protected]/data/population_engineers_hurricanes.csv"
          },
          "key": "id",
          "fields": ["population", "engineers", "hurricanes"]
        }
      },
      {"fold": ["population", "engineers", "hurricanes"]}
    ]
  },
  "resolve": {"scale": {"color": "independent"}}
}

image Open the Chart in the Vega Editor

joelostblom avatar Apr 29 '22 23:04 joelostblom

In the compiled Vega, one have to change the line

"fit": {"signal": "data('child_main')"},

into

"fit": {"signal": "data('source_0')"},

to make it work. It looks good then! I tried to find in the VL-codebase to see where child_main comes from, but I cannot discover where that is defined.

mattijn avatar Oct 31 '22 21:10 mattijn

Any chance this will ever get fixed? This is currently a pretty big blocker in a project I'm developing where just concatenating charts is not a viable workaround.

velochy avatar Dec 23 '23 12:12 velochy

If you could help triage to see where the issue lies, I'm happy to help with the pull request. I have a lot on my plate so making time to triage this specific issue may take some time.

domoritz avatar Dec 23 '23 13:12 domoritz

What do you mean by triage in this case. Im happy to help but am quite unfamiliar with the vega codebase and internal functioning, so im not sure how much of a help i can be.

velochy avatar Dec 24 '23 08:12 velochy

Vega-Lite is basically a transpiler. The first step would be to find out what Vega spec vo generates that's wrong. The we need to find out what code causes it. This could be something where if you step through the translation of an example, you could see where the code generates the wrong output.

domoritz avatar Dec 24 '23 10:12 domoritz

I just tried and can recreate the issue with @joelostblom's spec and resolve it with @mattijn's fix. Does that help @domoritz?

{
  "data": {
    "url": "https://cdn.jsdelivr.net/npm/[email protected]/data/us-10m.json",
    "format": {"feature": "states", "type": "topojson"}
  },
  "facet": {"row": {"field": "key", "type": "nominal"}},
  "spec": {
    "mark": {"type": "geoshape", "tooltip": true},
    "encoding": {"color": {"field": "value", "type": "quantitative"}},
    "height": 100,
    "projection": {"type": "albersUsa"},
    "transform": [
      {
        "lookup": "id",
        "from": {
          "data": {
            "url": "https://cdn.jsdelivr.net/npm/[email protected]/data/population_engineers_hurricanes.csv"
          },
          "key": "id",
          "fields": ["population", "engineers", "hurricanes"]
        }
      },
      {"fold": ["population", "engineers", "hurricanes"]}
    ]
  },
  "resolve": {"scale": {"color": "independent"}}
}

In the compiled Vega, change the line

"fit": {"signal": "data('child_main')"},

into

"fit": {"signal": "data('source_0')"},

and it then works.

PBI-David avatar Dec 24 '23 10:12 PBI-David

Update: This example works. My assumption is: the geometry data is somehow cannot be selected and transformed to geojson even assigned mark as geoshape, so i call the us-10m data again and combines into the geo column

===========================================

I compare this generated vega script with this example, and found that the issue maybe missing a line in transform {"type": "geojson", "signal": "child_main"} Open the Chart in the Vega Editor

ChiaLingWeng avatar Jan 24 '24 01:01 ChiaLingWeng

Thanks for exploring this @ChiaLingWeng ! If I understand correctly it seems like your first working spec is similar to this example from the Altair gallery. The key in these workarounds seem to be to use a lookup transform for the geo data instead of specifying it as the default data object of the chart, does that sound correct?

It would be great to get it working without a lookup transform, and it seems like replacing data('child_main') with data('source_0') as mentioned above would achieve this, but we can't figure out where in the code this change needs to made. Are you suggesting that it is more appropriate to add {"type": "geojson", "signal": "child_main"} instead?

joelostblom avatar Mar 13 '24 21:03 joelostblom

Hi @joelostblom, yes, my workaround is to create another lookup transform. For this bug , I think data signal is push from here https://github.com/vega/vega-lite/blob/5b7f963f52dd3483687d0608a045cd0f5aa4cbdb/src/compile/projection/parse.ts#L69, and the name child_main is generated from here https://github.com/vega/vega-lite/blob/5b7f963f52dd3483687d0608a045cd0f5aa4cbdb/src/compile/model.ts#L469-L475 From my understanding, I think it's quite reasonable to have childModel name "child" in facet chart (correct me if I'm wrong). So I search facet geo graph example and come up with adding {"type": "geojson", "signal": "child_main"} might be correct

ChiaLingWeng avatar Mar 15 '24 07:03 ChiaLingWeng

@ChiaLingWeng Ah I see, thanks for elaborating. I think what you are suggesting makes sense and it's exciting to see a potential solution that would bring faceted maps to VL! I will say that I'm not the most familiar with the code base; would you be able to create a PR that implements your solution so that we can have a review from someone on the team with more experience?

joelostblom avatar Mar 15 '24 17:03 joelostblom

Hi @joelostblom, I switch the 2 data source position and find this can work. I'm a little confused now, but I'll see if I can help on this.

{
  "data": {
    "url": "https://cdn.jsdelivr.net/npm/[email protected]/data/population_engineers_hurricanes.csv"
  },
  "facet": {"row": {"field": "key", "type": "nominal"}},
  "spec": {
    "height": 100,
    "transform": [
      {"fold": ["population", "engineers", "hurricanes"]},
      {
        "lookup": "id",
        "from": {
          "data": {
            "url": "data/us-10m.json",
            "format": {"type": "topojson", "feature": "states"}
          },
          "key": "id"
        },
        "as": "geo"
      }
    ],
    "projection": {"type": "albersUsa"},
    "mark": "geoshape",
    "encoding": {
      "shape": {"field": "geo", "type": "geojson"},
      "color": {"field": "value", "type": "quantitative"}
    }
  },
  "resolve": {"scale": {"color": "independent"}}
}

ChiaLingWeng avatar Mar 19 '24 04:03 ChiaLingWeng

Thanks for taking the time to look into this further @ChiaLingWeng. I was trying to understand it better myself by using a simpler example that does not involve a transform and where the only data source is the geo data:

image Open the Chart in the Vega Editor

If I add facetting based on the ID to this chart, we can see that all the charts are empty, but the color scale is actually correct. So the data is passed correctly to the scale, but not to the map projection:

image Open the Chart in the Vega Editor (scroll all the way to the right)

If you open the Vega spec for the faceted VL chart, you can see that the data is referenced in three places:

  • On line 22 inside the projections key, it is data('child_main'): image
  • On line 51 inside the from key, it is source_0 image
  • On line 84 inside the scales key, it is also source_0 image

Based on that, it seems like the simplest solution might be to figure out how to replace the first occurrence with data('source_0') as @mattijn originally mentioned. Maybe the geojson signals you mentioned are only needed when there are transforms in the spec?

joelostblom avatar Mar 19 '24 16:03 joelostblom