sort by channel not used to visualize?
I have a dataset sorted by a column, lets say rate. I then create a stacked bar chart using a different column, lets say value.
I want a stacked bar chart on value but sorted by rate. Even though my data is already sorted by rate, when i do the following it orders the y domain alphabetically:
Plot.barX(data, { x: "value", y: "name", fill: "type" })
if I try to use sort: { y: "x", reverse: true, limit: 20 } I can't figure out how to get it to sort by my rate column instead.
I'm able to do a work around by explicitly setting the y domain to the sorted order of names I want, but it feels like i might be missing something?
Can you share a notebook? When you use sort: { y: "x", reverse: true, limit: 20 } inside a mark's options, the "x" part must refer to an existing channel (& you can also specify a reducer).
So one example might be to use:
Plot.barX(data, {
x: "value",
y: "name",
fill: "type",
stroke: "rate",
sort: { y: "stroke", reduce: "median", reverse: true, limit: 20 }
})
but it implies you then have this stroke channel that you probably didn't want initially. Of course it's possible to use a channel with an invisible consequence (e.g. use "strokeOpacity", since there is no stroke), but it feels a bit like cheating.
I ran into this the other day but I forget the context now. It would be nice if there were some syntax to materialize a column in the sort options for when you don’t have an appropriate existing channel. Maybe something like sort: {y: {value: "rate"}}?
Maybe we could check the optional (explicit) channels used for the tip mark?
@Fil It should already work if you use the channels option to declare an additional channel.
yes, this works (paste it into https://observablehq.com/d/c4d7fa2856198eea)
Plot.plot({
marks: [
Plot.barX(clean_monthly_april, {
y: 'title',
x: 'activations',
channels: {difference: {value: 'difference'}},
sort: {y: 'difference'}
})
]
})
meaning we can close this issue!?
Since 0.6.7 you can shorten slightly too:
Plot.plot({
marks: [
Plot.barX(clean_monthly_april, {
y: 'title',
x: 'activations',
channels: {difference: 'difference'},
sort: {y: 'difference'}
})
]
})
We could potentially allow this shorthand in the future
Plot.plot({
marks: [
Plot.barX(clean_monthly_april, {
y: 'title',
x: 'activations',
sort: {y: 'difference'}
})
]
})
where we materialize the difference channel implicitly if it doesn’t already exist using the difference field. But, that introduces some ambiguity because it puts channel names and channel values in the same namespace. So, probably better to declare the channel explicitly as above.
More discussion: https://talk.observablehq.com/t/observable-plot-cell-how-to-order-y-axis-by-a-specific-field/8120/6
I think we should probably allow field names to be specified, with channel names taking priority — assuming that it’s possible for us to inspect whether a channel with the given name exists at runtime. (I think it should be possible?) And then we’ll need to introduce some disambiguation syntax. For channels, probably:
sort: {y: {channel: "x", order: "descending"}}
For fields, probably:
sort: {y: {value: "latitude", order: "descending"}}
This is consistent with how the sort transform works, which is nice.
Unfortunately the latter is already supported for referencing channel names. So we’ll probably need to continue to allow that, but we could issue a warning saying that it’s deprecated and may be removed in the future? This would mean you’d need to specify a function if the field name collides with a channel:
sort: {y: {value: (d) => d.x, order: "descending"}}