Support histograms
It's not clear from the documentation how to support histogram plots, is it possible out-of-the-box, or do we need to do our own binning and then use a bar chart?
I think it's one of the most widespread graph, might be useful to have a few examples in the documentation
Here is an existing example (among others) consisting of three linked histograms: https://uwdata.github.io/mosaic/examples/flights-200k.html
And yes, the current model is to perform binning and create a bar chart.
Agreed that the documentation might benefit from a simpler example with "Histogram" in the name!
Thanks!
I think it would also help to make the distinction between Mosaic and vgplot clearer. The introduction is very good about talking about Mosaic, what it is and how it works, but maybe we can be more explicit about the fact that vgplot is not the only way to use Mosaic.
I agree
and also how to easily build “escape hatch” to use any methods from vega-lite / observable from Mosaic would be useful
Can you elaborate? We already explain extensibility in https://uwdata.github.io/mosaic/why-mosaic/#mosaic-is-extensible and have docs for building clients at https://uwdata.github.io/mosaic/core/#clients.
Another example that would be really fantastic would be to understand how to bin timestamp data. There are some really good examples on how to plot data by day of week, or month in year. But I just cant get a linear timescale to plot.
vgplot.bin() throws an error "Binder Error: No function matches the given name and argument types '-(TIMESTAMP, BIGINT)' You might need to add explicit type casts." when you try to do that for example with the rectY mark. Perhaps there is something really rudimentary I am missing, in order to get that to work?
I am able to get an areaY to render with timestamp data, but I believe from performance reasons it would be significantly better to have the data binned first.
An update on the histogram with a timescale dimension - I was able to make that work, using the following approach:
-
Created a customized dateYearMonthDay() function - based off the existing dateMonth() API. Need an input argument to state whether the X1 or X2 parameter shall be calculated (X2 will add a +1 on the year date).
-
Modified the rectY function to input x1 and x2, referencing the above functions
However, I am not sure this would be considered best practice, should the bin() function be able to seamlessly support this like with other data types (or if it doesn't today, is it the aspiration that it should once implemented)? Feel free to comment on any better approach to achieve this, and I do think an example for this would be very useful as many use timescale elements also for bar charts (business reports et c).
Hi @Unemyr, this is the direction I would recommend. The vgplot bin transform is specifically focused on binning quantitative values, and by design it does not operate on date-time data and related intervals (year, quarter, month, etc). I'd recommend opening a new feature request issue for support for time bin functions that produce the desired intervals (not unlike what Vega-Lite provides). We'd also be happy to review PRs along these lines.
OK noted on that. I would be open to contributing PRs for that later. Thanks for the quick reply!
I'll close this for now since mosaic supports histograms.