Rendering with maxbins when data have data points less than the number of bins themselves
I have a following data:
cdf_data = [
{ d_percentages: 0, student_percentages: 35 },
{ d_percentages: 10, student_percentages: 42 },
{ d_percentages: 20, student_percentages: 55 },
{ d_percentages: 30, student_percentages: 75 },
{ d_percentages: 40, student_percentages: 85 },
{ d_percentages: 50, student_percentages: 91 },
{ d_percentages: 60, student_percentages: 96 },
{ d_percentages: 70, student_percentages: 98 },
{ d_percentages: 80, student_percentages: 98 },
{ d_percentages: 90, student_percentages: 100 },
{ d_percentages: 100, student_percentages: 100 }
]
I created following visualization:
cdf_in_js_with_minbins = {
const plot = vl.markBar()
.data(cdf_data)
.encode(
vl.y()
.fieldQ('student_percentages'),
vl.x()
.fieldQ('d_percentages')//.bin(true)
.scale({ "domain": [0, 100] })
.bin({ minbins: 10 })
).width(500).height(250);
return plot.render();
}
This outputs:

Initially, before minbins: 10 above, I had tried maxbins: 30, and it rendered following:

This confused me a lot, especially because two bars in the range 90-100. Also, nowhere in cdf_data, it says 0-5 range has 35% of students and 5-10 range has 0% of students. I felt that, being "max" limit, it will end up showing just 10 bins as in case of first figure. Instead, it created 20 bins. Am I missing some understanding here or its a bug?
Here is the observablehq notebook rendering both plots.
I think this is as expected. If you have prebinned data, use the binned property. The last in is inclusive the upper bound and not exclusive. The actual number of bins depends only on the range and not the data.