Inconsistent error message and documentation on px.histogram parameter nbins
The plotly express documentation states that the parameter nbins of px.histogram should be an int, however, the error message when inputting something else states that the nbinsx property of histograms should be an int or a float that will be cast to an int. The validation seems to check if the parameter is an instance of an int. The documentation of go.Histogram does not specify types for nbinsx.
It seems to me the documentation of go.Histogram could also include that nbinsx (and nbinsy?) should be int and that the type cannot be a float that will be cast to an int. Would be happy to update documentation and code to reflect this.
Reproducible example with float input:
import plotly.express as px
df = px.data.iris()
px.histogram(
df,
x="sepal_length",
nbins=47.0,
)
This produces a ValueError:
ValueError Traceback (most recent call last)
Cell In[11], line 1
----> 1 px.histogram(
2 df,
3 x="sepal_length",
4 nbins=47.0,
5 )
File ~/.cache/pypoetry/virtualenvs/dsp-rNgR53Wo-py3.10/lib/python3.10/site-packages/plotly/express/_chart_types.py:480, in histogram(data_frame, x, y, color, pattern_shape, facet_row, facet_col, facet_col_wrap, facet_row_spacing, facet_col_spacing, hover_name, hover_data, animation_frame, animation_group, category_orders, labels, color_discrete_sequence, color_discrete_map, pattern_shape_sequence, pattern_shape_map, marginal, opacity, orientation, barmode, barnorm, histnorm, log_x, log_y, range_x, range_y, histfunc, cumulative, nbins, text_auto, title, template, width, height)
434 def histogram(
435 data_frame=None,
436 x=None,
(...)
472 height=None,
473 ) -> go.Figure:
474 """
475 In a histogram, rows of `data_frame` are grouped together into a
476 rectangular mark to visualize the 1D distribution of an aggregate
477 function `histfunc` (e.g. the count or sum) of the value `y` (or `x` if
478 `orientation` is `'h'`).
479 """
--> 480 return make_figure(
481 args=locals(),
482 constructor=go.Histogram,
483 trace_patch=dict(
484 histnorm=histnorm,
485 histfunc=histfunc,
486 cumulative=dict(enabled=cumulative),
487 ),
488 layout_patch=dict(barmode=barmode, barnorm=barnorm),
489 )
File ~/.cache/pypoetry/virtualenvs/dsp-rNgR53Wo-py3.10/lib/python3.10/site-packages/plotly/express/_core.py:2256, in make_figure(args, constructor, trace_patch, layout_patch)
2251 group[var] = 100.0 * group[var] / group_sum
2253 patch, fit_results = make_trace_kwargs(
2254 args, trace_spec, group, mapping_labels.copy(), sizeref
2255 )
-> 2256 trace.update(patch)
2257 if fit_results is not None:
2258 trendline_rows.append(mapping_labels.copy())
File ~/.cache/pypoetry/virtualenvs/dsp-rNgR53Wo-py3.10/lib/python3.10/site-packages/plotly/basedatatypes.py:5139, in BasePlotlyType.update(self, dict1, overwrite, **kwargs)
5137 BaseFigure._perform_update(self, kwargs, overwrite=overwrite)
5138 else:
-> 5139 BaseFigure._perform_update(self, dict1, overwrite=overwrite)
5140 BaseFigure._perform_update(self, kwargs, overwrite=overwrite)
5142 return self
File ~/.cache/pypoetry/virtualenvs/dsp-rNgR53Wo-py3.10/lib/python3.10/site-packages/plotly/basedatatypes.py:3940, in BaseFigure._perform_update(plotly_obj, update_obj, overwrite)
3937 plotly_obj[key] = val
3938 else:
3939 # Assign non-compound value
-> 3940 plotly_obj[key] = val
3942 elif isinstance(plotly_obj, tuple):
3944 if len(update_obj) == 0:
3945 # Nothing to do
File ~/.cache/pypoetry/virtualenvs/dsp-rNgR53Wo-py3.10/lib/python3.10/site-packages/plotly/basedatatypes.py:4874, in BasePlotlyType.__setitem__(self, prop, value)
4870 self._set_array_prop(prop, value)
4872 # ### Handle simple property ###
4873 else:
-> 4874 self._set_prop(prop, value)
4875 else:
4876 # Make sure properties dict is initialized
4877 self._init_props()
File ~/.cache/pypoetry/virtualenvs/dsp-rNgR53Wo-py3.10/lib/python3.10/site-packages/plotly/basedatatypes.py:5218, in BasePlotlyType._set_prop(self, prop, val)
5216 return
5217 else:
-> 5218 raise err
5220 # val is None
5221 # -----------
5222 if val is None:
5223 # Check if we should send null update
File ~/.cache/pypoetry/virtualenvs/dsp-rNgR53Wo-py3.10/lib/python3.10/site-packages/plotly/basedatatypes.py:5213, in BasePlotlyType._set_prop(self, prop, val)
5210 validator = self._get_validator(prop)
5212 try:
-> 5213 val = validator.validate_coerce(val)
5214 except ValueError as err:
5215 if self._skip_invalid:
File ~/.cache/pypoetry/virtualenvs/dsp-rNgR53Wo-py3.10/lib/python3.10/site-packages/_plotly_utils/basevalidators.py:937, in IntegerValidator.validate_coerce(self, v)
933 else:
934 # Check int
935 if not isinstance(v, int):
936 # don't let int() cast strings to ints
--> 937 self.raise_invalid_val(v)
939 # Check min/max
940 if self.has_min_max:
File ~/.cache/pypoetry/virtualenvs/dsp-rNgR53Wo-py3.10/lib/python3.10/site-packages/_plotly_utils/basevalidators.py:296, in BaseValidator.raise_invalid_val(self, v, inds)
293 for i in inds:
294 name += "[" + str(i) + "]"
--> 296 raise ValueError(
297 """
298 Invalid value of type {typ} received for the '{name}' property of {pname}
299 Received value: {v}
300
301 {valid_clr_desc}""".format(
302 name=name,
303 pname=self.parent_name,
304 typ=type_str(v),
305 v=repr(v),
306 valid_clr_desc=self.description(),
307 )
308 )
ValueError:
Invalid value of type 'builtins.float' received for the 'nbinsx' property of histogram
Received value: 47.0
The 'nbinsx' property is a integer and may be specified as:
- An int (or float that will be cast to an int)
in the interval [0, 9223372036854775807]
The same error can also be produced for a go.Histogram using a float input to nbinsx:
go.Histogram(
x=df["sepal_length"],
nbinsx=47.0,
)
Good catch @gunnhildsp! My guess is this is a holdover from the JavaScript behavior, since there’s only one number type in JS. That said once plotly.js gets a hold of this number, it does check if nbins % 1 == 0 and if not it will discard the value, so either we should keep the current only-accept-int and fix the docs, or we should also accept float but only if it matches an int as far as % is concerned (and assuming that behavior is the same between Python and JS). I guess only-accept-int seems cleaner, so let’s keep that, and we’d welcome a PR to fix the messages/docs.
@alexcjohnson thank you for such a quick reply!
Should I create a separate issue in the docs repo for the doc change or just reference this issue?
I see that the ValueError message is made in the IntegerValidator. My proposal is to introduce a new parameter to the class, float_can_be_cast_to_int which defaults to True, and set it to False from the validators NbinsxValidator and NbinsyValidator. What do you think about that approach?
Should I create a separate issue in the docs repo for the doc change or just reference this issue?
This issue is fine for both. But let's nail down the behavior here first before updating the docs.
I see that the
ValueErrormessage is made in theIntegerValidator. My proposal is to introduce a new parameter to the class,float_can_be_cast_to_intwhich defaults toTrue, and set it toFalsefrom the validatorsNbinsxValidatorandNbinsyValidator. What do you think about that approach?
Are there any integer attributes that DON'T behave the way you're seeing with nbinsx/nbinsy? All the usage of these validators is autogenerated from the plotly.js schema, so if there is any distinction in how different cases of IntegerValidator behave it would need to be tied to something in the schema, which I'm pretty sure there isn't... so I think all integer attributes are handled the same way inside plotly.js and therefore they should all inherit this behavior.
Thank you for the input @alexcjohnson!