matplotlib
matplotlib copied to clipboard
[Bug]: TypeError when plotting against list of datetime.date where 0th element of list is None
Bug summary
pyplot raises an error when a list of dates starts with None.
Code for reproduction
from datetime import *
from matplotlib import pyplot as plt
y = [6, 2, 8, 3, 1, 8, 5, 3, 0, 7]
x = [date.today() + timedelta(days=i) for i in range(10)]
x[5] = None
x[-1] = None
plt.plot(x, y) # works
plt.show()
x[0] = None
plt.plot(x, y) # TypeError
plt.show()
Actual outcome
Traceback (most recent call last):
File "pyplot_test.py", line 17, in <module>
plt.plot(x, y)
File "<python_path>\lib\site-packages\matplotlib\pyplot.py", line 2769, in plot
return gca().plot(
File "<python_path>\lib\site-packages\matplotlib\axes\_axes.py", line 1634, in plot
self.add_line(line)
File "<python_path>\lib\site-packages\matplotlib\axes\_base.py", line 2288, in add_line
self._update_line_limits(line)
File "<python_path>\lib\site-packages\matplotlib\axes\_base.py", line 2311, in _update_line_limits
path = line.get_path()
File "<python_path>\lib\site-packages\matplotlib\lines.py", line 999, in get_path
self.recache()
File "<python_path>\lib\site-packages\matplotlib\lines.py", line 652, in recache
x = _to_unmasked_float_array(xconv).ravel()
File "<python_path>\lib\site-packages\matplotlib\cbook\__init__.py", line 1298, in _to_unmasked_float_array
return np.asarray(x, float)
File "<python_path>\lib\site-packages\numpy\core\_asarray.py", line 85, in asarray
return array(a, dtype, copy=False, order=order)
TypeError: float() argument must be a string or a number, not 'datetime.date'
Expected outcome
plot data starting at index 1.
Additional information
No response
Operating system
windows 10
Matplotlib Version
3.5.2
Matplotlib Backend
TkAgg
Python version
3.8.2
Jupyter version
No response
Installation
pip
Unfortunately I think this is the expected behavior (I was actually a bit surprised it worked with None
s at all). It works because in the time handling code paths we delegate most of the work to numpy and it will cast None
to NaT
:
In [7]: np.asarray(x, dtype=np.datetime64)
Out[7]:
array(['2022-08-07', '2022-08-08', '2022-08-09', '2022-08-10',
'2022-08-11', 'NaT', '2022-08-13', '2022-08-14',
'2022-08-15', '2022-08-16'], dtype='datetime64[D]')
In [8]: x
Out[8]:
[datetime.date(2022, 8, 7),
datetime.date(2022, 8, 8),
datetime.date(2022, 8, 9),
datetime.date(2022, 8, 10),
datetime.date(2022, 8, 11),
None,
datetime.date(2022, 8, 13),
datetime.date(2022, 8, 14),
datetime.date(2022, 8, 15),
datetime.date(2022, 8, 16)]
The reason this fails with None
as the first element is that we do not correctly identify the units (and hence which handlers to use). For the automatic unit detection (we handle datetime through the unit system because time is a number that has time-like units attached to it) we look at the first element (because we assume that the input lists are homogeneous).
We could add a special case to that dispatch to keep walking down the sequence if a None
is found. On one hand that would fix this issue and smooth out the rough edge identified here. On the other hand it makes that code even more complex.
I could also see a case for exploding a lot earlier if there is a None
in the data, but that is just doubling down on a failure mode that depends on the location in the input of a bad value which is less than ideal.
In the short term, the user-side fix is to use np.array(..., dytpe=np.datetime64)
before you pass the values to Matplotlib which ensure that the first element is always a type that will go down the correct unit dispatch.
diff --git i/lib/matplotlib/cbook/__init__.py w/lib/matplotlib/cbook/__init__.py
index b9d5f5eb26..9cbf273259 100644
--- i/lib/matplotlib/cbook/__init__.py
+++ w/lib/matplotlib/cbook/__init__.py
@@ -1691,19 +1691,12 @@ def safe_first_element(obj):
This is an type-independent way of obtaining the first element, supporting
both index access and the iterator protocol.
"""
- if isinstance(obj, collections.abc.Iterator):
- # needed to accept `array.flat` as input.
- # np.flatiter reports as an instance of collections.Iterator
- # but can still be indexed via [].
- # This has the side effect of re-setting the iterator, but
- # that is acceptable.
- try:
- return obj[0]
- except TypeError:
- pass
- raise RuntimeError("matplotlib does not support generators "
- "as input")
- return next(iter(obj))
+ if isinstance(obj, np.flatiter):
+ return obj[0]
+ elif isinstance(obj, collections.abc.Iterator):
+ raise RuntimeError("matplotlib does not support generators as input")
+ else:
+ return next(val for val in obj if val is not None)
def sanitize_sequence(data):
seems reasonable enough to me? (except that the function becomes a bit misnamed, but such is life). This assumes that the goal of the first cutout is indeed just to support flatiter (which seems to be the intent of #6712).
None working elsewhere in the list seems to me an implementation accident rather than something we purposely designed. I'm not aware of us using None to mean no data elsewhere in the library (but I could be wrong) so I am not sure we should special case it here.
I think this is an "in for a penny, in for a pound" situation. We are very permissive in terms of what we take as input and currently do take lists of datetimes with None
anywhere but the first element. To be consistent we either need to lock down what we take as input more (which is not really feasible) or take this extra complexity.
I'm leaning towards "take the complexity" because that is our job.
I'm labeling this as a good first issue because we have a plausbile looking patch and there is no (explicit) new public API.
Steps:
- convert the patch in https://github.com/matplotlib/matplotlib/issues/23580#issuecomment-1207477074 into a commit
- add a test (the OP's example is enough)
Hi there, Can I work on this?
@MikiPWata Yes, just be aware we do not consider an issue "claimed" until there is a PR and that there is not complete consensus among the devs on this yet.
I checked that None is also ignored in bare lists of floats, so I guess this is OK. We probably should document this somehow though...
@jklymak my PR is pretty much good to go(passed all checks), so would love to have a review. Thank you in advance!
well, we finally had two people take up a "good first issue" at about the same time.
Is there anything I can/should do to fix the test failures from CI?
Finally passed all checks! I was wondering if I should keep using the original methods(safe_first_element()) or the _safe_non_none() for all tests that uses safe_first_element.
I am pretty sure that I should be using the new method for the test made for this issue.
Closed by #23587.