pandas
pandas copied to clipboard
VOTE: Core Team Vote on Keeping/Removing pyarrow warning
@pandas-dev/pandas-core
At the development meeting on February 14, we agreed to take a vote on whether to remove the DeprecationWarning about pyarrow being required in version 2.2.1. We agreed that the decision about whether pyarrow will still be required with version 3.0 is delayed.
Core team should vote below on one of these 2 options:
OPTION 1: Keep the DeprecationWarning in Version 2.2.1
OPTION 2: Remove the DeprecationWarning in Version 2.2.1
OPTION 3: Indifferent (equivalent to a +0 on up/down vote issues)
Voting will close at Noon Eastern Time on February 20, 2024. In the comments, choose OPTION 1 or OPTION 2 or OPTION 3. The decision will be based on which option receives the most votes. If OPTION 3 receives the most votes, then either OPTION 1 or OPTION 2 will be chosen based on which has the most votes. If both of those receive the same number of votes, I don't know what we will do!
For reference: Current warning that users see when importing pandas in version 2.2.0:
Pyarrow will become a required dependency of pandas in the next major release of pandas (pandas 3.0),
(to allow more performant data types, such as the Arrow string type, and better interoperability with other libraries)
but was not found to be installed on your system.
If this would cause problems for you,
please provide us feedback at https://github.com/pandas-dev/pandas/issues/54466
Github issue with feedback: https://github.com/pandas-dev/pandas/issues/54466
Github issue with discussion about not requiring pyarrow: https://github.com/pandas-dev/pandas/issues/57073
I'll list the reasons for keeping/removing the warning here, based on my recall of the discussion. Others can feel free to add additional reasons in the comments, or correct my memory.
Reasons for keeping the warning:
- pandas 2.2.0 has only been out for 1 month, so we may obtain more feedback, as many people may have not upgraded yet
- There may be additional reasons for not requiring
pyarrowthat we have not considered - If we remove the warning, then users might infer that we have decided to not require
pyarrowin version 3.0
Reasons for removing the warning:
- Too many people who are not affected by requring
pyarroware confused by the warning - We have enough feedback already to make a decision
- It's too noisy for a variety of use cases
Option 2
2, I think enough feedback has been collected
Option 1
Option 1
Option 2
(Side Note: I'm not sure the size and capabilities of pyarrow-core and whether Option 1 with an updated message about the dependency would change the feedback received)
Option 1, keep the warning
Option 2
Option 2
Option 1, retain the warning.
Option 2
Option 3
(it's an annoying warning but also a major change~~and DeprecationWarning are not printed by default~~)
Option 1
people complain about everything - it's a good warning and useful
Option 1. Its annoying and I think it should be removed for the final 2.2.x release but for now its only been out for 1 month so keep it.
~Option 2 Lots of users are affected by the warning, no matter whether they directly rely on pandas or not. I don't think it's end side user's responsibility to depress these warnings.~
Edit (@phofl): This is a core team vote, so please refrain from commenting here
Option 1
Option 1
Option 2
@twoertwein
(it's an annoying warning but also a major change and DeprecationWarning are not printed by default)
Not trying to sway anyone's vote, but I do think it's important to know that this is not always true. It will print if you run python foo.py and foo.py imports pandas directly prior to importing any other module that imports pandas. Likewise, it will print when you import pandas directly in a jupyter notebook - again only if prior to importing any other module that imports pandas.
If you first import another module that imports pandas, you will not see the warning by default.
You are right! (For some reason, I get a different DeprecationWarning: datetime.datetime.utcfromtimestamp() is deprecated on python 3.12 only when changing the warning level).
This is probably not the place for a new option, but changing it to an ImportWarning (which is not printed by default) might be a nice middle ground.
I changed my vote to "Option 3" (indifferent). Option 2 doesn't make sense to me as a large percentage of "users" (downloads) is not yet using 2.2: https://www.pepy.tech/projects/pandas
Option 2
Option 2
conditional on backing out making pyarrow required as per PDEP 10.
Option 2
conditional on backing out making pyarrow required as per PDEP 10.
@lithomas1 are you saying that you only support option 2 if pyarrow is no longer required? But given that we have not yet decided whether to reverse PDEP-10, does that change your vote?
The final tally is
- Option 1: 8
- Option 2: 9
- Option 3: 1
Since I did say this would be a majority vote, we should remove the warning for 2.2.1.
Having said that, @jorisvandenbossche and I have discussed that we really don't have a process for revoking parts of a PDEP. In other words, PDEP-10 says a warning would be issued from 2.2 onwards. By removing the warning, we are changing the outcome of the PDEP via an ad-hoc voting process created to resolve this particular issue. So I'm not entirely comfortable with making this decision based on a difference of 1 vote. I'm not sure how others feel about the procedural aspect of this decision, where a simple majority determines the revocation of part of a PDEP.
I'll make the PR to remove the warning. I didn't have time yesterday, so will put off the release until Friday.
This'll also give us some time to think through this decision some more, in case people are getting worried about the simple majority thing.
Can someone else update the PDEP?
Closing since this already happened