pandas icon indicating copy to clipboard operation
pandas copied to clipboard

VOTE: Core Team Vote on Keeping/Removing pyarrow warning

Open Dr-Irv opened this issue 1 year ago • 19 comments
trafficstars

@pandas-dev/pandas-core

At the development meeting on February 14, we agreed to take a vote on whether to remove the DeprecationWarning about pyarrow being required in version 2.2.1. We agreed that the decision about whether pyarrow will still be required with version 3.0 is delayed.

Core team should vote below on one of these 2 options: OPTION 1: Keep the DeprecationWarning in Version 2.2.1 OPTION 2: Remove the DeprecationWarning in Version 2.2.1 OPTION 3: Indifferent (equivalent to a +0 on up/down vote issues)

Voting will close at Noon Eastern Time on February 20, 2024. In the comments, choose OPTION 1 or OPTION 2 or OPTION 3. The decision will be based on which option receives the most votes. If OPTION 3 receives the most votes, then either OPTION 1 or OPTION 2 will be chosen based on which has the most votes. If both of those receive the same number of votes, I don't know what we will do!

For reference: Current warning that users see when importing pandas in version 2.2.0:

Pyarrow will become a required dependency of pandas in the next major release of pandas (pandas 3.0),
(to allow more performant data types, such as the Arrow string type, and better interoperability with other libraries)
but was not found to be installed on your system.
If this would cause problems for you,
please provide us feedback at https://github.com/pandas-dev/pandas/issues/54466

Github issue with feedback: https://github.com/pandas-dev/pandas/issues/54466 Github issue with discussion about not requiring pyarrow: https://github.com/pandas-dev/pandas/issues/57073

I'll list the reasons for keeping/removing the warning here, based on my recall of the discussion. Others can feel free to add additional reasons in the comments, or correct my memory.

Reasons for keeping the warning:

  • pandas 2.2.0 has only been out for 1 month, so we may obtain more feedback, as many people may have not upgraded yet
  • There may be additional reasons for not requiring pyarrow that we have not considered
  • If we remove the warning, then users might infer that we have decided to not require pyarrow in version 3.0

Reasons for removing the warning:

  • Too many people who are not affected by requring pyarrow are confused by the warning
  • We have enough feedback already to make a decision
  • It's too noisy for a variety of use cases

Dr-Irv avatar Feb 14 '24 19:02 Dr-Irv

Option 2

phofl avatar Feb 14 '24 19:02 phofl

2, I think enough feedback has been collected

MarcoGorelli avatar Feb 14 '24 19:02 MarcoGorelli

Option 1

WillAyd avatar Feb 14 '24 19:02 WillAyd

Option 1

Dr-Irv avatar Feb 14 '24 19:02 Dr-Irv

Option 2

(Side Note: I'm not sure the size and capabilities of pyarrow-core and whether Option 1 with an updated message about the dependency would change the feedback received)

simonjayhawkins avatar Feb 14 '24 19:02 simonjayhawkins

Option 1, keep the warning

datapythonista avatar Feb 14 '24 19:02 datapythonista

Option 2

mroeschke avatar Feb 14 '24 19:02 mroeschke

Option 2

jorisvandenbossche avatar Feb 14 '24 19:02 jorisvandenbossche

Option 1, retain the warning.

bashtage avatar Feb 14 '24 20:02 bashtage

Option 2

gfyoung avatar Feb 14 '24 21:02 gfyoung

Option 3

(it's an annoying warning but also a major change~~and DeprecationWarning are not printed by default~~)

twoertwein avatar Feb 15 '24 01:02 twoertwein

Option 1

people complain about everything - it's a good warning and useful

jreback avatar Feb 15 '24 03:02 jreback

Option 1. Its annoying and I think it should be removed for the final 2.2.x release but for now its only been out for 1 month so keep it.

attack68 avatar Feb 15 '24 16:02 attack68

~Option 2 Lots of users are affected by the warning, no matter whether they directly rely on pandas or not. I don't think it's end side user's responsibility to depress these warnings.~

Edit (@phofl): This is a core team vote, so please refrain from commenting here

ziyixi avatar Feb 15 '24 22:02 ziyixi

Option 1

fangchenli avatar Feb 15 '24 22:02 fangchenli

Option 1

alimcmaster1 avatar Feb 17 '24 00:02 alimcmaster1

Option 2

rhshadrach avatar Feb 17 '24 12:02 rhshadrach

@twoertwein

(it's an annoying warning but also a major change and DeprecationWarning are not printed by default)

Not trying to sway anyone's vote, but I do think it's important to know that this is not always true. It will print if you run python foo.py and foo.py imports pandas directly prior to importing any other module that imports pandas. Likewise, it will print when you import pandas directly in a jupyter notebook - again only if prior to importing any other module that imports pandas.

If you first import another module that imports pandas, you will not see the warning by default.

rhshadrach avatar Feb 17 '24 12:02 rhshadrach

You are right! (For some reason, I get a different DeprecationWarning: datetime.datetime.utcfromtimestamp() is deprecated on python 3.12 only when changing the warning level).

This is probably not the place for a new option, but changing it to an ImportWarning (which is not printed by default) might be a nice middle ground.

I changed my vote to "Option 3" (indifferent). Option 2 doesn't make sense to me as a large percentage of "users" (downloads) is not yet using 2.2: https://www.pepy.tech/projects/pandas

twoertwein avatar Feb 17 '24 15:02 twoertwein

Option 2

jbrockmendel avatar Feb 19 '24 18:02 jbrockmendel

Option 2

conditional on backing out making pyarrow required as per PDEP 10.

lithomas1 avatar Feb 20 '24 14:02 lithomas1

Option 2

conditional on backing out making pyarrow required as per PDEP 10.

@lithomas1 are you saying that you only support option 2 if pyarrow is no longer required? But given that we have not yet decided whether to reverse PDEP-10, does that change your vote?

Dr-Irv avatar Feb 20 '24 17:02 Dr-Irv

The final tally is

  • Option 1: 8
  • Option 2: 9
  • Option 3: 1

Since I did say this would be a majority vote, we should remove the warning for 2.2.1.

Having said that, @jorisvandenbossche and I have discussed that we really don't have a process for revoking parts of a PDEP. In other words, PDEP-10 says a warning would be issued from 2.2 onwards. By removing the warning, we are changing the outcome of the PDEP via an ad-hoc voting process created to resolve this particular issue. So I'm not entirely comfortable with making this decision based on a difference of 1 vote. I'm not sure how others feel about the procedural aspect of this decision, where a simple majority determines the revocation of part of a PDEP.

Dr-Irv avatar Feb 20 '24 21:02 Dr-Irv

I'll make the PR to remove the warning. I didn't have time yesterday, so will put off the release until Friday.

lithomas1 avatar Feb 21 '24 15:02 lithomas1

This'll also give us some time to think through this decision some more, in case people are getting worried about the simple majority thing.

Can someone else update the PDEP?

lithomas1 avatar Feb 21 '24 15:02 lithomas1

Closing since this already happened

mroeschke avatar May 31 '24 22:05 mroeschke