ianvs
ianvs copied to clipboard
Deprecate outdated interfaces of pandas used in rank.py
What would you like to be added/modified:
Some outdated interfaces of pandas
are used in in rank.py
.
We can deprecate them by:
- Implement
__get_all()
using a more elegant interface. - using
pd.concat()
to merge old DataFrame and new DataFrame.
The modified method should remain compatible with the version of pandas corresponding to Python 3.6.
Why is this needed:
Ianvs was originally designed for pandas==1.1.5
, but the latest version is now pandas==2.2.2
.Due to a major version update, some interfaces of pandas have been deprecated in the new version.
Continuing to use these old interfaces will encounter errors on Python>=3.8
.
-
pd.np
has been deprecated inpandas>=2.0.0
: https://github.com/kubeedge/ianvs/blob/f2352ce018f04f398b1be0f37d0fa3cd11476626/core/storymanager/rank/rank.py#L208
AttributeError: module 'pandas' has no attribute 'np'
-
append
has been deprecated inpandas>=2.0.0
: https://github.com/kubeedge/ianvs/blob/f2352ce018f04f398b1be0f37d0fa3cd11476626/core/storymanager/rank/rank.py#L171
AttributeError: 'DataFrame' object has no attribute 'append'
- initializing
all_df
withnp.NAN
will causestr
data missing: https://github.com/kubeedge/ianvs/blob/f2352ce018f04f398b1be0f37d0fa3cd11476626/core/storymanager/rank/rank.py#L145
+------+-----------+-----+-----------+----------+-----------+----------+---------------------+----------------------+-------------------+------------------------+---------------------+------+-----+
| rank | algorithm | acc | edge-rate | paradigm | basemodel | apimodel | hard_example_mining | basemodel-model_name | basemodel-backend | basemodel-quantization | apimodel-model_name | time | url |
+------+-----------+-----+-----------+----------+-----------+----------+---------------------+----------------------+-------------------+------------------------+---------------------+------+-----+
| 1 | | 0.6 | 0.6 | | | | | | | | | | |
+------+-----------+-----+-----------+----------+-----------+----------+---------------------+----------------------+-------------------+------------------------+---------------------+------+-----+
- Setting value by
df.[row_index][column]
will causeSettingWithCopyWarning
. This interface will be deprecated inpandas>=3.0
in the future. https://github.com/kubeedge/ianvs/blob/f2352ce018f04f398b1be0f37d0fa3cd11476626/core/storymanager/rank/rank.py#L148
Line 151, 154, 158, 165, 167 have the same issue, too.
./ianvs/core/storymanager/rank/rank.py:148: FutureWarning: ChainedAssignmentError: behaviour will change in pandas 3.0!
You are setting values through chained assignment. Currently this works in certain cases, but when using Copy-on-Write (which will become the default behaviour in pandas 3.0) this will never work to update the original DataFrame or Series, because the intermediate object on which we are setting values will behave as a copy.
A typical example is when you are setting values in a column of a DataFrame, like:
df["col"][row_indexer] = value
Use `df.loc[row_indexer, "col"] = values` instead, to perform the assignment in a single step and ensure this keeps updating the original `df`.
See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
all_df.loc[i][0] = algorithm.name
-
Assigning values one by one reduces code readability and can be simplified. https://github.com/kubeedge/ianvs/blob/f2352ce018f04f398b1be0f37d0fa3cd11476626/core/storymanager/rank/rank.py#L145-L167
-
Using whitspace as seperator in a CSV(Comma-Separated Values) file is weird. https://github.com/kubeedge/ianvs/blob/f2352ce018f04f398b1be0f37d0fa3cd11476626/core/storymanager/rank/rank.py#L179