modin
modin copied to clipboard
FIX-#4660: Fix `fillna` when Modin series object is an argument
Signed-off-by: Myachev [email protected]
What do these changes do?
- [x] commit message follows format outlined here
- [x] passes
flake8 modin/ asv_bench/benchmarks scripts/doc_checker.py
- [x] passes
black --check modin/ asv_bench/benchmarks scripts/doc_checker.py
- [x] signed commit with
git commit -s
- [x] Resolves #4660
- [x] tests added and passing
- [x] module layout described at
docs/development/architecture.rst
is up-to-date - [x] added (Issue Number: PR title (PR Number)) and github username to release notes for next major release
Codecov Report
Merging #4674 (6ea4940) into master (9b33451) will increase coverage by
3.09%
. The diff coverage is100.00%
.
@@ Coverage Diff @@
## master #4674 +/- ##
==========================================
+ Coverage 86.57% 89.67% +3.09%
==========================================
Files 230 231 +1
Lines 18467 18849 +382
==========================================
+ Hits 15988 16903 +915
+ Misses 2479 1946 -533
Impacted Files | Coverage Δ | |
---|---|---|
modin/core/dataframe/pandas/dataframe/dataframe.py | 95.20% <100.00%> (+0.01%) |
:arrow_up: |
...odin/core/storage_formats/pandas/query_compiler.py | 96.09% <100.00%> (+<0.01%) |
:arrow_up: |
...s/pandas_on_dask/partitioning/virtual_partition.py | 85.98% <0.00%> (-8.90%) |
:arrow_down: |
...lementations/pandas_on_dask/dataframe/dataframe.py | 95.83% <0.00%> (-4.17%) |
:arrow_down: |
modin/pandas/utils.py | 92.40% <0.00%> (-1.94%) |
:arrow_down: |
modin/core/execution/ray/common/utils.py | 95.23% <0.00%> (-1.64%) |
:arrow_down: |
...ns/pandas_on_ray/partitioning/partition_manager.py | 82.19% <0.00%> (-1.30%) |
:arrow_down: |
...s/pandas_on_dask/partitioning/partition_manager.py | 100.00% <0.00%> (ø) |
|
modin/experimental/batch/test/test_pipeline.py | 100.00% <0.00%> (ø) |
|
... and 23 more |
:mega: Codecov can now indicate which changes are the most critical in Pull Requests. Learn more
I now wonder what is the reason for not reindexing in the case of
df1["c"].fillna(df2)
? What if partition boundaries do not match fordf1
anddf2
? What happens ifdf1.index
does not matchdf2.index
, how would.fillna()
work in that case?
The case df1["c"].fillna(df2)
is not possible, because value
for Series.fillna
must be a scalar, dict or Series
, but not DataFrame
:
import pandas as pd
# import modin.pandas as pd
df = pd.DataFrame({'a': ['a'], 'b': ['b'],}, index=['row1'])
df['c'] = pd.NA
df2_0 = pd.DataFrame({"a": [0], "b": [5]}, index=['row1'])
df2_1 = pd.DataFrame({"c": ["c"]}, index=['row1'])
df2 = pd.concat([df2_0, df2_1], axis=1)
df = df["c"].fillna(df2)
print(df)
Traceback (most recent call last):
File "test_fillna.py", line 12, in <module>
df = df["c"].fillna(df2)
File "C:\Users\79049\.conda\envs\modin\lib\site-packages\pandas\util\_decorators.py", line 311, in wrapper
return func(*args, **kwargs)
File "C:\Users\79049\.conda\envs\modin\lib\site-packages\pandas\core\series.py", line 4908, in fillna
return super().fillna(
File "C:\Users\79049\.conda\envs\modin\lib\site-packages\pandas\core\generic.py", line 6461, in fillna
raise TypeError(
TypeError: "value" parameter must be a scalar, dict or Series, but you passed a "DataFrame"