pandas BUG: In `main`, using `resample().interpolate(inplace=True)` raises an exception

Pandas version checks

[X] I have checked that this issue has not already been reported.
[X] I have confirmed this bug exists on the latest version of pandas.
[X] I have confirmed this bug exists on the main branch of pandas.

Reproducible Example

import numpy as np
from pandas import date_range, DataFrame
DR = date_range("1999-1-1", periods=365, freq="D")
DF_ = DataFrame(np.random.standard_normal((365, 1)), index=DR)
S = DF_.iloc[:, 0]
DF = DataFrame({"col1": S, "col2": S})

DF.resample("ME").interpolate(inplace=True)

Issue Description

Code raises an exception:

Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "C:\Anaconda3\envs\pandasstubs\lib\site-packages\pandas\core\resample.py", line 958, in interpolate
    result_interpolated = result_interpolated.loc[final_index]
AttributeError: 'NoneType' object has no attribute 'loc'

Expected Behavior

No exception. Works fine in pandas 2.2.2

Note: Seems to be introduced by https://github.com/pandas-dev/pandas/pull/56515 by @cbpygit , review by @MarcoGorelli and @jbrockmendel

Installed Versions

INSTALLED VERSIONS

commit : 34177d6b20896bf101590306e3f0e1e0a225c5fb python : 3.9.16.final.0 python-bits : 64 OS : Windows OS-release : 10 Version : 10.0.19045 machine : AMD64 processor : Intel64 Family 6 Model 158 Stepping 13, GenuineIntel byteorder : little LC_ALL : None LANG : None LOCALE : English_United States.1252

pandas : 3.0.0.dev0+943.g34177d6b20 numpy : 1.26.4 pytz : 2024.1 dateutil : 2.9.0.post0 setuptools : 68.2.2 pip : 22.3.1 Cython : None pytest : 8.2.0 hypothesis : None sphinx : None blosc : None feather : None xlsxwriter : 3.2.0 lxml.etree : 5.2.1 html5lib : 1.1 pymysql : None psycopg2 : None jinja2 : 3.1.4 IPython : None pandas_datareader : None adbc-driver-postgresql: None adbc-driver-sqlite : None bs4 : 4.12.3 bottleneck : None fastparquet : None fsspec : None gcsfs : None matplotlib : 3.8.4 numba : None numexpr : 2.10.0 odfpy : None openpyxl : 3.1.2 pyarrow : 16.0.0 pyreadstat : 1.2.7 python-calamine : None pyxlsb : 1.0.10 s3fs : None scipy : 1.13.0 sqlalchemy : 2.0.30 tables : 3.9.2 tabulate : 0.9.0 xarray : 2024.3.0 xlrd : 2.0.1 zstandard : 0.22.0 tzdata : 2024.1 qtpy : None pyqt5 : None

May 12 '24 19:05 Dr-Irv

@Dr-Irv Thank you for reporting this problem! I am able to reproduce this. The problem is purely related to the inplace=True option. The interpolation result here becomes None and internally the _update_inplace method call here does not yet possess the final result. There is still the step of index correction missing, which is only applied afterwards here.

The problem is that the context in pandas.core.generic.NDFrame.interpolate does not know about the correct final index. So far, I am unable to figure out how I can update the state of the parent data frame that led to the pandas.core.resample.Resampler.interpolate call. I tried to manipulate self._selected_obj but it does not propagate. If somebody can point me at the object that needs to be updated I can easily fix this.

May 13 '24 10:05 cbpygit

I could use some input here @MarcoGorelli. As we are introducing a breaking change anyway, we could think about removing the inplace option. I think it is counter-intuitive that the second method in a chain of methods modifies the original data frame "in place". Just think about the completely valid option to split the calls like this:

df = ...
resampler = df.resample("ME")   # creates a resampler instance, does not modify df
resampler.interpolate(inplace=True)  # modifies df, returns None

May 22 '24 07:05 cbpygit

we could think about removing the inplace option.

yup - this looks like one of the places where inplace was a lie anyway? Agree, "in for a penny, in for a pound" - if we gonna break, let's break. It's for the better anyway

May 22 '24 08:05 MarcoGorelli

@MarcoGorelli 🙌 I'll take care of it and ping you in the PR.

May 22 '24 15:05 cbpygit

@MarcoGorelli I looked into this but it is extremely convoluted 😢 The inplace option is passed on in multiple steps, partially to overloaded methods or abstract methods with many implementations. There are dozens of tests that use inplace=True or even test this in particular.

Before I spend a lot of time on this, could you indicate a safe way to approach it? Shall I really drop the inplace argument in all those methods, delete all the tests against the feature and switch to not-inplace in the tests that use it?

May 27 '24 08:05 cbpygit

yeah i'd be ok with that

May 27 '24 08:05 MarcoGorelli

as in, remove it from interpolate. if any intermediate function (which is also used in other places) requires inplace, just pass inplace=False

May 27 '24 08:05 MarcoGorelli

@cbpygit is this a good place to ask about dtype downcasting deprecation FutureWarning? i see you are responsible for fixing .resample()...

.resample().interpolate() now produces FutureWarning: DataFrame.interpolate with object dtype is deprecated and will raise in a future version. Call obj.infer_objects(copy=False) before interpolating instead. but .resample() doesn't have infer_objects()

there is also an interesting question outlined in here that is think is related: https://github.com/pandas-dev/pandas/issues/53631#issuecomment-1853635791

should we continue in this issue?

Sep 12 '24 10:09 spawn-guy