pandas icon indicating copy to clipboard operation
pandas copied to clipboard

DataFrame.update silently does nothing when indices are of differing type

Open birdcolour opened this issue 7 years ago • 5 comments

Code to reproduce

import numpy as np

df_int = pd.DataFrame(
    {'col': ['foo', 'bar', np.nan]},
    index=[1,2,3]
)
df_obj = pd.DataFrame(
    {'col': [np.nan, np.nan, 'baz']},
    index=['1', '2', '3']
)

print(df_int)
print(df_obj)

# >>>
#    col
# 1  foo
# 2  bar
# 3  NaN
#    col
# 1  NaN
# 2  NaN
# 3  baz

# Note that the indices appear identical, but are actually different dtypes

df_int.update(df_obj)
print(df_int)

# Intended output
# >>>
#    col
# 1  foo
# 2  bar
# 3  baz

# Actual output
# >>>
#      a
# 1  foo
# 2  bar
# 3  NaN

Problem description

Since update compares values of indices, when two dataframes with differing index dtypes are compared, it is possible that no matches are made when this is not the intended behaviour the user expects, and there is no feedback to the user that this has happened. This is particularly surprising when indices appear to be identical, as highlighted above. A warning should be raised to signal that either:

  • tells the user that the indices are not the same type, which may produce some unintended results.
  • states that a type comparison is taking place that will never produce any matches.

birdcolour avatar Feb 26 '18 13:02 birdcolour

Related to #4094

birdcolour avatar Feb 26 '18 13:02 birdcolour

states that a type comparison is taking place that will never produce any matches.

This is hard to do in general. In your case, a regular Index can contain objects with any type, including the same type as the Int64Index.

TomAugspurger avatar Feb 26 '18 14:02 TomAugspurger

@TomAugspurger : That being said, we shouldn't aligning on indices that are clearly not equal (i.e. string and numeric), so this is still a bug IMO.

gfyoung avatar Mar 02 '18 20:03 gfyoung

The df_int.update(df_obj) call now correctly raises ValueError: Update not allowed when the index on other has no intersection with this dataframe on main. Could use a test (first check to see if one already exists)

jbrockmendel avatar Oct 31 '25 01:10 jbrockmendel

Looks like a test already exists: https://github.com/pandas-dev/pandas/blob/25934d6755dda44f192e4893106316bc4dce5382/pandas/tests/frame/methods/test_update.py#L218

mebaier avatar Dec 08 '25 13:12 mebaier