SDV icon indicating copy to clipboard operation
SDV copied to clipboard

Update code to remove pandas `FutureWarning` messages that's displayed for each row during conditional sampling

Open srinify opened this issue 1 year ago • 0 comments

Environment Details

SDV 1.12

Problem Description

In our current implementation, conditional sampling thousands of rows generates thousands of pandas FutureWarning messages. This can actually crash Jupyter Notebook / Lab webpages sometimes (as I've experienced).

FutureWarning: The behavior of Series.replace (and DataFrame.replace) with CategoricalDtype is deprecated. 
In a future version, replace will only be used for cases that preserve the categories. 
To change the categories, use ser.cat.rename_categories instead.

result = result.replace(nan_name, np.nan)
Screenshot 2024-05-07 at 10 57 04 AM

Expected behavior

That we update the pandas methods we're using so these warnings aren't displayed.

Steps to Reproduce

import pandas as pd
from sdv.single_table import GaussianCopulaSynthesizer
from sdv.datasets.demo import download_demo

data, metadata = download_demo(
    modality='single_table',
    dataset_name='census_extended'
)

synthesizer = GaussianCopulaSynthesizer(metadata)
synthesizer.fit(data)
synthetic_data = synthesizer.sample_remaining_columns(data[['sex', 'income']].head(10))

Workaround

As an SDV user, you can squash these warnings for now while we resolve the underlying issue:

# Run before importing pandas
import warnings
warnings.simplefilter(action='ignore', category=FutureWarning)

import pandas as pd

srinify avatar May 07 '24 14:05 srinify