woodwork icon indicating copy to clipboard operation
woodwork copied to clipboard

Force removal of Woodwork schema before allowing re-initialization of Woodwork

Open tamargrey opened this issue 3 years ago • 0 comments

  • As a user, I wish I Woodwork would not automatically overwrite a DataFrame's typing information if I try reinitializing Woodwork on it.

Currently in Woodwork, doing df.ww.init on a dataframe that already has Woodwork typing information on it will overwrite that typing information. In order to avoid that, users need to remember to do the following check:

if df.ww.schema is None:
    df.ww.init()

This means that it would not be difficult to accidentally overwrite woodwork typing info. In places where Woodwork is used extensively, this can start getting difficult to keep track of.

If Woodwork were to not allow init to be called on an object that already has a Woodwork schema, then this would solve that problem. For users who specifically do want to reinit Woodwork, they can choose to remove the Woodwork typing info prior to initialization:

if df.ww.schema is not None:
    del df.ww
df.ww.init()

If the schema wasn't removed, we could raise a WoodworkAlreadyInitError or something of the sort.

This change would make Woodwork re-initialization an opt-in practice rather than one you need to opt out of. However, as this change impacts libraries that depend on Woodwork, like Featuretools and EvalML, we should confirm that this behavior wouldn't create undue complexity in those downstream libraries.

tamargrey avatar Nov 10 '21 22:11 tamargrey