woodwork issues

Results 131 woodwork issues

Sort by recently updated

Change pivot point of 2-digit year in datetime inference

Choose a different pivot point for 2-digit year rather than the default that pandas datetime uses. Chose 2030 as the pivot date, but can easily be changed to another

bchen1116

Fixed treatment of Boolean coercion for boolean column with nulls

Fixes #1486

ParthivNaresh

Allow selective transformation

`ignore_columns` has been added as an argument when initializing a dataframe with woodwork. Any column name(s) that are passed in to `df.ww.init(ignore_columns=[col_1, col_2, etc])` will be ignored during initialization, as...

ParthivNaresh

Add a logical type Currency

Currently woodwork will detect data of the form ``` $1.234 $5.678 ... ``` as "Natural Language". It would be helpful if we created a currency type so that this sort...

dsherry

needs design

new feature

evalml

spike

ColumnSchema utils `is_numeric` and `is_categorical` only work if logical type is set

Currently the ColumnSchema utils `is_numeric` and `is_categorical` only look at the logical type when determining if a column schema is numeric or categorical in nature. In Featuretools, we often set...

tamargrey

bug

Update arrow serialization to use a single file for data and typing info in Woodwork

- As a user, I wish I Woodwork would store all data and typing information in a single arrow file when serializing, rather than writing a data file and a...

thehomebrewnerd

new feature

Woodwork should not strip timezone information from datetime values upon initialization.

Currently if users have a datetime column in which the input values are timezone aware, the timezone information is removed from the datetime objects after Woodwork initialization. This information should...

thehomebrewnerd

Stop converting ['1', '2', '3'] to string dtype before initializing as a Double column

Pandas bug was stopping us from being able to initialize Woodwork on a column of numeric strings, so we had to convert to the `string` dtype first. https://github.com/alteryx/woodwork/pull/755/files#diff-75cb54847db4ed09b32148b93f1f543df16a421473154cb17085ff4932277ba8L402 This should...

tamargrey

Manipulating pandas dataframe doubles length of pandas column in Woodwork 0.16.0 (with string[arrow] dtype)

In the latest Woodwork release manipulating the top level pandas dataframe after initializing woodwork duplicates the column and the column doubles in length. See [here](https://github.com/alteryx/evalml/runs/6124456856) and specifically `test_simple_imputer_ignores_natural_language` to see...

jeremyliweishih

bug

Use string[arrow] dtype for all Logical Types that use string dtype

- pandas 1.3 has a new string[arrow] dtype that saves on memory and improves speed - https://pythonspeed.com/articles/pandas-string-dtype-memory/ - We should use it, and verify that all subsequent calls on this...

gsheni

woodwork
woodwork copied to clipboard

Metadata

Change pivot point of 2-digit year in datetime inference

Fixed treatment of Boolean coercion for boolean column with nulls

Allow selective transformation

Add a logical type Currency

ColumnSchema utils `is_numeric` and `is_categorical` only work if logical type is set

Update arrow serialization to use a single file for data and typing info in Woodwork

Woodwork should not strip timezone information from datetime values upon initialization.

Stop converting ['1', '2', '3'] to string dtype before initializing as a Double column

Manipulating pandas dataframe doubles length of pandas column in Woodwork 0.16.0 (with string[arrow] dtype)

Use string[arrow] dtype for all Logical Types that use string dtype

← Metadata

Owner

Metadata

woodwork woodwork copied to clipboard

Metadata

← Metadata

Owner

Metadata

woodwork
woodwork copied to clipboard