woodwork icon indicating copy to clipboard operation
woodwork copied to clipboard

Woodwork is a Python library that provides robust methods for managing and communicating data typing information.

Results 131 woodwork issues
Sort by recently updated
recently updated
newest added

Choose a different pivot point for 2-digit year rather than the default that pandas datetime uses. Chose 2030 as the pivot date, but can easily be changed to another

`ignore_columns` has been added as an argument when initializing a dataframe with woodwork. Any column name(s) that are passed in to `df.ww.init(ignore_columns=[col_1, col_2, etc])` will be ignored during initialization, as...

Currently woodwork will detect data of the form ``` $1.234 $5.678 ... ``` as "Natural Language". It would be helpful if we created a currency type so that this sort...

needs design
new feature
evalml
spike

Currently the ColumnSchema utils `is_numeric` and `is_categorical` only look at the logical type when determining if a column schema is numeric or categorical in nature. In Featuretools, we often set...

bug

- As a user, I wish I Woodwork would store all data and typing information in a single arrow file when serializing, rather than writing a data file and a...

new feature

Currently if users have a datetime column in which the input values are timezone aware, the timezone information is removed from the datetime objects after Woodwork initialization. This information should...

Pandas bug was stopping us from being able to initialize Woodwork on a column of numeric strings, so we had to convert to the `string` dtype first. https://github.com/alteryx/woodwork/pull/755/files#diff-75cb54847db4ed09b32148b93f1f543df16a421473154cb17085ff4932277ba8L402 This should...

In the latest Woodwork release manipulating the top level pandas dataframe after initializing woodwork duplicates the column and the column doubles in length. See [here](https://github.com/alteryx/evalml/runs/6124456856) and specifically `test_simple_imputer_ignores_natural_language` to see...

bug

- pandas 1.3 has a new string[arrow] dtype that saves on memory and improves speed - https://pythonspeed.com/articles/pandas-string-dtype-memory/ - We should use it, and verify that all subsequent calls on this...