datatable icon indicating copy to clipboard operation
datatable copied to clipboard

A Python package for manipulating 2-dimensional tabular data structures

Results 151 datatable issues
Sort by recently updated
recently updated
newest added

I encountered an unexpected behavior of `to_csv()` and `fread()` regarding its handling of 'NA' string. When I ran the following code, ```python import datatable as dt data = dt.Frame(['a', 'NA'])...

improve
Beginner task

A feature which is often used in pandas is `apply` (or `aggregate` or `transform`) that basically allow to do a mapping, aggregation or even a partial reduction operation. [PySpark](https://databricks.com/blog/2017/10/30/introducing-vectorized-udfs-for-pyspark.html) basically...

design-doc
new feature

- `dt.isna()`, `dt.abs()`, `dt.exp()`, `dt.log()`, `dt.log10()` are marked internally as [deprecated](https://github.com/h2oai/datatable/blob/main/src/datatable/expr/math.py#L21-L26) for a long time, so they should be replaced with the corresponding `dt.math.*()` functions; - `dt.len()` should be deprecated...

The structure of the tests should be made approximately the same as the structure of API docs, with a dedicated test file for every function. These test files should then...

test
refactor
EPIC :star:

The output for statistical reducers when used in the square brackets syntax does not match that of Frame methods. ```python srcs = ["a", "bc", "def", None, -2.5, 3.7] / dt.obj64...

Our current approach for creating new `Expr`s is overly complicated, as outlined here: https://github.com/h2oai/datatable/blob/master/src/core/expr/!readme.md. This complexity stems mostly from the fact that the `Expr` class which performs arithmetic on `f-expressions`...

refactor
EPIC :star:

- [x] Suppose there is one column stores time in UNIX epoch with integer type, provide a function that trans this column to time64, something like `pandas.to_datetime` - [ ]...

Currently when a link in the lhs menu is clicked, that menu is scrolled to the top as the new page loads. This makes the navigation very inconvenient. A better...

improve
documentation

- convert data from wide to long, and vice versa (similar to ``melt/dcast`` in `rdatatable` or ``melt/pivot.pd.wide_to_long`` in ``pandas``) # Example : df = dt.Frame({"A":['a','b','c'], "B":[1,3,5],"C":[2,4,6]}) Transform from wide to...

- Instead of a default ``C0``, it would be nice to have some relevant name ## Example: ``` from datatable import dt, f, by grades = [48, 99, 75, 80,...

improve
groupby