datatable icon indicating copy to clipboard operation
datatable copied to clipboard

A Python package for manipulating 2-dimensional tabular data structures

Results 151 datatable issues
Sort by recently updated
recently updated
newest added

```sh wget https://raw.githubusercontent.com/h2oai/db-benchmark/cf255c174647ac437aa7a85751f6e65732a3cb9a/_data/groupby-datagen.R Rscript groupby-datagen.R 1e9 1e2 5 0 ## activate your pydt env source ~/git/db-benchmark/pydatatable/py-pydatatable/bin/activate python import datatable as dt from datatable import f, count x = dt.fread('G1_1e9_1e2_5_0.csv', na_strings=[''])...

bug
fread

- Did you find a bug in datatable, or maybe the bug found you? I found a bug. - How to reproduce the bug? ```sh wget https://raw.githubusercontent.com/h2oai/db-benchmark/cf255c174647ac437aa7a85751f6e65732a3cb9a/_data/groupby-datagen.R Rscript groupby-datagen.R 1e8...

segfault
sort

dt.Frame is raising an error while trying to import pandas frame where columns are of `Int32` so that they can have a missing value. ```py import pandas as pd import...

new feature

This is follow up of our slack conversation. FR is about providing API that allows to force materialize results of computations which might have not been materialized yet, simply because...

Currently the default value for `fill` is `False`. However when `sep = ' '` we change `fill` to `True`. This shouldn't be the case if the user asked `fill=False` explicitly....

improve
fread

In py3.8 protocol version 5 was added for pickling, which allows avoiding excessive memory copies of serialized objects. We should make use of this feature for faster inter-process data exchange....

improve

- Did you find a bug in datatable, or maybe the bug found you? Loss of column names during some operations. What determines how a column name is changed? What...

documentation

https://archive.ics.uci.edu/ml/machine-learning-databases/badges/badges.data Seems like reasonable data set that needs better white space detection -- similar to datetime, here is firstnameinitiallastname and dt gets confused when name format changes slightly. As you...

bug
improve
fread
low priority

This is a proposal for implementing a new function `xread()`, which would be conceptually similar to `fread()`, but much lazier. In particular, `xread()` would parse only the first `n_sample_lines=100` lines...

fread
design-doc
cust-goldmansachs

I'd like to see the ability to get different rolling aggregations of my dataset based on order and grouping columns. Pandas has robust support for these type of actions. https://pandas.pydata.org/pandas-docs/stable/generated/pandas.DataFrame.rolling.html....

new feature
cust-goldmansachs