yohplala
yohplala
When doing a row-wise filtering (not row-group-wise filtering) with filters '>', '>=', '=', '
Hi, Do you think we could record in the metadata the size of the row groups when recording them? This data can be obtained when dataframe is split in row_group...
When using `read_row_group_file()`, - it is compulsory to provide columns and categories when using `read_row_group_file`. - also, the way to specify loading all partitions, or only some of them is...
**What happened**: Test case broken with new version of fastparquet. **Minimal Complete Verifiable Example**: Here is an example showing the trouble: ```python import os import pandas as pd import fastparquet...
Hi, Configuration for this bug to happen have been difficult to formalize, it took me a while. So I notice that `filters` in `to_pandas` is ineffective in case the parquet...
Recording a DataFrame with a `pandas.DatetimeIndex` works. ```python import pandas as pd import fastparquet import os path = os.path.expanduser('~/Documents/code/draft/data/') file = path + 'weather_data' datetime_index = pd.date_range(start = pd.Timestamp('2020/01/02 01:00:00'),...
Bug under investigation when combining row filtering and nulls in a DataFrame. Description is in ticket #957
```python """Test categorical data with nulls and read with filters""" fn = os.path.join(str(tempdir), 'test.parquet') # Create DataFrame with categorical and nullable columns df = pd.DataFrame({ 'cat_col': ['A', 'B', None, 'C']...
Replaces PR #953 I am very sorry, I did a mess in commits history, so I restarted from fresh. Text from PR #953 applies: PR aiming to solve #949 The...
Hello, Could we have a linting tool setup (maybe through pre-commit) in fastparquet? We tackled this topic in issue #720 . At this time, @martindurant , you mentionned: > It...