fastparquet icon indicating copy to clipboard operation
fastparquet copied to clipboard

python implementation of the parquet columnar file format.

Results 89 fastparquet issues
Sort by recently updated
recently updated
newest added

When trying to optimize the speed for a serverless/lambda deployment I found that the fastparquet wheel contains a test folder of ~80 Mb. Could this be excluded from the distribution...

This warning message is not understandable. I searched the net and found some references to it but have not been able to understand whether it is important or not. Using...

**Code:** ``` from fastparquet import ParquetFile pf = ParquetFile('/path/file.parquet') df = pf.to_pandas() ``` **Error:** ``` File "/home/.../venv/lib64/python3.7/site-packages/fastparquet/core.py", line 112, in read_data_page nval = daph.num_values - num_nulls AttributeError: 'NoneType' object has...

I'm looking for a way to dump the ParquetFile to an actual file. I've tried [write method](https://fastparquet.readthedocs.io/en/latest/api.html#fastparquet.write) replacing pandas data with ParquetFile object, but I receive the following error: `TypeError:...

Looks like the importing packaging, package was added to requirements.txt and merge to master but the tag 0.4.0 still gives issue when installing with pip. As a workaround, I need...

Hello, In the file `API.py`, the function `filter_out_stats` uses `min` and `max` statistic fields, but they are marked as deprecated in the parquet thrift specification. https://github.com/apache/arrow/blob/master/cpp/src/parquet/parquet.thrift line: 201 As some...

python version: `Python 3.8.2` command: `pip install fastparquet` error: `Building wheels for collected packages: fastparquet Building wheel for fastparquet (setup.py) ... error` ` building 'fastparquet.speedups' extension error: Microsoft Visual C++...

I am trying to do a parquet file using Dask and Fastparquet from a Dataframe using a column with the type 'Int64' (https://pandas.pydata.org/pandas-docs/stable/user_guide/integer_na.html). But unfortunately, I got the following error:...

``` lib\fastparquet\writer.py:655: FutureWarning: RangeIndex._start is deprecated and will be removed in a future version. Use RangeIndex.start instead index_cols = [{'name': index_cols.name, 'start': index_cols._start, lib\fastparquet\writer.py:656: FutureWarning: RangeIndex._step is deprecated and will...

Fastparquet does not appear to support writing Dask dataframes with Pandas SparseArray columns. Doing so fails with: ``` AttributeError: 'SparseDtype' object has no attribute 'itemsize' ``` Pandas: 0.25.1 Dask: 2.4.0...