fastparquet
fastparquet copied to clipboard
python implementation of the parquet columnar file format.
**What happened**: When trying to write a large Pandas dataframe to S3 using fastparquet, I get an error. This doesn't happen with smaller dataframes or if writing the big dataframe...
I think there's an issue with the Conda-forge package for NumPy 1.15.x ``` conda create -n tst -c conda-forge --force python=3.6.7 fastparquet numpy=1.15.4 ``` ``` bash-5.0$ python -c "import fastparquet"...
`test_import_without_warning` does this: ``` subprocess.check_call([sys.executable, "-Werror", "-c", "import fastparquet"]) ``` But with packaging >= 20.5 this raises a DeprecationWarning: ``` [ 95s] E subprocess.CalledProcessError: Command '['/usr/bin/python3.6', '-Werror', '-c', 'import fastparquet']'...
Hi, Configuration for this bug to happen have been difficult to formalize, it took me a while. So I notice that `filters` in `to_pandas` is ineffective in case the parquet...
Getting Error FileNotFoundError: [Errno 2] No such file or directory: 'llvm-config': 'llvm-config'
I am getting an error with my docker file running on Python3.7 Alpine I have successfully installed pip-21.0 Here is the docker file: FROM python:3.7-alpine COPY . /DataUtil/ WORKDIR /DataUtil...
Recording a DataFrame with a `pandas.DatetimeIndex` works. ```python import pandas as pd import fastparquet import os path = os.path.expanduser('~/Documents/code/draft/data/') file = path + 'weather_data' datetime_index = pd.date_range(start = pd.Timestamp('2020/01/02 01:00:00'),...
My understanding of pandas library is that it requires loading the entire dataset into memory. Is there any way to avoid this requirements and write data from a stream or...
Hi all, I have tried to write the pandas DataFrame as a parquet file. My DataFrame has some columns with list or tuple as the object. If I try to...
Test failure with 0.4.1 (and 0.4.0) cloned from this repo, with Python 3.8. ``` =================================== FAILURES =================================== _ test_frame_write_read_verify[input_symbols8-10-hive-2-partitions8-filters8] __ tempdir = '/build/tmpighy8d7p', input_symbols = ['NOW', 'SPY', 'VIX'] input_days =...
The issue can be reproduced as follows: ``` import pandas as pd df = pd.DataFrame([ [1, 'DE', 2.3], [2, 'BE', 4.5], [3, 'DE', 7.6], [4, 'DE', 4.8] ], columns=['id', 'country',...