parquet-python icon indicating copy to clipboard operation
parquet-python copied to clipboard

python implementation of the parquet columnar file format.

Results 16 parquet-python issues
Sort by recently updated
recently updated
newest added

`pip install parquet` fails on python 3.11 becuase [thriftpy2 won't build](https://github.com/Thriftpy/thriftpy2/issues/192) on python 3.11

There seems to be a mishandling of MAP columns since those columns contain groups named key_value with elements key and value, and those are considered already seen which leads to...

Hi, would you consider adding parquet-python to [conda-forge](https://conda-forge.org/docs/maintainer/adding_pkgs.html)? Right now it seems to be the only tool out there that can read parquet data as a bytestream and doesn't require...

Hi Joe and others I am trying to use your module to read a parquet file, and i ran into a problem here: schema.py, line 21: assert len(self.schema_elements) == len(self.schema_elements_by_name)...

The deprecated aliases have been removed in python/cpython#28268

I have the following parquet schema: field4: BINARY UNCOMPRESSED DO:0 FPO:170 SZ:58/58/1.00 VC:1 ENC:PLAIN,BIT_PACKED ST:[min: 32505002.09, max: 32505002.09, num_nulls: 0] json: {"field4":"32505002.09"} However, if I try to read it I...

I'm trying to use `parquet.reader(file_obj)`, but when I do on my parquet I find this error: ```zsh for row in parquet.reader(fo): File "/home/vladimir/.local/share/virtualenvs/ComisionDeLaVerdad-FivqEOe7/lib/python3.7/site-packages/parquet/__init__.py", line 472, in reader dict_items = _read_dictionary_page(file_obj,...

I had an error when trying to open a parquet file: Traceback (most recent call last): File "/local/workplace/lib/python3.6/site-packages/lambda_handlers/parquet_test.py", line 57, in lambda_handler for row in parquet.reader(fin): File "/local/workplace/lib/python3.6/site-packages/parquet/__init__.py", line 470,...

Following instances should use tostring on Python 2 and tobytes on Python 3. ``` test/test_encoding.py 114: encoded_bitstring = array.array('B', raw_data_in).tostring() 134: 'B', [0b00000101, 0b00111001, 0b01110111]).tostring() ```

Couldn't read booleans because they were set to 0 in the thrift file which would then evaluate to False when checking their truth value. Changed it so it specifically checks...