parquet-python
parquet-python copied to clipboard
ValueError ordinal must be >= 1
I'm trying to use parquet.reader(file_obj), but when I do on my parquet I find this error:
for row in parquet.reader(fo):
File "/home/vladimir/.local/share/virtualenvs/ComisionDeLaVerdad-FivqEOe7/lib/python3.7/site-packages/parquet/__init__.py", line 472, in reader
dict_items = _read_dictionary_page(file_obj, schema_helper, page_header, cmd)
File "/home/vladimir/.local/share/virtualenvs/ComisionDeLaVerdad-FivqEOe7/lib/python3.7/site-packages/parquet/__init__.py", line 395, in _read_dictionary_page
return convert_column(values, schema_element) if schema_element.converted_type is not None else values
File "/home/vladimir/.local/share/virtualenvs/ComisionDeLaVerdad-FivqEOe7/lib/python3.7/site-packages/parquet/converted_types.py", line 68, in convert_column
return [datetime.date.fromordinal(d) for d in data]
File "/home/vladimir/.local/share/virtualenvs/ComisionDeLaVerdad-FivqEOe7/lib/python3.7/site-packages/parquet/converted_types.py", line 68, in <listcomp>
return [datetime.date.fromordinal(d) for d in data]
What can I do?
Hi, did you open the file in binary mode? We recently updated the example in the readme https://github.com/jcrobak/parquet-python#example
The error remains:
>>> import parquet
>>> with open("victimas_union_recat.parquet", "rb") as fo:
... for row in parquet.reader(fo):
... pass
...
Traceback (most recent call last):
File "<stdin>", line 2, in <module>
File "/home/vladimir/.local/share/virtualenvs/ComisionDeLaVerdad-FivqEOe7/lib/python3.7/site-packages/parquet/__init__.py", line 472, in reader
dict_items = _read_dictionary_page(file_obj, schema_helper, page_header, cmd)
File "/home/vladimir/.local/share/virtualenvs/ComisionDeLaVerdad-FivqEOe7/lib/python3.7/site-packages/parquet/__init__.py", line 395, in _read_dictionary_page
return convert_column(values, schema_element) if schema_element.converted_type is not None else values
File "/home/vladimir/.local/share/virtualenvs/ComisionDeLaVerdad-FivqEOe7/lib/python3.7/site-packages/parquet/converted_types.py", line 68, in convert_column
return [datetime.date.fromordinal(d) for d in data]
File "/home/vladimir/.local/share/virtualenvs/ComisionDeLaVerdad-FivqEOe7/lib/python3.7/site-packages/parquet/converted_types.py", line 68, in <listcomp>
return [datetime.date.fromordinal(d) for d in data]
ValueError: ordinal must be >= 1
I finally used pyarrow (as recommended by the pandas.read_parquet method)