hdfs3 icon indicating copy to clipboard operation
hdfs3 copied to clipboard

add crc=True|False parameter to HDFileSystem(...)

Open sk1p opened this issue 6 years ago • 5 comments

Title says it all. I also added instructions for testing on Python 2.7 to the CI README. Test is a bit long-winded, comments/improvements welcome

sk1p avatar May 18 '18 09:05 sk1p

I am stumped! I have no idea why the compression should matter, since the values encoded in the path by partition_on are not compressed at all. That is might depend on type of value is not as surprising, since fastparquet attempts the convert the (string) values encoded in the path into whatever the original pandas type was, and so for in, the types would need to match to pass the filter. You can check what was inferred with

pf = fastparquet.ParquetFile(..)
pf.cats

See the function filter_out_cats for how the values get used for comparison.

martindurant avatar Feb 01 '21 16:02 martindurant

Hi @martindurant Thanks for the feedback. At the moment, only reporting the bug, being focused on other topics ;). Bests,

yohplala avatar Feb 02 '21 08:02 yohplala