Volker L.
Volker L.
Hi Johann, my fault was to think of every timestamp being a new sample and every feature being a different measure (e.g. temperature and pressure). But this is only true,...
Hope this is not off-topic, but you can leverage `duckdb` or `polars` for this. ```python import duckdb import pyarrow.dataset as ds import polars as pl dset = ds.dataset('path/to/data') # duckdb...
What is the size of the dataset and where is it stored? In a s3 bucket? If so, this could be interesting for you: https://github.com/apache/arrow/issues/14336
> > Thank you @legout. Duckdb works really well, but polars is struggling. Maybe I am doing something wrong. > > But anyway here is how it worked for me...
Any news here?
I am looking forward to a polars native solutin. Current workaround for me is the following: ```python def read_parquet_dataset(path, partitioning=None, filter_=None, with_columns=None, storage_options=None): if storage_options is not None: fs=s3fs.S3FileSystem(**storage_options) files=["s3://+f...
I´ll add my following workaround for a similar problem. Unnesting dataframes with (deeply) nested structs. ## Example data ```python data = [ {"a": {"b": 1, "c": {"d":2}}, "b": {"b": 1},...
Thanks for the fast response. Unfortunately, the full image is not available for arm64.
Is there a specific reason, why the full version is only available for amd64?
Hi, I found the same behavior and wonder, if there is a workaround or solution available.