Anatoly Myachev

Results 195 comments of Anatoly Myachev

@zaichang it's strange, because `storage_options` are supported. Maybe these are old logs?

> It was strange then to see a note on how to enable object spilling in multi-node clusters: https://docs.ray.io/en/latest/ray-core/objects/object-spilling.html#cluster-mode without noting that it is actually disabled by default. @zaichang you...

> With object spilling enabled, I expect the cluster to complete reading the large parquet to work even if the dataset does not fit in an instance's memory, and surprised...

> On a more general note if I have to use a 32GB instance to read a 10GB parquet file, wouldn't that mean we actually can't work with datasets that...

> @anmyachev I checked my logs and yes there were these lines: > UserWarning: Defaulting to pandas implementation. > Please refer to https://modin.readthedocs.io/en/stable/supported_apis/defaulting_to_pandas.html for explanation. > Reason: Parquet options that...

> @anmyachev Thanks for checking in. I recently re-checked my test and indeed I am not seeing the `read_parquet` unsupported options message any more. Internally we had unrelated code that...

> @dchigarev Why we're merging this to our repo instead of https://github.com/data-apis/dataframe-api-compat? I consider this change as temporary in order to be in time for the 0.27 release. > Is...

> > I consider this change as temporary in order to be in time for the 0.27 release. > > why do we want this to be in 0.27 release?...

It is also worth noting that: 1. The protocol is still in beta and may change significantly. However, almost all the code can be reused from Pandas, given that the...

Closed in favour of https://github.com/modin-project/modin/pull/7196