Iaroslav Igoshev
Iaroslav Igoshev
@abudden, thanks for the update. Yes, we should probably add a note to the docs regarding the engine initialization time. @mvashishtha, can you take this on?
Is this PR ready for review?
> switch to using the basic package itself (ray), but this will not work for fairly new versions of Ray, so how they don't include grpcio dependency. Why did they...
ray.get() allows to deserialize data with zero-copy for primitive data types (if an object supports pickle protocol 5). Then, we intentionally make a copy of data so the user can...
Hi @Abhi5h3k, thanks for opening this question! I am afraid it is not possible to use Modin with an engine like Ray, Dask or MPI inside a celery task. Why...
@rootsmusic, thanks for letting us know this! When pandas starts supporting python 3.13, we should also try to support it. However, as long as a Modin engine (Ray, Dask or...
BUG: Excessive log file generation when using Modin[ray] with Parquet files and DataFrame operations
Hi @xixibaobei, thanks for opening this issue! Could you share more details on which Modin, Ray, python versions you are using? What OS are you running on? How much memory...
BUG: Excessive log file generation when using Modin[ray] with Parquet files and DataFrame operations
And yes, which log file do you mean? The one that is generated by Ray in /tmp/ray folder?
BUG: Excessive log file generation when using Modin[ray] with Parquet files and DataFrame operations
Thanks for providing the details. I am not actually sure if it is possible to completely disable Ray logging to those files. @rkooo567, could you shed some light on this?
BUG: Excessive log file generation when using Modin[ray] with Parquet files and DataFrame operations
@xixibaobei, btw, did you try setting RAY_verbose_spill_logs=0? Does it reduce the log files size?