Iaroslav Igoshev

Results 189 comments of Iaroslav Igoshev

@SiRumCz, could you try to execute `ray.init()` and `importlib.reload(pd)` before `run_my_tasks(xxx)`, where `pd` is `import modin.pandas as pd`?

@SiRumCz, I opened https://github.com/modin-project/modin/pull/7280, which adds `reload_modin` function. Tested on the following example and it passed to me. ```python import modin.pandas as pd from modin.utils import reload_modin import ray ray.init(num_cpus=16)...

Your understanding is correct, ray.shutdown() kills all Ray processes. If we are talking about your warkaround, ``` I ended up using a Process to wrap my task into a new...

How much memory do you have on the system? What data sizes do you want to process?

32GB might be insufficient but Ray should start spilling objects onto disk if available memory got depleted and the flow should finish. Do you encounter OOM error?

@SiRumCz, let's keep track of the issue in Ray. Also, we merged `reload_modin` feature into main so you can check it out.

@Retribution98, could you also check performance for dtypes, which is part of https://github.com/modin-project/modin/issues/2751?

I think the first approach ("complex") is clear and reasonable. I wonder why "easy" approach is faster than "complex` in your measurements?

Ah, I see why the "easy" approach is faster. I originally thought that you proposed to first make a repartitioning in one remote call and then perform "map" in other...

Reopening since #7136 covered Map operator only.