mars [WIP] use direct async call for in-process rpc

What do these changes do?

This PR use the direct async function call to speed up in-progress actor call, and gives a speed up about 2.6 times.

Benchmark code:

import asyncio
from contextlib import asynccontextmanager
import datetime
import os
import sys
import time

import mars.oscar as mo


class BenchmarkActor(mo.Actor):

    def __init__(self, value):
        super().__init__()
        self.value = value

    async def send(self, uid, method, iternum, *args):
        actor_ref = await mo.actor_ref(uid, address=self.address)
        value = None
        for _ in range(iternum):
            value = await getattr(actor_ref, method)(*args)
        return value

    async def inc(self, delta):
        self.value += delta
        return self.value

    def get_value(self):
        return self.value


@asynccontextmanager
async def actor_pool_context():
    start_method = (
        os.environ.get("POOL_START_METHOD", "forkserver")
        if sys.platform != "win32"
        else None
    )
    pool = await mo.create_actor_pool(
        "127.0.0.1", n_process=2, subprocess_start_method=start_method
    )
    await pool.start()
    yield pool
    await pool.stop()


async def test_dummy_call_benchmark(pool):
    ref1 = await mo.create_actor(BenchmarkActor, 1, address=pool.external_address)
    ref2 = await mo.create_actor(BenchmarkActor, 2, address=pool.external_address)
    await ref1.send(ref2, "inc", 2, 2)
    iternum = 100000
    expect = await ref2.get_value() + iternum * 2
    start = time.time()
    print(f"Start with iternum {iternum} at {datetime.datetime.now()}")
    assert await ref1.send(ref2, "inc", iternum, 2) == expect
    print(f"End with iternum {iternum} at {datetime.datetime.now()}, took {time.time() - start} seconds.")


async def main():
    async with actor_pool_context() as ctx:
        await test_dummy_call_benchmark(ctx)


if __name__ == '__main__':
    asyncio.run(main())

Benchmark Env

2.2 GHz 6-Core Intel Core i7
16 GB 2400 MHz DDR4
OS: macOS BigSur 11.5.2 (20G95)

Benchmark result

Without this PR: 100000 actor calls in 21.88 seconds
WIth this PR: 100000 actor calls in 8.19 seconds

Related issue number

Closes #2691

Check code requirements

[ ] tests added / passed (if needed)
[ ] Ensure all linting tests pass, see here for how to run them

Feb 09 '22 05:02 chaokunyang

Please open a new issue to illustrate your changes. What's more, any benchmarks for that?

Feb 09 '22 05:02 wjsi

Please open a new issue to illustrate your changes. What's more, any benchmarks for that?

@wjsi I will open an issue after I finished the benchmark.

Feb 09 '22 07:02 chaokunyang

Remove in-process actor will get more performance boost. But, that will change lots of code.

Feb 09 '22 08:02 fyrestone

Remove in-process actor will get more performance boost. But, that will change lots of code.

Yes, removing in-process actor will get more performance boost. Actually, after using direct call for in-process call, the main cost are the oscar itself. The cost is obvious from the dumped flame graph: profile

Feb 09 '22 08:02 chaokunyang

@chaokunyang any updates about this PR?

Apr 06 '22 03:04 wjsi

Feb 20 '23 09:02 chaokunyang

mars mars copied to clipboard

[WIP] use direct async call for in-process rpc

What do these changes do?

Related issue number

Check code requirements

mars
mars copied to clipboard