apollo icon indicating copy to clipboard operation
apollo copied to clipboard

Python binary module run slow in Apollo

Open algo-scope opened this issue 3 years ago • 4 comments

I have a module writen in Python and build it into a py_binary. And I write a launch file for it.

<cyber>
    <module>
        <name>my_py</name>
        <dag_conf></dag_conf>
        <type>binary</type>
        <process_name>
           /apollo/bazel-bin/modules/my_module/my_py
        </process_name>
    </module>
</cyber>

I use cyber_launch start xx.launch after starting other all modules on my vehicle. But it runs much slower than running itself separately, whenever using cyber_launch or python xx.py directly. I wonder if the cyberRT scheduler works for the binary process_type, or it's only useful for component with .dag? Can I change my module's prio to make it faster?

algo-scope avatar Aug 16 '22 02:08 algo-scope

I use cyber python API in my module as:

if __name__ == "__main__":
    cyber.init()

    if not cyber.ok():
        print('Well, something went wrong.')
        sys.exit(1)

    test_node = cyber.Node('listener')
    writer = test_node.create_writer('/apollo/prediction', PredictionObstacles)
    test_node.create_reader('/apollo/perception/obstacles', PerceptionObstacles, callback, args={'writer': writer})
    test_node.create_reader('/apollo/localization/pose', LocalizationEstimate, ego_callback)
    
    test_node.spin()
    
    cyber.shutdown()

And I've changed my module' prio in /apollo/cyber/conf/compute_sched.conf, but it was still slow.

algo-scope avatar Aug 17 '22 02:08 algo-scope

@zhaowenyi7 Slow always means insufficient system resources, all processes can be scheduled if system resources are sufficient.

If your program is significantly slow, then you need to look at the cpu resources. If your program is not real-time, you can modify the configuration. You know that linux threads get scheduled even with low priority, so we first have to figure out which is the case

daohu527 avatar Aug 17 '22 15:08 daohu527

When other all modules launched, the 16 cores of i9 9900k were about 60% usage each. I've tested it in 2 ways:

  1. Use cyber python api to read, calculate, write. Every core's usage increase some degree, lower than 70% on every core. CPU still has many resources but it runs slow yet.
  2. parse history record to a csv. Use python without cyber to read csv and don't write to a channel. Eventually the python script run on 1 core and make it 100% used. Also it runs slower than test1. It seems that cyber surely schedule my python module but I can't change it's prority.

If I change my module to C++ implement, inherit Component and use Init() Proc(), would it be better?

algo-scope avatar Aug 18 '22 02:08 algo-scope

Both ways are the same, if you use a dag file it integrates the reader, in your python script it creates 3 readers.

There is a trick to set priorities, you can add below config in cyber/conf/compute_sched_classic.conf

{
	name: "listener_/apollo/prediction"
	prio: 19   # max to be 20
},
{
	name: "listener_/apollo/perception/obstacles"
	prio: 19   # max to be 20
},
{
	name: "listener_/apollo/localization/pose"
	prio: 19   # max to be 20
},

If you have io operations in your callbacks, such as reading and writing files, setting the priority may not help much

daohu527 avatar Aug 18 '22 08:08 daohu527