backend.ai
backend.ai copied to clipboard
fix: Replace `roundrobin` flag with `AgentSelectionStrategy.RoundRobin` strategy
Checklist: (if applicable)
- [x] Milestone metadata specifying the target backport version
- [x] Mention to the original issue
Follow up PR of #1405.
This PR performs overall revamps related AgentSelectionStrategy things.
Details:
- Replace the
roundrobinflag with AgentSelectionStrategy.RoundRobin strategy - Simplify the roundrobin agent selection logic when joined agent list changes.
- Move
agent_selection_resource_priorityintoAbstractSchedulerto avoid passing it all related functions each time. - Fix wrong type declaration (Replace
KernelInfowithKernelRowofassign_agent_for_kernel) - Abstract agent selection logic into
AbstractScheduler.
How to test
To test the RoundRobin strategy,
- To use the RoundRobin policy for
agent_selection_strategy, change theagent_selection_strategyinscaling_groups.scheduler_optstoroundrobin. - Update the redis address in etcd by
./backend.ai mgr etcd put config/redis/addr <manager_addr>:8111 - Update the
rpc-listen-addr,idof theagentsection in each agent'sagent.toml. - Checks if all agents are properly connected to the manager in the Backend.AI web UI.
- Create sessions to verify that sessions are evenly distributed among the agents and that the
next_indexforround_robinstate is correctly updated in etcd.