milvus [Bug]: Should ban auto balance channel

Is there an existing issue for this?

[X] I have searched the existing issues

Environment

- Milvus version:
- Deployment mode(standalone or cluster):
- MQ type(rocksmq, pulsar or kafka):    
- SDK version(e.g. pymilvus v2.0.0rc2):
- OS(Ubuntu or CentOS): 
- CPU/Memory: 
- GPU: 
- Others:

Current Behavior

for now, if during balance channel. here will came two shard on same channel, we can't prevent release happens in one shard, and search happens in another shard, so search will exit with error.

we should ban auto balance channel until we can deal event which may happens in two shards.

Expected Behavior

No response

Steps To Reproduce

No response

Milvus Log

No response

Anything else?

No response

Apr 26 '23 04:04 weiliu1031

should we just release the old channel before have the new one? That should solve the problem right?

Apr 26 '23 05:04 xiaofan-luan

should we just release the old channel before have the new one? That should solve the problem right?

if we release before sub channel, there will be a period with no available shard

Apr 26 '23 06:04 weiliu1031

should we just release the old channel before have the new one? That should solve the problem right?

if we release before sub channel, there will be a period with no available shard

Yep, but it has to be only one leader

Apr 26 '23 06:04 xiaofan-luan

We could have system has two delegators at the same time:

Make old delegator work as sub node during the balance
All load/release operation dispatched by new delegator shall be forwarded by the older one
After the new delegator become workable, de-register the old delegator

Apr 26 '23 08:04 congqixia

/assign @weiliu1031 /unassign

Apr 26 '23 09:04 yanliang567

We could have system has two delegators at the same time:

Make old delegator work as sub node during the balance

All load/release operation dispatched by new delegator shall be forwarded by the older one

After the new delegator become workable, de-register the old delegator

It's might be doable, if querynode accept duplicated insert. And the msgstream has to change to shared mode I guess

Apr 26 '23 17:04 xiaofan-luan

we will ban balance channel for short term, until we can deal with two shard online for same channel in replica.

Apr 28 '23 06:04 weiliu1031

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions. Rotten issues close after 30d of inactivity. Reopen the issue with /reopen.

May 31 '23 20:05 stale[bot]

@weiliu1031 shall we keep this open?

Jun 09 '23 01:06 yanliang567

balance channel should be the priority for 2.3

Jun 09 '23 07:06 xiaofan-luan

/reopen

Jun 09 '23 09:06 weiliu1031

@weiliu1031: Reopened this issue.

In response to this:

/reopen

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

Jun 09 '23 09:06 sre-ci-robot

balance channel should be the priority for 2.3

@yah01 is working on supporting two shard in same channel exist at same time. after that, we will re-enable balance channel

Jun 09 '23 09:06 weiliu1031

/assign working on it

Jun 12 '23 08:06 yah01

master fixed with #24849

Jun 19 '23 03:06 yah01

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions. Rotten issues close after 30d of inactivity. Reopen the issue with /reopen.

Jul 19 '23 06:07 stale[bot]

/reopen

Sep 05 '23 01:09 jiaoew1991

@jiaoew1991: Reopened this issue.

In response to this:

/reopen

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

Sep 05 '23 01:09 sre-ci-robot

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions. Rotten issues close after 30d of inactivity. Reopen the issue with /reopen.

Oct 05 '23 01:10 stale[bot]

/reopen

Nov 15 '23 11:11 weiliu1031

@weiliu1031: Reopened this issue.

In response to this:

/reopen

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

Nov 15 '23 11:11 sre-ci-robot

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions. Rotten issues close after 30d of inactivity. Reopen the issue with /reopen.

Dec 15 '23 20:12 stale[bot]

/unassign

Dec 20 '23 06:12 yah01

milvus milvus copied to clipboard

[Bug]: Should ban auto balance channel

Is there an existing issue for this?

Environment

Current Behavior

Expected Behavior

Steps To Reproduce

Milvus Log

Anything else?

milvus
milvus copied to clipboard