milvus
milvus copied to clipboard
[Bug]: Should ban auto balance channel
Is there an existing issue for this?
- [X] I have searched the existing issues
Environment
- Milvus version:
- Deployment mode(standalone or cluster):
- MQ type(rocksmq, pulsar or kafka):
- SDK version(e.g. pymilvus v2.0.0rc2):
- OS(Ubuntu or CentOS):
- CPU/Memory:
- GPU:
- Others:
Current Behavior
for now, if during balance channel. here will came two shard on same channel, we can't prevent release happens in one shard, and search happens in another shard, so search will exit with error.
we should ban auto balance channel until we can deal event which may happens in two shards.
Expected Behavior
No response
Steps To Reproduce
No response
Milvus Log
No response
Anything else?
No response
should we just release the old channel before have the new one? That should solve the problem right?
should we just release the old channel before have the new one? That should solve the problem right?
if we release before sub channel, there will be a period with no available shard
should we just release the old channel before have the new one? That should solve the problem right?
if we release before sub channel, there will be a period with no available shard
Yep, but it has to be only one leader
We could have system has two delegators at the same time:
- Make old delegator work as sub node during the balance
- All load/release operation dispatched by new delegator shall be forwarded by the older one
- After the new delegator become workable, de-register the old delegator
/assign @weiliu1031 /unassign
We could have system has two delegators at the same time:
- Make old delegator work as sub node during the balance
- All load/release operation dispatched by new delegator shall be forwarded by the older one
- After the new delegator become workable, de-register the old delegator
It's might be doable, if querynode accept duplicated insert. And the msgstream has to change to shared mode I guess
we will ban balance channel for short term, until we can deal with two shard online for same channel in replica.
This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.
Rotten issues close after 30d of inactivity. Reopen the issue with /reopen.
@weiliu1031 shall we keep this open?
balance channel should be the priority for 2.3
/reopen
@weiliu1031: Reopened this issue.
In response to this:
/reopen
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.
balance channel should be the priority for 2.3
@yah01 is working on supporting two shard in same channel exist at same time. after that, we will re-enable balance channel
/assign working on it
master fixed with #24849
This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.
Rotten issues close after 30d of inactivity. Reopen the issue with /reopen.
/reopen
@jiaoew1991: Reopened this issue.
In response to this:
/reopen
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.
This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.
Rotten issues close after 30d of inactivity. Reopen the issue with /reopen.
/reopen
@weiliu1031: Reopened this issue.
In response to this:
/reopen
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.
This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.
Rotten issues close after 30d of inactivity. Reopen the issue with /reopen.
/unassign