milvus icon indicating copy to clipboard operation
milvus copied to clipboard

[Bug]: Create collection failed reporting "dmlChanName by-dev-rootcoord-dml_2 and deltaChanName by-dev-rootcoord-delta_1 mis-match"

Open binbinlv opened this issue 2 years ago • 31 comments

Is there an existing issue for this?

  • [X] I have searched the existing issues

Environment

- Milvus version: master-20220225-adca79fa
- Deployment mode(standalone or cluster): cluster 
- SDK version(e.g. pymilvus v2.0.0rc2):pymilvus==2.0.1.dev3
- OS(Ubuntu or CentOS): 
- CPU/Memory: 
- GPU: 
- Others:

Current Behavior

Create collection failed reporting "dmlChanName by-dev-rootcoord-dml_2 and deltaChanName by-dev-rootcoord-delta_1 mis-match"

Expected Behavior

Create collection successfully

Steps To Reproduce

Running weekly nightly:
https://ci.milvus.io:18080/jenkins/blue/organizations/jenkins/weekly-milvus-nightly-ci%2Fweekly-milvus-ci-test/detail/weekly-milvus-ci-test/57/pipeline

Anything else?

1 log: milvus-distributed-57-pymilvus-e2e-logs.tar.gz

  1. failed timeline: [2022-02-27T02:25:40.279Z] [gw0] [ 0%] FAILED testcases/test_alias.py::TestAliasParamsInvalid::test_alias_create_alias_with_invalid_name[12-s] [2022-02-27T02:25:40.279Z] [gw3] [ 0%] FAILED testcases/test_alias.py::TestAliasParamsInvalid::test_alias_create_alias_with_invalid_name[\u4e2d\u6587]

  2. test error trace:

[](https://ci.milvus.io:18080/jenkins/blue/organizations/jenkins/weekly-milvus-nightly-ci%2Fweekly-milvus-ci-test/detail/weekly-milvus-ci-test/57/pipeline#step-153-log-1081)[](https://ci.milvus.io:18080/jenkins/blue/organizations/jenkins/weekly-milvus-nightly-ci%2Fweekly-milvus-ci-test/detail/weekly-milvus-ci-test/57/pipeline#step-153-log-1082)[](https://ci.milvus.io:18080/jenkins/blue/organizations/jenkins/weekly-milvus-nightly-ci%2Fweekly-milvus-ci-test/detail/weekly-milvus-ci-test/57/pipeline#step-153-log-1083)[](https://ci.milvus.io:18080/jenkins/blue/organizations/jenkins/weekly-milvus-nightly-ci%2Fweekly-milvus-ci-test/detail/weekly-milvus-ci-test/57/pipeline#step-153-log-1084)[](https://ci.milvus.io:18080/jenkins/blue/organizations/jenkins/weekly-milvus-nightly-ci%2Fweekly-milvus-ci-test/detail/weekly-milvus-ci-test/57/pipeline#step-153-log-1085)[](https://ci.milvus.io:18080/jenkins/blue/organizations/jenkins/weekly-milvus-nightly-ci%2Fweekly-milvus-ci-test/detail/weekly-milvus-ci-test/57/pipeline#step-153-log-1086)[](https://ci.milvus.io:18080/jenkins/blue/organizations/jenkins/weekly-milvus-nightly-ci%2Fweekly-milvus-ci-test/detail/weekly-milvus-ci-test/57/pipeline#step-153-log-1087)[](https://ci.milvus.io:18080/jenkins/blue/organizations/jenkins/weekly-milvus-nightly-ci%2Fweekly-milvus-ci-test/detail/weekly-milvus-ci-test/57/pipeline#step-153-log-1088)[](https://ci.milvus.io:18080/jenkins/blue/organizations/jenkins/weekly-milvus-nightly-ci%2Fweekly-milvus-ci-test/detail/weekly-milvus-ci-test/57/pipeline#step-153-log-1089)[](https://ci.milvus.io:18080/jenkins/blue/organizations/jenkins/weekly-milvus-nightly-ci%2Fweekly-milvus-ci-test/detail/weekly-milvus-ci-test/57/pipeline#step-153-log-1090)[](https://ci.milvus.io:18080/jenkins/blue/organizations/jenkins/weekly-milvus-nightly-ci%2Fweekly-milvus-ci-test/detail/weekly-milvus-ci-test/57/pipeline#step-153-log-1091)[](https://ci.milvus.io:18080/jenkins/blue/organizations/jenkins/weekly-milvus-nightly-ci%2Fweekly-milvus-ci-test/detail/weekly-milvus-ci-test/57/pipeline#step-153-log-1092)[](https://ci.milvus.io:18080/jenkins/blue/organizations/jenkins/weekly-milvus-nightly-ci%2Fweekly-milvus-ci-test/detail/weekly-milvus-ci-test/57/pipeline#step-153-log-1093)[](https://ci.milvus.io:18080/jenkins/blue/organizations/jenkins/weekly-milvus-nightly-ci%2Fweekly-milvus-ci-test/detail/weekly-milvus-ci-test/57/pipeline#step-153-log-1094)[](https://ci.milvus.io:18080/jenkins/blue/organizations/jenkins/weekly-milvus-nightly-ci%2Fweekly-milvus-ci-test/detail/weekly-milvus-ci-test/57/pipeline#step-153-log-1095)[](https://ci.milvus.io:18080/jenkins/blue/organizations/jenkins/weekly-milvus-nightly-ci%2Fweekly-milvus-ci-test/detail/weekly-milvus-ci-test/57/pipeline#step-153-log-1096)[](https://ci.milvus.io:18080/jenkins/blue/organizations/jenkins/weekly-milvus-nightly-ci%2Fweekly-milvus-ci-test/detail/weekly-milvus-ci-test/57/pipeline#step-153-log-1097)[](https://ci.milvus.io:18080/jenkins/blue/organizations/jenkins/weekly-milvus-nightly-ci%2Fweekly-milvus-ci-test/detail/weekly-milvus-ci-test/57/pipeline#step-153-log-1098)[](https://ci.milvus.io:18080/jenkins/blue/organizations/jenkins/weekly-milvus-nightly-ci%2Fweekly-milvus-ci-test/detail/weekly-milvus-ci-test/57/pipeline#step-153-log-1099)[](https://ci.milvus.io:18080/jenkins/blue/organizations/jenkins/weekly-milvus-nightly-ci%2Fweekly-milvus-ci-test/detail/weekly-milvus-ci-test/57/pipeline#step-153-log-1100)[](https://ci.milvus.io:18080/jenkins/blue/organizations/jenkins/weekly-milvus-nightly-ci%2Fweekly-milvus-ci-test/detail/weekly-milvus-ci-test/57/pipeline#step-153-log-1101)[](https://ci.milvus.io:18080/jenkins/blue/organizations/jenkins/weekly-milvus-nightly-ci%2Fweekly-milvus-ci-test/detail/weekly-milvus-ci-test/57/pipeline#step-153-log-1102)[](https://ci.milvus.io:18080/jenkins/blue/organizations/jenkins/weekly-milvus-nightly-ci%2Fweekly-milvus-ci-test/detail/weekly-milvus-ci-test/57/pipeline#step-153-log-1103)[](https://ci.milvus.io:18080/jenkins/blue/organizations/jenkins/weekly-milvus-nightly-ci%2Fweekly-milvus-ci-test/detail/weekly-milvus-ci-test/57/pipeline#step-153-log-1104)[](https://ci.milvus.io:18080/jenkins/blue/organizations/jenkins/weekly-milvus-nightly-ci%2Fweekly-milvus-ci-test/detail/weekly-milvus-ci-test/57/pipeline#step-153-log-1105)[](https://ci.milvus.io:18080/jenkins/blue/organizations/jenkins/weekly-milvus-nightly-ci%2Fweekly-milvus-ci-test/detail/weekly-milvus-ci-test/57/pipeline#step-153-log-1106)[](https://ci.milvus.io:18080/jenkins/blue/organizations/jenkins/weekly-milvus-nightly-ci%2Fweekly-milvus-ci-test/detail/weekly-milvus-ci-test/57/pipeline#step-153-log-1107)[](https://ci.milvus.io:18080/jenkins/blue/organizations/jenkins/weekly-milvus-nightly-ci%2Fweekly-milvus-ci-test/detail/weekly-milvus-ci-test/57/pipeline#step-153-log-1108)[](https://ci.milvus.io:18080/jenkins/blue/organizations/jenkins/weekly-milvus-nightly-ci%2Fweekly-milvus-ci-test/detail/weekly-milvus-ci-test/57/pipeline#step-153-log-1109)[](https://ci.milvus.io:18080/jenkins/blue/organizations/jenkins/weekly-milvus-nightly-ci%2Fweekly-milvus-ci-test/detail/weekly-milvus-ci-test/57/pipeline#step-153-log-1110)[](https://ci.milvus.io:18080/jenkins/blue/organizations/jenkins/weekly-milvus-nightly-ci%2Fweekly-milvus-ci-test/detail/weekly-milvus-ci-test/57/pipeline#step-153-log-1111)[](https://ci.milvus.io:18080/jenkins/blue/organizations/jenkins/weekly-milvus-nightly-ci%2Fweekly-milvus-ci-test/detail/weekly-milvus-ci-test/57/pipeline#step-153-log-1112)[](https://ci.milvus.io:18080/jenkins/blue/organizations/jenkins/weekly-milvus-nightly-ci%2Fweekly-milvus-ci-test/detail/weekly-milvus-ci-test/57/pipeline#step-153-log-1113)[](https://ci.milvus.io:18080/jenkins/blue/organizations/jenkins/weekly-milvus-nightly-ci%2Fweekly-milvus-ci-test/detail/weekly-milvus-ci-test/57/pipeline#step-153-log-1114)[](https://ci.milvus.io:18080/jenkins/blue/organizations/jenkins/weekly-milvus-nightly-ci%2Fweekly-milvus-ci-test/detail/weekly-milvus-ci-test/57/pipeline#step-153-log-1115)[](https://ci.milvus.io:18080/jenkins/blue/organizations/jenkins/weekly-milvus-nightly-ci%2Fweekly-milvus-ci-test/detail/weekly-milvus-ci-test/57/pipeline#step-153-log-1116)[](https://ci.milvus.io:18080/jenkins/blue/organizations/jenkins/weekly-milvus-nightly-ci%2Fweekly-milvus-ci-test/detail/weekly-milvus-ci-test/57/pipeline#step-153-log-1117)[](https://ci.milvus.io:18080/jenkins/blue/organizations/jenkins/weekly-milvus-nightly-ci%2Fweekly-milvus-ci-test/detail/weekly-milvus-ci-test/57/pipeline#step-153-log-1118)[](https://ci.milvus.io:18080/jenkins/blue/organizations/jenkins/weekly-milvus-nightly-ci%2Fweekly-milvus-ci-test/detail/weekly-milvus-ci-test/57/pipeline#step-153-log-1119)[](https://ci.milvus.io:18080/jenkins/blue/organizations/jenkins/weekly-milvus-nightly-ci%2Fweekly-milvus-ci-test/detail/weekly-milvus-ci-test/57/pipeline#step-153-log-1120)[](https://ci.milvus.io:18080/jenkins/blue/organizations/jenkins/weekly-milvus-nightly-ci%2Fweekly-milvus-ci-test/detail/weekly-milvus-ci-test/57/pipeline#step-153-log-1121)[](https://ci.milvus.io:18080/jenkins/blue/organizations/jenkins/weekly-milvus-nightly-ci%2Fweekly-milvus-ci-test/detail/weekly-milvus-ci-test/57/pipeline#step-153-log-1122)[](https://ci.milvus.io:18080/jenkins/blue/organizations/jenkins/weekly-milvus-nightly-ci%2Fweekly-milvus-ci-test/detail/weekly-milvus-ci-test/57/pipeline#step-153-log-1123)[](https://ci.milvus.io:18080/jenkins/blue/organizations/jenkins/weekly-milvus-nightly-ci%2Fweekly-milvus-ci-test/detail/weekly-milvus-ci-test/57/pipeline#step-153-log-1124)[](https://ci.milvus.io:18080/jenkins/blue/organizations/jenkins/weekly-milvus-nightly-ci%2Fweekly-milvus-ci-test/detail/weekly-milvus-ci-test/57/pipeline#step-153-log-1125)[](https://ci.milvus.io:18080/jenkins/blue/organizations/jenkins/weekly-milvus-nightly-ci%2Fweekly-milvus-ci-test/detail/weekly-milvus-ci-test/57/pipeline#step-153-log-1126)[](https://ci.milvus.io:18080/jenkins/blue/organizations/jenkins/weekly-milvus-nightly-ci%2Fweekly-milvus-ci-test/detail/weekly-milvus-ci-test/57/pipeline#step-153-log-1127)[](https://ci.milvus.io:18080/jenkins/blue/organizations/jenkins/weekly-milvus-nightly-ci%2Fweekly-milvus-ci-test/detail/weekly-milvus-ci-test/57/pipeline#step-153-log-1128)[](https://ci.milvus.io:18080/jenkins/blue/organizations/jenkins/weekly-milvus-nightly-ci%2Fweekly-milvus-ci-test/detail/weekly-milvus-ci-test/57/pipeline#step-153-log-1129)[](https://ci.milvus.io:18080/jenkins/blue/organizations/jenkins/weekly-milvus-nightly-ci%2Fweekly-milvus-ci-test/detail/weekly-milvus-ci-test/57/pipeline#step-153-log-1130)[](https://ci.milvus.io:18080/jenkins/blue/organizations/jenkins/weekly-milvus-nightly-ci%2Fweekly-milvus-ci-test/detail/weekly-milvus-ci-test/57/pipeline#step-153-log-1131)[](https://ci.milvus.io:18080/jenkins/blue/organizations/jenkins/weekly-milvus-nightly-ci%2Fweekly-milvus-ci-test/detail/weekly-milvus-ci-test/57/pipeline#step-153-log-1132)[2022-02-27T03:38:44.121Z] [2022-02-27 02:25:35 - INFO - ci_test]: ################################################################################ (conftest.py:166)

[2022-02-27T03:38:44.121Z] [2022-02-27 02:25:35 - INFO - ci_test]: [initialize_milvus] Log cleaned up, start testing... (conftest.py:167)

[2022-02-27T03:38:44.121Z] [2022-02-27 02:25:35 - INFO - ci_test]: [setup_class] Start setup class... (client_base.py:45)

[2022-02-27T03:38:44.121Z] [2022-02-27 02:25:35 - INFO - ci_test]: *********************************** setup *********************************** (client_base.py:51)

[2022-02-27T03:38:44.121Z] [2022-02-27 02:25:35 - INFO - ci_test]: [setup_method] Start setup test case test_alias_create_alias_with_invalid_name. (client_base.py:52)

[2022-02-27T03:38:44.121Z] ------------------------------ Captured log call -------------------------------

[2022-02-27T03:38:44.121Z] [2022-02-27 02:25:35 - DEBUG - ci_test]: (api_request)  : [Connections.connect] args: ['default'], kwargs: {'host': 'md-weekly-57-n-milvus.milvus-ci', 'port': '19530'} (api_request.py:55)

[2022-02-27T03:38:44.121Z] [2022-02-27 02:25:35 - DEBUG - ci_test]: (api_response) : None  (api_request.py:27)

[2022-02-27T03:38:44.121Z] [2022-02-27 02:25:35 - DEBUG - ci_test]: (api_request)  : [Connections.has_connection] args: ['default'], kwargs: {} (api_request.py:55)

[2022-02-27T03:38:44.121Z] [2022-02-27 02:25:35 - DEBUG - ci_test]: (api_response) : True  (api_request.py:27)

[2022-02-27T03:38:44.121Z] [2022-02-27 02:25:35 - DEBUG - ci_test]: (api_request)  : [Collection] args: ['collection_9LOQlelU', {

[2022-02-27T03:38:44.121Z]   auto_id: False

[2022-02-27T03:38:44.121Z]   description: 

[2022-02-27T03:38:44.121Z]   fields: [{

[2022-02-27T03:38:44.121Z]     name: int64

[2022-02-27T03:38:44.121Z]     description: 

[2022-02-27T03:38:44.121Z]     type: 5

[2022-02-27T03:38:44.121Z]     is_primary: True

[2022-02-27T03:38:44.121Z]     auto_id: False

[2022-02-27T03:38:44.121Z]   }, {

[2022-02-27T03:38:44.121Z]     name: float

[2022-02-27T03:38:44.121Z]     description: 

[2022-02-27T03:38:44.121Z]     type: 10

[2022-02-27T03:38:44.121Z]   }, {

[2022-02-27T03:38:44.121Z]     name: float_vector

[2022-02-27T03:38:44.121Z]     description: 

[2022-02-27T03:38:44.121Z]     type: 101

[2022-02-27T03:38:44.122Z]     params: {'dim': 128}

[2022-02-27T03:38:44.122Z]  ......, kwargs: {'consistency_level': 'Strong'} (api_request.py:55)

[2022-02-27T03:38:44.122Z] [2022-02-27 02:25:35 - ERROR - pymilvus.client.grpc_handler]: error_code: UnexpectedError

[2022-02-27T03:38:44.122Z] reason: "CreateCollection failed: dmlChanName by-dev-rootcoord-dml_1 and deltaChanName by-dev-rootcoord-delta_2 mis-match"

[2022-02-27T03:38:44.122Z]  (grpc_handler.py:131)

[2022-02-27T03:38:44.122Z] [2022-02-27 02:25:35 - ERROR - pymilvus.decorators]: RPC error: [create_collection], <BaseException: (code=1, message=CreateCollection failed: dmlChanName by-dev-rootcoord-dml_1 and deltaChanName by-dev-rootcoord-delta_2 mis-match)>, <Time:{'RPC start': '2022-02-27 02:25:35.728042', 'RPC error': '2022-02-27 02:25:35.735496'}> (decorators.py:73)

[2022-02-27T03:38:44.122Z] [2022-02-27 02:25:35 - ERROR - ci_test]: Traceback (most recent call last):

[2022-02-27T03:38:44.122Z]   File "/home/jenkins/agent/workspace/tests/python_client/utils/api_request.py", line 22, in inner_wrapper

[2022-02-27T03:38:44.122Z]     res = func(*args, **kwargs)

[2022-02-27T03:38:44.122Z]   File "/home/jenkins/agent/workspace/tests/python_client/utils/api_request.py", line 56, in api_request

[2022-02-27T03:38:44.122Z]     return func(*arg, **kwargs)

[2022-02-27T03:38:44.122Z]   File "/usr/local/lib/python3.6/site-packages/pymilvus/orm/collection.py", line 145, in __init__

[2022-02-27T03:38:44.122Z]     consistency_level=consistency_level)

[2022-02-27T03:38:44.122Z]   File "/usr/local/lib/python3.6/site-packages/pymilvus/decorators.py", line 56, in handler

[2022-02-27T03:38:44.122Z]     raise e

[2022-02-27T03:38:44.122Z]   File "/usr/local/lib/python3.6/site-packages/pymilvus/decorators.py", line 41, in handler

[2022-02-27T03:38:44.122Z]     return func(self, *args, **kwargs)

[2022-02-27T03:38:44.122Z]   File "/usr/local/lib/python3.6/site-packages/pymilvus/decorators.py", line 74, in handler

[2022-02-27T03:38:44.122Z]     raise e

[2022-02-27T03:38:44.122Z]   File "/usr/local/lib/python3.6/site-packages/pymilvus/decorators.py", line 70, in handler

[2022-02-27T03:38:44.122Z]     return func(*args, **kwargs)

[2022-02-27T03:38:44.122Z]   File "/usr/local/lib/python3.6/site-packages/pymilvus/client/grpc_handler.py", line 132, in create_collection

[2022-02-27T03:38:44.122Z]     raise BaseException(status.error_code, status.reason)

[2022-02-27T03:38:44.122Z] pymilvus.client.exceptions.BaseException: <BaseException: (code=1, message=CreateCollection failed: dmlChanName by-dev-rootcoord-dml_1 and deltaChanName by-dev-rootcoord-delta_2 mis-match)>

[2022-02-27T03:38:44.122Z]  (api_request.py:35)

[2022-02-27T03:38:44.122Z] [2022-02-27 02:25:35 - ERROR - ci_test]: (api_response) : <BaseException: (code=1, message=CreateCollection failed: dmlChanName by-dev-rootcoord-dml_1 and deltaChanName by-dev-rootcoord-delta_2 mis-match)> (api_request.py:36)

binbinlv avatar Feb 28 '22 09:02 binbinlv

@LoveEachDay could you help to check whether the error time is when the milvus has not been started yet?

Thanks.

binbinlv avatar Feb 28 '22 09:02 binbinlv

milvus started normally since 2022-02-27T02:23:56.

LoveEachDay avatar Feb 28 '22 10:02 LoveEachDay

@czs007

could you help to find someone to check this issue?

Thanks.

binbinlv avatar Feb 28 '22 10:02 binbinlv

/unassign

LoveEachDay avatar Feb 28 '22 10:02 LoveEachDay

@soothing-rain could you please help on this issue? It reproduces recently in ci nightly /assign @soothing-rain /unassign @czs007

yanliang567 avatar Feb 28 '22 11:02 yanliang567

Reproduced again in the following nightly:

1 log: milvus-distributed-89-pymilvus-e2e-logs.tar.gz

2 Test case failed time: at the beginning of the tests [2022-03-12T06:26:45.816Z] [gw0] [ 0%] FAILED testcases/test_alias.py::TestAliasParamsInvalid::test_alias_create_alias_with_invalid_name[12-s] [2022-03-12T06:26:46.102Z] [gw3] [ 0%] FAILED testcases/test_alias.py::TestAliasParamsInvalid::test_alias_create_alias_with_invalid_name[\u4e2d\u6587]

binbinlv avatar Mar 14 '22 04:03 binbinlv

@soothing-rain

any progress of this issue? Thanks.

binbinlv avatar Mar 14 '22 04:03 binbinlv

Not yet, as we were told it only happened once last time :)

Will take a look now that it has reproduced.

soothing-rain avatar Mar 14 '22 06:03 soothing-rain

Same issue appears again: https://ci.milvus.io:18080/jenkins/blue/rest/organizations/jenkins/pipelines/milvus-nightly-ci/branches/master/runs/468/nodes/63/steps/211/log/?start=0

milvus logs:

artifacts-milvus-distributed-pulsar-master-468-pymilvus-e2e-logs.tar.gz

binbinlv avatar Apr 18 '22 03:04 binbinlv

/assign @wayblink

soothing-rain avatar Apr 18 '22 03:04 soothing-rain

@soothing-rain: GitHub didn't allow me to assign the following users: wayblink.

Note that only milvus-io members, repo collaborators and people who have commented on this issue/PR can be assigned. Additionally, issues/PRs can only have 10 assignees at the same time. For more information please see the contributor guide

In response to this:

/assign @wayblink

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

sre-ci-robot avatar Apr 18 '22 03:04 sre-ci-robot

/assign @wayblink

yanliang567 avatar Apr 18 '22 05:04 yanliang567

@yanliang567: GitHub didn't allow me to assign the following users: wayblink.

Note that only milvus-io members, repo collaborators and people who have commented on this issue/PR can be assigned. Additionally, issues/PRs can only have 10 assignees at the same time. For more information please see the contributor guide

In response to this:

/assign @wayblink

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

sre-ci-robot avatar Apr 18 '22 05:04 sre-ci-robot

Same issue occurs in the following nightly: https://ci.milvus.io:18080/jenkins/blue/rest/organizations/jenkins/pipelines/milvus-nightly-ci/branches/master/runs/483/nodes/63/steps/211/log/?start=0

  1. milvus logs: artifacts-milvus-distributed-pulsar-master-483-pymilvus-e2e-logs.tar.gz

  2. collection name: search_collection_ouB7uvlt

  3. Failed timeline: [2022-04-21T19:16:14.996Z] [gw2] [ 52%] FAILED testcases/test_search_20.py::TestCollectionSearch::test_search_after_different_index_with_params[128-False-False-IVF_FLAT-params1]

  4. test logs:

[2022-04-21T20:13:36.402Z] [2022-04-21 19:16:13 - INFO - ci_test]: init_collection_general: collection creation (client_base.py:162)
[2022-04-21T20:13:36.402Z] [2022-04-21 19:16:13 - DEBUG - ci_test]: (api_request)  : [Connections.has_connection] args: ['default'], kwargs: {} (api_request.py:55)
[2022-04-21T20:13:36.402Z] [2022-04-21 19:16:13 - DEBUG - ci_test]: (api_response) : True  (api_request.py:27)
[2022-04-21T20:13:36.402Z] [2022-04-21 19:16:13 - DEBUG - ci_test]: (api_request)  : [Collection] args: ['search_collection_ouB7uvlt', {
[2022-04-21T20:13:36.402Z]   auto_id: False
[2022-04-21T20:13:36.402Z]   description: 
[2022-04-21T20:13:36.402Z]   fields: [{
[2022-04-21T20:13:36.402Z]     name: int64
[2022-04-21T20:13:36.402Z]     description: 
[2022-04-21T20:13:36.402Z]     type: 5
[2022-04-21T20:13:36.402Z]     is_primary: True
[2022-04-21T20:13:36.402Z]     auto_id: False
[2022-04-21T20:13:36.402Z]   }, {
[2022-04-21T20:13:36.402Z]     name: float
[2022-04-21T20:13:36.402Z]     description: 
[2022-04-21T20:13:36.402Z]     type: 10
[2022-04-21T20:13:36.402Z]   }, {
[2022-04-21T20:13:36.402Z]     name: float_vector
[2022-04-21T20:13:36.402Z]     description: 
[2022-04-21T20:13:36.402Z]     type: 101
[2022-04-21T20:13:36.403Z]     params: {'dim':......, kwargs: {'consistency_level': 'Strong'} (api_request.py:55)
[2022-04-21T20:13:36.403Z] [2022-04-21 19:16:13 - ERROR - pymilvus.client.grpc_handler]: error_code: UnexpectedError
[2022-04-21T20:13:36.403Z] reason: "CreateCollection failed: dmlChanName by-dev-rootcoord-dml_190 and deltaChanName by-dev-rootcoord-delta_191 mis-match"
[2022-04-21T20:13:36.403Z]  (grpc_handler.py:167)
[2022-04-21T20:13:36.403Z] [2022-04-21 19:16:13 - ERROR - pymilvus.decorators]: RPC error: [create_collection], <MilvusException: (code=1, message=CreateCollection failed: dmlChanName by-dev-rootcoord-dml_190 and deltaChanName by-dev-rootcoord-delta_191 mis-match)>, <Time:{'RPC start': '2022-04-21 19:16:13.675529', 'RPC error': '2022-04-21 

binbinlv avatar Apr 22 '22 03:04 binbinlv

/assign @wayblink

Another non-P0 issue.

soothing-rain avatar May 20 '22 07:05 soothing-rain

It appears again in the following nightly: https://ci.milvus.io:18080/jenkins/blue/rest/organizations/jenkins/pipelines/milvus-nightly-ci/branches/master/runs/561/nodes/64/steps/178/log/?start=0

  1. milvus logs: artifacts-milvus-distributed-kafka-master-561-pymilvus-e2e-logs.tar.gz

  2. collection name: collection_count_j0RqmfpU

  3. errors: 022-05-30T16:29:32.910Z] [2022-05-30 15:04:50 - ERROR - pymilvus.client.grpc_handler]: error_code: UnexpectedError [2022-05-30T16:29:32.910Z] reason: "CreateCollection failed: dmlChanName by-dev-rootcoord-dml_42 and deltaChanName by-dev-rootcoord-delta_41 mis-match" [2022-05-30T16:29:32.910Z] (grpc_handler.py:187) [2022-05-30T16:29:32.910Z] [2022-05-30 15:04:50 - ERROR - pymilvus.decorators]: RPC error: [create_collection], <MilvusException: (code=1, message=CreateCollection failed: dmlChanName by-dev-rootcoord-dml_42 and deltaChanName by-dev-rootcoord-delta_41 mis-match)>, <Time:{'RPC start': '2022-05-30 15:04:50.848032', 'RPC error': '2022-05-30 15:04:50.849877'}>

  4. Failed timeline: [2022-05-30T15:04:51.847Z] [gw4] [ 9%] FAILED testcases/test_collection.py::TestCollectionMultiCollections::test_collection_count_multi_collections_l2[2001] [2022-05-30T15:04:52.414Z] [gw3] [ 9%] FAILED testcases/test_collection.py::TestCollectionMultiCollections::test_collection_count_multi_collections_l2[1]

binbinlv avatar May 31 '22 02:05 binbinlv

@wayblink Could you please have an investigation? Thanks.

binbinlv avatar May 31 '22 03:05 binbinlv

@wayblink Could you please have an investigation? Thanks.

OK, I will look into it

wayblink avatar May 31 '22 04:05 wayblink

seems reproduced in https://ci.milvus.io:18080/jenkins/blue/organizations/jenkins/milvus-ha-ci/detail/PR-17341/4/pipeline

bigsheeper avatar Jun 02 '22 10:06 bigsheeper

This is a compatibility issue of channel prefix that has been fixed

xiaofan-luan avatar Jun 20 '22 03:06 xiaofan-luan

This issue has been reproduced in both master and 2.1.0 branch: master: https://ci.milvus.io:18080/jenkins/blue/organizations/jenkins/milvus-nightly-ci/detail/master/667/pipeline/190 (1) milvus log: artifacts-milvus-distributed-pulsar-master-667-pymilvus-e2e-logs.tar.gz (2) collection name: search_collection_FumeaLdb (3) failed time:[2022-07-21T15:25:03.223Z] [gw1] [ 68%] FAILED testcases/test_search.py::TestCollectionSearch::test_search_with_expression_auto_id[8-True-500 <= float <= 1000]

2.1.0 branch: (1) milvus log: milvus-distributed-kafka-167-pymilvus-e2e-logs.tar.gz (2) collection name: collection_XkdnMoEf (3) FAILED time: [2022-07-21T20:44:03.435Z] [gw3] [ 0%] FAILED testcases/test_alias.py::TestAliasParamsInvalid::test_alias_create_alias_with_invalid_name[%$#]

binbinlv avatar Jul 22 '22 02:07 binbinlv

So reopen this issue.

binbinlv avatar Jul 22 '22 02:07 binbinlv

Considering it is randomly reproduced in both master and 2.1.0 branch, so not set as urgent.

binbinlv avatar Jul 22 '22 02:07 binbinlv

Reproduced in the following 2.1.0 branch nightly: https://ci.milvus.io:18080/jenkins/blue/organizations/jenkins/milvus-release-nightly/detail/milvus-release-nightly/172/pipeline/155

  1. log: milvus-distributed-kafka-172-pymilvus-e2e-logs.tar.gz
  2. collection name: collection_zMbIB0SE
  3. Failed time: [2022-07-23T02:09:59.487Z] [gw4] [ 6%] FAILED testcases/test_collection.py::TestCollectionParams::test_collection_shards_num_with_not_default_value[256]

binbinlv avatar Jul 25 '22 02:07 binbinlv

Same issue in the following nightly: https://ci.milvus.io:18080/jenkins/blue/organizations/jenkins/milvus-nightly-ci/detail/master/675/pipeline/190

  1. milvus log: artifacts-milvus-distributed-pulsar-master-675-pymilvus-e2e-logs.tar.gz
  2. collection name: collection_0WV7fmlw
  3. Failed time: [2022-07-25T15:04:33.391Z] [gw2] [ 0%] FAILED testcases/test_alias.py::TestAliasParamsInvalid::test_alias_create_alias_with_invalid_name[(mn)]

binbinlv avatar Jul 26 '22 02:07 binbinlv

Same issue in the following nightly: https://ci.milvus.io:18080/jenkins/blue/organizations/jenkins/milvus-nightly-ci/detail/master/675/pipeline/190

  1. milvus log: artifacts-milvus-distributed-pulsar-master-675-pymilvus-e2e-logs.tar.gz
  2. collection name: collection_0WV7fmlw
  3. Failed time: [2022-07-25T15:04:33.391Z] [gw2] [ 0%] FAILED testcases/test_alias.py::TestAliasParamsInvalid::test_alias_create_alias_with_invalid_name[(mn)]

I'll take a look

wayblink avatar Jul 26 '22 09:07 wayblink

Same issue in the following nightly: https://jenkins.milvus.io:18080/blue/organizations/jenkins/Milvus%20Nightly%20CI/detail/master/11/pipeline

(1) milvus logs: artifacts-milvus-distributed-kafka-nightly-11-pymilvus-e2e-logs.tar.gz (2) collection name: test_GQu207FX (just one example for "test_release_collection_not_existed") (3) FAILED time: [2022-08-02T15:05:17.249Z] [gw0] [ 9%] FAILED testcases/test_collection.py::TestLoadCollection::test_load_collection_after_index [2022-08-02T15:05:17.249Z] testcases/test_collection.py::TestLoadCollection::test_load_collection_after_index_binary [2022-08-02T15:05:17.505Z] testcases/test_collection.py::TestLoadCollection::test_release_collection_not_existed

binbinlv avatar Aug 03 '22 02:08 binbinlv

Same issue in the following nightly: https://jenkins.milvus.io:18080/blue/organizations/jenkins/Milvus%20Nightly%20CI/detail/master/11/pipeline

(1) milvus logs: artifacts-milvus-distributed-kafka-nightly-11-pymilvus-e2e-logs.tar.gz (2) collection name: test_GQu207FX (just one example for "test_release_collection_not_existed") (3) FAILED time: [2022-08-02T15:05:17.249Z] [gw0] [ 9%] FAILED testcases/test_collection.py::TestLoadCollection::test_load_collection_after_index [2022-08-02T15:05:17.249Z] testcases/test_collection.py::TestLoadCollection::test_load_collection_after_index_binary [2022-08-02T15:05:17.505Z] testcases/test_collection.py::TestLoadCollection::test_release_collection_not_existed

2022-08-02T23:05:12.646719703+08:00 stdout F [2022/08/02 15:05:12.646 +00:00] [DEBUG] [rootcoord/task.go:173] ["dmlChanName deltaChanName mismatch detail"] [i=0] [vchanName=by-dev-rootcoord-dml_182_435015571895222273v0] [phsicalChanName=by-dev-rootcoord-dml_182] [deltaChanName=by-dev-rootcoord-delta_200] [converted_deltaChanName=by-dev-rootcoord-delta_182] [err=] 2022-08-02T23:05:16.152510739+08:00 stdout F [2022/08/02 15:05:16.152 +00:00] [DEBUG] [rootcoord/task.go:173] ["dmlChanName deltaChanName mismatch detail"] [i=0] [vchanName=by-dev-rootcoord-dml_183_435015572812726273v0] [phsicalChanName=by-dev-rootcoord-dml_183] [deltaChanName=by-dev-rootcoord-delta_182] [converted_deltaChanName=by-dev-rootcoord-delta_183] [err=] 2022-08-02T23:05:16.16297721+08:00 stdout F [2022/08/02 15:05:16.162 +00:00] [DEBUG] [rootcoord/task.go:173] ["dmlChanName deltaChanName mismatch detail"] [i=0] [vchanName=by-dev-rootcoord-dml_186_435015572825571329v0] [phsicalChanName=by-dev-rootcoord-dml_186] [deltaChanName=by-dev-rootcoord-delta_183] [converted_deltaChanName=by-dev-rootcoord-delta_186] [err=] 2022-08-02T23:05:16.623740304+08:00 stdout F [2022/08/02 15:05:16.623 +00:00] [DEBUG] [rootcoord/task.go:173] ["dmlChanName deltaChanName mismatch detail"] [i=0] [vchanName=by-dev-rootcoord-dml_187_435015572943536129v0] [phsicalChanName=by-dev-rootcoord-dml_187] [deltaChanName=by-dev-rootcoord-delta_186] [converted_deltaChanName=by-dev-rootcoord-delta_187] [err=] 2022-08-02T23:05:17.038012466+08:00 stdout F [2022/08/02 15:05:17.037 +00:00] [DEBUG] [rootcoord/task.go:173] ["dmlChanName deltaChanName mismatch detail"] [i=0] [vchanName=by-dev-rootcoord-dml_190_435015573048655873v0] [phsicalChanName=by-dev-rootcoord-dml_190] [deltaChanName=by-dev-rootcoord-delta_187] [converted_deltaChanName=by-dev-rootcoord-delta_190] [err=] 2022-08-02T23:05:17.384730188+08:00 stdout F [2022/08/02 15:05:17.384 +00:00] [DEBUG] [rootcoord/task.go:173] ["dmlChanName deltaChanName mismatch detail"] [i=0] [vchanName=by-dev-rootcoord-dml_191_435015573140668417v0] [phsicalChanName=by-dev-rootcoord-dml_191] [deltaChanName=by-dev-rootcoord-delta_190] [converted_deltaChanName=by-dev-rootcoord-delta_191] [err=] 2022-08-02T23:05:17.385215049+08:00 stdout F [2022/08/02 15:05:17.385 +00:00] [DEBUG] [rootcoord/task.go:173] ["dmlChanName deltaChanName mismatch detail"] [i=0] [vchanName=by-dev-rootcoord-dml_194_435015573140668419v0] [phsicalChanName=by-dev-rootcoord-dml_194] [deltaChanName=by-dev-rootcoord-delta_191] [converted_deltaChanName=by-dev-rootcoord-delta_194] [err=] 2022-08-02T23:05:17.492200633+08:00 stdout F [2022/08/02 15:05:17.492 +00:00] [DEBUG] [rootcoord/task.go:173] ["dmlChanName deltaChanName mismatch detail"] [i=0] [vchanName=by-dev-rootcoord-dml_195_435015573166620697v0] [phsicalChanName=by-dev-rootcoord-dml_195] [deltaChanName=by-dev-rootcoord-delta_194] [converted_deltaChanName=by-dev-rootcoord-delta_195] [err=] 2022-08-02T23:05:17.701995468+08:00 stdout F [2022/08/02 15:05:17.701 +00:00] [DEBUG] [rootcoord/task.go:173] ["dmlChanName deltaChanName mismatch detail"] [i=0] [vchanName=by-dev-rootcoord-dml_198_435015573219049479v0] [phsicalChanName=by-dev-rootcoord-dml_198] [deltaChanName=by-dev-rootcoord-delta_195] [converted_deltaChanName=by-dev-rootcoord-delta_198] [err=] 2022-08-02T23:05:18.154396322+08:00 stdout F [2022/08/02 15:05:18.154 +00:00] [DEBUG] [rootcoord/task.go:173] ["dmlChanName deltaChanName mismatch detail"] [i=0] [vchanName=by-dev-rootcoord-dml_199_435015573337014273v0] [phsicalChanName=by-dev-rootcoord-dml_199] [deltaChanName=by-dev-rootcoord-delta_198] [converted_deltaChanName=by-dev-rootcoord-delta_199] [err=] 2022-08-02T23:05:18.161079874+08:00 stdout F [2022/08/02 15:05:18.161 +00:00] [DEBUG] [rootcoord/task.go:173] ["dmlChanName deltaChanName mismatch detail"] [i=0] [vchanName=by-dev-rootcoord-dml_200_435015573337014275v0] [phsicalChanName=by-dev-rootcoord-dml_200] [deltaChanName=by-dev-rootcoord-delta_199] [converted_deltaChanName=by-dev-rootcoord-delta_200] [err=]

wayblink avatar Aug 08 '22 03:08 wayblink

https://github.com/milvus-io/milvus/blob/master/internal/rootcoord/task.go#L162~L181

I think this issue is caused by concurrent create_collection. Inner logic can't guarantee allocating dmlChannel and deltaChannel with the same id in concurrency. For example, task1 and task2 are two create_collection tasks. Mismatch will happen if they actually executing as the following order: task1 getDmlChannelName <- dml_1 task2 getDmlChannelName <- dml_2 task2 getDeltaChannelName <- delta_1 task2 getDeltaChannelName <- delta_2

wayblink avatar Aug 08 '22 03:08 wayblink

https://github.com/milvus-io/milvus/blob/master/internal/rootcoord/task.go#L162~L181

I think this issue is caused by concurrent create_collection. Inner logic can't guarantee allocating dmlChannel and deltaChannel with the same id in concurrency. For example, task1 and task2 are two create_collection tasks. Mismatch will happen if they actually executing as the following order: task1 getDmlChannelName <- dml_1 task2 getDmlChannelName <- dml_2 task2 getDeltaChannelName <- delta_1 task2 getDeltaChannelName <- delta_2

How about we 1, get dmlChannel from timetickSync 2, use the corresponding name deltaChannel 3, tell timetickSync the deltaChannel is used

@congqixia Could you help to take a look at this issue? Thx~

wayblink avatar Aug 08 '22 04:08 wayblink