milvus [Bug]: The created collection cannot be loaded and cannot be written to.

Is there an existing issue for this?

[X] I have searched the existing issues

Environment

- Milvus version:2.3.5
- Deployment mode(standalone or cluster):cluster
- MQ type(rocksmq, pulsar or kafka):  kafka  
- SDK version(e.g. pymilvus v2.0.0rc2):pymilvus 2.4
- OS(Ubuntu or CentOS): centos
- CPU/Memory: 
- GPU: 
- Others:

Current Behavior

When I created the milvus collection in the attu tool, I could not perform the load operation, and found that the time of inserting the sample data was wrong, the error was "deny to write the message to mq".Checking etcd shows that some historical deleted collections are still stored in the meta file

Expected Behavior

No response

Steps To Reproduce

1.attu Creates a new collection
2.attu writes the sample data

Milvus Log

[2024/05/23 10:00:01.916 +00:00] [ERROR] [rootcoord/dml_channels.go:282] ["Broadcast failed"] [error="deny to write the message to mq"] [chanName=kfk-topic-2-rootcoord-dml_5] [stack="github.com/milvus-io/milvus/internal/rootcoord.(*dmlChannels).broadcast\n\t/go/src/github.com/milvus-io/milvus/internal/rootcoord/dml_channels.go:282\ngithub.com/milvus-io/milvus/internal/rootcoord.(*timetickSync).broadcastDmlChannels\n\t/go/src/github.com/milvus-io/milvus/internal/rootcoord/timeticksync.go:392\ngithub.com/milvus-io/milvus/internal/rootcoord.(*bgGarbageCollector).notifyCollectionGc\n\t/go/src/github.com/milvus-io/milvus/internal/rootcoord/garbage_collector.go:190\ngithub.com/milvus-io/milvus/internal/rootcoord.(*bgGarbageCollector).GcCollectionData\n\t/go/src/github.com/milvus-io/milvus/internal/rootcoord/garbage_collector.go:236\ngithub.com/milvus-io/milvus/internal/rootcoord.(*deleteCollectionDataStep).Execute\n\t/go/src/github.com/milvus-io/milvus/internal/rootcoord/step.go:197\ngithub.com/milvus-io/milvus/internal/rootcoord.(*stepStack).Execute\n\t/go/src/github.com/milvus-io/milvus/internal/rootcoord/step_executor.go:59\ngithub.com/milvus-io/milvus/internal/rootcoord.(*bgStepExecutor).process.func1\n\t/go/src/github.com/milvus-io/milvus/internal/rootcoord/step_executor.go:201"]

[2024/05/23 10:02:00.675 +00:00] [WARN] [datacoord/index_service.go:264] ["there are multiple indexes, please specify the index_name"] [traceID=afdcb4804255e2e0fd1f9ee7b48c14bc] [collectionID=448145066652305196] [indexName=]

[2024/05/23 09:59:00.376 +00:00] [WARN] [kafka/kafka_consumer.go:138] ["consume msg failed"] [topic=kfk-topic-2-rootcoord-dml_6] [groupID=datanode-147-kfk-topic-2-rootcoord-dml_6_448145066652305348v0-true] [error="Local: Timed out"]

error.log

Anything else?

No response

May 24 '24 08:05 waitwindy

[2024/05/23 09:58:36.142 +00:00] [WARN] [timerecord/time_recorder.go:134] ["RootCoord haven't synchronized the time tick for 2.000000 minutes"]

This warning indicates the etcd or message queue doesn't work. Double-check the etcd state(log) and the disk space.

May 24 '24 10:05 yhmo

/assign @waitwindy /unassign

May 24 '24 10:05 yanliang567

But now etcd is readable by the client

May 24 '24 10:05 waitwindy

it seems to be a kafka error, not an etcd issue

May 25 '24 14:05 xiaofan-luan

But there is another milvus cluster that works fine with this Kafka cluster.

May 26 '24 02:05 waitwindy

are they using different topic name?

May 26 '24 02:05 xiaofan-luan

from the error log, milvus failed to consume kafka message. If two milvus share kafka, they need to use different prefix

May 26 '24 02:05 xiaofan-luan

Yes, their prefixes are different

May 27 '24 05:05 waitwindy

是的，它们的前缀不同

could this be a config issue? You have to figure out why kafka timeout

May 27 '24 05:05 xiaofan-luan

Maybe kafka's consumer group can't be found. I found that kafka topic exists, but the consumer group can't be found in the existing tools

May 27 '24 10:05 waitwindy

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions. Rotten issues close after 30d of inactivity. Reopen the issue with /reopen.

Jun 26 '24 17:06 stale[bot]