milvus
milvus copied to clipboard
[Bug]: [laion1b-test] GC has not been triggered for a long time
Is there an existing issue for this?
- [X] I have searched the existing issues
Environment
- Milvus version: 2.3-20231229-7a192da8-amd64
- Deployment mode(standalone or cluster): cluster
- MQ type(rocksmq, pulsar or kafka): pulsar
- SDK version(e.g. pymilvus v2.0.0rc2):
- OS(Ubuntu or CentOS):
- CPU/Memory:
- GPU:
- Others:
Current Behavior
-
Concurrent insert, delete, flush, query, search of a collection with 50 million data. The number of segments in the collection increases sharply and then stops increasing new data. laion1b-test-read-write-new-8a
-
After a period of index building and compaction, the number of segments needed is reduced. But the number of dropped segments has not decreased. I guess GC is blocked by something
Expected Behavior
No response
Steps To Reproduce
No response
Milvus Log
No response
Anything else?
No response
@ThreadDao you can try to use the latest 2.3 branch to build a image, it should have been fixed. link pr: https://github.com/milvus-io/milvus/pull/29557
/assign @ThreadDao please help to verify /unassign
@SimFG I thought the problem with this issue was that something might be blocking the GC rather than the lack of concurrency in the GC?
@ThreadDao This is caused by too many dropped segments, which means the cleaning speed is relatively slow and it blocks the GC
@ThreadDao @chyezh From the server log, i found it maybe block the recycle unused indexes step. loglink
GC Task blocks at scan (list) minio object.
goroutine 7744 [chan receive]:
github.com/milvus-io/milvus/internal/storage.(*MinioObjectStorage).ListObjects(0xc0008dc280, {0x555e678, 0xc02db23720}, {0xc000418770, 0xd}, {0xc02d1a6be0, 0x10}, 0x1)
/go/src/github.com/milvus-io/milvus/internal/storage/minio_object_storage.go:190 +0x525
github.com/milvus-io/milvus/internal/storage.(*RemoteChunkManager).listObjects(0xc0008eb530, {0x555e678, 0xc02db23720}, {0xc000418770, 0xd}, {0xc02d1a6be0, 0x10}, 0x37?)
/go/src/github.com/milvus-io/milvus/internal/storage/remote_chunk_manager.go:393 +0xca
github.com/milvus-io/milvus/internal/storage.(*RemoteChunkManager).ListWithPrefix(0xc079463768?, {0x555e678?, 0xc02db23720?}, {0xc02d1a6be0?, 0xc02be38f30?}, 0x2c?)
/go/src/github.com/milvus-io/milvus/internal/storage/remote_chunk_manager.go:321 +0x3f
github.com/milvus-io/milvus/internal/datacoord.(*garbageCollector).scan(0xc059518fc0)
/go/src/github.com/milvus-io/milvus/internal/datacoord/garbage_collector.go:218 +0x39c
github.com/milvus-io/milvus/internal/datacoord.(*garbageCollector).work(0xc059518fc0)
/go/src/github.com/milvus-io/milvus/internal/datacoord/garbage_collector.go:153 +0xad9
created by github.com/milvus-io/milvus/internal/datacoord.(*garbageCollector).start.func1
/go/src/github.com/milvus-io/milvus/internal/datacoord/garbage_collector.go:96 +0x6d
GC Task blocks at scan (list) minio object.
goroutine 7744 [chan receive]: github.com/milvus-io/milvus/internal/storage.(*MinioObjectStorage).ListObjects(0xc0008dc280, {0x555e678, 0xc02db23720}, {0xc000418770, 0xd}, {0xc02d1a6be0, 0x10}, 0x1) /go/src/github.com/milvus-io/milvus/internal/storage/minio_object_storage.go:190 +0x525 github.com/milvus-io/milvus/internal/storage.(*RemoteChunkManager).listObjects(0xc0008eb530, {0x555e678, 0xc02db23720}, {0xc000418770, 0xd}, {0xc02d1a6be0, 0x10}, 0x37?) /go/src/github.com/milvus-io/milvus/internal/storage/remote_chunk_manager.go:393 +0xca github.com/milvus-io/milvus/internal/storage.(*RemoteChunkManager).ListWithPrefix(0xc079463768?, {0x555e678?, 0xc02db23720?}, {0xc02d1a6be0?, 0xc02be38f30?}, 0x2c?) /go/src/github.com/milvus-io/milvus/internal/storage/remote_chunk_manager.go:321 +0x3f github.com/milvus-io/milvus/internal/datacoord.(*garbageCollector).scan(0xc059518fc0) /go/src/github.com/milvus-io/milvus/internal/datacoord/garbage_collector.go:218 +0x39c github.com/milvus-io/milvus/internal/datacoord.(*garbageCollector).work(0xc059518fc0) /go/src/github.com/milvus-io/milvus/internal/datacoord/garbage_collector.go:153 +0xad9 created by github.com/milvus-io/milvus/internal/datacoord.(*garbageCollector).start.func1 /go/src/github.com/milvus-io/milvus/internal/datacoord/garbage_collector.go:96 +0x6d
List all object of huge bucket in minio cost too much time. We need make a pagination to avoid it.
Let's take this in the newer fix
New fix has been merged into master. @ThreadDao Please verify it.