QAnything icon indicating copy to clipboard operation
QAnything copied to clipboard

[BUG] milvus_client.py文件中的过滤效率问题

Open johnson7788 opened this issue 1 year ago • 0 comments

是否已有关于该错误的issue或讨论? | Is there an existing issue / discussion for this?

  • [X] 我已经搜索过已有的issues和讨论 | I have searched the existing issues / discussions

该问题是否在FAQ中有解答? | Is there an existing answer for this in FAQ?

  • [X] 我已经搜索过FAQ | I have searched FAQ

当前行为 | Current Behavior

qanything_kernel/connector/database/milvus/milvus_client.py 文件中,当搜索该段落的前后相关文件时,current_chunk_id小于0的就没必要搜索了,因为拆分的段落不可能有负数的

    def process_group(self, group):
        new_cands = []
        group.sort(key=lambda x: int(x.metadata['chunk_id'].split('_')[-1]))
        id_set = set()
        file_id = group[0].metadata['file_id']
        file_name = group[0].metadata['file_name']
        group_scores_map = {}
        # 先找出该文件所有需要搜索的chunk_id
        cand_chunks = []
        for cand_doc in group:
            current_chunk_id = int(cand_doc.metadata['chunk_id'].split('_')[-1])
            group_scores_map[current_chunk_id] = cand_doc.metadata['score']
            for i in range(current_chunk_id - 200, current_chunk_id + 200):
                need_search_id = file_id + '_' + str(i)
                if need_search_id not in cand_chunks:
                    cand_chunks.append(need_search_id)

期望行为 | Expected Behavior

修改代码为如下,加上i>=0限制

                if need_search_id not in cand_chunks and i >= 0:
                    cand_chunks.append(need_search_id)

运行环境 | Environment

- OS:
- NVIDIA Driver:
- CUDA:
- Docker Compose:
- NVIDIA GPU Memory:

QAnything日志 | QAnything logs

No response

复现方法 | Steps To Reproduce

No response

备注 | Anything else?

No response

johnson7788 avatar Feb 19 '24 09:02 johnson7788