lobe-chat icon indicating copy to clipboard operation
lobe-chat copied to clipboard

[Bug]

Open dab512 opened this issue 1 year ago • 2 comments

📦 Platform

Other

📦 Deploymenet mode

server db(lobe-chat-database image)

📌 Version

Latest

💻 Operating System

Windows

🌐 Browser

Chrome

🐛 Bug Description

When I chunk files it is not chunk by semantic blocks. Every time I got equal length chunks.

But recursive text splitter or md splitter should split by headers or lines.

Maybe I miss some settings but I can’t found anything like this.

📷 Recurrence Steps

No response

🚦 Expected Behavior

No response

📝 Additional Information

No response

dab512 avatar Mar 01 '25 09:03 dab512

👀 @dab512

Thank you for raising an issue. We will investigate into the matter and get back to you as soon as possible. Please make sure you have given us as much context as possible.
非常感谢您提交 issue。我们会尽快调查此事,并尽快回复您。 请确保您已经提供了尽可能多的背景信息。

lobehubbot avatar Mar 01 '25 09:03 lobehubbot

The issue you're experiencing with file chunking not splitting by semantic blocks might be related to the current limitations or settings in the lobe-chat framework. As of now, there isn't specific documentation or configuration settings available in the repository that control how files are chunked by semantic blocks or headers. The default behavior seems to result in equal-length chunks, which might not align with your expectations for semantic or header-based splitting.

There is a known limitation where the chatbot processes only a limited portion of chunks, which might affect how the data is handled [1]. Additionally, the logic for different file types and their loaders is implemented in the ChunkingLoader class, but specific settings for semantic block chunking are not detailed [2].

If you are looking for specific settings or configurations, they might not be explicitly defined in the current version. You might need to explore the codebase further or wait for future updates that address these limitations.

To continue talking to Dosu, mention @dosu.


Help Dosu learn! Give it feedback: Great Response | Irrelevant Answer | Incorrect Sources | Too Verbose | Hallucination | Bug Report | Other

dosubot[bot] avatar Mar 01 '25 09:03 dosubot[bot]