gpt4all
gpt4all copied to clipboard
Indexing gets stuck if filenames have square brackets
Bug Report
Indexing gets stuck if filenames have square brackets.
As mentioned in discord by the user "Synerdata": "When indexing text files with square brackets in their title it seems to clog the embedder which gets stuck on it and returns to 0% until the square brackets are changed to curved ones or removed." and "It was just stuck on them saying it was embedding but was not, and then when I changed the square brackets to curved in the filename it proceeded normally and embedded them."
Steps to Reproduce
- Try to use LocalDocs feature and embedd local files
- Have files with a filename that contains square brackets, such as
[or]
Expected Behavior
Indexing should not get stuck. The file should get indexed.
Your Environment
- GPT4All version: 3.0
- Operating System: Unknown
- Chat model used (if applicable): Unknown
I've just tried that and it made embeddings for me. With GPT4All v3.0.1-dev0, current main: 6e0c0660. Windows 10, built locally.
I simply added square brackets to two test .txt files and made a collection with that. It created the database just fine.
Is there something else to consider?
Maybe simply renaming the file in itself solved the "being stuck at indexing" issues. We do not know why it got stuck. Would need more feedback from user, maybe sample documents or error messages. Closing, because I personally am not willing to invest more time into this. Since the original user is not willing to create a github account, it is hard to follow up.