private-gpt
private-gpt copied to clipboard
CHM file support
Is your feature request related to a problem? Please describe. Well, I think the ability to work with CHM files would be a great addition, which is commonly used for software documentation.
Describe the solution you'd like I expect it to be able to read, answer questions and point out what parts of it the content was found.
Describe alternatives you've considered Converting to other formats, although it might make it worse and tedious considering I have thousands of CHMs...
Additional context I guess I've explained it enough, thank you!
This is right now not possible since langchain does not support it, see https://python.langchain.com/en/latest/reference/modules/document_loaders.html.
That's unfortunate :/ CHM are internally very similar to HTML tho, it actually has HTML files inside, so perhaps it might be possible later on... Although I truly suggest that you guys consider it, even if doing it custom, since this would add a lot more power to PrivateGPT.
After spending a day wrestling with creating a pipeline for converting chm files to something useful (without resorting to online converters, due to confidentiality), I'm very interested in this feature.
I've created a feature request over on langchain, to add support for chm files: https://github.com/langchain-ai/langchain/issues/15469
I'll have to see, whether I'll be allowed to dedicate some work hours to contribute to the project.
CHM support has been added to langchain with https://github.com/langchain-ai/langchain/pull/15519