feat: How About Adding Support for Loading Pickle Files?
open-webui: 0.4.7. ollama: 0.4.0 python: 3.11
I am a dedicated user of Open WebUI and find it extremely useful.
I especially rely on the RAG feature, which has been incredibly helpful.
I would like to suggest adding support for the pickle format, as it could make the platform even more versatile and useful.
The following is the code I modified and tested. I have confirmed that it works correctly in the described environment.
| apps
├── retrieval/
├── loaders/
├── main.py
# existing code
import pickle
# existing code
class PickleLoader:
def __init__(self, file_path):
self.file_path = file_path
def load(self) -> list[Document]:
with open(self.file_path, 'rb') as f:
data = pickle.load(f)
if isinstance(data, str):
return [Document(page_content=data, metadata={"source": self.file_path})]
elif isinstance(data, (list, tuple)):
return [Document(page_content=str(item), metadata={"source": self.file_path}) for item in data]
else:
return [Document(page_content=str(data), metadata={"source": self.file_path})]
# existing code
def _get_loader(self, filename: str, file_content_type: str, file_path: str):
# existing code
elif file_ext == "pkl" or file_ext == "pickle":
loader = PickleLoader(file_path)
# existing code
PR welcome!
Closing due to security concerns by experts in the field: https://blog.trailofbits.com/2024/06/11/exploiting-ml-models-with-pickle-file-attacks-part-1/ & https://blog.trailofbits.com/2024/06/11/exploiting-ml-models-with-pickle-file-attacks-part-2/ & https://www.sisainfosec.com/weekly-threat-watch/new-sleepy-pickle-exploit-puts-ml-models-at-risk/ to link a few sources of information