core
core copied to clipboard
Batch upload of documents
Description
This pull request introduces a new endpoint to facilitate the batch uploading of files. The endpoint allows users to upload multiple files. The response of the endpoint returns a dictionary containing the status of each uploaded file. Metadata for each file is passed as a JSON string alongside form data. This new endpoint does not affect the existing endpoint or any current functionality.
Example of the usage:
files = []
files_to_upload = {"sample.pdf":"application/pdf","sample.txt":"application/txt"}
for file_name in files_to_upload:
content_type = files_to_upload[file_name]
file_path = f"tests/mocks/{file_name}"
files.append( ("files", ((file_name, open(file_path, "rb"), content_type))) )
metadata = {
"sample.pdf":{
"source": "sample.pdf",
"title": "Test title",
"author": "Test author",
"year": 2020
},
"sample.txt":{
"source": "sample.txt",
"title": "Test title",
"author": "Test author",
"year": 2021
}
}
# upload file endpoint only accepts form-encoded data
payload = {
"chunk_size": 128,
"metadata": json.dumps(metadata)
}
response = requests.post(
"http://localhost:1865/rabbithole/batch",
files=files,
data=payload
)
Related to issue #871
Type of change
- [ ] Bug fix (non-breaking change which fixes an issue)
- [x] New feature (non-breaking change which adds functionality)
- [ ] Breaking change (fix or feature that would cause existing functionality to not work as expected)
- [ ] This change requires a documentation update
Checklist:
- [x] My code follows the style guidelines of this project
- [x] I have performed a self-review of my own code
- [x] I have commented my code, particularly in hard-to-understand areas