pyDataverse
pyDataverse copied to clipboard
upload_datafile: handling of the content (mime) type
Any change needs to be discussed before proceeding. Failure to do so may result in the rejection of the pull request.
All Submissions
Describe your environment
- [x] OS: MacOS X 11.4
- [x] pyDataverse: 0.3.1
- [x] Python: 3.8.8
- [x] Dataverse: 5.9
Follow best practices
- [x] Have you checked to ensure there aren't other open Pull Requests for the same update/change?
- [x] Have you followed the guidelines in our Contribution Guide?
- [x] Have you read the Code of Conduct?
- [x] Do your changes in a seperate branch. Branches MUST have descriptive names.
- [x] Have you merged the latest changes from upstream to your branch?
Describe the PR
There is currently no way to pass the content (mime) type to upload_datafile()
(see #118). Also, when the multi-part POST form is created inside the method, NO content type is specified for the upload. This apparently fools Dataverse into defaulting to "text/plain", without attempting to use its normal type detection methods. In other words, in its current form, all files uploaded via pyDataverse end up with the content type "text/plain". Even when they are of types normally recognized by Dataverse (popular image types, etc). This defaulting behavior can and should be addressed on the Dataverse side. But it should be a good idea to fix it on the pyDataverse side as well. So this PR does 2 things:
- Provides a way to supply the mime type explicitly; and
- Makes it default to the standard
application/octet-stream
- a polite way to say "type unknown" - when creating a multi-part POST entry, like curl does; which then prompts Dataverse to at least attempt to identify the file more accurately. This is achieved by switching to the long notation of passing the file to therequests.post
method: from{"file": open(filename, "rb")}
to{"file": (filename, open(filename, "rb"), content_type)}
.
On the Dataverse side this is tracked in https://github.com/IQSS/dataverse/issues/8344
- [x] What kind of change does this PR introduce?
- bug fix/improvement
- [x] Why is this change required? What problem does it solve?
- see the description above and the discussion in the linked issues
- [ ] Screenshots (if appropriate)
- [x] Put
Closes #ISSUE_NUMBER
to the end of this pull request
Testing
- [ ] Have you used tox and/or pytest for testing the changes?
- [ ] Did the local testing ran successfully?
- [ ] Did the Continous Integration testing (Travis-CI) ran successfully?
Commits
- [ ] Have descriptive commit messages with a short title (first line).
- [ ] Use the commit message template
- [ ] Put
Closes #ISSUE_NUMBER
in your commit messages to auto-close the issue that it fixes (if such).
Others
- [ ] Is there anything you need from someone else?
Documentation contribution
- [ ] Have you followed NumPy Docstring standard?
Code contribution
- [ ] Have you used pre-commit?
- [ ] Have you formatted your code with black prior to submission (e. g. via pre-commit)?
- [ ] Have you written new tests for your changes?
- [ ] Have you ran mypy on your changes successfully?
- [ ] Have you documented your update (Docstrings and/or Docs)?
- [ ] Do your changes require additional changes to the documentation?
- Closes #118