Rework auth/api_token parameters
Describe your environment
On host:
- [x] OS: macOS, Sonoma 14.5, M1
- [x] pyDataverse: this PR (i.e., main branch + patches)
- [x] Python: 3.12
- [x] Dataverse: 6.3 (local/container), 6.2 (demo.dataverse.org)
Inside container:
- [x] OS: I think the python images use ubuntu, but didn't check
- [x] pyDataverse: this PR (i.e., main branch + patches)
- [x] Python: 3.11
- [x] Dataverse: 6.3 (!) -- had to override DV_VERSION in docker-compose-test-all.yml to make tests pass, as three tests check for the version. See also #197 .
Follow best practices
- [x] Have you checked to ensure there aren't other open Pull Requests for the same update/change? #146 is partially affected by this as it changes the params argument of
put_request. - [x] Have you followed the guidelines in our Contribution Guide? Yes, see #193 .
- [x] Have you read the Code of Conduct? Yes.
- [x] Do your changes in a separate branch. Branches MUST have descriptive names.
- [x] Have you merged the latest changes from upstream to your branch? Yes
Describe the PR
- [x] What kind of change does this PR introduce?
- This PR introduces custom authentication implementation to allow for API token and Bearer token-based authentication.
- Sword requires BasicAuth, which this PR also addresses (see https://guides.dataverse.org/en/latest/api/sword.html#sword-auth)
- An Auth implementation must be passed as a keyword-argument to the init function. We can discuss this requirement, but it felt better to be explicit about it and make sure implementers actually pass it correctly rather than accidentally passing it as the api_token or something.
- I replaced some
isinstance(..., ("".__class__, "".__class__))with the much more legible and equivalent (as far as I can tell)isinstance(..., str). - It also deprecates the use of the previously ignored
authon individual functions in favor of generating multiple API objects with different auth methods.- For the deprecation, I introduced a DEPRECATION_GUARD to warn people who are trying to pass something to the functions. Initially I wanted to check for
False, but there was one call defaulting toTrue(get_access) and this would not catch code explicitly settingauth=Falseon the callsite. So I decided to introduce a new default argument and show a warning if it is overwritten. Happy to discuss this decision – I have never done such an orderly deprecation before, so this is my first try :)
- For the deprecation, I introduced a DEPRECATION_GUARD to warn people who are trying to pass something to the functions. Initially I wanted to check for
- I updated the documentation.
- Oh, and I smuggled a change to the
Api.__str__methods into the PR. As I scrolled by, I saw that they essentially could be removed and just theApi.__str__method itself should useself.__class__.__name__to get a similar result. However, the DataAccessApi will not printDataAccessApi: ...instead of the previousData Access API: ...– the same applies for the others. However, this felt to be more in spirit with the docstring (which I then also updated slightly). If that's not so nice, we can move that to its own PR or even drop that commit entirely. - I had to adapt three unrelated things (conf.py, the
self.base_url = Noneassignment in api.py, and the pyproject.toml) for mypy. - This PR almost accidentally also allows to pass
paramsproperly due to the name change of params -> headers.
- [x] Why is this change required? What problem does it solve?
- First and foremost, the issue it resolves is that tokens could potentially be leaked when sharing log messages, as the error response from dataverse includes the request URL, which, if the token is provided as ?key, contains the key.
- The PR also lays the groundwork to allow for different authentication schemas in the future and even to configure it from the outside, favoring composition over inheritance. This will hopefully make maintenance and the refactoring in #189 easier.
- By deprecating the auth on individual methods, it will eventually be possible to remove this unused argument and make the API a little bit leaner and reduce confusing behavior.
- [ ] Screenshots (if appropriate)
- [x] Put
Closes #ISSUE_NUMBERto the end of this pull request
Testing
- [x] Have you used tox and/or pytest for testing the changes? pytest
- [x] Did the local testing ran successfully? yes, if applying #197. For demo.dataverse.org, I could not run test_upload.py successfully, as I was unauthorized to perform those uploads (401). However, locally (outside the containers) I was able to run the asyncio tests with pytest-asyncio. The containers seem to miss this.
- [ ] Did the Continuous Integration testing (Travis-CI) ran successfully?
Commits
- [x] Have descriptive commit messages with a short title (first line).
- [x] Use the commit message template
- [x] Put
Closes #ISSUE_NUMBERin your commit messages to auto-close the issue that it fixes (if such).- I have only done this for 3e73fbf, as that is arguably the "essence" of this PR.
Others
- [x] Is there anything you need from someone else? Take your time to review these changes. While some are quite repetitive and I already split some parts off into other PRs, this is a massive changeset due to the deprecation. In fact, maybe the deprecation should be its own PR? What do you think?
Documentation contribution
- [?] Have you followed NumPy Docstring standard? I hope so
Code contribution
- [x] Have you used pre-commit? Yes, see also #196.
- [x] Have you formatted your code with black prior to submission (e. g. via pre-commit)? Yes
- [x] Have you written new tests for your changes? Yes
- [x] Have you ran mypy on your changes successfully? Yes, but I had to add types-jsonschema to the lint dependencies and in the conf.py. I haven't tried --strict.
- [x] Have you documented your update (Docstrings and/or Docs)? yes
- [x] Do your changes require additional changes to the documentation? No, but m
Note that this PR depends on other PRs which should be reviewed and decided/acted upon first, I can also rebase/merge the changes into this branch afterwards. In particular:
- #196
- #197
Closes #192 .
Thanks, @shoeffner, that looks great! I agree with the changes made to allow passing either 'auth' or 'api_token'. I just tested it locally, and it works perfectly 🙌
@shoeffner, due to the merge of the jsonData pull request, this one has a couple of merge conflicts. I am happy to take care of those if you want.
I'll take a look, thanks for letting me know!
Alright, I rebased everything and resolved one incompatibility with #203, which I didn't pay attention to during the rebase. Should work now.
@shoeffner, thanks a lot - Merging!