pyDataverse icon indicating copy to clipboard operation
pyDataverse copied to clipboard

Rework auth/api_token parameters

Open shoeffner opened this issue 1 year ago • 1 comments

Describe your environment

On host:

  • [x] OS: macOS, Sonoma 14.5, M1
  • [x] pyDataverse: this PR (i.e., main branch + patches)
  • [x] Python: 3.12
  • [x] Dataverse: 6.3 (local/container), 6.2 (demo.dataverse.org)

Inside container:

  • [x] OS: I think the python images use ubuntu, but didn't check
  • [x] pyDataverse: this PR (i.e., main branch + patches)
  • [x] Python: 3.11
  • [x] Dataverse: 6.3 (!) -- had to override DV_VERSION in docker-compose-test-all.yml to make tests pass, as three tests check for the version. See also #197 .

Follow best practices

  • [x] Have you checked to ensure there aren't other open Pull Requests for the same update/change? #146 is partially affected by this as it changes the params argument of put_request.
  • [x] Have you followed the guidelines in our Contribution Guide? Yes, see #193 .
  • [x] Have you read the Code of Conduct? Yes.
  • [x] Do your changes in a separate branch. Branches MUST have descriptive names.
  • [x] Have you merged the latest changes from upstream to your branch? Yes

Describe the PR

  • [x] What kind of change does this PR introduce?
    • This PR introduces custom authentication implementation to allow for API token and Bearer token-based authentication.
    • Sword requires BasicAuth, which this PR also addresses (see https://guides.dataverse.org/en/latest/api/sword.html#sword-auth)
    • An Auth implementation must be passed as a keyword-argument to the init function. We can discuss this requirement, but it felt better to be explicit about it and make sure implementers actually pass it correctly rather than accidentally passing it as the api_token or something.
    • I replaced some isinstance(..., ("".__class__, "".__class__)) with the much more legible and equivalent (as far as I can tell) isinstance(..., str).
    • It also deprecates the use of the previously ignored auth on individual functions in favor of generating multiple API objects with different auth methods.
      • For the deprecation, I introduced a DEPRECATION_GUARD to warn people who are trying to pass something to the functions. Initially I wanted to check for False, but there was one call defaulting to True (get_access) and this would not catch code explicitly setting auth=False on the callsite. So I decided to introduce a new default argument and show a warning if it is overwritten. Happy to discuss this decision – I have never done such an orderly deprecation before, so this is my first try :)
    • I updated the documentation.
    • Oh, and I smuggled a change to the Api.__str__ methods into the PR. As I scrolled by, I saw that they essentially could be removed and just the Api.__str__ method itself should use self.__class__.__name__ to get a similar result. However, the DataAccessApi will not print DataAccessApi: ... instead of the previous Data Access API: ... – the same applies for the others. However, this felt to be more in spirit with the docstring (which I then also updated slightly). If that's not so nice, we can move that to its own PR or even drop that commit entirely.
    • I had to adapt three unrelated things (conf.py, the self.base_url = None assignment in api.py, and the pyproject.toml) for mypy.
    • This PR almost accidentally also allows to pass params properly due to the name change of params -> headers.
  • [x] Why is this change required? What problem does it solve?
    • First and foremost, the issue it resolves is that tokens could potentially be leaked when sharing log messages, as the error response from dataverse includes the request URL, which, if the token is provided as ?key, contains the key.
    • The PR also lays the groundwork to allow for different authentication schemas in the future and even to configure it from the outside, favoring composition over inheritance. This will hopefully make maintenance and the refactoring in #189 easier.
    • By deprecating the auth on individual methods, it will eventually be possible to remove this unused argument and make the API a little bit leaner and reduce confusing behavior.
  • [ ] Screenshots (if appropriate)
  • [x] Put Closes #ISSUE_NUMBER to the end of this pull request

Testing

  • [x] Have you used tox and/or pytest for testing the changes? pytest
  • [x] Did the local testing ran successfully? yes, if applying #197. For demo.dataverse.org, I could not run test_upload.py successfully, as I was unauthorized to perform those uploads (401). However, locally (outside the containers) I was able to run the asyncio tests with pytest-asyncio. The containers seem to miss this.
  • [ ] Did the Continuous Integration testing (Travis-CI) ran successfully?

Commits

  • [x] Have descriptive commit messages with a short title (first line).
  • [x] Use the commit message template
  • [x] Put Closes #ISSUE_NUMBER in your commit messages to auto-close the issue that it fixes (if such).
    • I have only done this for 3e73fbf, as that is arguably the "essence" of this PR.

Others

  • [x] Is there anything you need from someone else? Take your time to review these changes. While some are quite repetitive and I already split some parts off into other PRs, this is a massive changeset due to the deprecation. In fact, maybe the deprecation should be its own PR? What do you think?

Documentation contribution

  • [?] Have you followed NumPy Docstring standard? I hope so

Code contribution

  • [x] Have you used pre-commit? Yes, see also #196.
  • [x] Have you formatted your code with black prior to submission (e. g. via pre-commit)? Yes
  • [x] Have you written new tests for your changes? Yes
  • [x] Have you ran mypy on your changes successfully? Yes, but I had to add types-jsonschema to the lint dependencies and in the conf.py. I haven't tried --strict.
  • [x] Have you documented your update (Docstrings and/or Docs)? yes
  • [x] Do your changes require additional changes to the documentation? No, but m

Note that this PR depends on other PRs which should be reviewed and decided/acted upon first, I can also rebase/merge the changes into this branch afterwards. In particular:

  • #196
  • #197

Closes #192 .

shoeffner avatar Jul 19 '24 00:07 shoeffner

Thanks, @shoeffner, that looks great! I agree with the changes made to allow passing either 'auth' or 'api_token'. I just tested it locally, and it works perfectly 🙌

image

JR-1991 avatar Jul 19 '24 08:07 JR-1991

@shoeffner, due to the merge of the jsonData pull request, this one has a couple of merge conflicts. I am happy to take care of those if you want.

JR-1991 avatar Aug 23 '24 08:08 JR-1991

I'll take a look, thanks for letting me know!

shoeffner avatar Aug 23 '24 08:08 shoeffner

Alright, I rebased everything and resolved one incompatibility with #203, which I didn't pay attention to during the rebase. Should work now.

shoeffner avatar Aug 24 '24 15:08 shoeffner

@shoeffner, thanks a lot - Merging!

JR-1991 avatar Aug 26 '24 06:08 JR-1991