pyDataverse icon indicating copy to clipboard operation
pyDataverse copied to clipboard

Add Pandas integration for Datafiles

Open skasberger opened this issue 4 years ago • 1 comments

Add a pandas integration for spreadsheet datafiles. There are two options to do this:

  1. Create an API request with DataAccess.get_datafile(), which should return an Pandas Dataframe instead of the requests.Response object.
  2. Create an API requests with DataAccess.get_datafile(), which should return an models.Datafile() object, with the data stored inside and an offered method .to_df() to get the data as a Pandas Dataframe.

Idea coming from https://github.com/gdcc/pyDataverse/issues/80.

Prepare

  • [ ] Research both ways
    • [ ] Talks with users

Implementation

  • [ ] Write tests
  • [ ] Write code
  • [ ] Update Docs
    • [ ] Write tutorial
  • [ ] Update Docstrings
  • [ ] Run pytest
  • [ ] Run tox
  • [ ] Run pylint
  • [ ] Run mypy

Review

  • [ ] Docs

Follow-Ups

skasberger avatar Feb 15 '21 19:02 skasberger

As discussed during the 2024-02-14 meeting of the pyDataverse working group, we are closing old milestones in favor of a new project board at https://github.com/orgs/gdcc/projects/1 and removing issues (like this one) from those old milestones. Please feel free to join the working group! You can find us at https://py.gdcc.io and https://dataverse.zulipchat.com/#narrow/stream/377090-python

pdurbin avatar Mar 04 '24 16:03 pdurbin