hamilton icon indicating copy to clipboard operation
hamilton copied to clipboard

[good first issue - beginner] Pandas Readers & Writers

Open skrawcz opened this issue 1 year ago • 26 comments

Is your feature request related to a problem? Please describe. We need to add more Pandas Readers & Writers (Savers & Loaders in our internal parlance).

Describe the solution you'd like We need to have readers & if appropriate, writers, covering:

  • [x] #409
  • [x] #292 (assigned to @benhhack)
  • [x] #407 (assigned to @bengineerdavis)
  • [ ] pandas fwf
  • [x] #342 ()
  • [x] #341 (assigned to @bryangalindo)
  • [x] #369 (assigned to @JoJo10Smith)
  • [x] #352 (assigned to @JoJo10Smith )
  • [ ] pandas hdf
  • [x] #384 (assigned to @JoJo10Smith )
  • [x] #406
  • [x] pandas orc (assigned to @JoJo10Smith )
  • [ ] pandas sas
  • [x] pandas spss
  • [x] #355 (assigned to @bryangalindo )
  • [ ] #375
  • [x] #377 (assigned to @JoJo10Smith )
  • [ ] Latex writer

We should cover I/O as listed here.

Additional context We need to start building wrappers around the common ways people will want to save/load data. That way they'll have off the shelf ways to get onto Hamilton easily.

If you're interested in contributing

If you are interested in contributing, picking up one of the above should be straightforward.

  1. Ask for one, and we'll assign it.
  2. We'll create an issue for you.
  3. We'll then work with you on that issue.

In terms of effort, for an example of a desired class, see this code. It basically involves:

  1. Reading the subsequent documentation.
  2. Creating the right class.
  3. Creating some tests for it.
  4. Creating an example to put into our examples repository.

skrawcz avatar Aug 21 '23 04:08 skrawcz

Hey there, I would love to give this a go with one of the pandas i/o methods. I'm new to contributing on GitHub so I appreciate the cooperative work.

benhhack avatar Aug 21 '23 11:08 benhhack

Hey there, I would love to give this a go with one of the pandas i/o methods. I'm new to contributing on GitHub so I appreciate the cooperative work.

@benhhack

That would be great! Hopefully there are enough examples to get you started -- let us know what you need above that. No judgement if you use gpt-* to help you out as well -- I've found its helpful for translation/repetetive tasks like this.

elijahbenizzy avatar Aug 21 '23 16:08 elijahbenizzy

Hey there, I would love to give this a go with one of the pandas i/o methods. I'm new to contributing on GitHub so I appreciate the cooperative work.

@benhhack yeah thanks for offering to help! Just to make sure this indeed is the right issue to get started with, what's your comfort level with python & pandas?

skrawcz avatar Aug 22 '23 04:08 skrawcz

Quite familiar with using both, you can check out my repos to see my level. Never really done anything like this before though, so I'm quite interested to see how it goes.

benhhack avatar Aug 22 '23 07:08 benhhack

Quite familiar with using both, you can check out my repos to see my level. Never really done anything like this before though, so I'm quite interested to see how it goes.

Cool. I would take a look at https://pandas.pydata.org/docs/reference/io.html, pick one, and then we can claim that here and create an issue to move discussion to. Which one would you like?

skrawcz avatar Aug 22 '23 15:08 skrawcz

Quite familiar with using both, you can check out my repos to see my level. Never really done anything like this before though, so I'm quite interested to see how it goes.

Cool. I would take a look at https://pandas.pydata.org/docs/reference/io.html, pick one, and then we can claim that here and create an issue to move discussion to. Which one would you like?

Looking at that they all seem equally vague, haha. You can assign me to whichever you'd feel is most appropriate/best to start on.

benhhack avatar Aug 23 '23 07:08 benhhack

Looking at that they all seem equally vague, haha. You can assign me to whichever you'd feel is most appropriate/best to start on.

Sure. @benhhack mind commenting on #292 so I can assign it to you?

skrawcz avatar Aug 24 '23 05:08 skrawcz

Looking at that they all seem equally vague, haha. You can assign me to whichever you'd feel is most appropriate/best to start on.

Sure. @benhhack mind commenting on #292 so I can assign it to you?

Have commented :))

benhhack avatar Aug 24 '23 08:08 benhhack

@benhhack https://github.com/DAGWorks-Inc/hamilton/issues/342 is open for you, if you wanted to comment on it.

skrawcz avatar Sep 11 '23 17:09 skrawcz

@skrawcz I've taken a look at the other tickets and I could try the XML read and write class. The classes shouldn't be too difficult but I will reach out if I need help with the testing.

Thanks Jordan

JoJo10Smith avatar Sep 15 '23 23:09 JoJo10Smith

@skrawcz I've taken a look at the other tickets and I could try the XML read and write class. The classes shouldn't be too difficult but I will reach out if I need help with the testing.

Thanks Jordan

@JoJo10Smith if you wanted to comment on https://github.com/DAGWorks-Inc/hamilton/issues/352 I can assign it to you. Thanks!

skrawcz avatar Sep 16 '23 03:09 skrawcz

@bryangalindo mind commenting on https://github.com/DAGWorks-Inc/hamilton/issues/355 so I can assign it to you. I missed doing that earlier, sorry about that.

skrawcz avatar Sep 16 '23 20:09 skrawcz

@skrawcz I could take the HTML read and write class next.

Thanks Jordan

JoJo10Smith avatar Sep 21 '23 00:09 JoJo10Smith

@skrawcz I could take the HTML read and write class next.

Thanks Jordan

@JoJo10Smith please comment on #369 :)

skrawcz avatar Sep 21 '23 03:09 skrawcz

@skrawcz i can start working on pandas gbq!

bryangalindo avatar Sep 22 '23 02:09 bryangalindo

@skrawcz i can start working on pandas gbq!

Please comment on https://github.com/DAGWorks-Inc/hamilton/issues/375 -- note this one will require a GCP account I think.

skrawcz avatar Sep 22 '23 20:09 skrawcz

@skrawcz I can take on pandas Stata next.

JoJo10Smith avatar Sep 23 '23 02:09 JoJo10Smith

@skrawcz I can take on pandas Stata next.

@JoJo10Smith please comment on #377

skrawcz avatar Sep 23 '23 06:09 skrawcz

@skrawcz I'll take Feather next.

Thanks Jordan

JoJo10Smith avatar Sep 25 '23 23:09 JoJo10Smith

@skrawcz I'll take Feather next.

Thanks Jordan

https://github.com/DAGWorks-Inc/hamilton/issues/384 🙇 .

skrawcz avatar Sep 26 '23 04:09 skrawcz

Hey skrawcz! I would like to work on the Panda Table I have already made many pandas dataframes and would like to work on this project.

149189 avatar Sep 29 '23 10:09 149189

@skrawcz can I take 'pandas parquet'?

flaviassantos avatar Sep 29 '23 19:09 flaviassantos

thanks @flaviassantos you should be all set. If you have questions put them in issue https://github.com/DAGWorks-Inc/hamilton/issues/406.

@149189 please comment on https://github.com/DAGWorks-Inc/hamilton/issues/407 to have me assign that to you. If have questions we can have the conversation there.

skrawcz avatar Sep 29 '23 20:09 skrawcz

@skrawcz I'll take csv next

Thanks Jordan

JoJo10Smith avatar Sep 30 '23 02:09 JoJo10Smith

@JoJo10Smith https://github.com/DAGWorks-Inc/hamilton/issues/409 is up. thanks!

skrawcz avatar Sep 30 '23 22:09 skrawcz

Created Spss for https://github.com/DAGWorks-Inc/hamilton/issues/813

swapdewalkar avatar Apr 09 '24 20:04 swapdewalkar