sixtyfour bucket and files fxns should return data frames

... and we should get really opinionated what the columns of those data frames should be.

Originally via @seankross in https://github.com/fhdsl/sixtyfour/pull/3#pullrequestreview-1680656213

Oct 17 '23 04:10 sckott

@seankross With what's on main branch right now, all the inputs to our file fxns are now vectorized. however, this issue is about returning data frames from file and bucket fxns. The vectorized nature of the file fxns makes it easy - as pointed out in the s3fs docs - to pipe these fxns together. However, if we output tibble's we won't be able to do that so easily (though still could be done i guess). One of the file fxns returns a tibble right now, whereas others return vectors. thoughts? if we returned df's i guess we could always run a fxn to get back the contents of the bucket to return?

I think it's easier to think about always returning dfs with bucket fxns

Oct 19 '23 23:10 sckott

For the file fxns I think it's mostly okay to be dealing in vectors because ultimately you're acting on and pushing around paths. Where in the case of aws_file_attr you're going to get multiple variables returned for every one file. I'll try to think about a better heuristic but it's something like: no one-column data frames.

this is a poorly formed thought: I am struggling to think of function where I would want a data frame as input, but I like data frames as outputs when multiple variables are returned per function input, because then you could do tidyverse stuff to that data frame, then grab the columns you need for the next function in the pipeline.

Oct 20 '23 00:10 seankross

Thanks for your feedback.

ON your last thought, that makes sense. Fxns in this package may not be piped together themselves - more likely the output of a fxn in this pkg will go into a tidyverse pipeline

Oct 20 '23 16:10 sckott