Marc Garcia comments

Results 166 comments of


                                            Marc Garcia

Dataframe namespaces

Some comments made in the meeting: - Discoverabiliy is important - More +1's in method chaining - Several conflicts in naming pointed out: `replace`, `count`... which are different for `.str`...

Study on the pandas API: What is the most commonly used?

This is really cool, thanks a lot for sharing!

Study on the pandas API: What is the most commonly used?

I use `values` to "export" pandas data to numpy to train scikit-learn models. Not sure if that's the reason, but it doesn't surprise me. I guess it's as `read_csv`, biased...

Study on the pandas API: What is the most commonly used?

I agree with both points (I think they meant `DataFrame.array`, which is somehow recent). But I think that's still the reason why the notebooks use `.values` frequently.

Implicit alignment in operations

Trying to structure a bit the discussion, this is how I see the different components of what is being discussed here (with an example): ```python >>> import pandas >>> df...

Data types to support

Thanks Tom, those are great points. > What sort of coverage for these data types do other dataframe libraries have? I assume Dask provides the same as pandas. For Vaex...

Data types to support

> Another question taking one step back: which "aspects" of the data types do we want to standardize? Good question. In a first stage we care about interchanging data (#25)....

Data types to support

> 1. For categorical, there's some debate about whether the categories (set of valid values) are part of the array or data type. In pandas it's part of the dtype....

APIs for both building pipelines and data analysis

I mostly agree, but I think the API may be affected. With an example: ```python import sql query = sql.connect('foo.db') query = query.select('*') query = query.from('table1') query.execute() ``` This is...

APIs for both building pipelines and data analysis

Thanks @maartenbreddels, that's a very good point. I clarified my comment above to note that. My main point is that there may be API implications on the decision made regarding...