Marc Garcia

Results 166 comments of Marc Garcia

Some comments made in the meeting: - Discoverabiliy is important - More +1's in method chaining - Several conflicts in naming pointed out: `replace`, `count`... which are different for `.str`...

This is really cool, thanks a lot for sharing!

I use `values` to "export" pandas data to numpy to train scikit-learn models. Not sure if that's the reason, but it doesn't surprise me. I guess it's as `read_csv`, biased...

I agree with both points (I think they meant `DataFrame.array`, which is somehow recent). But I think that's still the reason why the notebooks use `.values` frequently.

Trying to structure a bit the discussion, this is how I see the different components of what is being discussed here (with an example): ```python >>> import pandas >>> df...

Thanks Tom, those are great points. > What sort of coverage for these data types do other dataframe libraries have? I assume Dask provides the same as pandas. For Vaex...

> Another question taking one step back: which "aspects" of the data types do we want to standardize? Good question. In a first stage we care about interchanging data (#25)....

> 1. For categorical, there's some debate about whether the categories (set of valid values) are part of the array or data type. In pandas it's part of the dtype....

I mostly agree, but I think the API may be affected. With an example: ```python import sql query = sql.connect('foo.db') query = query.select('*') query = query.from('table1') query.execute() ``` This is...

Thanks @maartenbreddels, that's a very good point. I clarified my comment above to note that. My main point is that there may be API implications on the decision made regarding...