Stijn de Gooijer

Results 620 comments of Stijn de Gooijer

> updating all libraries' from_dataframe function to handle both ways of specifying the buffers' dtypes? Implementations of `from_dataframe` should just disregard the data buffer dtype entirely. `column.dtype` already tells you...

> > Implementations of `from_dataframe` should just disregard the data buffer dtype entirely. `column.dtype` already tells you what to expect in the data buffer (e.g. dtype `STRING` will mean an...

> I think we should still take some care though For sure! Let's first get the `from_dataframe` implementations fixed, then we can update the data buffer dtype whenever we feel...

> Well, if you have a DATETIME column, for example, what is the implied dtype for the data buffer? It might be spelled out in the spec, but I'm certainly...

> > Implementations of `from_dataframe` should just disregard the data buffer dtype entirely. `column.dtype` already tells you what to expect in the data buffer > > Thinking further on this...

When implementing this for Polars, I had some trouble wrapping my head around the `offset` on Columns and `bufsize` on Buffers, specifically when it comes to bitmasks. You can figure...

It looks good, but the Array repr has been updated. I'll send an update and this can be merged.

Was this closed by https://github.com/pola-rs/polars/pull/16268?

@r-brink Did you mean to keep this on draft or is this ready to be reviewed?

Do you see the same results if you run `df.to_numpy(use_pyarrow=False)`? PyArrow is still the default engine, and it rechunks when converting to PyArrow.