Antoine Pitrou

Results 823 comments of Antoine Pitrou

I'm really -1 on the current testing approach where everything is disabled by default and tests have to be whitelisted **twice** to be executed.

@github-actions crossbow submit -g wheel -g python

This bug can still be reproduced. @raulcd @AlenkaF

> Note that Arrow is somewhat different than Parquet in that most of the Arrow implementations are maintained by the Apache Arrow project itself. In comparison, I believe most of...

@alkis > Cons: > > * `carpenter` has a bit of complexity - it needs to be able to decode a subset of parquet to verify equivalence > * drivers...

> Second best would be option 3, but I'm curious how often an implementation would be expected to provide files? The full set for each release, or just one for...

> > Need to host all important implementations under a single CI job (including closed-source ones? including GPU ones?). > > This is a good point. Does it apply to...

@zanmato1984 In case you want to chime in.

Normally, dataset tries to normalize schemas when reading the files in a dataset. Apparently that doesn't work for dictionary types, we should fix this.

> Or are you talking about the call to `UnifySchemas` in `DatasetFactory::Inspect`? It should be this, indeed. > It seems like doing it this way in Python is currently impossible...