DanCardin

Results 144 comments of DanCardin

Perhaps relatedly, a query like so: ``` SELECT * FROM read_parquet('s3://bucket/**/*.parquet', hive_partitioning=true, union_by_name=true) WHERE date = (SELECT max(date) FROM read_parquet('s3://bucket/**/*.parquet', hive_partitioning=true, union_by_name=true)) ``` This seems to similarly be scanning all...

I kind of figured there would be **some** sort of unavoidable slowdown due to union_by_name...but! * Even if I were willing to accept the slowdown, I seem to be unable...

Is there a world where some additional `read_parquet(...)` option or `SET` mode or something could exist? For my particular example, I'd be happy to accept the first-found type for a...

The first option would work for my uses, for whatever that's worth. No idea if it's possible/realistic, but it **seems** like you could (perhaps optionally) infer the schema from any...

I guess I should also say again, in a bunch of cases for us, the only referenced columns are hive partition columns which (as far as i know) are necessarily...

At least for my original usecase, it will need to support hive partitioning. But once it does, i think i'll be happy to swap and close this issue, assuming `union_by_name`...

as i said above, once schemas support hive partitioning, im happy for this to be closed, thus not stale.

this stale bot is honestly just obnoxious...lack of activity doesn't make an issue solved or not.

for what it's worth, I ran into the relative complexity caused by CronWorkflow subclassing Workflow in https://github.com/argoproj-labs/hera/pull/942. imo, while subclassing Workflow for the DRYness of its fields **is** convenient, the...

🥹 there's a whole separate `cron_suspend`?! gah. Well that probably solves this bug-wise... I still probably/maybe think something about this issue indicates something maybe ought to change...