Han Wang

Results 63 comments of Han Wang

@shchur I apologize for the delay. Could you try fugue 0.9.0.dev3? I think we will release 0.9.0 soon, but if you could try the latest release, it can be helpful...

@shchur we will release Fugue 0.9.0 in two weeks. Thanks for the confirmation.

@lukeb88 I am very sorry about the delay. Yes this sounds like a good idea, we will look into it and see if it can be integrated. Thanks!

@jstammers thanks for reporting. What duckdb version are you using? I remember in earlier Duckdb versions (

One problem I saw in unit tests of duckdb is that it can have weird behaviors because the duckdb connection are not properly closed at certain step so the following...

https://github.com/fugue-project/fugue/blob/49d37249fd74dddddc94795a5474a31f3fbfdd45/fugue/dataframe/dataframe.py#L25 https://github.com/fugue-project/fugue/blob/49d37249fd74dddddc94795a5474a31f3fbfdd45/fugue/execution/execution_engine.py#L40

Fugue doesn't support running the main(driver) logic on different processes. So the behavior is undefined.

Ok, I see there are a couple of issues, look at this code ```python def read_text_file(filepath: str) -> DataFrame: headers = read_header(filepath) engine = DuckExecutionEngine() return engine.load_df(csv_filepath, skip=2, columns=headers) csv_filepath...

This code has multiple issues: ```python # ... I can't easily use `fsql` instead as `columns` clashes ... def read_text_file(filepath: str) -> DataFrame: headers = read_header(filepath) return fsql(f"LOAD '{filepath}' (skip=2,...

> I think he had to instantiate the engine because there was a bit of inconsistent behavior between the Pandas and DuckDB engines when reading multi-header CSVs. In this issue...