dominik
dominik
I also have problem with timestamps when reading delta tables. {"code":400,"error":"query_execution","message":"Failed to execute query: Arrow error: External error: Execution error: Failed to map column projection for field XY. Incompatible data...
Yes. This gives me the same error: {"code":400,"error":"query_execution","message":"Failed to execute query: Arrow error: External error: Execution error: Failed to map column projection for field XY. Incompatible data types Timestamp(Nanosecond, None)...
I tried to reproduce the error with some sample data. this works: import pandas as pd from deltalake.writer import write_deltalake import pyarrow as pa from typing import Tuple def delta_arrow_schema_from_pandas(...
> Maybe we should just merge this, since it already fixes all `read_delta` issues. I will take a closer look at the `scan_delta` when I have time; maybe I am...
We have this requirement as well. Currently, we convert the Polars dataframe to Pandas and export it from there. Would be great, of course, if we didn't have to take...
Could we help here somehow? I think the issue is not in Python, but in the Rust backend. But we don't really know where to start debugging. We cannot not...
> > @stinodego that would explain why partitioning must be set in scan_delta. > > That is a different issue. `scan_delta` does the following: > > 1. Map the file...
> > Any subsequent filters are pushed to scan_ds just fine. The problem is that these filters are not pushed further to step 2, where the conversion happens from Delta...
> I did some tests on my side on the dataset we are struggling with: ``` file_url = "data/delta/item" @memory_usage def scan(): df = pl.scan_delta(file_url) print(df.filter(pl.col("item") == "00009501").fetch(5)) @memory_usage def...
``` times = [] for i in range(10): start = datetime.now() dt = DeltaTable(file_url) dfs = pl.concat([pl.scan_parquet(i) for i in dt.file_uris()], rechunk=False) dfs.filter((pl.col("item") == "01523048")).collect(streaming=True) end = datetime.now() delta =...