Daft
Daft copied to clipboard
df.write_lance does't support writing to Alibaba OSS
Describe the bug
We use the daft.write_lance to write OSS, and find it does't support yet.
df.write_lance(
lance_out,
io_config=io_conf,
mode="create",
max_bytes_per_file=1024,
)
Traceback (most recent call last):
File "/home/ray/daft/test_picture_oss.py", line 205, in <module>
df_lance_back = write_lance_with_retry()
File "/home/ray/daft/test_picture_oss.py", line 183, in write_lance_with_retry
df.write_lance(
File "/home/ray/anaconda3/lib/python3.10/site-packages/daft/api_annotations.py", line 32, in _wrap
return func(*args, **kwargs)
File "/home/ray/anaconda3/lib/python3.10/site-packages/daft/dataframe/dataframe.py", line 1514, in write_lance
return self.write_sink(sink)
File "/home/ray/anaconda3/lib/python3.10/site-packages/daft/api_annotations.py", line 32, in _wrap
return func(*args, **kwargs)
File "/home/ray/anaconda3/lib/python3.10/site-packages/daft/dataframe/dataframe.py", line 1426, in write_sink
micropartition = sink.finalize(results["write_results"])
File "/home/ray/anaconda3/lib/python3.10/site-packages/daft/io/lance/lance_data_sink.py", line 177, in finalize
dataset = lance.LanceDataset.commit(
File "/home/ray/anaconda3/lib/python3.10/site-packages/lance/dataset.py", line 3200, in commit
new_ds = _Dataset.commit(
OSError: LanceError(IO): Operation not yet implemented., /home/runner/work/lance/lance/rust/lance-table/src/io/commit.rs:1044:50
To Reproduce
No response
Expected behavior
No response
Component(s)
Native Runner
Additional context
No response
@weixiuli
Lance supports the OSS protocol. You can directly release the code here to OSS and try it out. Contributions are welcome.
https://github.com/Eventual-Inc/Daft/blob/ec202771d0b031eee2b3538cfcd15ad6f4aead38/daft/io/object_store_options.py#L10
@weixiuli any questions update?