datachain icon indicating copy to clipboard operation
datachain copied to clipboard

`filter` with function call is broken

Open shcheklein opened this issue 8 months ago • 3 comments

This doesn't work:

        read_storage("...")
        .filter(file_stem("file.path") == "file.parquet")
        .save("index")

shcheklein avatar Apr 20 '25 17:04 shcheklein

How were you importing file_stem, is that from datachain.func.path.file_stem or datachain.sql.functions.path.file_stem? If latter, that is not supported. You have to use the former one.

I tested with following which seem to work:

import datachain as dc
from datachain.func import file_stem

chain = dc.read_storage("file:///Users/user/Projects/iterative/datachain/")
chain.filter(file_stem("file.path") == "file.path").show()

skshetry avatar Apr 24 '25 00:04 skshetry

Hmm, okay, what is the difference? Is func.path deprecated now? (we probably should cleanup docs then)

shcheklein avatar Apr 24 '25 00:04 shcheklein

Is func.path deprecated now?

I think so, but I am not the best person to answer to that. cc @dreadatour @ilongin.

skshetry avatar Apr 24 '25 00:04 skshetry