datachain icon indicating copy to clipboard operation
datachain copied to clipboard

select_except with inner signals

Open dmpetrov opened this issue 9 months ago • 6 comments

Description

d = dc.read_parquet("example.parquet").save()

# Works:
d.select("source.kwargs").show()

# This doesn't work:
d.select_except("source.kwargs").show()
SignalResolvingError: cannot resolve signal name 'source.kwargs': select_except() error - the feature name does not exist or inside of feature (not supported)

Version Info


dmpetrov avatar Mar 30 '25 00:03 dmpetrov

Looks like this was explicitly not supported for some reason, as can be seen in error message

ilongin avatar Mar 31 '25 23:03 ilongin

@ilongin it looks like legacy logic. at some point it worked this way but not with the latest changes.

dmpetrov avatar Apr 01 '25 04:04 dmpetrov

can it because it would require creating a "partial" model (w/o certain fields)? (we can though use the same semantics as in select - probably just flatten columns that left?)

shcheklein avatar Apr 01 '25 04:04 shcheklein

yes, that's the problem 🙂

dmpetrov avatar Apr 01 '25 05:04 dmpetrov

Yes, the only solution is to flatten the top level column. As I spoke with Dmitry, this is not that high priority so removing myself form assignment. Added quick better error handling though https://github.com/iterative/datachain/pull/1020

ilongin avatar Apr 01 '25 14:04 ilongin

@ilongin looks good. please close this issue once #1020 is closed

dmpetrov avatar Apr 01 '25 20:04 dmpetrov