datachain icon indicating copy to clipboard operation
datachain copied to clipboard

`filter` doesn't work on top of group by results

Open shcheklein opened this issue 1 month ago • 0 comments

A query like this doesn't work w/o persist on sqlite:

read_dataset("test")
            .distinct("file.path")
            .group_by(cnt=func.count(), files=func.collect("file.path"), partition_by=("session_id", "position"))
            .persist()
            .filter(C("cnt") > 1)

It raises:

in/data_storage/sqlite.py", line 242, in execute
    result = self.db.execute(*self.compile_to_args(query))
             ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
sqlite3.OperationalError: misuse of aggregate: count()

shcheklein avatar Nov 26 '25 04:11 shcheklein