Vladimir Rudnykh

Results 22 issues of Vladimir Rudnykh

"Clear Buffer" clears all output except last line, which supposed to be a command prompt. But prompt can consist of more then one line. It would be great to set...

Last fix (0.5.7) breaks datetime's like "2014-10-09T10:00:00Z". More info in ISO8601: http://en.wikipedia.org/wiki/ISO_8601#Time_zone_designators Only first (8bad0cd) commit is necessary, second one (1b67651) just speedup and simplify regex'es, and third one is...

Implement chain `group_by`: `group_by.py`: ```python from datachain import C, DataChain from datachain.lib import func from datachain.sql.functions.path import file_ext res = ( DataChain.from_storage("s3://dql-50k-laion-files/") .group_by( cnt=func.count(), total_size=func.sum("file.size"), avg_size=func.avg("file.size"), partition_by=file_ext(C("file__path")), ) ) res.show()...

Tiny fix for unused code found while working on another PR.

Follow-up for the https://github.com/iterative/datachain/issues/327 Sometimes it is useful to save intermediate chain state, because operations are lazy, chains are not executed immediately and intermediate results are not stored. For example,...

enhancement

Fix for https://github.com/iterative/datachain/issues/1125 ``` File "/Users/vlad/.virtualenvs/datachain/lib/python3.13/site-packages/multiprocess/queues.py", line 138, in get_nowait return self.get(False) ~~~~~~~~^^^^^^^ File "/Users/vlad/.virtualenvs/datachain/lib/python3.13/site-packages/multiprocess/queues.py", line 125, in get return _ForkingPickler.loads(res) ~~~~~~~~~~~~~~~~~~~~~^^^^^ File "/Users/vlad/.virtualenvs/datachain/lib/python3.13/site-packages/dill/_dill.py", line 303, in loads return load(file,...

When running datachain query in parallel mode and there is an error with file (prefetch/download/cache), there is an error with exception pickle/unpickle: ``` File "/Users/vlad/.virtualenvs/datachain/lib/python3.13/site-packages/multiprocess/queues.py", line 138, in get_nowait return...

Validate `File.path` on usage (caching, download). More validation cases [in test](https://github.com/iterative/datachain/pull/1110/files#diff-2d6a9fd7e0eb8ccf0311704b0853ab94c27e1282394791e9643f317e5366c08fR380-R411).