skshetry

Results 297 comments of skshetry

DiskCache is fast for web applications like django apps, etc. as a replacement for redis. On those web applications, you have a very different workload, where you are unlikely to...

> Is it `hashfile.build._build_tree` then? I saw `index.save.build_tree` initially, but that doesn't make sense to me. Sorry, yes it is `_build_tree`. https://github.com/iterative/dvc-data/blob/ffa6839e35cdd193da469605978eb8e0946433ee/src/dvc_data/hashfile/build.py#L90

> (though we might loose a bit since it will be serials - first read, then md5, then write) vs all in parallel. Parallelizing a large function that does a...

> Thanks @skshetry! Do you have any examples or benchmarks to show the overall improvement and in what scenarios it should be better? See the description above (at the very...

`3.13.3` is a very old dvc version. Could you please try with the latest version?

`dvcfs.repo` is an internal of a DVCFileSystem, so I cannot help with it unfortunately. I looked into `DVCFileSystem` and the fact that files are not cached is expected, since the...

Could be related to https://github.com/fsspec/ossfs/pull/129. Please file a bug upstream.

Can you try removing hardlink and symlink from `cache types` config? You can remove the `cache.type` config entirely as `reflink, copy` is the default. It'd be great if you could...

I think this is due to a relink optimization that I did recently for `checkout` (which is used during repro): https://github.com/iterative/dvc-data/pull/548. DVC looks at the file in the workspace, and...

I maybe open to some config to force-relink. Any thoughts @dberenbaum, @shcheklein?