skshetry

Results 304 comments of skshetry

@dberenbaum, it does not hash, it goes through the directory and tries to look into the state db if we have the hashes. And it does that for each item,...

We don't have plans to add new remote implementation ourselves. Of course, contributions are always welcome. In the future, we have plans to offer plugin system so that we can...

It's going to be hard to write a fair benchmark for `dvc repro`. I understand the motivation, but what you are testing here is `dvc commit` and that too is...

@grahameth, can you please share the cprofile data?

Does streaming dataset affect us in any way? I assume that'll be in user's code rather than dvc (and cached by dvcx). Is that just a dependency support without materializing?

Okay, looks like for streaming datasets, we will need some APIs to redirect a tracked dataset in dvc to the appropriate dataset in dql.

Discussed with @dmpetrov about this. He does not want dvc to compute md5 hashes for pipeline checksums, and wants that to be handled by dql itself, as in ask `dql`...

Thinking about another alternative, maybe we should create a top-level `datasets` in dvc.yaml where users register their versions/datasets, and then they can reference them in `dvc stage add`? Eg: ```console...

In the second diagram, dvc will have to act as an intermediary. The dataset info needs to be tracked in dvc's metadata, not in the source code, so that when...