Jinjing Zhou
Jinjing Zhou
For data, ideally, user needs to register the dataset detail at the envd-server at first. And declare `io.mount(src=dataset("imagenet"), dst="/data")` in the `build.envd`. However this is a bit complex, easier way...
The initial version we can start with `info.git_repo = "https://github.com/tensorchord/envd`, and clone it into k8s pod. But syncthing will also be implemented later
Thanks @aseaday and @zwpaper . My core consideration here is also to make build function unified for both k8s and docker. The key design here is to support ad hoc...
After team discussion: - We think it's better to use the same function `build` and provide a way for user to identify current context - Use function such as `config.info(repo="https://github.com/tensorchord/envd")`...
Generally, I think PVC is a good abstraction fitting all data sources (nfs, oss, lustre, and so on) in enterprise-level usage. Therefore what we need to do here is just...
@kemingy It's the same, I just randomly picked a name. `data.dataset` will only add label to the image. And the runner will decide how to handle it. Therefore local_docker runner...
Done in https://github.com/tensorchord/envd-docs/pull/155
Another solution is to support estargz format with lazy loading. However not sure how to make this work with the building stage
Not viable now. Github requires language popularity (grammar used in hundreds of repos held by different people), that envd doesn't satisfy yet.
We can add ci with arm by qemu https://github.com/docker/setup-qemu-action. Although I head the performance is very bad