envd
envd copied to clipboard
feat: Add Kubernetes runtime proposal
Ref #179
Proposal preview: https://github.com/gaocegege/envd/blob/proposal/docs/proposals/20220603-kubernetes-vendor.md
/cc @Xuanwo @hezhizhen @knight42
Other options for file syncing:
- reverse sshfs https://github.com/lima-vm/sshocker
- ksync https://github.com/ksync/ksync
One thing I'm concerning is that whether syncing is needed at MVP. Code should be on a PV-like thing I think and usually people works with git
My 2 cents is syncing might be needed ultimately, but might in some different ways. Let's say we are working with git, and push some new commits to a branch, I think it would be better if envd could pull the new commits automatically to simulate the local development experience.
My 2 cents is syncing might be needed ultimately, but might in some different ways. Let's say we are working with git, and push some new commits to a branch, I think it would be better if envd could pull the new commits automatically to simulate the local development experience.
Thanks for the advice! Automatic push/pull looks magic to me. And it is complex. If the container is crashed, we also may lost the commits if we do not run push.
As discussed with some infra engineers interested in envd, port-forwarding may consume many API server CPUs. And tools like virtual kubelet does not support port forwarding.
port-forwarding may consume many API server CPUs.
Would you mind elaborating? AFAIK if there is no much traffic, port-forwarding should not consume too much cpu resources, as it is simply a SPDY connection under the hood.
virtual kubelet does not support port forwarding
Indeed. But what are we going to do to access the services inside the container without port-forwarding?
Thanks for the advice! Automatic push/pull looks magic to me. And it is complex. If the container is crashed, we also may lost the commits if we do not run push.
Or, we can forget the sync things. We request users to develop on the remote container, instead of the host. The build.envd may look like this:
def build():
base(os="ubuntu20.04", language="python3")
install.vscode_extensions([
"ms-python.python",
])
#config.pip_index(url = "https://pypi.tuna.tsinghua.edu.cn/simple")
install.python_packages([
"tensorflow",
"numpy",
])
shell("zsh")
config.jupyter(password="", port=8888)
+ config.working_dir(local=".", remote="https://github.com/tensorchord/envd.git")
envd mounts the local dir with docker runner and downloads the repo with Kubernetes runner.
Would you mind elaborating? AFAIK if there is no much traffic, port-forwarding should not consume too much cpu resources, as it is simply a SPDY connection under the hood.
There should not be huge traffic by design. But algorithm engineers may use it to copy data:
scp <10G-file> container:~
But what are we going to do to access the services inside the container without port-forwarding?
They may use service and ingress to achieve this. Thus we may need a mechanism to support customization here.
Maybe just like the design of the device plugin, we provide an interface to communicate between envd and a CLI shim. The shim does the critical logic like port forwarding. The envd just communicate with the shim and show information to users.
Port forwarding can be used in our default shim, while users can write their own shim to customize, e.g. using service and ingress.
But what are we going to do to access the services inside the container without port-forwarding?
They may use service and ingress to achieve this. Thus we may need a mechanism to support customization here.
Maybe just like the design of the device plugin, we provide an interface to communicate between envd and a CLI shim. The shim does the critical logic like port forwarding. The envd just communicate with the shim and show information to users.
Port forwarding can be used in our default shim, while users can write their own shim to customize, e.g. using service and ingress.
Device plugin is a not-so-good comparison. Let's say kubectl plugin mechanism. The shim maintained by users can be integrated into the envd.
- config.working_dir(local=".", remote="https://github.com/tensorchord/envd.git")
My concern is that working dir seems to be a command line argument to me, otherwise it might prevent the reuse of build.envd
in different working dir, just like we don't specify the build context in Dockerfile. Besides if we need to specify the repo address, should we need to specify the branch as well?
algorithm engineers may use it to copy data:
Got it 👌 If we need to transfer such huge file via port-forwarding without rate limiting, the functionality of apiserver might be affected.
while users can write their own shim to customize, e.g. using service and ingress.
I think it make sense 👍
My concern is that working dir seems to be a command line argument to me, otherwise it might prevent the reuse of build.envd in different working dir, just like we don't specify the build context in Dockerfile. Besides if we need to specify the repo address, should we need to specify the branch as well?
Sounds reasonable. It should be a runtime argument instead of build time.
@VoVAllen Do you have opinion on it?
Agree it better to be a runtime option. If using git, user should handle the sync related thing by himself (clone git repo and git pull/push). Therefore config.working_dir
might not be needed.
Things to be decided:
- [ ] How to support code repository, sync or git
- [ ] How to support exposing services to end users, port-forward, nodePort or ingress/service
- [ ] How to allow users to customize the logic without maintaining a fork of envd (plugin system)
Found one new syncing tool: https://github.com/mutagen-io/mutagen
How is it going now?
I am still working on #261 . Currently no bandwidth for it.
I do some exp on K8S syync. Maybe I could help you or discuss with more details If you want.
[APPROVALNOTIFIER] This PR is APPROVED
This pull-request has been approved by: gaocegege
The full list of commands accepted by this bot can be found here.
The pull request process is described here
- ~~OWNERS~~ [gaocegege]
Approvers can indicate their approval by writing /approve
in a comment
Approvers can cancel approval by writing /approve cancel
in a comment
I will update the Kubernetes design proposal recently. It should be elegant, and fancy.
We request users to develop on the remote container, instead of the host.
We should not assume where the user to work on
in my practice, working on remote jupyter notebook or working on local VS code, which connects to remote jupyter kernel, or working on local VS code with VS code remote development kit is the usual scenario.
on the other hand, there are a lot of algorithm engineers who do not use git as their source code management tool, they just write the code and produce some summary stuff, then the work is over, the source code does not need to manage.
file sync by envd can be a choice for the user, but can not be the only one.
how about letting the user choose how to set up their work style?
config.jupyter(password="", port=8888)
# if they want to work on a jupyter notebook
config.vscodeserver()
# or if they want to work on a VS code
config.sync(local='.', runtime='/workdir')
# or we sync their source code and datasets
# also the underhood can be a choice from volume(if the runtime is a local container)
# port-forward, nodePort, ingress, etc. if the runtime is a kubernetes
# or maybe
envd up --sync
# but `--sync` should be the default behavior
# so,
envd up -d --no-sync
# can be more practical
envd context ls
# then just print the context information
# if the runtime is a remote one
# tells users that VS code remote development kit can help with their work
The proposal is updated with significant changes, PTAL.
PTAL
@Xuanwo Thanks for your fix!
@Xuanwo Thanks for your fix!
Probably you should try Grammarly. There are still some syntax errors.
@Xuanwo Thanks for your fix!
Probably you should try Grammarly. There are still some syntax errors.
It should be fixed. PTAL.
I am merging it since we are already starting the development of envd-server.