OpenAdapt
OpenAdapt copied to clipboard
Implement client/server functionality
Feature request
To support https://github.com/openadaptai/SoM we need to implement a client.py with https://www.gradio.app/docs/client. See:
- client.py: https://github.com/microsoft/SoM/pull/19/files#diff-1ebfaf6cb3592166b73835fa82333cb7109e7c624865c0039a7b22ff34aa27fa)
- deploy.py: https://github.com/microsoft/SoM/pull/19/files#diff-5c9ed18af9a5f902219d12c3044ccb193c2c304a3748d02702889c2ca5703978
Motivation
https://github.com/openadaptai/SoM is state-of-the-art for visual understanding, and only runs on Linux / CUDA
Refer to system diagram:
Inference (SoM/SAM) must be done remotely.
We wish to implement:
-
openadapt/adapters/som/client.py
: modified version of client.py in https://github.com/microsoft/SoM/pull/19 to support getting marked screenshots during analysis (visualization) and replay -
openadapt/adapters/som/server
, which can be a git submodule containing https://github.com/OpenAdaptAI/SoM/
@FFFiend thoughts? 🙏 😄
I took a look at the client.py
code as well as the docs so from my understanding:
Instead of using a gradio app url or HuggingFace space, we would like to create an entry point to the EC2 SoM instance we have available and return a marked screenshot as the output of predict
Bit confused on how a marked screenshot is defined though 😄
@FFFiend thanks for your quick response!
Instead of using a gradio app url or HuggingFace space, we would like to create an entry point to the EC2 SoM instance we have available and return a marked screenshot as the output of predict
Exactly right! We would need to integrate the deploy.py
and a variation of client.py
, which would both be called from elsewhere in openadapt (e.g. visualize.py, replay.py).
Bit confused on how a marked screenshot is defined though 😄
No worries! You can see the marked screenshot in the PR description, reproduced here:
The original screenshot is on the left, the marked screenshot is on the right.
awesome, so for client.py I'm envisioning client to work as follows:
- Use
start
andstop
from theDeploy
class for instantiating and then closing the instance. - Use either paramiko https://www.paramiko.org/ or https://pexpect.readthedocs.io/en/stable/ to have a runner for functions within the instance.
- Write up or reuse SOM logic from one of the existing demos (
demo_som.py
for example) into a function and then plug that into the runner above, and inference is done.
The original repo doesn't have any architecture or heavy ML code laid out so I'm guessing the meat n potatoes is within the demo files, but I could be wrong.
@FFFiend thanks for your patience! Just saw this 😅
- Use start and stop from the Deploy class for instantiating and then closing the instance.
Agreed.
- Use either paramiko https://www.paramiko.org/ or https://pexpect.readthedocs.io/en/stable/ to have a runner for functions within the instance.
This may be unnecessary. https://github.com/microsoft/SoM includes a client.py
which uses the Gradio API -- the client.py
should look similar.
- Write up or reuse SOM logic from one of the existing demos (demo_som.py for example) into a function and then plug that into the runner above, and inference is done.
Bingo! This should all go in client.py
.