self-operating-computer
self-operating-computer copied to clipboard
A framework to enable multimodal models to operate a computer.
Changed numpy version ## What does this PR do? Fixes # (issue) ## Requirement/Documentation - If there is a requirement document, please, share it here. ## Type of change -...
### Is your feature request related to a problem? Please describe. Would be nice if we could any multimodal model from ollama, especially models with more parameters. Llava 7b is...
## What does this PR do? Fixes # (issue) ## Requirement/Documentation - If there is a requirement document, please, share it here. ## Type of change - [ ] Bug...
## What does this PR do? - support OmniparserV2 official api - integrate Omniparser with Qwen ## Requirement/Documentation - Omniparser api: https://github.com/microsoft/OmniParser/blob/master/omnitool/omniparserserver/omniparserserver.py ## Type of change - [x] New feature...
## What does this PR do? When a website or the screen doesn't load quickly (in raspberry pi for example). The SOC now is capable of waiting for some time...
## What does this PR do? This fixes the ollama client code that is supposed to use the configured client vs `ollama.chat` directly. ## Requirement/Documentation Use [client code](https://github.com/ollama/ollama-python?tab=readme-ov-file#custom-client) vs [direct](https://github.com/ollama/ollama-python?tab=readme-ov-file#usage)...
Hey @joshbickett, I’m a college student exploring this repository and would like to do some research on it. I have a few questions: - Can this be used in headless...
## What does this PR do? Allows Gemini to work again and reduce its failure rate. - Fixes Gemini model name in config.py - Improve success rate of Gemini tasks...
When a webpage or else lasts longer to appear, the AI thinks it would be good to wait for a bit to the page to load, but theres no such...
## What does this PR do? This PR simplifies the screenshot code by removing the need to create the screenshots dir by adding a `.keep` file and retaining the directory....