self-operating-computer icon indicating copy to clipboard operation
self-operating-computer copied to clipboard

A framework to enable multimodal models to operate a computer.

Results 130 self-operating-computer issues
Sort by recently updated
recently updated
newest added

I just noticed something interesting. I use a French keyboard ("AZERTY") and when the system "searches", it opens Sppotlight and writes "Google Chro,e" as if typing on a US keyboard,...

I'm a newbie, please help : ) and Iā€˜m really interested in this project :D

Maybe a yolo object detection model trained on basic things to get coordinates? Or something like sam? i mean as soon as there is a small model, gpt4 can check...

the X/Y coordinates inferred by the model are always off. It can't even select the address bar correctly.

Request: `Open calculator and add 23 + 35` Output: ```c++ Error parsing JSON: 403 Request had insufficient authentication scopes. [ reason: "ACCESS_TOKEN_SCOPE_INSUFFICIENT" domain: "googleapis.com" metadata { key: "service" value: "generativelanguage.googleapis.com"...

I've been reviewing the project's codebase and noticed that all the logic and functions are currently contained within a single file. This structure, while functional, can make the code challenging...

**This PR is a work-in-progress.** The goal of this PR is to add support for LLaVA through Ollama. Todo: - [x] Successfully send prompt + image to LLaVA and get...

enhancement

**This PR aims to achieve two primary objectives:** 1. Support vertical mouse-wheel scrolling to let the model access UI elements which currently aren't on the screen. 2. Replace the CLICK...

[Self-Operating Computer] Hello, I can help you with anything. What would you like done? [User] google the word HI Error parsing JSON: X get_image failed: error 8 (73, 0, 967)...

bug

šŸš€ **PR Summary:** Adds a Dockerfile to support containerization as part of https://github.com/OthersideAI/self-operating-computer/issues/36. šŸ› ļø **Changes Made:** - Included Dockerfile for containerization. - Used Python:3.11-slim as the base image (Considering it's...