self-operating-computer icon indicating copy to clipboard operation
self-operating-computer copied to clipboard

A framework to enable multimodal models to operate a computer.

Results 130 self-operating-computer issues
Sort by recently updated
recently updated
newest added

I've optimized the code by consolidating exception handling into a single approach using a dictionary for better efficiency and readability. Redundant code segments have been refactored to avoid repetition, ensuring...

### Problem Currently, the application is prompt to use Google Chrome by default, limiting accessibility and user experience for individuals using alternative browsers. This monolithic approach excludes a significant user...

enhancement
question

I'd installed SOC on my computer, a MacBook Pro with M1 Pro arm chip. Today, my first try to use SOC, but it repeat open my browser, not go further....

question

As mentioned in the `Readme`, probably `cmd + L` would be a better thing to navigate yo the search bar Even I faced the issue of navigating to the search...

wontfix

I just noticed that the model doesn't have access to scrolling up and down. Is this difficult to implement generally (asking mostly for Linux, but of course interested in Mac,...

The feature request is this. - 🔊 Utilize speech synthesis to narrate actions before execution. - This will enhance user experience by providing audio cues. - Make the bot more...

Since it likes to misclicks a lot, you could either train a model to do image segmentation or, you can with clever prompt engineering add a barebones grid asking to...

Would it be advantageous to keep a collage of downsampled previous images, maybe to 160px x 90px and just stack them in a line left to right, one after another,...

enhancement

I have three 32" 4K monitors for my Mac Studio and keep getting this error for any command. I'm curious which monitor it selects for the screenshot. I can hear...

bug