OpenAdapt icon indicating copy to clipboard operation
OpenAdapt copied to clipboard

Implement Omniparser

Open abrichr opened this issue 1 year ago • 0 comments

Feature request

We want to implement https://huggingface.co/microsoft/OmniParser in a ReplayStrategy (e.g. https://github.com/OpenAdaptAI/OpenAdapt/pull/888)

Motivation

OmniParser is designed to be able to convert unstructured screenshot image into structured list of elements including interactable regions location and captions of icons on its potential functionality. OmniParser is intended to be used in settings where users are already trained on responsible analytic approaches and critical reasoning is expected. OmniParser is capable of providing extracted information from the screenshot, however human judgement is needed for the output of OmniParser. OmniParser is intended to be used on various screenshots, which includes both PC and Phone, and also on various applications.

abrichr avatar Oct 26 '24 02:10 abrichr