ahsin-s

Results 3 comments of ahsin-s

There are demo videos showing this working with gpt4 and they seemed to at least get the model to click the address bar but I'm not sure if that was...

That is the crux of the issue. Project maintainers don't seem to point out that gpt4v is NOT leveraging the grid overlay to estimate coordinates and is relying on heuristics...

any update on this functionality? I recently came across the need for it and it would be a big help.