Suggestion: Add terminal operator capability

Open James4Ever0 opened this issue 1 year ago • 0 comments

Although there are many web agents around the academia, few are capable of building a terminal agent even if it is pure text.

Devin, a close-sourced coding agent, has the ability to operate within terminal. On the other hand, OpenDevin recently declared their milestone towards this.

Hereby I made some effort over this very agent, by grounding the terminal environment with markup language.

You can see the position of the cursor, the range of the selected text.

tmux_show_1

You can also capture a screenshot of the terminal with cursor denoted in red.

vim_edit_tmux_screenshot

Grayscale augmented terminal gives high contrast to the red cursor, making the agent easier to locate it.

grayscale_dark_tmux

I believe this is the future, where AI agents become inseparable to operate systems. So will SeeAct adopt my code and push the terminal agent to the next level, or even make some contributions to OpenDevin?

Aug 09 '24 16:08 James4Ever0