image capture with !look(x,y,z)
There is a way to take simple screenshots: https://github.com/PrismarineJS/mineflayer/blob/master/examples/screenshot-with-node-canvas-webgl/screenshot.js
They will be deeply flawed, but better than nothing. thought must go into the command itself, I like !look(x, y, z) to look at a specific block, as opposed to roll,pitch,yaw
Also adding text info about the screenshot would be good, like raycast the type of block in the middle of its view, get coords and type, and describe that with the image. some sort of x,y,z coords and info to help ground the image in the gameworld
you can actually send the screenshot to openai newer models as part of the conversation with them. so this could improve the bot experience a lot
Maybe some sort of info about biomes/structures in the direction of the image? I think the biggest problem is that it needs to accomplish 2 very difficult tasks.
- Identify points of interest in the image (Hopefully advanced models will be capable of this)
- Figure out where these are in the game world/how to get there (I'm not sure how to solve this one)
@DerJanniku mabe you could submit a PR
@Ninot1Quyi sure
i submit the PR
yeah dont post all the code here
I have this working and would like to take over this issue please.
I’ve started working on this and integrated it with GPT-4o for generating descriptions. It seems to be working well so far! Once I refine the workflow and improve the integration, I’ll share a more detailed update.
demo video: https://www.youtube.com/watch?v=0B7urBGJsKw
This has been implemented.