factorio-learning-environment icon indicating copy to clipboard operation
factorio-learning-environment copied to clipboard

Support for Images

Open JackHopkins opened this issue 9 months ago • 3 comments

Currently, agents only observe token streams from the output of their programs.

We could create a tool render(*entities) that converts the entity objects into an image that can be passed to a visual model.

Requirements:

  1. Implement an entities -> image renderer.
  2. Implement a render tool that invokes the renderer, with the manual etc.
  3. Add a post-tool hook to the render tool in the agent definition that passes the image into the agent
  4. Send the image and the program response to the LLM

JackHopkins avatar Mar 11 '25 17:03 JackHopkins

A thought here: If you implement integration with https://github.com/redruin1/factorio-draftsman, you can convert the entity objects into a blueprint string and then take advantage of already existing renderers for blueprints. Wouldn't capture the environment nearby though.

A-Vaillant avatar Mar 12 '25 14:03 A-Vaillant

I was reading the draftsman documentation and it mentions that (of course) it uses blueprint strings. As does everything in the wider ecosystem. I propose a better first step is entity object -> blueprint strings. That opens up a lot of possibilities.

vessenes avatar Mar 14 '25 14:03 vessenes

We decided against using blueprint strings, because AFAIK they don't contain any stateful information (such as inventories) or set recipes, and therefore don't capture enough information about 'live' factories.

However, I think blueprints would be ideal for specifying lab scenarios (e.g 'fix this broken factory') - which we should definitely aim to support.

JackHopkins avatar Mar 14 '25 14:03 JackHopkins