Plinf.jl
Plinf.jl copied to clipboard
Block-words domain, multiple goals with overlapping subtasks
Hi,
My research lab was attempting to run experiments involving the block-words domain. We were running test cases in which the word CAPSTONE was being stacked. In our experiments, the possible goals are ROTE, TONE, TUNE, HAWK, and CAPSTONE. Since TONE and CAPSTONE overlap in their subtasks, we are getting unusual results in our CAPSTONE tests, where the model is unable to switch its goal belief to CAPSTONE from TONE after block S is stacked. Is this case with overlapping subtasks something that the block-word domain currently supports?
Thank you for your help.
Hi Naveen,
Sorry to hear you're running into this issue! It'd be helpful if you could share what you mean by unusual results, and how you've configured the model, because performance is going to be sensitive to the model configuration.
The model should be able to handle goals with overlapping subgoals/tasks. I think what is most likely happening here is that the inference algorithm is becoming over confident in the "TONE" hypothesis too quickly, and ends up discarding all samples with "CAPSTONE" as a goal hypothesis before it can be recovered. To address this, I would try the following:
- Increase the number of particles/samples (to e.g. 500)
- Turn off resampling in the particle filtering algorithm (set
resample
tofalse
)
Inference will end up being slower, but hopefully more stable. If the above doesn't work, there are other steps I can suggest.
Hello ztangent,
Thank you so much for your reply and suggestions. I am working with Naveen on this issue.
We have increased the particle samples and set resampling to false. The code can now recognize overlapping subgoals such as TONE and CAPSTONE. However, the algorithm is failing to detect a new goal when it completes goal 1 (stacking TONE) and moves on to goal 2 (stacking HAWK) within the same episode/observation trajectory. The algorithm currently remains too confident for the goal 1 even when the agent has moved on to the goal 2. Is the algorithm equipped to handle such situations?
Also, we are trying to plot the accuracy of your algorithm across various sensor noises. For this, we are passing the varied sensor_noise value to the setup_model() function. Let us know if anything else needs to be changed for this.
However, the algorithm is failing to detect a new goal when it completes goal 1 (stacking TONE) and moves on to goal 2 (stacking HAWK) within the same episode/observation trajectory. The algorithm currently remains too confident for the goal 1 even when the agent has moved on to the goal 2. Is the algorithm equipped to handle such situations?
The algorithm is indeed not equipped to handle such situations by default. This is because the default model configuration doesn't allow for the possibility that the agent might switch goals over time, and instead just models the agent as having a single initial goal, which it never changes.
To handle switching goals, you need to adjust the model configuration to provide both a prior over initial goals, and transition function that specifies how goals might change over time. To do this, you can pass in a goal_step
generative function to the constructor for AgentConfig
:
https://github.com/ztangent/Plinf.jl/blob/3f6bc1119055790163a5f21ac63e6b0d73cf4d8a/src/agents.jl#L18-L28
The goal_step
variable is expected to be a generative function that looks something like this:
https://github.com/ztangent/Plinf.jl/blob/3f6bc1119055790163a5f21ac63e6b0d73cf4d8a/src/specifications/specifications.jl#L80-L82
The above is an example of goal transition function that does nothing -- it simply keeps the goal fixed across timesteps (hence the name static_goal_step
). To handle changing goals, you probably want to have function that adds some probability of changing the goal at every timestep. Here's an example from the refactored version of Plinf.jl that I'm currently working on:
https://github.com/ztangent/Plinf.jl/blob/68c8e87d54b8aaf7d0f9025f01cdb209d675e700/src/modeling/agents/goals.jl#L77-L91
You will also have to change some of the inference algorithm code, so that the goal probabilities which the algorithm returns are tracking the probability of the agent's most recent goal $g_t$, not the goal $g_0$ it had at the start.
All of the above is a bit complicated, so if you're looking for an easier workaround, you could try just changing the goal specification so that instead of the modeling the agent as switching goals, you model the agent as possibly having a conjunctive goal, where their goal is to build both the TONE tower and HAWK tower, where the order of doing so doesn't matter. This way, the algorithm should infer the joint goal "TONE + HAWK" when it sees the agent build both towers.
I hope this helps! As a related note, I'm currently refactoring the Plinf.jl library so that modifications like the above will hopefully be a bit easier and more well-documented. This may not happen in a timeframe quick enough in your project, but I figured I'd give a heads up anyway, because at some point I'll be merging those changes into the master
branch, and I don't want it to catch anyone off-guard.
Hello Tan,
I hope you are doing well. Just wanted to check in and see if the code has been refactored. Thank you in advance! I hope your research is going well.
Hello! It is basically ready apart from the example and visualization code for the block words domain (which I will probably only be able to get to in September). If you want to try the the refactored branch, here it is: https://github.com/ztangent/Plinf.jl/tree/refactor