ParlAI icon indicating copy to clipboard operation
ParlAI copied to clipboard

Add Async Multiparty Chat to ParlAI

Open chiehminwei opened this issue 3 years ago • 8 comments

Patch description This patch adds basic async multiparty chat functionality to ParlAI, following the specification outlined in the design document.

During training, we can still use a two-agent world, even for multiparty chat. This is achieved by making one agent the teacher, one agent the student, and modifying the relevant Teacher to format the context correctly.

To support multiparty chat, though, we extend the TorchAgent class to have the ability to handle more than two speakers, and enable it to have a notion of time and a notion of silence.

These are enabled by setting the following CLI flags:

  1. --timestep-field
  2. --speaker-field
  3. --record-silence
  4. --silence-token

For example, for an observation:

observation = { 'speaker': 'Groot', 'text': 'I am Groot.', 'timestep': '00:00:00', }

The --timestep-field would be timestep, and the --speaker-field would be speaker. The timestep would get saved in the History class in new fields related to timesteps, and speaker would replace the __P1__ token to get saved in the conversation history.

To stop the agent from recording silence in its own output, specify --record-silence=True to stop agent recording it during its call to self_observe.

Testing steps Additional unit tests have been added to the test suite. Run pytest tests/pytest tests/test_torch_agent.py

chiehminwei avatar Jul 06 '22 21:07 chiehminwei

Looks great. You added the minimal amount of code needed and refactored well. This is good example of adding on top of high level code. Just make sure that you have enough test coverage for backward compatibility. I am guessing some of the GPU test failures might be related to the changes to the Torch Agent.

Awesome I'll add unit tests once I figure out silence and teachers, etc.

I took a look and it seems like the GPU test failures are from broken main branch, not from this PR: https://fb.workplace.com/groups/770643789812374/permalink/1997298873813520/

chiehminwei avatar Jul 08 '22 17:07 chiehminwei

Another question I had after reading through the code again is whether we want to modify the how we handle observations in TorchAgent.

Currently, act calls batch_act with the observation self.observation that was stored from the last call to observe(observation). https://github.com/facebookresearch/ParlAI/blob/dff9aabb5024c30c81e146cebffbc88bc6431b61/parlai/core/torch_agent.py#L1822

https://github.com/facebookresearch/ParlAI/blob/dff9aabb5024c30c81e146cebffbc88bc6431b61/parlai/core/torch_agent.py#L1842-L1843

https://github.com/facebookresearch/ParlAI/blob/dff9aabb5024c30c81e146cebffbc88bc6431b61/parlai/core/torch_agent.py#L2142-L2148

This observation (or batch of observations) is then passed to the train_step (or eval_step). https://github.com/facebookresearch/ParlAI/blob/dff9aabb5024c30c81e146cebffbc88bc6431b61/parlai/core/torch_agent.py#L2152 https://github.com/facebookresearch/ParlAI/blob/dff9aabb5024c30c81e146cebffbc88bc6431b61/parlai/core/torch_agent.py#L2169 https://github.com/facebookresearch/ParlAI/blob/dff9aabb5024c30c81e146cebffbc88bc6431b61/parlai/core/torch_agent.py#L2239

This behavior is fine if we only have two agents parley to each other. The agent is always in an observe-act loop so self.observation is always fresh.

Things are different if we introduce a tick-based system and allow multiple observe before calling act. When the agent act, only the latest observation gets stored in self.observation and gets sent to train_step.

For example, in a world with 3 agents, for Agent A:

agentA.observe(action from Agent B)
agentA.observe(action from Agent C)
agentA.act()

Although both action from Agent B and action from Agent C will get stored in the history (by appending to the self.history.history_vecs array, etc.) when observe is called, only action from Agent C will be passed to train_step(action from Agent C).

This may be fine since previous observations are all stored in the history, but I wonder if all observations from this time tick should be somehow grouped together and sent to batch_act and then to train_step(batch), maybe by modifying batch so that it's not just a list of Message, but rather a list of list of Message, e.g.:

# batch of size 2
batch = [
  [Message_from_A, Message_from_B, Message_from_C],
  [Message_from_A, Message_from_B, Message_from_C],
]

This will require changes in a lot of places though (the validate function in words.py, and changing function signatures for batch_act, train_step, and eval_step), so I'm not sure if it's worth doing.

chiehminwei avatar Jul 08 '22 18:07 chiehminwei

Another question I had after reading through the code again is whether we want to modify the how we handle observations in TorchAgent.

Currently, act calls batch_act with the observation self.observation that was stored from the last call to observe(observation).

https://github.com/facebookresearch/ParlAI/blob/dff9aabb5024c30c81e146cebffbc88bc6431b61/parlai/core/torch_agent.py#L1822

https://github.com/facebookresearch/ParlAI/blob/dff9aabb5024c30c81e146cebffbc88bc6431b61/parlai/core/torch_agent.py#L1842-L1843

https://github.com/facebookresearch/ParlAI/blob/dff9aabb5024c30c81e146cebffbc88bc6431b61/parlai/core/torch_agent.py#L2142-L2148

This observation (or batch of observations) is then passed to the train_step (or eval_step).

https://github.com/facebookresearch/ParlAI/blob/dff9aabb5024c30c81e146cebffbc88bc6431b61/parlai/core/torch_agent.py#L2152

https://github.com/facebookresearch/ParlAI/blob/dff9aabb5024c30c81e146cebffbc88bc6431b61/parlai/core/torch_agent.py#L2169

https://github.com/facebookresearch/ParlAI/blob/dff9aabb5024c30c81e146cebffbc88bc6431b61/parlai/core/torch_agent.py#L2239

This behavior is fine if we only have two agents parley to each other. The agent is always in an observe-act loop so self.observation is always fresh.

Things are different if we introduce a tick-based system and allow multiple observe before calling act. When the agent act, only the latest observation gets stored in self.observation and gets sent to train_step.

For example, in a world with 3 agents, for Agent A:

agentA.observe(action from Agent B)
agentA.observe(action from Agent C)
agentA.act()

Although both action from Agent B and action from Agent C will get stored in the history (by appending to the self.history.history_vecs array, etc.) when observe is called, only action from Agent C will be passed to train_step(action from Agent C).

This may be fine since previous observations are all stored in the history, but I wonder if all observations from this time tick should be somehow grouped together and sent to batch_act and then to train_step(batch), maybe by modifying batch so that it's not just a list of Message, but rather a list of list of Message, e.g.:

# batch of size 2
batch = [
  [Message_from_A, Message_from_B, Message_from_C],
  [Message_from_A, Message_from_B, Message_from_C],
]

This will require changes in a lot of places though (the validate function in words.py, and changing function signatures for batch_act, train_step, and eval_step), so I'm not sure if it's worth doing.

After Jul 11 Multiparty Chat meeting, we decide to go against allowing multiple "observe" in a row. Interactive world will be handling this stuff to make sure all agents act in observe-act loop (even when new observation comes in the middle of an act, etc.)

chiehminwei avatar Jul 11 '22 23:07 chiehminwei

From the failing tests it looks like adding silence_token to the dictionary will break models trained without silence_token when they restore from checkpoints due to mismatch in embedding size.

We could just remove silence_token from the dict.py and instead use a command line flag that saves to self.silence_token. The only reason we need a silence_token is so that we can skip it during self_observe.

But this feels a little unclean. Let me know what you think.

chiehminwei avatar Jul 12 '22 22:07 chiehminwei

From the failing tests it looks like adding silence_token to the dictionary will break models trained without silence_token when they restore from checkpoints due to mismatch in embedding size.

We could just remove silence_token from the dict.py and instead use a command line flag that saves to self.silence_token. The only reason we need a silence_token is so that we can skip it during self_observe.

But this feels a little unclean. Let me know what you think.

From the failing tests it looks like adding silence_token to the dictionary will break models trained without silence_token when they restore from checkpoints due to mismatch in embedding size.

We could just remove silence_token from the dict.py and instead use a command line flag that saves to self.silence_token. The only reason we need a silence_token is so that we can skip it during self_observe.

But this feels a little unclean. Let me know what you think.

I went ahead and changed to using the command line flag --silence-token for TorchAgent. This still feels a little hacky to me, but I guess it doesn't make sense to have an unused silence token in the dictionary and grow the embedding size by one for agents that don't care about the silence token.

chiehminwei avatar Jul 13 '22 22:07 chiehminwei

Looks good thank you. I had a few small comments on the code. Also an extra major comment about the tests: it looks like a lot of your tests are a combinations of multiple tests with many resets and repeats. This makes them very long and hard to debug. If possible, break your tests into multiple smaller ones. I see you have commented on the test parts, why now having them with functions that have proper names coming from those comments?

I wrote the tests based on the existing one that we had and thought that was the style we were going for. Should I break things up?

chiehminwei avatar Jul 25 '22 22:07 chiehminwei

Hi @chiehminwei!

Thank you for your pull request.

We require contributors to sign our Contributor License Agreement, and yours needs attention.

You currently have a record in our system, but the CLA is no longer valid, and will need to be resubmitted.

Process

In order for us to review and merge your suggested changes, please sign at https://code.facebook.com/cla. If you are contributing on behalf of someone else (eg your employer), the individual CLA may not be sufficient and your employer may need to sign the corporate CLA.

Once the CLA is signed, our tooling will perform checks and validations. Afterwards, the pull request will be tagged with CLA signed. The tagging process may take up to 1 hour after signing. Please give it that time before contacting us about it.

If you have received this in error or have any questions, please contact us at [email protected]. Thanks!

facebook-github-bot avatar Aug 20 '22 07:08 facebook-github-bot

This PR has not had activity in 30 days. Closing due to staleness.

github-actions[bot] avatar Sep 20 '22 00:09 github-actions[bot]