imitation Inconsistent naming of expert demonstrations

The expert demonstrations are referred in the code base as:

episodes
trajectories
lists of transitions
rollouts
demonstrations

I think we should clarify in the documentation what we mean by them and maybe get rid of some of the terms. What are your thoughts?

Feb 17 '22 11:02 ernestum

Not sure how I missed this the first time around. I agree we should standardize, but some care needed here, as each of these names means a different thing or at least has different conntations.

For example, lists of transitions and trajectories are not the same data type. You can get lists of transitions from trajectories, but not the other way around. Some algorithms like BC can just learn from sequences of transitions; other algorithms (like MCE IRL IIRC) need whole trajectories.

I tend to think of demonstrations as rollouts from the expert, whereas rollouts can also come from other policies. Episodes is pretty redundant with rollouts and is probably best avoided being used except for "number of episodes".

Oct 22 '22 02:10 AdamGleave

So I propose:

Decide if we want to call it episodes, trajectories or rollouts? Maybe look at the naming conventions of SB3.
Demonstrations should only be used as a parameter name but with the qualification of the type. So use demonstration_transitions and demonstration_trajectories/demonstration_rollouts ...

Nov 14 '22 12:11 ernestum

So I propose:

Decide if we want to call it episodes, trajectories or rollouts? Maybe look at the naming conventions of SB3.

OTOH I'd vote for calling them trajectories, although happy to stick with an SB3 naming convention if it exists. Trajectories feels like the most general one. I don't think "rollouts from a human" makes much sense, for example. And in theory we might not even always have an episodic environment https://github.com/HumanCompatibleAI/imitation/issues/575 so calling them episodes is odd

Demonstrations should only be used as a parameter name but with the qualification of the type. So use demonstration_transitions and demonstration_trajectories/demonstration_rollouts ...

I'd personally towards just calling them demonstrations -- the type annotation usually makes it clear what types they can accept. Also any algo that can take transitions can also take trajectories (as can flatten trajectories into transitions), and it feels a bit odd to say func(demonstrations_transitions=expert_trajectories).

Nov 14 '22 21:11 AdamGleave

SB3 uses the term trajectory and rollout but a rollout sometimes refers to something generated by a learned model. So I think trajectory is less ambiguous.

I agree that type annotations should be enough to distinguish between different types of demonstrations.

Nov 16 '22 11:11 ernestum