InputSystem [API Proposal] Global Input Actions and wrapper classes

Background and Motivation

This API proposal adds a simplification layer around Input Actions and introduces the concept of Global Actions, an input actions asset that comes pre-built with the package and sits in the InputManager asset in Project Settings (bit of an implementation detail but that seems the best place to put it). Global actions are enabled by default and remove the need for the user to a) remember to call Enable and b) create a new Input Action Asset before they can even get started with input (along with the requirement to understand all the concepts around Input Actions that go with that).

The API favours the polling approach to input by default and funnels users towards that type of usage. While the original Input Actions API and all their events are always accessible, they are another layer down, and none of the types in this proposal directly expose any events.

It also tries to make a distinction between the concept of Input Actions as an abstraction over a set of controls, and the interaction model of started -> performed -> cancelled, because this often seems to be a source of confusion.

Finally, this API removes the need for users to know what action type an action has. In the existing API, users must call the ReadValue<T> method to retrieve the value of the action, and if the wrong type is provided, a runtime exception is thrown. This is an easy way for uncaught errors to sneak into a product, as the type of an action can be changed in the asset at any time. Using source generators, this API proposes building strongly-typed wrappers around actions.

Proposed API

namespace UnityEngine.InputSystem.HighLevel
{
  public static partial class Input
  {
    public static InputActionAsset globalAsset { get; }

    public static bool IsActionPressed(string actionName, string actionMapName = "");
    public static bool IsActionDown(string actionName, string actionMapName = "");
    public static bool IsActionUp(string actionName, string actionMapName = "");
  }

  public class Input<TActionType> where TActionType : struct
  {
    public InputAction action { get; }
    public bool isPressed { get; }
    public bool wasPressedThisFrame { get; }
    public bool wasReleasedThisFrame { get; }
    public TActionType value { get; }

    public Input(InputAction action);

    public bool WasPerformedThisFrame<TInteraction>() where TInteraction:IInputInteraction;
    public bool WasStartedThisFrame<TInteraction>() where TInteraction : IInputInteraction;
    public bool WasEndedThisFrame<TInteraction>() where TInteraction : IInputInteraction;

    public bool SetInteractionParameter<TInteraction, TParameter>(Expression<Func<TInteraction, TParameter>> expr, TParameter value);
    public bool TryGetInteractionParameter<TInteraction, TParameter>(Expression<Func<TInteraction, TParameter>> expr, out TParameter value);
    public bool AddInteraction(IInputInteraction interaction);
    public TInteraction AddInteraction<TInteraction>() where TInteraction : IInputInteraction, new();
    public bool RemoveInteraction<TInteraction>() where TInteraction : IInputInteraction;
    public bool RemoveInteraction(IInputInteraction interaction);

    // alternative API
    public void AddInteraction(TInteractionConfig config) where T:IInputInteractionConfiguration;

    public static implicit operator bool(Input<TActionType> input);
    public static implicit operator InputAction(Input<TActionType> input);
  }

  public struct InputInteraction<TInteraction, TActionType> : IDisposable
      where TInteraction:IInputInteraction 
      where TActionType:struct
  {
    public bool wasPerformedThisFrame { get; }
    public bool wasStartedThisFrame { get; }
    public bool wasCancelledThisFrame { get; }

    public InputInteraction(Input<TActionType> input);
  }

  public partial class InputSettings : ScriptableObject
  {
    public InputActionAsset globalInputActionsAsset { get; set; }
    public bool disableGlobalInputActions { get; set; }
  }
}

API Usage

Increase a value while an action is pressed

  public class ChargedFire : MonoBehaviour
  {
    private float m_StartTime = 0;

    public void Update()
    {
      if (Input.IsActionPressed("Fire"))
      {
        m_StartTime = Time.time;
      }

      if (Input.IsActionUp("Fire"))
      {
        Fire(Time.time - m_StartTime);
        m_StartTime = 0;
      }
    }

    private void Fire(float chargeTime)
    {
    }
  }

Perform a gameplay action when a strongly-typed input is pressed

  public class Jump : MonoBehaviour
  {
    public void Update()
    {
      if (InputActions.jump)
      {
        GetComponent<Rigidbody>().AddForce(Vector3.up, ForceMode.Impulse);
      }
    }
  }

Use source generated interactions

  public class FireWeapon : MonoBehaviour
  {
    public void Update()
    {
      if (InputActions.fire.holdInteraction.wasPerformedThisFrame)
      {
        InputActions.fire.TryGetInteractionParameter((HoldInteraction x) => x.duration, out var duration);
        Fire(duration);
      }

      if (InputActions.fire.pressInteraction.wasPerformedThisFrame)
        Fire(0);
    }

    public void Fire(float chargeTime)
    {
    }
  }

Change the value of an existing interaction parameter

  public class ChangeInteractionParameter : MonoBehaviour
  {
    public void Update()
    {
      InputActions.chargedFire.TryGetInteractionParameter((HoldInteraction x) => x.duration, out var duration);
      InputActions.chargedFire.SetInteractionParameter((HoldInteraction x) => x.duration,
        duration + 0.1f);
    }
  }

Source generated strongly-typed input action access example

  public static class InputActions
  {
    public static Input<Vector2> move => new Input<Vector2>(globalAsset.FindAction("Gameplay/Move"));
    public static FireInput fire => new FireInput(globalAsset.FindAction("Gameplay/Fire"));
    public static Input<float> join => new Input<float>(globalAsset.FindAction("Player/Join"));
    public static Input<Vector2> navigate => new Input<Vector2>(globalAsset.FindAction("UI/Navigate"));
  
    public class FireInput : Input<Vector2>
    {
      public InputInteraction<Vector2, HoldInteraction> holdInteraction;
      public InputInteraction<Vector2, PressInteraction> pressInteraction;

      internal FireInput(InputAction action) : base(action)
      {
        holdInteraction = new InputInteraction<Vector2, HoldInteraction>(this);
        pressInteraction = new InputInteraction<Vector2, PressInteraction>(this);
      }
    }
  }

Source generated XMLDOC showing bindings and interactions on an action

XMLDOCExample

Notes

Source generated actions can have the action map name prepended if there is a collision in naming.
As part of this work, the binding logic should be improved so that errors are thrown during binding resoltuion if an incompatible control type gets bound to an action? For example, a button binding in an action of type Vector2. Right now, the errors only get thrown when ReadValue is called, so it would be easy for that error to sneak into a game.
The global action asset setting in InputSettings will require some custom build pipeline work because all the source generated code has to work against it.
InputInteraction implements IDisposable so that we can remove performed/started/cancelled event handlers. This should be called by the main Input class when the game shuts down or on exiting play mode.

Risks

The AddInteraction implementation could be very difficult. Currently the system creates interaction instances dynamically via Activator.CreateInstance from configuration data stored in string format e.g. "Hold(duration=0.5);Press(behaviour=1)" etc. This API though would allow injection of a specific instance of an interaction into the binding state to allow interaction parameters to be changed by direct modification of the instance.
The constructors for Input<TValue> and InputInteraction classes need to be public because they will be instantiated from source generated code in a user assembly. It would therefore be possible to wrap an input action in a non-type-safe way. We can add debug asserts at runtime, but that's not ideal.

Jul 21 '22 09:07 andrew-oc

Source generated XMLDOC showing bindings and interactions on an action

I assume we can only generate this for the strongly-typed wrappers on the built in actions. I'm slightly concerned we will set a precedent for users to expect this to work on their own actions added via assets.

But maybe I'm wrong in my assumption because these strongly-typed wrappers don't exist for user assets. Perhaps that will also be an expectation (could they be generated there too).

Aug 12 '22 17:08 lyndon-unity

if (Input.IsActionUp("Fire"))

I fine the naming IsActionUp slightly confusing. Is this on the up edge or just not pressed in general. I think its the up edge but I'm wondering if a different name would reduce confusion.

E.g. IsActionReleased

Aug 12 '22 17:08 lyndon-unity

InputActions.fire.TryGetInteractionParameter((HoldInteraction x) => x.duration, out var duration);

This is quite a long function name - would TryGetInteraction be sufficient ? Ignore me - this obviously clashes with the other term :(

Aug 12 '22 17:08 lyndon-unity

Source generated XMLDOC showing bindings and interactions on an action

I assume we can only generate this for the strongly-typed wrappers on the built in actions. I'm slightly concerned we will set a precedent for users to expect this to work on their own actions added via assets.

But maybe I'm wrong in my assumption because these strongly-typed wrappers don't exist for user assets. Perhaps that will also be an expectation (could they be generated there too).

I don't think it would really be possible to do this for user assets. The problem is that I don't know how we would tell the source generator what asset to look at. In some cases, the asset might not even be assigned before the player is built! The global asset exists in one well-defined and immutable location. The only concession I was able to find that might mitigate some of this is that the project settings editor will have a slot for replacing the global asset with an arbitrary other one, and then the source generator will work against that. I'm torn on this one as well though because if you change it half way through production, all the strongly typed references will cause compile errors.

if (Input.IsActionUp("Fire"))

I fine the naming IsActionUp slightly confusing. Is this on the up edge or just not pressed in general. I think its the up edge but I'm wondering if a different name would reduce confusion.

E.g. IsActionReleased

Yep, went round and round on this one:) IsActionUp/Down was ultimately for users to feel some sense of familiarity when moving from the legacy input manager where we have GetKeyDown/Up, GetButtonDown/Up which are all true on the up or down edge for they frame they occur in. IsActionReleased is problematic because IsActionPressed seems like the right name for when the action is held, but then would have different semantics to how released works. Maybe

IsActionPressed
IsActionReleased
IsActionHeld

would work, but then we lose the familiarity.

Aug 15 '22 12:08 andrew-oc

Thanks for posting this proposal. I think the API surface is small and easy to grasp in general and the examples help illustrate usage. Starting off review with mainly questions to be iterated on to shed some light on some aspects I am not sure I interpret correctly.

Regarding Proposed API

Regarding Input<TActionType> public bool isPressed { get; } public bool wasPressedThisFrame { get; } public bool wasReleasedThisFrame { get; } Are these only applying to actions that behaves as a button. Does it make sense to include them like this or does this calls for a an extended button behavior input? isPressed, wasPressed, wasReleased may be confusing if applied as step-function evaluations in a non-button context. E.g. what should be expected from InputActions.navigate.wasReleasedThisFrame or InputActions.navigate.wasPressedThisFrame, is the intention to map these to actuation based on some threshold in this case or generate an error?

Implicit bool operator is typically convenient and can collapse expressions but was is the intended behavior here? Will it map to performed, check if the InputAction is valid, etc? Also curious about the story behind the implicit conversion supported there and the story behind it?

Just to make sure I do correct interpretation - It is still the case with this API similar to Unity Input that it is down-sampled to frame granularity right? E.g. wasPerformed, wasPressed, wasReleased may all be true and in reality reflects e.g. wasPerformedAtLeastOnceThisFrame. Which means this supports use-cases where sub-frame ordering or multiple events per frame is not important.

How is the the interaction Add/Remove intended to be used. Is it intended to be used to add additional ways to interact to trigger the action from code? I think a use-case example illustrating this part of the proposal might be good since it currently do not exist.

Regarding examples

Related to "Increase a value while an action is pressed", how come IsActionDown isn't the right tool for this particular use-case? Tying back to discussion with @lyndon-unity above I understand the problems tied to naming here and history. In general I think it might be good if we avoid mixing up "state names" and "event names". E.g. IsActionDown/IsActionUp reflects a current state which makes sense, however down/up implies something about the physical design of the button which might be confusing if the button isn't a press button, but might be the most common scenario and has been used before. I find "isPressed" to me is the most confusing since "press" sounds like an "event name" reflecting the action to push the button from being "up" to being "down, e.g. reflecting an edge. Hence, wasPressedThisFrame, wasReleasedThisFrame makes sense as evaluating if an event happened at least once during "this frame", but also having isPressed is confusing, maybe this should be "isDown"/"isActuated" or similar.

Regarding source generated actions

"Source generated actions can have the action map name prepended if there is a collision in naming." , this would result in inconsistent naming between colliding names and non-colliding names. Would it make sense to always prefix by map name for consistency between generated code and source asset? I know the access pattern will require a few extra characters but for any modern IDE that also means possibility to "drill-down" efficiently if you have a large number of actions. E.g.

public static class InputActions
{
   public static class Game {
       public static Input<Vector2> move => new Input<Vector2>(globalAsset.FindAction("Gameplay/Move"));
       public static FireInput fire => new FireInput(globalAsset.FindAction("Gameplay/Fire"));
       public static Input<float> join => new Input<float>(globalAsset.FindAction("Player/Join"));
       public static Input<Vector2> navigate => new Input<Vector2>(globalAsset.FindAction("UI/Navigate"));
    }
    
    public static class UI { ... }
}

I noticed we call the containing source generated class InputActions in the example above. Is this name fixed or reflecting any user defined name? I would assume fixed but happy to learn the motivation behind it. The reason I mention this is that it might make sense to tie the name used here together with the concept name we are going to use and stay consistent to avoid confusion. I.e. if we select to call this "Global Actions" maybe this class should be called GlobalActions to avoid confusion. Is it envisioned to live within UnityEngine.InputSystem.HighLevel or somewhere else since I guess we can avoid repeating names or intent inference from a containing namespace?

Regarding "polling"

The API favours the polling approach to input by default and funnels users towards that type of usage. While the original Input Actions API and all their events are always accessible, they are another layer down, and none of the types in this proposal directly expose any events.

Maybe we should avoid calling it "polling" as part of discussing this API since I suspect the underlying implementation wouldn't go to the device to ask but rather operate on time-window buffers and not yield different results on each query. It could easily open up for interpretation of what is under the hood here, e.g. wiki definition. Maybe its less confusing to just call it something similar to "query-based" or "request-based".... not sure and this is maybe very minor. Potentially fine to use "polling" but then we should be clear that it is not polling the underlying signal.

Aug 16 '22 06:08 ekcoh

Review meeting

Clarified that this is mainly an escape hatch (globalAsset) and support for versions without code generation.

  public static partial class Input
  {
    public static InputActionAsset globalAsset { get; }

    public static bool IsActionPressed(string actionName, string actionMapName = "");
    public static bool IsActionDown(string actionName, string actionMapName = "");
    public static bool IsActionUp(string actionName, string actionMapName = "");
  }

Discussed that it might be desirable to limit to single interaction instead of allowing multiple interactions that doesn't make logical sense to be added. Not clear how this could be achieved at this point.

Why do we need to add and remove interactions? This is an authoring stage, use-case seem to be lost, maybe try to define a clear use-case for this. E.g. swap weapon to a weapon that has a hold interaction. But this could be achieved by different actions instead. We should define how we support such a use-case in a good way. Point was made that we are not using callbacks, single-truth-of-source could simply be conditional evaluation. Basically just ask based on selected weapon index. Was also mentioned Addinteraction API might not be possible to implement. Might have problems tied to rebinding.

Way forward: Remove AddInteraction/RemoveInteraction to limit options and guide users towards authoring tools instead.

Aug 26 '22 13:08 ekcoh

InputInteraction<TInteraction, TActionType> is unclear why it should be used when strongly typed anyway, might no longer be needed.

What if we remove parts we depend on from editor. Then code wouldn't compile, we might be able to detect dependencies and highlight the problem or alert the user trying to remove or rename an action. Another way would be to support auto update of e.g. code reference.

Aug 26 '22 14:08 ekcoh

This work has been discontinued.

Oct 27 '23 11:10 jamesmcgill

InputSystem InputSystem copied to clipboard

[API Proposal] Global Input Actions and wrapper classes

Background and Motivation

Proposed API

API Usage

Increase a value while an action is pressed

Perform a gameplay action when a strongly-typed input is pressed

Use source generated interactions

Change the value of an existing interaction parameter

Source generated strongly-typed input action access example

Source generated XMLDOC showing bindings and interactions on an action

Notes

Risks

InputSystem
InputSystem copied to clipboard