Input Actions

Posted On: 2019-11-25

By Mark

Over the past couple months, I've been working on a substantial piece of technical debt , specifically, a set of remappable controls for the game. When I first confronted this problem (back in April) I chose to defer implementing such a system until a later date - both due to the significant scope of the effort and also due to the state of Unity's input system. At the time, a prototype for a new input system was in the works, so there was a lot of uncertainty around what input handling would look like in future Unity versions.

Since then, Unity's new input system reached 1.0 preview status, and is planned to become fully integrated into Unity 2020.1. As such, it seemed like a natural time to put it to the test, to see how it fares, both from ease of development and ability to resolve the many issues associated with making a robust user experience.

Action Abstraction

At the heart of Unity's new Input System is the concept of an action: a representation of what the user intended to do by pressing the button/key/whatever. This abstraction is quite powerful: rather than mapping input from individual buttons (such as the "A" button on a controller), developers can map input based on the player's intention (such as the "Jump" action.) By using actions*, gameplay code both becomes clearer and more resilient to input remapping: the gameplay code can be written in terms that make sense for the gameplay (how much force to add when "jump" is performed, what state to move the character to when the "primary attack" is performed, etc.)

Working with actions has been quite satisfying. I tend to think of character behavior in terms of their affordances (ie. can they jump) and their realization of those affordances (ie. trying to jump) and I have found that actions represent both quite nicely. Additionally, actions are typically encouraged to be used in conjuction with events, which I personally find easier to read and maintain (ie. when the "jump" action is performed, execute the "TryToJump" method). While using events to handle input could potentially lead to race conditions (ie. if two conflicting actions are performed at the same time), I think this might actually be a benefit, as it encourages designing actions in a way that are not mutually exclusive**. Additionally, since polling for input is available, if a race condition ever were a risk, one could use polling instead of events to avoid it.

* The new input system is not the only way to have such an abstraction - it is likely that many developers have developed similar abstractions to decouple gameplay from input code. That being said, having the input system provide this abstraction by default is a boon for me, personally, as I don't want the overhead associated with building and maintaining the input abstraction code myself (not least because my inexperience with input edge cases means the abstractions I would make myself might need rework down the line.)
** Based on my own experience and observations, the state-machine logic of "you can't do X because you're already doing Y" is a barrier that makes it hard for some players to wrap their head around gameplay. While not necessarily right for every game, I think using non-exclusive models for representing character behavior could help more people get to a place where they can enjoy games.

Binding Pains

Coding gameplay to use actions has (so far) been relatively* simple and effective, however, actions present some interesting challenges when it comes to displaying the correct button prompts. Each action is associated with one or more bindings - which "binds" the action to a specific input source (such as a particular button press.) This mechanism makes it possible to have one action controlled by different bindings for different devices (maybe the "Z" key on the keyboard and the "A" button on the controller), which is generally good for simplifying coding controls for a variety of input devices. For many actions, one could assume a single binding for each device (to make it easier to display the right prompt) however, there is one particular use-case that is vastly more complex, including not just multiple bindings for a device but several advanced kinds of bindings as well.

Movement - up, down, left, and right - introduces quite a bit more complexity into bindings. Taken as a single whole, the "movement" action maps neatly onto an analog stick, but maps much less neatly onto keyboard inputs. Composite bindings (a single binding made up of several others) are required in order to turn four different inputs (such as the arrow keys on the keyboard) into a single binding to be used with the action.

Conversely, if the four distinct directions (up, down, left, and right) are represented as four separate actions, the keyboard maps nicely, but analog sticks are the ones that require advanced bindings. Specifically, an analog stick is a single binding (the X and Y position of the stick) but that binding can be decomposed into "synthetic" bindings, which can be used to convert that one input into four different views of that data (ie. the "left" binding for the analog stick to represent negative X axis values.)

Lastly, supporting multiple bindings on a single device is likely to be more important for movement than any other action. Many gamepads feature both an analog stick and a directional pad, and which one the player will first try will vary, both by controller and player preferences. For games that support it, mapping movement to both stick and d-pad will likely increase the number of players that can play without making modifications to the controls, thereby reducing the amount of friction those players face. Unfortunately, the down-side to supporting multiple bindings is that it means that it's not clear which one should be displayed as a controller prompt (a problem becomes much more difficult if players have modified one, or both, of the bindings.)

Gradual Progress

As one may infer from the "Binding Pains" mentioned above - I am currently working my way through the process of displaying the correct button prompts for each action. Doing so is, unfortunately, making me aware of incorrect assumptions I made while working on the rebinding UI, so I expect I may need to revisit that code as well. This situation - discovering that earlier assumptions were incorrect, or perhaps simply imprecise, has been something of a recurring experience throughout these past two months working on input rebinding. At the outset, I expected everything would take about a month to complete, and each time I stopped to look at what is left, I estimated about a month more work remains. Realistically, there's probably more than a month still left to do - though it's pretty clear at this point that my estimates about this topic should be taken with a grain of salt*.

Conclusion

While it remains to be seen how much longer I will be working on this, I am thinking it will be worthwhile to provide a more complete write-up of everything I've learned at the end. If there's interest, I am also considering writing up a tutorial as well, detailing how to implement the new input system, complete with rebinding and button prompts. If that's something you'd like to see, please let me know.