Interface Paradigms Under Augmented Reality

There have been a number of hypothesized and even prototyped interaction models for augmented reality interfaces attempting to address the lack of a physical interface peripheral (keyboard, mouse, touchscreen, buttons, &c.). These typically fall into broad categories:

Contrived Peripherals

These models propose new interface peripherals designed for augmented reality and other applications without a clearly defined computing appliance. They often take the form of gloves or other hand-mounted systems. Hideyuki Ando et al. propose the most outlandish solution in a disappointingly brief release: a vibro-haptic fingernail appliance. Jessica Speir et al. designed an interesting fabric-based wristband and glove to test user preference for one- or two-handed operation. Masahiro Toyoura et al. developed a haptic glove for use in AR, which used electrically-stimulated contracting threads to create a perception of physical contact with projected objects.

Tongue-based input devices are targeted primarily at quadriplegic patients for use as spellers in speech generation. Kencana and Heng, meanwhile, prototyped a tongue-driven extra-oral input device for the severely disabled. Caltenco, Struijk and colleagues document an intra-oral tongue-driven input device in one of a series of research articles for Caltenco's PhD dissertation.

Other intra-oral devices have been variously proposed for broader consumer adoption as supplemental input methods for head-up displays and implanted computer systems. As early as 1957, intra-oral interfaces were imagined by science fiction authors.

He pressed hard with his tongue against his right upper first molar. The operation that had transformed half his body into an electronic machine, had located the control switchboard in his teeth. Foyle pressed a tooth with his tongue and the peripheral cells of his retina were excited into emitting a soft light. He looked down two pale beams at the corpse of a man.

— Bester, A. (1957) The Stars My Destination. London: Signet. Online text

Appropriated Peripherals

Due to the limitations of mechanical feedback devices, a number of researchers have proposed the opportunistic appropriation of local environment objects as props for simulation objects. Early work in this vein focused on attaching virtual interface elements to physical objects. Chen et al.'s briefly described "memoicon" system, Cheng et al.'s "iCon", and Corsten et al.'s "Fillables" all present similar variations on the user appropriation of physical objects as interface items, particularly using fiducials for tracking.

Demonstration video of Corsten &t al.'s "Fillables" system.

More generally, other authors propose the less awkward and more opportunistic appropriation of objects as specific physical controls. Walsh, von Itzstein, and Thomas describe using physical objects to supplement AR sliders, knobs, and buttons. Corsten et al. present a system for programming physical objects as individual input supplements, for example linking a retractible pen click to a keyboard button for use as an ad-hoc slideshow presentation remote. Heun, Kasahara, and Maes take this idea one step further and propose a comprehensive system for programming and linking physical objects with AR controls.

In the lowest order manipulation of the environment, Todi et al. designed an augmented reality game utilizing real-world objects in their native form as game elements. Finally, the most credible generic AR prop appropriation scheme is developed by Hettiarachchi and Wigdor, allowing AR applications to opportunistically annex physical environment objects with physical characteristics similar to their virtual pair, and to adjust the virtual representation to more perfectly fit the physical prop. For example, a salt-shaker is appropriated as a graspable prop for a flashlight, so that the AR user has a credible physical object to interact with.

Gesture Recognition

The typical problems of dedicated input devices are in the awkwardness of having to bring them with you, such that they interfere in daily tasks. Either they are awkward and must be lugged around, like dedicated controllers of any kind, or they are uncomfortable or interfere with daily activities, like gloves and hand-mounted systems. Having these input peripherals detracts from many of the benefits of augmented reality, and does not fit the paradigm of increasing interface directness.

One potentially promising solution to these problems is gesture based interfaces. Typically involve camera systems for primitive motion capture, taking advantage of AR system cameras which are already used to map the environment and position augment projections. Some recognize static hand positions, such as Rautaray and Agrawal's desktop webcam solution, while others feature moving gesture recognition.

However, there are a number of problems with entirely gesture-based interaction models for augmented reality. First and foremost of these is fatigue. Fatigue results from having to make large or clumsy mid-air gestures for what would normally be very small movement tasks (e.g. mouse clicks for selection). Consider this footage from a HoloLens user as a prime example of how clumsy and physically demanding a basic interaction task such as text entry becomes with the generic point-and-pinch gesture input method provided by the device.

The awkwardness of generic gesture recognition as a replacement for point-and-click user interfaces is demonstrated by Reddit user /u/drakfyre.

Dezfuli et al. describe the fatigue drawback of mid-air 3D gesture input and have designed a palm-based gesture interaction model for TV remote control that addresses this.

Another major problem with mid-air gesture interfaces is credibility when interacting with synthesized objects. The lack of physical presence presents significant challenges for motor learning and dexterity, as well as interaction credibility. Cothros, Wong, and Gribble found that limited visual cues had a significant impact on motor learning in virtual tasks, indicating that novel motor tasks for gesture interfaces may be significantly impacted by environmental context. Sato et al. developed a visual softness cue to improve bare-hand (null-haptic) grasping task with augmented reality object. Chen et al. propose an electromyogram-assisted biofeedback system to simulate the experience of physical exertion without the need for mechanical feedback devices (gloves, actuators, suits, etc.) or props.

Even without these significant mechanical challenges to the operator, gesture recognition also suffers from accuracy and scope in motion capture. If a camera system is used, only gestures within the field of view can be recognized. If a glove or other device is worn, it interferes with all manner of daily tasks. Neither of these solutions are viable for wearable computer systems, augmented reality or otherwise. Needless to say, traditional computer input methods are simply not viable when the user is in motion.

Neural Interfaces

A potential solution to both the challenges of interaction without physical interfaces, and the credibility of physical interactions with virtual objects, can be found in direct neural interfaces which hijack the body's own sensorimotor system.

Noninvasive systems for neuroimaging are generally clunky and low-resolution. Bleichner et al. developed a miniaturized EEG electrode system that could be comfortably worn throughout the day. Unfortunately, noninvasive systems lack the definition to be useful for dextrous interactions.

The most straightforward way to tap into the nervous system is by intercepting an existing motor nerve close to its terminus. Rossini et al. document the development and surgery installation of a prosthetic implant control for amputees. The system learns the patterns of nerve firing which correlate with trained grasping, and is used to control a one-dimension grasping hand prosthetic.

Cutting edge neural interfaces are fairly invasive electrode arrays capable of both sensing and stimulating.