Time to disengage: the Metaverse and human-computer interaction

Stanislav Stankovic
UX Collective
Published in
9 min readNov 4, 2022

--

Hugo Gernsback wearing a Head Mounted Display.

I am old enough to remember when Snow Crash by Neal Stephenson originally hit the bookstore shelves. It was the early ‘90s and I was in my teens. Our computers still squeaked out a sequence of high pitch noises whenever they tried to connect to a network. But Cyberpunk was a fresh new thing, the vision of our hyper connected future, in which we were supposed to spend a good chunk of our lives online.

Fast forward thirty years into the future. It is the year 2022 and Metaverse, the word coined by Mr. Stephenson, is now a product name. The vision of this Cyberspace has been co-opted by one of the biggest global corporations. I guess that we are living in the future that we have been promised.

This future is once again tied to some concepts that the technology developers have been tossing around for decades, namely VR goggles and immersive 3D environments, bundled with a slew of new controller devices. The allure of this vision is that it will redefine the way we interact with technology by making it more intuitive and, in the process, somehow, make us more productive.

According to the definition of any interactive system, including all the VR environments, an input device, the environment, and the display device form a complete technological loop which allows the user to interact with the said environment. The keyboard of my laptop, the text editor which I am using to write this text, and the laptop screen constitute one such system. For the purpose of making this document they are quite adequate, and I feel totally happy using them. No wonder, they are a result of several decades of incremental improvements. Then again, they did not initially materialize in their current form. As in anything, there is a space for innovation and even disruption. VR, AR, XR, etc. promise exactly this, i.e. disruption in the way we interact with our technology.

As it goes the ultimate verdict on any such promise is going to be passed by final users. The kind of ordinary folks, which are not enamoured with the technology itself, but would rather use it as means to an end. The quality of the user experience (UX) will make or break any such proposal. On the flip side, a wide adoption of any new Human Computer Interaction (HCI) paradigm will have important repercussions of the way we do UX design.

In this text, I am deliberately not going to talk about simulation sickness. It is a phenomenon that has grabbed a lot of attention from both critics and proponents of the tech. It is also a phenomenon that a lot of development is focused on.

Instead, I am going to talk about three other, very important human factors related to the UX of these new exotic devices. These factors manifest themselves in some very important conceptual problems in designing new methods of human-computer interaction. I believe that anyone working in this field should be aware of them, and I’ll try to explain why in the following text.

Time to Disengage

The first of these three notions is what is known in HCI as Time to Disengage. This cryptic name denotes something at the same time, very simple and very counterintuitive. We build systems to interact with them. Ideally, UX is about making the interaction more comfortable, efficient and intuitive. However, we continue to live in the physical world. Even while we are immersed in the cyberspace of our technology, we are still immersed in the real world. Seamless switching between the two, is one of the most important yet most neglected tasks.

While I am typing this text, I am focused on my work. I am in the flow. I do best to ignore my surroundings. However, things might happen in the real world that would require my attention. It could be a very banal thing. At some point my colleague might ask me about something. My son might need help opening a jar of cookies. Or the electric cattle might start to boil. The cat might jump on the desk. All sorts of things might happen. The house might be on fire. Time to transfer my attention from the laptop screen to the real world around me is measured in milliseconds. All that is required is that I move my eyes into the direction of the possible distraction. Returning to the virtual world on the laptop screen requires the same amount of effort.

Contrast this to wearing VR goggles. This device effectively makes me blind to the real world. The time required to emerge from the real virtual into the real world can again be very brief. However, it requires a significantly larger effort. It requires that I use my hands to pull down the device from my eyes. If my hands are strapped to some fancy 3D controller, this also adds additional seconds. Even if I am perfectly used to doing these actions, no matter how fast I am, this will always be an order of a magnitude slower than glancing away from the screen. Getting back into the virtual world requires the same amount or even more effort.

Small insignificant interruptions might lead to big annoyance in the usage of VR gear. This, in turn, can limit the number of situations in which these devices can be practically used, in turn limiting their adoption by users.

Passive Haptic Feedback

Drawing with a ballpoint on a plain piece of printer paper feels different than painting with a watercolor paintbrush on aquarelle paper. Coloring with crayola feels different than using a marker. When you try to cut a piece of raw meat, it feels dramatically different than slicing bread or curing a piece of cheese. This is something we take for granted, but it is actually extremely important.

The forces exerted by our muscles upon the tool get reflected back onto the nerve ends in our skin. This intricate interplay of forces is essential for our performance of various tasks. It is a self feeding sensory-motor loop in which our mind adjusts our motions to precisely perform a delicate task. What we are sensing is actually Passive Haptic Feedback.

A diagram of sensory-motor loop in VR systems
Sensory-motor loop in VR.

There is nothing specifically built into the handle of a knife that would convey this force feedback to our fingers. What we are sensing is a result of fundamental laws of physics. It is just the passive resistance of a solid object to application of the external force due to its inertia. Thus, this is a passive form of feedback. Each object with its form, composition and the context of the application, gives out unique and recognizable haptic feedback.

Now consider the gesture-based interface. They were all the rage on computer consoles some decade and a half ago, during the time that Nintendo Wii and Microsoft Kinect ruled the world. They are again emerging with all the new devices and software such as Google Tilt Brush etc.

Google Titlt Brush promo material
Google Tilt Brush.

These devices and applications try to emulate, to mimic the behavior of various real time tools while operating in a virtual world devoid of laws of physics that our bodies can recognize. In the virtual world, you are handling an immaterial gun or a paintbrush. The space between your fingers is void. There is no physical desk or canvas to lean against.

The virtual world lacks the appropriate passive haptic feedback. Sometimes this doesn’t actually matter. Sometimes, it is actually to the user’s advantage. No one is missing a real recoil when shooting a virtual gun. The fun in most shooter games is not in a faithful representation of the properties of weapons. However, as we are sliding more and more towards the so-called professional applications of VR and AR these things will start to matter more and more. Painting something in the virtual void feels significantly different then sculpting something out of physical clay or wood.

These things matter. Working without any passive haptic feedback is akin to trying to work under local anesthesia. Digital image processing, and painting applications have for decades now been evolving the ways in which they mimic important properties of drawing with various physical tools. A large part of this is about mimicking the appropriate passive haptic feedback. Photoshop has a sophisticated set of settings that allows the user to adjust the speed with which the virtual ink flows and spreads around the virtual tip of the virtual tool.

On the other hand, the human mind is a remarkably malleable entity. Humans are able to adjust to the properties of new tools that they encounter.

False Input

Leap Motion gesture interface.

I make mistakes when I type. I am a bit clumsy and I do use only three fingers on each of my hands to type. However, making a mistake when I type still requires me to actually press a physical button. There is no ambiguity in pressing a button. Once a button is pressed, by intention or by mistake, the system can execute whatever function the software has assigned to that button.

If we go back to the gesture interface from our previous example, we shall see that things are not always so simple. Sure you can have a sophisticated system to recognize the hand gestures of the user, but how will i know if the user is waving his hand to flip the pages of the virtual book or he is waving for some other unrelated reason. Issuing a command by gesture can seem intuitive, but the simpler these gestures are, the harder it is for the system to understand if the gesture was intentional or done by mistake. The more complex the gestures, the harder for the user to execute them properly it will be.

This problem is not unsurmountable, but it is there. Some systems circumvent this by making users hold onto a small device equipped with a button. Pressing a button again indicates the intention. However, even this approach has its limitations. First, it is not a pure gesture approach. It still requires a contact between the user and the dedicated device. Second of all it doesn’t help in distinguishing between two distinct yet similar inputs.

Even established devices have this problem. Consider swipe commands on a touch screen. Swiping up and down scrolls the content of the virtual page. Scrolling left or right flips the virtual pages, or tabs on the browser. Depending on how you hold your phone, your motions are almost never going to be perfectly horizontal or vertical. If you are like me, your motions are somewhat diagonal. Every so often the system misinterprets them and does the opposite thing from the one I intended, annoying me to no end.

Conclusion

None of these three things is something that represents an insurmountable obstacle in the development of VR. Technology developers can mitigate their effect, or work around these problems. In some cases, it is even possible to work with them to the advantage of potential users.

However, when working in this field one needs to be aware of these, and many other notions. Failure to recognize their importance can lead to serious design failures. These three concepts are largely independent, however they can act in conjunction exasperating each other’s negative effects. For example, the combination of lack of passive haptic feedback and false inputs has been a big obstacle in the development of gesture-based interfaces in history. Microsoft Kinect remains a gimmick while, gamepad, keyboard and mouse remain the staple of gameplay to this day.

Links

--

--