A shift supervisor is standing in a large machine hall monitoring production. From her vantage point she can overlook the complete hall. She is carrying no laptop, no tablet, not even a phone. Instead she is wearing a plain pair of glasses and looking from one machine to the next. When she gazes at a machine, a window opens in her field of view, showing the machines’ current status. The shift supervisor can check which job the machine is working on, the number of finished items, and if all tools are in order. When she gazes away from the machine, the window disappears and she can look around the machine hall freely.
A promising vision, but is it realistic? Unfortunately at the current state of technology it isn’t – yet.
But the progress of companies like Microsoft, Magic Leap, or Daqri in the development of augmented reality glasses suggests that augmented reality will be ready for use in the near future. This is a giant step towards Industry 4.0: across the production process workers can be supported in their everyday tasks. But how can we prepare for this new technology? How can we already gain experience now to start developing user-friendly applications when operational devices are released, integrating augmented reality effectively and efficiently into work routines?
With the DeepSight project we at Centigrade have found a way to prototype augmented reality applications right now while identifying and leveraging possible advantages of this promising technology. For this we resorted to another technology that might be surprising in this context: virtual reality.
Why not use AR from the start?
Visionary reports and videos are continuously appearing since the 2016 HoloLens release showing how people can be supported in their daily work with augmented reality glasses. Augmented reality (AR), the extension of reality through additional virtual elements, could be used in widely varying industry sectors. For example the planning, development, and maintenance of products could be significantly improved. However there is a catch to all these promising visions: at the moment they are just that: visions.
Of course the HoloLens release was a first big step for this technology. Other manufacturers are also feverishly working on their AR glasses or releasing first prototypes. Still, available AR glasses have many restrictions. Most devices currently are everything but inconspicuous, weighing down the wearers and limiting their field of view with tinted glasses. The actual display area for virtual elements is currently very small as well. For example, the field of vision of the HoloLens is only 30°×18°. The human field of vision (german article) in contrast usually is 180°×130°. This means only a small section of reality can be augmented with virtual overlays.
Additionally most AR glasses have short battery life which limits the wearer’s mobility. And the requirement that augmented reality systems should completely work in real-time is one of the greatest and most critical challenges. At the end of the day AR glasses not only have to overcome technical problems to be used in an industrial setting but also have to meet safety requirements. This means further technological development is needed until our visions can come to life.
The current Gartner study on emerging technologies forecasts that augmented reality will be used productively in five to ten years. But the unfinished development of augmented is no reason to digress and wait. Conversely the time until productivity is a chance to partake in development and work on making the visions reality – bit by bit.
The new technology not only raises hardware questions but also brings completely new software challenges. Especially the user interface and the related user interactions have to be re-imagined from the ground up. Because of the additional third dimension established design methods can’t be used anymore. Completely new questions appear:
- How can virtual elements be integrated into the real world so users can access them optimally?
- How should AR elements be activated and deactivated to support the users in their task without distracting or hindering her?
- How should virtual elements be designed and positioned to ensure good readability and an ergonomic work environment?
- How can users intuitively interact with AR elements?
In my master thesis I spent the last six months trying to find some answers to these questions with the DeepSight project.
With DeepSight we want to use a short scenario to look into the future of augmented reality. In cooperation with Festo Polymer GmbH we developed our vision of an AR-supported shift supervisor. As described at the start of this article the shift supervisor should be able to stand at a vantage point in the machine hall and overlook all machines in the hall. When she gazes at a machine, information on the machine’s current production job should be displayed. The shift supervisor should be able to get further details about the job by pushing a button. The scenario should also include a failure occurring on a tool inside a machine. Through an AR element the shift supervisor should be alerted to the failure so that she can look at the affected machine and fix the failure.
Using this scenario we wanted to develop and implement an AR user interface and an AR interaction concept. We assumed that the application is used with AR glasses that have overcome the current limitations. Of course it is difficult to develop and test such an application when the necessary hardware does not exist. This is why I had to cheat a little and take a detour via another technology: virtual reality (VR).
From VR to AR
What is the difference between augmented reality (AR) and virtual reality (VR)? While the wearers of AR glasses are looking at their real surroundings that are only augmented by some virtual elements, the wearers of VR glasses are transported to completely virtual surroundings and have no visual connection to reality anymore.
VR glasses are currently developed a little further than AR glasses. Even if they have some weaknesses they offer their users a large field of vision and the possibility of limited movement in the virtual environment. The current state of VR technology lends itself well to the building of simulations. Typically real situations are simulated in this way which can be useful during project planning or for education and training.
For DeepSight I wanted to place the user in a machine hall matching our scenario. By putting on the glasses a DeepSight user should take the role of the shift supervisor. She should have the impression of being in a real machine hall. When looking around the machine hall the only thing not seeming real should be the displays of machine information. This way it should be possible to develop the AR user interface and the AR interaction matching the scenario and test it in the VR application.
All of this may look confusing at this point. Let me give an example:
Imagine you are wearing AR glasses. You are at home in your living room. In front of you are some objects. When you look at one of the objects a little sign appears over it which shows the name of the object. So far, so good.
Now imagine you are wearing VR glasses. They completely cover your field of vision. This means while you are still in your living room physically, through the glasses you suddenly see a sweeping Arcadian landscape. Here, as well, are some objects in front of you and when you look at one of them its name is displayed. For you the only practical difference between the two pairs of glasses is that the VR glasses visually transport you to another location. However the possible interaction, looking at objects and thereby activating the name tag, stays the same. The arrangement of the tags and their design is the same as in the AR glasses.
For exactly this reason it is already possible to develop AR user interfaces and AR interactions now and test them with VR glasses. But for the testing to work, the user has to feel actually present in the virtual environment.
Appealing to the users senses
Most VR applications transport the wearer of the VR glasses into a world consisting of computer-generated 3D models and animations. Depending on what objects are displayed this way and how much time was spent on their creation such an environment can seem more or less real.
For users to feel actually present in the virtual environment it was crucial for the DeepSight application that the machine hall and everything happening in it seems as real as possible. This is why we decided to use a real movie of the machine hall of Festo Polymer GmbH in place of 3D models. To have the best possible visual depth and give users the impression of actually being inside the hall we recorded the move stereoscopically with two 360° cameras. Additionally we recorded sound to later reproduce the ambient noise in our application.
Using the material recorded this way we were able to build a first version of the VR application that enables the wearer of the VR glasses to virtually visit the machine hall.
Because of the cameras raised position during the recording of the movie, viewers already had the feeling of looking down into a deep hall. Just the moving machine hall combined with the noise of the working machine was impressive. To give users a solid footing in the hall I expanded the film with a room consisting of simple 3D models opening into the machine hall but bounded by a railing. I let some colleagues test the result.
Watching the testers look around the hall and react was very interesting and sometimes had me smiling. Many of them walked up to the virtual railing to see as much of the machine hall as possible. Some cautiously leaned over the virtual railing trying carefully not to touch it. Others reached for the railing and had to laugh about their action as the railing only existed in the virtual world. Watching how my colleagues interacted with the virtual railing gave us the idea to expand the application with a physical counterpart. We built a prototype railing from PVC tubes and connectors and positioned it according to its virtual counterpart. Afterwards, nearly everybody testing the application for me actually rested their hand on the railing.
Even early in the project these tests made clear that a feeling of presence in a virtual environment can be reached only through appealing to as many senses as possible. Being able to not only see but also hear and even touch the virtual environment increased the impression of being actually present. We already investigated this effect in an earlier research project: DeepGrip.
Augmented reality user interface and interaction concept
Now that we succeeded in displaying the machine hall in the VR application I could focus on the actual goal of my project: the development of the AR interaction concept and the AR user Interfaces. I was curious how the design of a 3D interface would differ from that of a 2D interface.
Festo Polymer GmbH provided data for the production jobs of single machines so I could recreate the shift supervisor scenario. With this data I could design AR elements that should be displayed in the user’s field of vision when they look at a machine. First I experimented with fonts and font sizes that I examined in the VR glasses. The legible display of text stood out as a problem. The relatively low resolution of the VR glasses did not allow the use of thin fonts or small type, which meant the AR elements had to be relatively large. Black backgrounds created a contrast with the bright machine hall to ensure clear recognition of the AR elements.
Next I had to find out how to best position the AR elements in the machine hall. When the user looks at a machine an AR element should be opened containing data for this machine. This means the AR element should be visible as an overlay directly in the machine hall and have a clear connection to the respective machine. First I positioned the overlay like a sign immediately above the focused machine. Again I let some colleagues test the application. We found out that the AR element was recognizable well but that testers had to lift their gaze to read its contents. However all testers were reluctant to look away from the machine because they assumed they would lose their selection and disliked the necessity of moving their head just to read the overlay. Also, some users complained that the overlays always were positioned vertically on the machines. Especially for machines close to the users the AR elements were hard to read.
With iterative adjustments and tests I could finally determine that for an ergonomic reading posture the overlay should be oriented slightly tilted it always faces the user. Also the head movement to read the AR element after it appears should be as small as possible. By positioning the overlays next to the machines towards the aisle, and always calculating their alignment depending on the users head rotation during the fade-in, I found a display mode that all testers found comfortable.
Now I could turn to the interaction concept. A user of the DeepSight application should be able to open an overlay by looking at the respective machine. The head movements of the wearer of the AR glasses determine where she is looking at the moment. To ease control for the users I created a plain cursor that moves with the users head movements similar to a mouse cursor. After activating an AR element the user should be able to get more details on the selected machine. To enable this interaction we used the VR glasses’ proprietary controller. By pushing a button users can switch to the detail view of a machine. There, they can select menu elements through movements of their heads and trigger further interactions through pushing the button again – like switching off the machine or rejecting a tool. They can leave the detail view through a menu element as well as by hitting a back button on the controller.
To fully implement the scenario I had to simulate the occurrence of a failure on a machine tool. For this I created a simple warning sign as a symbol that appears over a machine when a failure is triggered. At the same time a warning sounds to alert the user to the failure independent of her current viewing direction. Subsequently the user can fix the failure by switching off the afflicted machine or rejecting the faulty tool. Every change the user makes to the machines in this way is reflected in the displayed data afterwards so she can directly verify the effectiveness of her intervention.
With this, the complete scenario of the shift supervisor supported by augmented reality was implemented in the application. But how intuitive is the implementation? Do users understand how they can interact in the machine hall without prior briefing? Is the combination of gaze selection and buttons comfortable and self-explanatory or awkward and complicated? To determine if I successfully developed a user-friendly AR interaction and a comprehensible, attractive user interface I conducted a user study.
The user study
After months of work the moment of truth had arrived. How would people manage in the application without any information about the DeepSight project? I invited 33 testers to participate in my study. None of them was from an industrial background. As I only wanted to test the general interaction concept which could be re-used in any other context, everyone should be able to participate in the study.
To find out if my testers could use the application intuitively I did not give them any hints how to interact with the machine hall. Every participant got a short introduction in which I told them that they will be taking the role of shift supervisor. I explained to the testers that their duty in this role is to supervise the machines in a hall. They also should try to fix occurring faults. After the application was started no further communication happened between the test administrators and the participants. The testers were on their own in their new role of shift supervisor.
One test run took about three minutes. I gave every tester some time to look around the machine hall and understand that they could activate AR overlays by gazing at machines. Then I triggered the first fault. The image from the VR glasses was replicated on my screen so I could watch excitedly how the participants reacted. Some systematically proceeded to find out what they could do. Others were visibly nervous until they had found the fix. I was excited for every tester and happy to see them become faster and more confident when fixing the second and third fault after they had successfully fixed the first.
The test results showed as well that the participants quickly learned to interact with the system.
The test results
29 of 33 participants could fix all three faults. 2 testers could fix only two mistakes although one participant already had switched off a machine before a fault could occur. The other 2 participants could not fix any mistakes. One of those two testers had great difficulties reading the content of the AR overlays. He told me that his eyes have very different eyesight which may have resulted in the stereoscopic display of the VR glasses not working for him. The other test person that could not fix any fault said that she generally feels insecure interacting with technology. She still liked the impression of the machine hall.
Following the test I conducted a short interview with every participant and let them fill out a questionnaire. I surveyed the feeling of presence with the standardized IPQ questionnairez and the user friendliness of the application with the standardized SUS questionnaire. The interviews already showed that most participants perceived their visit to the machine hall as very positive. The presence of workers in the hall in particular was remarked upon. Also, many participants noted the combination of visual and auditory impressions. The interaction concept was adopted positively by the testers as well and perceived as easy. Some impressions from the testers:
„It made a real difference that actual people were working there. It felt really alive.”
„I found the overall impression very coherent. It gives you the feeling of being in there by being visual and also acoustic.”
„To me, the gaze interaction was surprisingly comfortable and easy. If you had described it to me beforehand I would have thought that it is way more complicated and awkward. But you could simply gaze and something popped up. I found that surprisingly good.”
“I was fascinated that it reacted to my gaze. I’m looking at something and seeing something. That was fun.”
“You really had the feeling of being the boss of the whole thing, of being able to actually control it.“
“I found it super exciting that it’s relatively fast and intuitive to get the interaction concept and explore it a bit and try different things. When there was a fault it was very easy to fix.”
Especially the use of the controller felt problematic. Because it was relatively large, many participants assumed that it had more functionality than simply pressing two buttons. Some testers tried to move the selection in menus by pointing the controller. But as most of them moved the head while pointing they managed to interact with the menus in this way as well.
It was also interesting that most participants, ignoring a tip before the start of the test, did not look at the controller. The controller was visible in the virtual world as well and the two buttons were marked. However most participants kept their gaze fixed on the machine hall and pressed whichever button their finger was over. Many did not notice the back button which meant they could only close the AR overlays with the associated menu button. Therefore, instead of using the controller it would have been better to use a smaller, simpler object like for example a smart ring which carries the two necessary buttons.
Overall, however, surprisingly many participants were able to work with the application without any introduction and able to fix all faults. In the questionnaires the interaction concept was rated very well, too. The resultant SUS score for the developed application was 86.4.
The feeling of presence was also rated very high overall. In addition we found a significant link between presence and SUS score. Participants that felt more present in the hall rated the interaction concept higher. Therefore we can assume that the interaction concept we developed would be well received in an actual AR application. Also the importance of a realistic VR simulation for testing AR interaction concepts became clear.
A promising vision
With the DeepSight project I could gain a lot of experience in virtual reality and augmented reality. In half a year I dealt with VR simulations, stereoscopic movies and the associated stumbling blocks, with the creation of a feeling of presence by incorporating as many senses as possible, and of course with the exciting creation of a new interaction concept and user interface design for a three-dimensional world. Thanks to my colleagues who were always available for tests and to all participants in my concluding user study, I succeeded in developing a demonstrably intuitive and user friendly interaction concept for augmented reality applications.
Of course I only examined a single short scenario. The concept I developed is suited to support the supervision of large machine halls or similar facilities. Here, users have the advantage of quick and intuitive access to the most important information while having their hands free for other tasks. However augmented reality can also be used in many other areas.
With DeepSight I could showcase that using virtual reality allows an evaluation of the chances and challenges of augmented reality interactions. I could determine that AR applications with a suitable interaction concept are surprisingly intuitive and easy to learn. The virtual overlays can be fluidly integrated into the real environment to facility and accelerate work. This means it is definitely worthwhile to bring future visions to live with virtual reality and develop matching interaction concepts and user interfaces. This way, we can look into the future today, to immediately use AR glasses productively as soon as they overcome certain technical restrictions.