Jingle Bells or classification of AR technologies (2018-12-24)

Tagged as: blog, AR, AR technologies
Group: G In this blog article state-of-the-art technologies are presented and visualized as organograms.

Hello everyone!

Christmas is nearly here and before we can look forward to the presents and huge amounts of food that are lying ahead of us within the next days, we would like you to present the result of one of our big research milestones: the classification of Augmented Reality technologies.

Within the last few weeks we read different surveys, papers and books about this topic and got a good overview over the different technologies that are used by Augmented Reality Systems. Gaining an overview over the huge amount of different technologies was an important research step for us before we are able to analyse evaluation methods of AR technologies. Besides, we are now able to put the technologies in relation to evaluation methods, e.g. we can compare the methods for smartphone AR applications to ones using head-mounted displays.

During our research we used literature from Milgram, Takemura, Utsumi, & Kishino (1995) and Azuma (1997) as important historical starting points and focused on recent technological surveys from the past ten years (Papagiannakis, Singh, & Magnenat-Thalmann, 2008, Van Krevelen & Poelman, 2010, Carmigniani et al., 2011) and especially on the state-of-the-art surveys from Billinghurst, Clark, & Lee (2015) and Schmalstieg & Hoellerer (2016). In general the technological key components remained the same since their definition in the 1990s (Billinghurst, Clark, & Lee 2015), but the technology used for these components has evolved over time.

In the following we would like to describe our classification of these technologies. We created a graphic for each key component to visually present our results. We combined and arranged the technologies from the literature within a hierarchical organogram to show the relation and structure of the technologies.

Definition

As a starting point we used the most established definition of Augmented Reality by Azuma (1997) as it was described by Billinghurst, Clark and Lee (2015). Based on this definition, an AR application has three characteristics, where each attribute is represented by one technological key component. It combines real and virtual images, therefore a display is necessary that combines both image types It is interactive in real time, i.e. dynamic graphics are generated and a user interface is necessary for dynamic user input It is registered in 3D, so a tracking system is necessary for determining the viewpoint position of the user and for fixing the virtual image correctly in the real world

For each of the three key components display, interface and tracking different technologies are available that should be classified and described in the following.

Display

We classify displays based on the involved sense or the positioning of the display, i.e. the distance from the user’s eye to the real world as proposed by Billinghurst, Clark, & Lee (2015), Schmalstieg & Hoellerer (2016) and Van Krevelen & Poelman (2010). There are display types for each of the five senses of the human body, although AR has focused on visual displays, as visual input represents “roughly 70% of the overall sensory information” (Schmalstieg & Hoellerer, 2016). Besides, audio displays emerged within the last years that offer on-demand sounds, e.g. museum guides or support for blind people as well as haptic displays that offer haptic feedback of objects. Olfactory and gustatory displays that simulate smell and taste inputs are also available in the field of AR, albeit they are very less. Multimodal displays combine multiple display technologies.

Visual displays have three different subcategories. Within older papers from Milgram, Takemura, Utsumi, & Kishino (1995) and Azuma (1997) monitor-based systems were mentioned, but recent papers do not include these types of displays anymore. Basically there are two possibilities, the real world can be seen through a lens system or the virtual images can be projected onto a real-world object (Schmalstieg & Hoellerer, 2016). The first one includes standard closed-view displays (Azuma, 1997), video or optical see-through displays, where the real world is directly seen with overlaid virtual images (Billinghurst, Clark, & Lee, 2015, Carmigniani et al., 2011, Milgram, Takemura, Utsumi, & Kishino, 1995, Schmalstieg & Hoellerer, 2016, Van Krevelen & Poelman, 2010) or eye multiplexed displays, where the user needs to combine real and virtual images from two near displays by himself (Billinghurst, Clark, & Lee, 2015). Projected images can either be displayed by spatial projection displays with fixed projectors or agile displays with mobile projectors (Schmalstieg & Hoellerer, 2016). Additionally, displays can have different positionings, i.e. different distances from the user’s eye to the real world. They can be located within or in front of the eyes, represented as glasses or contact lenses (Schmalstieg & Hoellerer, 2016), mounted on the head or the body, held in the hand or be stationary or projected onto a surface (Billinghurst, Clark, & Lee, 2015, Carmigniani et al., 2011, Van Krevelen & Poelman, 2010).

User Interface

We found 14 different types of user interfaces for AR applications that we categorized into eight classes. An information browser represents a two-dimensional interface for browsing and filtering information (Billinghurst, Clark, & Lee, 2015). Accordingly, 3D user interfaces allow information manipulation along three dimensions (Billinghurst, Clark, & Lee, 2015). The group of technologies that augment physical objects contain different types of tangible user interfaces, i.e. interfaces on physical objects (Billinghurst, Clark, & Lee, 2015, Carmigniani et al., 2011, Schmalstieg & Hoellerer, 2016, Van Krevelen & Poelman, 2010), virtual user interfaces on real surfaces and augmented paper (Schmalstieg & Hoellerer, 2016). Besides, there are natural user interfaces that support natural interactions like motion or gestures or interfaces that involve other senses, like haptic or aural user interfaces (Billinghurst, Clark, & Lee, 2015, Schmalstieg & Hoellerer, 2016, Van Krevelen & Poelman, 2010).

Additionally, multiple interface technologies can be combined to multimodal, hybrid or multi-view user interfaces (Billinghurst, Clark, & Lee, 2015, Carmigniani et al., 2011, Schmalstieg & Hoellerer, 2016, Van Krevelen & Poelman, 2010). Interfaces can also support the collaborating between multiple users as collaborative user interfaces (Carmigniani et al., 2011) or conversational agents that represent virtual conversation partners (Schmalstieg & Hoellerer, 2016). Finally there are brain computer interfaces that are less in use in the field of AR (Billinghurst, Clark, & Lee, 2015).

Tracking

Tracking is responsible for „[anchoring] virtual content in the real world such that it appears to be a part of the physical environment.” (Billinghurst, Clark, & Lee, 2015). Therefore the pose of the user needs to be defined, i.e. his position and orientation. Ideally the tracking system can determine six degrees of freedom (DOF), i.e. “three variables (x, y, and z) for position and three angles (yaw, pitch, and roll) for orientation” (Azuma, 1997). We found that tracking methods can either be categorized based on the used sensor or the environment, i.e. indoor or outdoor usage (Carmigniani et al., 2011).

There are four main sensor types. Stationary tracking systems include mechanical, electromagnetic and ultrasonic tracking, which are less used in current applications due to their stationary attribute (Schmalstieg & Hoellerer, 2016, Van Krevelen & Poelman, 2010). Mobile sensors include Global Positioning Systems (GPS), wireless networks like WiFi and Bluetooth (Papagiannakis, Singh, & Magnenat-Thalmann, 2008, Schmalstieg & Hoellerer, 2016, Van Krevelen & Poelman, 2010) or magnetometers, gyroscopes, accelerometers or odometers (Schmalstieg & Hoellerer, 2016) that measure acceleration, velocity and distance properties. Optical tracking systems can be divided into visible light and Infrared tracking (Billinghurst, Clark, & Lee, 2015). The latter one includes outside-looking-in and inside-looking-out tracking, where sensors are mounted stationary or moving along with the object (Schmalstieg & Hoellerer, 2016).

Visible light tracking includes model-based or model-free tracking, where a reference model of the environment is available at start or must be created at runtime (Schmalstieg & Hoellerer, 2016). Besides, the natural feature tracking method uses natural keypoints, edges, or keyframes as tracking parameter. If the environment is difficult to interpret, fiducial markers are used, which are artificial markers positioned within the environment that the tracker is able to recognize (Schmalstieg & Hoellerer, 2016). Finally there are different tracking algorithms like the Simultaneous Localization and Map Building (SLAM), Parallel Tracking and Mapping (PTAM), and Structure from Motion (SfM) (Billinghurst, Clark, & Lee, 2015, Carmigniani et al., 2011). Additionally, the Structured Light Tracking method is used by projecting light onto objects (Schmalstieg & Hoellerer, 2016). Finally, there are hybrid tracking methods that use multiple sensors, either in a complementary, competitive or cooperative way (Schmalstieg & Hoellerer, 2016).

As a next step we will analyse state-of-the-art publications and put the described applications and evaluation methods in relation to our categorization.

That’s all for today.

We wish you a merry Christmas and nice holidays! :-)

References:

Azuma, R. T. (1997). A Survey of Augmented Reality. Presence: Teleoper. Virtual Environ., 6(4), 355–385. https://doi.org/10.1162/pres.1997.6.4.355

Billinghurst, M., Clark, A., & Lee, G. (2015). A Survey of Augmented Reality. Foundations and Trends® in Human–Computer Interaction, 8(2–3), 73–272. https://doi.org/10.1561/1100000049

Carmigniani, J., Furht, B., Anisetti, M., Ceravolo, P., Damiani, E., & Ivkovic, M. (2011). Augmented reality technologies, systems and applications. Multimedia Tools and Applications, 51(1), 341–377. https://doi.org/10.1007/s11042-010-0660-6

Milgram, P., Takemura, H., Utsumi, A., & Kishino, F. (1995). Augmented reality: a class of displays on the reality-virtuality continuum. In H. Das (Ed.) (pp. 282–292). Presented at the Photonics for Industrial Applications, Boston, MA. https://doi.org/10.1117/12.197321

Papagiannakis, G., Singh, G., & Magnenat-Thalmann, N. (2008). A survey of mobile and wireless technologies for augmented reality systems. Computer Animation and Virtual Worlds, 19(1), 3–22. https://doi.org/10.1002/cav.221

Schmalstieg, D., & Höllerer, T. (2016). Augmented Reality: Principles and Practice. Addison-Wesley Professional. Retrieved from https://proquest.tech.safaribooksonline.de/9780133153217

Van Krevelen, R., & Poelman, R. (2010). A Survey of Augmented Reality Technologies, Applications and Limitations. International Journal of Virtual Reality (ISSN 1081-1451), 9, 1.