You are about to enter a new reality. Here, the world augments itself to you, morphing to your context, preferences, and needs. Reality becomes malleable, mutable, and highly personalized; it’s all defined and driven by you. The entire world becomes instantly translatable, breaking communication barriers, and creating a new sensory awareness that makes seeing, hearing, touching, and tasting brand new. The rules of the analog world no longer apply. Wearable computers, sensors, and intelligent systems are extending our human abilities and giving us superpowers.
This is the new Augmented Reality. Are you ready?
In this book, I’ll introduce you to Augmented Reality (AR), how it is evolving, where the opportunities are, and where it will go. I’ll guide you to a new dimension and immersive experience medium. However, you won’t need to leave your physical reality behind. The digital enters your world.
Let me explain.
SnowWorld, developed at the University of Washington Human Interface Technology (HIT) Lab in 1996 by Hunter Hoffman and David Patterson, was the first immersive VR world designed to reduce pain in adults and children. SnowWorld was specifically developed to help burn patients during wound care. Hoffman explains1 how VR helps to alleviate pain by distracting patients from their present physical reality:
Pain requires conscious attention. The essence of VR is the illusion users have of going inside the computer-generated environment. Being drawn into another world drains a lot of attentional resources, leaving less attention available to process pain signals.
VR is reliant upon the illusion of being immersed in another space and time, one that is typically removed from your current reality. In AR, you remain in your physical world and the virtual enters your surroundings by way of a pair of see-through digital glasses, a smartphone, a tablet, or a wearable computer. You still see and experience the real world around you with all of your senses, it just now becomes digitally enhanced and alterable.
One early application of AR to deliver a helpful experience was Word Lens.2 Imagine travelling to a new country where you aren’t fluent in the local language. Ordering food from a menu, or reading road signs can be challenging without someone to assist you. Word Lens allows you to point your smartphone at printed text in a foreign language and translate it on the fly into the language of your choice. Suddenly, you are more deeply immersed and engaged with your surroundings via a newfound contextual understanding assisted by technology.
VR will have its dedicated uses, but AR allows us to be more deeply immersed in, and connected to, the real world—the world in which we actually spend the majority of our time and attention. As with VR, we must be cognizant of draining our “attentional resources” in AR and design experiences that do not further separate us from our surroundings or one another. We must think critically about how we will place human experience at the center of this new medium. It’s not about being lost in our devices; it’s about technology receding into the background so that we can engage in human moments.
The most commonly used definition of AR is a digital overlay on top of the real world, consisting of computer graphics, text, video, and audio, which is interactive in real time. This is experienced through a smartphone, tablet, computer, or AR eyewear equipped with software and a camera. You can use AR to point at and identify stars and planets in the night sky, or delve deeper into a museum exhibit with an interactive AR guide. AR presents the opportunity to better understand and experience our world in unprecedented ways.
We’ve been using the same definition of AR since 1997 when AR pioneer Ronald Azuma succinctly explained,3 “AR allows the user to see the real world with virtual objects superimposed or composited with the real world. Therefore, AR supplements reality, rather than completely replacing it.”
AR technology traditionally works by tracking a target in the real world using a camera and software on an enabled device like a smartphone. These targets can include things like an icon, an image, an object, a sound, a location, or even a person. The target input data is processed by the software and compared against a database of potentially corresponding information. If there’s a match, an AR experience is triggered and content is superimposed on top of reality.
Azuma’s definition states4 that AR systems have the following three characteristics:
Combines real and virtual
Interactive in real time
Registered in three dimensions (3-D)
Registration, the third characteristic, is about seamlessly aligning virtual objects into 3-D space in the real world. Without accurate registration, the illusion of virtual objects existing in the physical world is compromised; the believability is broken. So, if a virtual AR lamp appears to float above your physical desk rather than being registered directly to the table, other than believing your office is haunted, this technical glitch breaks the illusion of that lamp existing in your space. But when a shadow is added to the virtual object, it becomes even more believable because it mirrors the characteristics of your physical environment.
What’s missing for me in this definition today, and what distinguishes the next wave of AR, is one keyword: context. Contextual information transforms the AR experience and content because it now moves from an experience that is the same for every user to one that is specific to you, your location, your interests, and your needs. Context builds on the characteristic of registration because it is registering, or compositing, relevant and meaningful data on top of the real world to create a personalized experience for you.
The success of this contextual registration in the new AR will not be about a virtual lamp looking like it sits perfectly on your physical desk (as in the 1997 definition). It will be about that lamp appearing at the appropriate moment when you perhaps need more light, or even turning itself off to indicate that it is time for you to leave work. Technical registration will be solved, and although it will continue to be important, the focus will be on delivering a meaningful and compelling experience that enhances your reality.
The process of target matching now becomes more complex because it is no longer a “hit play” process connected to a static library of things—like a photo of a dinosaur in a textbook which then triggers a 3-D model of a dinosaur displayed in AR. Today, that 3-D model and experience can be adaptive dependent on factors like how far each student has progressed in a lesson plan, and even the student’s learning style. So, the next time the student returns to the AR book, the dinosaur species has changed and it integrates other topics in which she is interested. AR technology becomes a living, breathing database: an interaction in which both the triggers and the content are dynamic and can change at any moment because they adapt to your shifting contextual data to deliver timely and relevant information and experiences dictated by you and your environment.
We are well overdue to revisit what AR is and what it can become, especially with AR no longer limited to academic research facilities. AR once required highly specialized equipment, none of which was very portable. But with the number of sensors in your smartphone today, you have the power of AR in your pocket. The technology will continue to become more ubiquitous with wearable computing discreetly embedded in your clothing and glasses, and even under your skin.
Large companies like Apple, Facebook, Microsoft, Google, and Intel are paying close attention to and investing in AR’s future to bring it to a mass audience. Facebook CEO Mark Zuckerberg refers to AR as “a new communication platform.” He writes, “One day, we believe this kind of immersive, augmented reality will become a part of daily life for billions of people.”5
Apple CEO Tim Cook regards6 AR “as a big idea like the smartphone.” Cook says, “I think AR is that big, it’s huge. I get excited because of the things that could be done that could improve a lot of lives. And be entertaining.”7 In 2017, at the annual World Wide Developers Conference (WWDC), Apple introduced ARkit, a cutting-edge platform for developing AR apps for iPhone and iPad. During the WWDC keynote, Craig Federighi, Apple’s senior vice president of software engineering, referred to ARkit as “the largest AR platform in the world.”8
AR is about augmenting the human experience and it will not advance in isolation. The real impact AR will have is when it becomes a super medium that combines other parallel emerging technologies like wearable computing, sensors, the Internet of Things (IoT), machine learning, and artificial intelligence.
The first wave of AR, which I refer to as “Overlay,” was all about a digital layering on top of reality. Overlay included examples like a 3-D model of a baseball player virtually appearing on a baseball trading card, or an augmented quiz game appearing on a beer coaster. There was little to no variation if you returned to the AR experience later; it was typically the exact same content, providing not much incentive for repeat experiences. Often in this first wave, you were also required to download and print a specific image or target to trigger the AR experience.
We are entering the second wave of AR, which I call “Entryway,” creating a more immersive, integrated, and interactive experience. The key difference between Overlay and Entryway (and the secret to creating meaningful AR experiences) is you. You are the driving force in Entryway. You are the context that defines the experience.
Unlike in Overlay, this next wave moves beyond printed targets toward a new spatial understanding and deeper intelligence of your environment. The entire world becomes a trackable target. In Entryway, we break through the limitations of overlays in the first wave, stepping into a new sensory awareness and heightened interaction with our world and each other.
New sensor-equipped AR smartphones, like the Lenovo Phab 2 Pro, and the Asus ZenFone AR, powered by Google’s AR technology, Tango, are a great example of Entryway at work. Tango technology incorporates motion tracking and depth perception, enabling a device to navigate the physical world the way humans do.
As you hold the device and move it around a room, the depth-sensing camera sees what you see and is able to identify the physical boundaries and layout of your surroundings. It can recognize where walls are, where the floor is, and even where furniture is positioned. In the not-too-distant future, technology such as Tango will enable new types of everyday experiences like reading your child a bedtime story. Imagine the foot of the bed transforming into a virtual safari truck as you watch a monkey jump from the dresser onto the lamp, all while a lion sleeps soundly on top of the dresser. Your physical environment is integrated into the story world, putting you directly into the story.
Microsoft Kinect introduced a major shift in the way AR technology worked to recognize a target in the real world. Kinect was instrumental in putting you inside the AR experience because your moving body now became a trackable target. Prior to Kinect, AR targets were typically static and limited to things like printed images. This technology opened the door to more interactive experiences to better see and sense you and your actions, with the ability to even recognize your facial expressions and how you are feeling. (Chapter 2 looks at how computer vision has evolved in AR and how it’s giving us new eyes with which to experience the world.)
Kinect inventor Alex Kipman (and inventor of Microsoft’s AR headset HoloLens) describes9 Kinect’s impact as “a monumental shift, where we move the entire computer industry from this old world, where we have to understand technology, into this new world, where technology disappears and it starts more fundamentally understanding us.” AR technology not only sees us and the environment that surrounds us, it begins to understand our activity, and responds to us. The way we interact with technology becomes more natural because the technology disappears and the experience becomes central. This is Entryway.
Entryway is all about a new level of immersion: we’re reaching through the looking glass of Overlay to experience the virtual with all of our senses in a new dimensionality. Engaging the human senses beyond the visual will play a more prominent role in this next wave of AR. For example, augmented audio is often paired with visuals, but sound can be used in AR on its own without a display, or even integrated with other senses. In addition to sight and sound, we now are able to touch, smell, and taste the digital, and even create new senses (these ideas are further explored in Chapters 3, 4, and 5).
In Entryway, AR embraces a new mode of hybrid physicality and virtuality. AR imbues the physical world with digital properties, and the virtual gains a new sense of tactility. Haptics technology enables a person to experience the sensations of touch and to feel the digital using interfaces such as air pressure fields, deformable screens, and special controllers. For example, AR makes it possible to reach out and pet a virtual cat, actually feeling its fur and the vibrations of it purring.
Taste and smell are also possible in AR with devices like the “Electronic Taste Machine” and the “Scentee,” both inventions of Adrian David Cheok, Professor of Pervasive Computing at City University London. The Scentee, a small device which you plug into the audio jack of your smartphone, allows you to send smell messages that release aromas. The Electronic Taste Machine uses metal sensors to trick your tongue into experiencing various tastes, ranging from sour to bitter, salty or sweet, depending on the electrical current passing through the electrode. This results in a virtual taste perception in your brain.
Cheok wants us to be able to interact with computers the way we interact in the physical world using all of our five senses. He explains:10
Imagine you’re looking at your desktop or your iPhone or your laptop, everything is behind a glass, behind the window, you’re either touching glass or looking through glass. But in the real world, we can open up the glass, open the window and we can touch, we can taste, we can smell.
This next wave of AR allows us to “open up the glass” and augment the human sensorium.
The human brain can translate digitized and electrochemical signals to create meaning and even new sensory experiences. Humans currently don’t see things like radio waves, X-rays, and gamma rays because we don’t have the proper biological receptors. It’s not that these things are unseeable; humans cannot see any of these, at least not yet, because we are not equipped with the proper sensors. AR can give us these new superpowers to not only see, but use our entire bodies to fully experience a broad range of information and data. We have the technology to engage with and know our world in extraordinary ways.
AR has made it possible for medical practitioners to interact with virtual 3-D scaled models of human anatomy. A physician now can manipulate a digital model and even 3-D print different stages of a procedure. Recent developments in haptics will enable a surgeon to one day work on a virtual brain physically, engaging in a full tactile experience before performing a real life operation.
We can use AR today to track facial expressions to see when a student is struggling. Teachers will be able to use this technology in the near future to alter and customize content to learners. For instance, if you’re taking a distance learning course or watching a lecture through your AR device, and you appear confused, the subject matter would be further explained to you. Alternatively, if you’re not paying attention, you might be prompted with a question.
AR already provides instructional repair guidance with the ability to share what you’re looking at and receive real time annotation. New design processes, allowing real time remote collaboration are emerging and will change the way we work across distances. For example, an architect in Japan could be on location with a builder in Canada, interacting and fully engaged at the job site.
One day, you might no longer need a television: your AR headset will become your entertainment hub, full of personalized content. Whether your favorite performer appears in your home and sings to you, or you’re in an open field and competing to make it through a virtual maze, new forms of digital content will be tailored to and coexist with your physical surroundings.
When I began working in AR 12 years ago, the primary focus of the field as a whole was on the technology; content came much later, if it even did, and it was typically an afterthought. At a time when most researchers and developers were working on registration and tracking problems in AR, I was fortunate to be a member of an extremely unique lab at York University in Toronto, Canada (led by Dr. Caitlin Fisher), where we were working on defining the future of AR storytelling. Our lab was very different from other research facilities at the time: we were based in the Faculty of Fine Arts and Department of Film, whereas most university research labs in AR were found in computer science departments. Other labs typically focused on one particular area of technical research in AR and specialized to invent and refine those techniques. Our lab, on the other hand, was centered on creating content and experiences.
We were software and hardware agnostic in our approach. The technology inspired the experiences we designed, but we didn’t limit ourselves to any of the technical restrictions of AR. There were a lot of labs working on solving those problems; the area that was not being explored was content creation and the new types of experiences this technology would allow. We experimented with multiple emerging technologies, combining them in new ways to push beyond the limitations of how AR was traditionally used. If the technology didn’t exist, we collaborated with engineers and scientists to create it.
In 2009, our lab developed SnapDragonAR, one of the first commercially available drag-and-drop software tools to enable nonprogrammers to build experiences and contribute to this new medium, making AR accessible to educators, artists, filmmakers, and a general audience. This created a gateway to content production for makers of all kinds. We expanded the world of AR beyond the technical realm of computer science, with innovators working in AR today continuing down this path.
AR is no longer just about the technology; it’s about defining how we want to live in the real world with this new technology and how we will design experiences that are meaningful and help advance humanity. The technology, awareness, and state of AR have evolved tremendously over the past decade. Now that we have all of this incredible technology, what are we going to do with it? This is our question to collectively answer as we define AR’s trajectory. We are in need of leaders across business, design, and culture to help steer and implement new experiences in this rapidly rising industry. AR will radically change the way we live, work, and play.
6 David Phelan, “Apple CEO Tim Cook: As Brexit hands over UK, ‘times are not really awful, there’s some great things happening’,” The Independent, February 10, 2017.