As we ease out of the first month of 2024, we are now fully engaged in the new year. In the past 30 days, I’ve had an opportunity to learn from my peers, such as Tom Emrich of Niantic (trend watches on his newsletter) and the co-chair of the AREA Research Committee, Samuel Neblett of Boeing, and to reflect on the projects in which I’m involved.
I’ve compressed my vague sense of hope and excitement down into a few enterprise AR trends I will be watching over the next 11 months. These are not predictions but significant areas of focus that I believe will drive innovation and the adoption of enterprise AR. I’m now officially keeping track of these trends to see where, how, and if they come about.
Please share these with your colleagues and your partners. Do you have evidence that either confirms or questions any of these trends in your companies? I hope you will share your evidence, feedback, and ideas with me at [email protected].
Artificial Intelligence
The convergence of AI and AR is the most significant and least surprising of the trends to watch in 2024. The signs are everywhere.
#1 Enterprises are beginning to internally test Generative AI (GenAI), including LLM lakes and private co-pilot solutions. Early adopters will increasingly combine these capabilities with AR tools. There are dozens of ways that the use of AI improves workflows and reduces the costs of enterprise AR. Well-positioned and programmed AI can extract relevant content from corporate data sets for visualization. Here are a few examples of where and how GenAI could boost AR:
Using Digital Twins for baseline and AI for detecting and matching features in 3D environments (rare in 2023), we expect enterprises to expand their interest in and need for spatially-aware apps and services. For example, we will see a proliferation of AR-assisted Visual Positioning Services for navigation and risk detection based on 3D maps.
Combined with advances in hardware (see below), GenAI will permit the automatic generation of richer AR experiences for hundreds of use cases, including but not necessarily limited to 3D spatial maps. Multi-modal LLMs, an advanced type of AI that can understand and generate not just text but other types of data, such as images, audio, and possibly even video, are on the rise. These Multi-modal AI models incorporate previously captured scenes into new instructions. They will detect sounds from the environment and predict risks or propose the user to respond in specific ways without being programmed/coded in advance.
#2 AI and computer vision advancements could address concerns over privacy in data collection and handling. Privacy and sensitivity to security risks from the use of cameras and other sensors in the workplace continue to be obstacles to large-scale AR deployments. With AI, real-time image and feature detection, blurring, and obfuscation methods can be combined with AR displays (or their associated services and software) with lower cost and power. Enterprise AR solutions for protecting the privacy of things, places, and people (AR device users and those around them) with AI in the loop will proliferate in response to the need for compliance with corporate privacy policies as well as national and international regulations.
Hardware
#3 Aside from a few roles (e.g., architects or those viewing medical imagery), knowledge workers don’t need to spend their time or money on large, virtual screens (aka Apple Vision Pro). Video see-through isn’t a viable substitute for Optical see-through in the workplace, where employee tasks require hands-free AR and peripheral vision. Video quality issues, including distortion, fixed camera IPD, high ISO, low dynamic range, low camera resolution, and low frame rate, are exceedingly difficult (think: high power use) to overcome. However, a lot of money will be invested, and marketing campaigns will make people try. Try though they will, the entire Video see-through headset push will not make a significant dent in reducing the optical see-through requirement for enterprise AR displays. I’ve heard repeatedly that any risk manager who would approve the use of video see-through XR displays for use in a production environment where risks are high is risking their employment.
#4 Smaller, more powerful, and less power-consuming sensors will be more economical to deploy and manage. In addition to the lower cost of implementation and management of IoT, more specialized semiconductor solutions, especially those specialized in computer vision but also for processing audio and motion, are increasingly being added to AR display devices. Imagine sensors on the device detecting the user’s need for corrective lenses and then generating the corrected version of the real world (enhanced with AR, of course) without the user’s being aware or needing to wear two pairs of glasses. The improvements in display capabilities, combined with cheaper hardware distributed in the user’s environment (think: intelligent spaces) and connected to AI in the display or on edge computing hardware, are making context awareness less expensive to acquire and more reliable. A deeper understanding of context translates to many of the other trends identified below.
#5 More companies will introduce lightweight, cheaper (and less capable) AR glasses to the market. Not all users need or want a full “computer” on their heads. There are more ways to add value than a helmet or a heavy and powerful wearable AR display. Some devices are offloading processing to tethered phones. Others offer wireless, monocular AR glasses to display only heads-up messages to users. We will also watch for the audio-only AR glasses segment to expand where voice prompts and AI-enabled audio responses satisfy the use case requirements.
UX
#6 New modes of interaction are beginning to complement/replace/displace the need for controllers and virtual keyboards. We are already starting to see more use of eye tracking, gaze, and natural gestures (e.g., pointing with better hand tracking) for inputs. Improvements in hand gesture tracking technologies will, in many cases, translate to lower cognitive loads and lower computational loads. Neural inputs using a headband or muscular signals via a wristband allow users to control all their digital devices using natural human interfaces. The user’s tongue might even become a source of input. Also, look out for brain sensing with EMG.
#7 Similarly to #6, due to new and different sensors in devices, there will be developments in how users receive/perceive the digital data in context in the workplace. In addition to animations, video clips, still images, and text, we will see rapid experimentation and exciting opportunities to use spatial audio and to provide just-in-time instructions and information to users using combinations with other wearables (e.g., watches and smart garments).
Infrastructure
#8 Private 5G networks, combined with 5G compatible hardware and cloud and edge computing, will permit richer experiences without heavier or power-consuming devices. While the verdict is still out on the cost-effectiveness of private 5G networks based on current implementations and use cases, they are gradually improving. There will be more 5G support in the next-generation AR displays. These core enabling technologies will lead to increased adoption of AR experience streaming and collaborative AR experiences.
#9 Security for AR experiences may be addressed in the network using improvements in off-device and automatic authentication of AR users and devices. Ensuring corporate cybersecurity is an enormous concern for all IT departments, and most AR devices are ill-equipped to meet all the requirements. Expertise in security risk reduction is not a core competency of most AR providers. Innovations to ensure high corporate data protection, privacy and reduce exposure from AR user intentional or inadvertent actions will come from network technology providers. They and their service provider customers have solutions that are emerging from research and will be tested in the near future.
Software
#10 Low-code/no-code will continue to gain traction with the assistance of AI. There are now dozens of low-code/no-code solutions available. The problems are figuring out which ones meet the enterprise requirements, including but not limited to security concerns. While AI eats away at the need to manually code experiences, subject matter experts are becoming the authors of more and more custom experiences. The biggest winner from this trend will be medium-sized companies without the necessary engineering resources to meet all their AR use case needs. With the low-code/no-code options reaching greater maturity and ease of use, the need for dedicated and highly paid AR experience developers and tools with steep learning curves will diminish.
#11 Standards are increasingly relevant and, combined with the expanded support of open-source libraries, reduce the need to develop and maintain display-specific apps and content for delivering experiences across a range of AR devices. Although W3C WebXR continues to evolve slowly, the processing requirements for Web-based solutions are being increasingly met by the hardware in a broader range of AR display devices. The improvements in network infrastructure also make more edge processing possible. Using the Web to provide AR experience content is highly scalable and can be entirely deployed in a company’s Intranet. Khronos Group’s OpenXR is already widely adopted on AR hardware and, combined with support for glTF, is significantly simplifying the development of content creation platforms (fueling the no-code/low-code trend). We expect that other standards will be adopted for AR experiences.
#12 AR developers’ skill sets and tools become more specialized, and the learning curves become steeper. On the one hand, AI and adopting standards simplify and accelerate the creation of AR experiences; they also introduce new risks. These are golden opportunities for specialization. AR developers and those with expertise in adjacent fields will increasingly have new offerings, such as deeper integrations with Learning Management systems, Enterprise Resource Planning, and Product Lifecycle Management platforms. Editing of AR experience recordings to preserve knowledge and accelerate its transfer will combine AR expertise with AI tools.