Smart speakers and virtual assistants. It’s most likely you have heard of them or heard stories about them*. Possibly you even own one! (I personally don’t trust them. More on that later.)
Awareness of voice-activated speakers is increasing; over half of GB adults claim to know a little bit about them. However, ownership is still low; 5% currently own a voice activated speaker and just 10% of non owners are likely to buy in the next 12 months.
– Source: Ipsos connect techtracker Q4 2017
They both have one thing in common, they are devices that are always on, always listening, and waiting for your voice commands which usually are initiated with ‘wake/action words such as “Alexa”, “Ok Google” or anything GCHQ deems to warrant putting you on some list. How do you design for zero UI?
Top smart speakers
Virtual assistant: Google Assistant (now available on fridges)
Smart speaker: Google Home
Virtual assistant: Siri
Smart speaker: HomePod
Virtual assistant: Cortana
Smart speaker: Microsoft’s Harman Kardon Invoke
Virtual assistant: Alexa
Smart speaker: Echo
Smart Speaker and Virtual Assistant
Facebook Portal is a smart speaker video calling device that integrates Amazon’s virtual assistant, Alexa.
How we interact
Because we use voices, our interactions with these devices are naturally different from, for example, using a search engine. Queries are more conversational and we speak to them as if we are interacting with another human:
- I want to go on a holiday.
- I want some Sushi, where is the nearest place?
- What Nikes should I buy?
Natural language processing (NLP)
The device that was listening to your conversation before now actually has a query that requires action, they do this using what’s known as natural language processing (NLP).
Almost 70% of requests to Google Assistant are expressed in natural language, not the typical keywords people type in a web search. And many requests are follow-ups that continue an ongoing conversation.
– Source: Google
Natural language processing (NLP) can be defined as the ability of a machine to analyse, understand, and generate human speech. The goal of NLP is to make interactions between computers and humans feel exactly like interactions between humans and humans.
Aside from listening to music and shopping, most users are controlling internet-of-things-connected devices.
- Turn off the lights in the kitchen.
- Set the heating 2 degrees higher.
Among other things, we want them to tell us a joke:
Attitudes towards voice-activated speakers
I’m not the only one with privacy concerns around these virtual assistant UI devices. 1 in 3 adults in the UK worry about conversations being recorded and 1 in 4 feel uncomfortable talking to a machine. 
19% of Americans who own one keep the assistant in their master bedroom, let’s hope your partner is not called Alexa .
Designing the invisible
The total number of voice assistant UI devices will reach 870 million in the U.S. by 2022 – a 95 percent increase from the total of 450 million estimated for 2017.
– Source: Techcrunch.com
What is zero UI?
Zero-UI is the concept of abstracting away the interface that exists between user and device, so the experience becomes a more seamless interaction with the technology.
The real problem with the interface is that it is an interface. Interfaces get in the way.
– Donald Norman, 1990
Designers for voice need to put themselves into the shoes of the user, take into account principles of social interactions and user intent, and then apply brand personality. The concept of the tone of voice has expanded to incorporate other aspects of a brand’s principles and behaviour, from the jokes it makes to the emojis it uses. Now we’re in the realm of user experience design, around conversational UI (but without the interface).
Bret Kinsella (editor of voicebot.ai) gave a keynote speech at the 2018 Smart Voice Summit which further confirmed the urgency for brands to create a voice strategy. Advances in technology (machine learning can now rival human accuracy) and market readiness (16% of Americans own a smart speaker) create the ideal environment for brands to launch their voice app. According to Kinsella’s stats, currently, 41% of consumers prefer voice to mobile apps or the web. The top reason given for this is because it is more convenient.
Designing for voice: clear guidelines on how they expect you to ‘design’ a voice experience.
A good UI also means validating user input and managing expectations in order to earn their trust and instil confidence.
– Source, slightly creepy guidelines: The Conversational UI and Why It Matters
Possible applications for voice
Search engines use different methods to return the most relevant content for voice searches. Now is the time to start thinking about how to optimise your content for natural language search.
Things to think about for returning the right results
- At home
- At work
- On the go
- Smart speaker
- Desktop PC
- Machine learning (can now rival human accuracy)
- Google Rankbrain algorithm
Optimising content for zero UI
With natural language voice search, understanding intent becomes even more important, and navigating the nuances is critical to success. The rise in more conversational search language is one of the main reasons voice search is on the rise.
When you’re optimising for voice search, you need to think about SEO differently. For instance, unlike typical search queries you do on your computer, voice search queries are longer than their text counterparts. They tend to be three to five (or more) keywords in length. This means you need to change the way you do keyword research and think about longer long-tail keywords.
Search queries also tend to specifically ask a question and typically use trigger words like who, how, what, where, best, where, why and when.
Give the machines what they want
I have talked about the importance of structured data and semantics, all for good reason, it is one of the biggest contributors to being ranked in voice search queries. The schema will match the search intent with the content most relevant to the query.
- Talesmith – schema and semantics
- 3 invisible – but important – ways we made our website more SEO-friendly
Mobile search traffic has surpassed desktop traffic worldwide. And with the growth in voice-activated digital assistants, more people are doing voice queries. In these cases, the traditional “10 blue links” format doesn’t work as well, making featured snippets an especially useful format.
Write content in a conversational manner, and think about natural-language processing. Build content that answers questions quickly. Make sure structured data markup is integrated into your website where appropriate.
When consumers are searching on their phone, Alexa or another device, your goal is to answer these questions with written content focused on long-tail topics as a part of a greater mix of content included on your site.
Top tips: Use these structured data types
This will most likely soon be your bible: Schema.org Actions. Remember action is intent.
Actually, this should just be your startup page when you launch your browser:
Summary (or tl;dr)
With a solid voice strategy that captures the essence of your brand principles and personality, you could begin to shape your content away from traditional single keyword-based queries, towards more bespoke long-tail, conversation-like interactions. The importance of structured data can not be stressed enough.
The US market is ready and is gaining traction in the UK.