Voice recognition is the enabling technology for virtual assistants like Siri, Google Assistant and Alexa.
The uniqueness of this type of technology comes from the fact that it uses voice as the primary mode of interaction. It’s faster than typing; it provides usability assistance for the disabled and helps you carry out everyday tasks in an intuitive manner.
The predecessor of the Voice User Interface, the Interactive Voice Response (IVR) was designed to assist users in find answers to their questions through a phone keypad without the need to speak with a live agent.
The principles used to design graphics interfaces differ from the ones used to create voice user interfaces. In VUI, designers don’t have a screen to display information, options or commands. Audio interfaces contain only the possible interactions and the necessary information without overloading the users.
According to the Interaction Design Institute, to build voice experiences, you need to have an understanding of the fundamentals of voice interaction and how people communicate. In their book “Wired for Speech”, Stanford researchers Clifford Nass and Scott Brave argue that we are “voice-activated”, we have to communicate with voice assistant devices the same way we communicate with people.
There are three main steps to creating a Voice User Interface:
Create a Persona
You should start by creating an identity for your app. Think about the characteristics you want to associate with your brand and regard it as an opportunity to strengthen the relationship with your users. Make the voice persona as relatable as possible. It has to be “someone” your users would like to interact with in real life. In “The Media Equation”, Clifford Nass and Byron Reeves write that people have a natural tendency to treat computers the same way they treat real people with real feelings.
Set the context and write sample dialogues
Before you start creating a user flow, think about the context in which communication takes place. To make sure the conversation flows naturally, interaction designers have to have an understanding of linguistics. You want to build conversations based on how people speak, not how they write. Many designers use in their research the cooperative principle and the maxims of politeness established by linguist and philosopher Paul Grice in his article "Logic and Conversation”, published in 1975. The cooperative principle states that dialogue happens by building on each other’s contributions, moving the conversation forward. Grice’s maxims describe conversation “as an exercise in which people try to be helpful”. All conversations should be guided by four principles: quality, quantity, relevance and clarity.
Quality- speakers should say things that are true
Quantity – each speaker has to contribute only what the conversation demands it, not too much, not too little
Relevance –What people and media say should clearly relate to the purpose of the conversation.
Utility – Contribution to an interaction should not be obscure or ambiguous, rather orderly and brief
After you set the context, the next step is to create a conversational flow between the user and the system. Write down what kind of questions you want your device to answer and create a script. Think about it as something to show your client before developing further.
Testing is a critical step in the design process. Information about the way real users interact with the device is critical and is usually not apparent before the release. You might find eye-opening errors in the software which can be real opportunities for improvement.
The market for digital virtual assistants will continue to expand, from $3.25 billion in 2019 to $8 billion in 2023. In an increasingly digitised world, the field of Voice User Interaction will become more and more relevant.
If you’re interested to explore further the field of virtual assistants and how to build Voice User Interfaces, check out Voicebot.ai. It contains a wealth of information on how AI and voice assistants will change the world. You might also want to read “Wired for Speech” of Clifford Nass and Scott Brave and Cathy Pearls’ “Designing Voice User Interfaces”. An interesting listen is this Stanford conference. It includes a panel of experts in speech recognition, AI and robotics like Margaret Urban, Jeff Cabili, Cathy Pearl, and Omar Abdelwahed.