Google Assistant can be as mind-blowing as it is frustrating, as illuminating as it is annoying. We know what Google has planned for the near term - smart displays, Routines rolling out and more languages.
But what's coming later down the line? Here are five future features we can expect to see, from the people thinking about and building one of the most ubiquitous AI assistants right now.
Marc Paulina, a senior user experience designer on Google Assistant, speaking at Re:Work AI Assistant Summit, described how teams of Googlers use role play and improv as a "theatrical toolset" in order to get into the mind of both the user and the voice assistant and encourage relationship-building empathy.
"You experience the social interaction and the same emotional journey the user has, that's the first step towards empathic design," he said. "You have to learn how to listen before you can create these engaging social interactions. It starts with the context- where/who/why. Then need, the expectations, what do they want to achieve? Then the motivations and the anxieties is the super important part."
Paulina says that Google is not only analysing accuracy and speed now but human UX elements such as how comfortable the user is throughout the interaction, how much they trust the AI (via how confident or hesitant they speak) and the impact on the overall relationship. He also doesn't think we need special emotion detecting tech to get there: "These are human problems, itās all psychological and thereās so much we can do to be empathic through our design process, that we donāt have to have a technology to solve a human problem."
An Assistant that knows how to make mistakes...
Really. It's less about cutting out errors entirely, more about how the assistant deals with them. Paulina says that when it comes to errors, this is one of the most important moments to get the persona and brand right.
"If we understand the user, we can anticipate what these might be and then we can design persona or dialogue strategies to bring the user back on track," he said. "Try to identify where errors will occur, use conversational language and stay in persona. When your voice assistant shows up in this worst case scenario, theyāre going to feel that emotional connection."
... And not make the same one twice
Yariv Adan, also speaking at Re:Work AI Assistant Summit, went further to suggest that Google's personal profile of each individual could become clever enough to process your reactions and adapt accordingly. With voice, we don't simply say 'yes' or 'no' we show our emotions to the system.
"Google Search isnāt smart, itās smart at getting the signals from users. There is a very simple, single signal - the click," he explained. "We know what results users are clicking. In the Assistant, itās very interesting because people are giving us many more, rich signals - 'Youāre stupid, youāre bad, next, skip, volume up, awesome, I love you, not this one, I hate you, Iām a veggie.' So many signals that actually Assistant has the potential to learn amazing things.
"In three to five years, we'll get to at least that type of learning of understanding the reaction and not making the same mistake again. Right now I ask for restaurants and I hate hamburgers, but it gives me hamburgers because thatās what everyone likes."
AI that asks the right questions
Adan says that discoverability is still a "huge, unsolved problem" for assistants and that tips and suggestions aren't going to cut it. In an ideal world, everything a user asks for would work but until we get to that, he reckons the trick is getting Google Assistant to ask the right questions:
"So if I say 'I want to buy flowers', then Assistant could say 'Cool, from what provider?' If Assistant can understand what it doesn't understand or get to the level of disambiguation that needs to happen, then it can be better on failures, I think thatās the way. Not saying 'Sorry I donāt understand' all the time."
Machines that understand our nonsense
One of the demos Adan gets most excited about is the fact that Google Assistant is able to pull up a 1975 Norwegian animated film called The Pinchcliffe Grand Prix based on his long, rambling description of it that involves very few proper nouns and phrases like "where the guy steals the car design".
"The truth is humans donāt speak so nicely, we think as we go on, we blather especially when we try to remember things," he said. "So to me this is like super crazy magic. You throw all these things at it and it understood me within 500 milliseconds. It finds the right entity in the history of humankind that maps to that crazy bunch of words. I think actually this is the direction... Iām optimistic on this happening: conversing in natural language."