What’s The Immediate Future For Chatbots And Voice Recognition?

March 9, 2021 | Mark Hillary

Over the next few months, I want to focus here on the intersection between technology and business solutions.

As someone who initially studied software engineering and also for an MBA focused on organizational psychology, I’m comfortable connecting business needs with technology, but I realize that sometimes the two areas don’t meet. Important technological tools or ideas are not supported by business leaders just because they don’t understand the value of these ideas.

So let’s begin by looking at voice recognition. In the 1960s the classic science-fiction TV show, Star Trek, featured Captain Kirk and his crew talking to a computer. At the time it seemed unbelievable and in the fourth edition of the Star Trek movies, the engineer Scotty travels back in time to 1986 and is confused when he has to use a keyboard to work with a computer.

In the present day, the idea of talking to a computer has been completely normalized through the popularity of home assistant devices such as the Amazon Echo, Google Home, and Apple’s Siri. I can talk to my Chevrolet car and easily send a WhatsApp message using Siri and that’s a lot safer (and more legal) than trying to type a message when driving.

But voice interactions are still a bit one-sided. I can ask Alexa to tell me a joke, but I can’t ask her to explain why blue contrasts well with yellow or why Monty Python is funny. We can issue commands, but we can’t really have a conversation yet.

But researchers around the world are working hard to improve this. Google demonstrated its Assistant almost three years ago and claimed that it could handle natural speech and make decisions on behalf of the user. Their most famous demonstration was asking the assistant to call and make a booking for a haircut. It was impressive, but at the time I was always thinking, what if the hairdresser taking the booking asks anything unexpected? What would Google say?

The reality is that this problem is like an onion

There are layers of complexity. Imagine if the hairdresser responded by saying, there is no parking available on the day you have booked so will you be able to manage to locate parking elsewhere? This is a normal question and yet it would not fit into the normal narrative of booking an appointment. If the Assistant is smart enough then faced with this kind of problem it would say something like, I will inform my client of the situation and we will contact you again if there is a problem.

If the Assistant is completely confused then that’s a problem for everyone. It’s rather like the customer service chatbots that have become so popular as companies ask us to talk to a robot, rather than a real person in a contact center. It saves the company money if they can automate all these interactions, but when customers have a problem that is outside the range of what the robot understands then it is a painful and frustrating experience.

I recently faced difficulty with PayPal. I was locked out and needed to reset my account, but they wanted to send a reset code to my phone and they had an old phone number that I no longer use. There was no other way to reset my account so I started a conversation with the help system only to find I was talking to a bot. I asked for help updating my phone number, but for that, I needed to log in. I was stuck in a Kafkaesque loop and the bot didn’t understand my frustration. I only managed to get help from a real human when I called out their poor customer service system on Twitter.

The Watson system from IBM is currently one of the most advanced examples of Natural Language Processing (NLP) and Automatic Speech Recognition (ASR) in the world today. IBM has been working on speech recognition systems since 1962 when their original system could understand 16 different words – perhaps they inspired Captain Kirk?

ASR is an extremely useful tool for many companies in several ways

Think about a customer service center. Hundreds of agents are constantly talking to your customers and it’s all simultaneous. How do the team leaders and managers ensure quality is kept high and problematic calls are checked? They used to just sample a small percentage of the calls, which means that many will be missed. Now they can apply a system like ASR to all the voice traffic, convert the conversations into text data and then analyze the content and perform sentiment or quality analysis – this can also be very important for compliance in regulated industries.

NLP is where the really exciting work is taking place. The IBM Watson research is taking natural voice recognition and processing to a new level, so the system can understand idioms and sarcasm, names and places, and actually interact seamlessly. Business applications include personal assistants, machine translation, chatbots, sentiment analysis, and even spam detection.

As my PayPal experience shows, I am particularly interested in how tools like Watson can improve that customer-to-brand interaction and experience. If I could just talk to the brand and explain my problem (and be understood) then it would be a dramatic improvement on the present-day situation where I have to first work out how I can find a human who will listen to my problem. NLP could be a game-changer for many companies and although I think the real human experience will probably remain important in customer service for several years, it seems that more and more basic types of interaction will be handled automatically.

In my lifetime I believe that we will move from commanding Alexa to start playing a David Bowie album to full and rich conversations. Maybe we will not be able to talk about the books of Jean-Paul Sartre, but at least it should be possible to ask DHL where my package is and to receive a sensible answer.

February 26, 2021

How The IoT Creates A Need For Digital Twins

Mark Hillary Continue Reading

March 16, 2021

How Can Companies Benefit From Machine Learning?

Mark Hillary Continue Reading

Blog