Data Science: Natural Language Processing
We are switching from mainframe topics to data science. Data science is a multi-faceted field and we at IBA Group divide it into classic data science, natural language processing, and computer vision.
The focus of this vlog is Natural Language Processing or NLP. I invited Elena Romanova, a leading specialist in NLP at IBA Group, to share her views and experience of working with NLP technologies.
Mark Hillary, a writer and analyst focused on technology, helps Elena share her ideas.
Mark Hillary: “I remember when we saw the first Google Duplex demo and many people were really stunned by this. The demonstration was Google Duplex calling a hair salon and booking an appointment, and the salon employees didn’t realize they were talking to a computer. “
Many people were stunned by Google Duplex when it was first demonstrated two years ago. I remember watching it book a hair salon appointment without the salon employee realizing they were talking to a computer. How far has natural language processing developed since then?
Elena Romanova: “Ten years ago, we dreamed about the possibility to have a translation service. Now we have this option in Google, Yandex, social networks, and many messengers.”
Google Duplex is about creating a voice assistant. It works in four steps. First, it converts speech to text, then processes the text to understand the entities, generates an answer, and converts the text to speech. Each step is a separate task of artificial intelligence and all of them have had huge progress during last year or years.
Ten years ago, we just dreamed about the possibility to have a translation service that could translate the whole text. Now we have this option in Google, Yandex, social networks, and many messengers. We have computer assistants to answer customer questions in banks, mobile services, railway stations, and more. NLP models assist doctors to diagnose diseases or prescribe medicines after a proper analysis of medical history. NLP helps process incoming documents for banks and insurance companies.
– What are the main applications you are seeing clients explore with NLP?
We always have a choice of commercial frameworks and open source. Open source products are free or low priced, but usually, they have a number of nuances. Open source products may take much more time for processing, require more data for training a model, and sometimes show worse results than commercial frameworks and services. For example, IBM Watson Assistant requires at least five examples of chatbot intents to start working, while Rasa (open source) requires about 200.
For a speech-to-text (and text-to-speech) task, the most powerful frameworks are commercial, like Google Speech to Text, Watson Speech to Text, and Yandex SpeechKit. For creating a chatbot (assistant), the best is Google Dialog Flow CX and IBM Watson Assistant. For processing documents, I would choose SpaCy (free open source) and Apache UIMA (free).
– Are there any situations that are more challenging? For example, how does it manage strong accents?
Sure, there are some difficulties in each area. Accents in voice recognition are among them. Errors of OCR and misspellings in scanned document make problem too. Also, NLP models can process only one language at a time, while some documents can be multilingual, for example visa applications or international contracts.
As to accents in voice recognition, key point for converting speech to text is to create a collection of voice records and train the model to understand the sense. So, if we want our application to understand some languages and accents, we need to find a number of records for each language, for each accent. But usually, we don’t need to start each project from scratch, as we already have some common services and frameworks. And commercial products like Google Text to Speech successfully resolve such problems.
– Can companies that offer customer service actively start using this technology now or is it safer to still consider it just for simple questions only? Perhaps a safer idea is for the system to listen to the call and suggest ideas immediately to the human customer service agent?
Every artificial intelligence task is still in the research phase. We always have something done, the part that is quite clear, shows good result, and many people use it. On the other hand, there are many new ideas and problems to solve appear. One of the interesting improvements for text-to-speech conversion is to make a computer voice sound more humanlike. Neural networks are used to modify tones of speech.
We will always have something to add or improve, but the more people use technology, the less time it will take to make a valuable progress. So learn about possibilities, use them, and enjoy!
– Elena, can you please share your personal experience of working on NLP projects?
I have been involved in the development of NLP models for extracting data from PHI. These are diagnoses, adverse events, surgery information, complications, and surveys. The models help doctors consider information about family medical history for patient care. NLP analyzes gene test descriptions and final results, and type and grade of mutation, as well as exact details about test procedures. In addition, we gathered information about treatment inside experimental medicine protocols or research. The models provide key points from corpus of data and help doctors work faster and more efficiently.
Another project is bank customer support assistant. Our assistant can give common information without customer authentication about the bank’s address, working hours, and services; answer frequently asked questions from an authenticated customer, including card status, credit details, and the number of wrong pin codes entered; make different operations, including block/unblock a card, reset the number of wrong pin codes. Finally direct a customer to a human, if the assistant cannot resolve the problem. One of the advantages of such assistant is the possibility to connect with the customer in many applications, like website, phone, messengers, and social networks.
We developed another assistant that helps a train inspector to fill in the train state form using a mobile phone and a Bluetooth headset. The assistant provides reference information about train details or malfunctions. It understands a number of voice commands, including logging into the application, setting train parameters, giving comments, and submitting a form. In addition, we had a minor project in sentiment analysis in tweets.
This blog post is a part of a series of video discussions on data science. Please share your thoughts about the discussion and offer your topics for future videos by leaving your comments or suggestions here.