Ilya Dudkin 26/11/2018 #Popular 10 min readThe sales of Amazon Alexa are off the charts and it’s fairly easy to see why. Thanks to state-of-the-art Amazon Alexa technology, customers can experience the next generation of personal assistants that are capable of much more than all the previous versions ever since the Alexa was released way back in 2014. The extraordinary machine learning processes that Alexa uses fundamentally change the way even routine tasks are done. In order to appreciate this new modern marvel let’s take a closer look at how Alexa works.Alexa Artificial IntelligenceThanks to artificial intelligence, Alexa can conform and adapt to each individual user, but this comes with a catch: the smarter Alexa is created, the harder it is to build. If we think about the way we talk, we use different word patterns and combinations, along with euphemisms, jargons, onomatopoeias and many other methods of describing the question for which we are seeking an answer to. Such differences in speech and pronunciation are evident even in the most basic commands such as waking up Alexa.For Alexa to wake up it needs to detect the “Wake word” if this word is detected, this information is sent to the speech cognition system on the cloud which converts the speech to text. This process transforms the Alexa’s task from a binary problem to a sequence to sequence problem. This is a problem that has given professional software developers a lot of headaches since the possibilities are endless for Alexa’s artificial intellect. It must scour the entire English vocabulary in search for the wake word and cloud technology is the only scalable solution that can handle such a mass volume of data. So, if you were ever wondering exactly how much data does Alexa use, the answer is a lot.In order to accommodate everyone’s unique vocal timbre, pitch, inflection, etc. mass volumes of data are needed to convert voice commands into text and determine what a sequence of words actually mean. This process is twofold: first, Alexa will look at the plain text without adding in extra features to determine the user’s request. If this does not work, it resorts to the next process, which is based on machine learning to train Alexa by using a combination of audio and transcripts.Alexa Deep LearningWe already mentioned that engineers use machine learning to train Alexa which determined exactly what can Alexa do and what is it capable of. A subcategory of machine learning is natural language understanding (NLU). There is a lot of mapping that goes on here but most often everything starts with rules and an expression. However, you could still run into problems if your request has multiple factors.For example, let’s say you ask Alexa “What is the weather in New York?” In order for Alexa to generate a response, it requires cross-domain intent classification. The application layer would find the weather and the dialog manager would make the call whether or not more information is needed for an accurate response. A language generator produces a prompt and natural language generation (NLG) gives Alexa the text it needs to say the answer of loud. Usually, when constructing a conversation agent, you would create a template, which works initially up until the point when you must scale up. Such speech engines use concatenate synthesis, which divides the audio into smaller bits and the engine tries to find the best sequence of bits to get the most out of the naturalness of the audio.So, if you enjoy using irony, sarcasm or some other rhetorical devices in your speech, Alexa will be able to recognize what you are saying by piecing together all the snippets. Also, it will be able to fine-tune the speaking voice to be more natural and soothing. However, you give Alexa a sarcastic or sassy remark, you can expect the same in return. The engineers behind Alexa use professional narrators that can make Alexa sound like a “wise guy” and it helps that Amazon has its own audiobook publishing service.The next stage for Alexa would be to determine the person’s emotions or mood based on the tone of voice and the words they use. In the future, if someone asks the Alexa to fetch the weather in New York and does it with a hint glumness, Alexa will ask the person if something is bothering them if there is anything it could do or maybe play a song that cheers them up. It will also be able to hold a conversation much better by remembering what you said earlier and soon your basic requests will turn into much more animated conversations that include jokes about the weather, your daily commute in bad weather and many other nuances.Amazon has big plans for Alexa. It does not want to replace human interaction altogether, but it does want to fundamentally change the way humans interact with machines. They do not want Alexa to be an accessory such as a smartphone, but rather a personal assistant that you simply cannot live without. This can be very useful for people with limited mobility who cannot get up easily to turn on the television, increase or decrease the temperature on the thermostat etc. It can even be tampered with to adjust to someone with speech impairments to make Alexa works on their terms.These are just some of the uses of the Alexa that we will be able to see in the future, although they are very important ones. Things such as home automation and entertainment will be completely transformed to provide an unprecedented level of comfort and convenience that comes with very advanced intellect. If you currently own an Alexa device, it can help you solve even the problems that you are experiencing at work or your personal life. Instead of complaining about all your problems to your best friend, try doing it to Alexa. It will be able to handle a twenty-minute conversation and it will actually produce actionable solutions to help you out. For example, you can ask it “How do I increase my company’s search engine rankings?” And it will produce a whole list of possibilities for you. With the machine learning that goes into Alexa, nothing is off limits and you will be pleasantly surprised with the results. So, give it a shot. With the rapid advance in technology, you will be able to see Alexa and many other forms of AI in use in the very near future.