Conversational applications may seem simple on the surface, but building truly useful conversational experiences represents one of the hardest AI challenges solvable today.
Deep-Domain Conversational Artificial Intelligence
Tim Tuttle | MindMeld
What is MindMeld and what do you do?
MindMeld is a leading Conversational AI company, now offering Deep-Domain Conversational AI, to help companies create intelligent conversational interfaces for apps and devices. The company has pioneered the AI technology behind the emerging generation of intelligent voice and chat assistants.
What's unique about MindMeld's tech vs. others on the market?
MindMeld’s platform leverages more advanced technology called Deep-Domain Conversational AI. This describes the AI technology required to build voice and chat assistants, which can demonstrate deep understanding of any knowledge domain. MindMeld’s Deep-Domain Conversational AI relies on state-of-the-art machine learning approaches, big data techniques to manage large amounts of training data, as well as the curation of custom knowledge graphs, which capture important domain knowledge for any application.
Why are voice-enabled devices like Amazon Echo and Google Home continuing to grow now when before they weren't as popular?
For nearly half-a-century, AI researchers have been building speech recognition and language understanding technology that could reach human levels of accuracy - to no avail. Long touted as the future of computing interfaces, these technologies have remained frustratingly out-of-reach for decades. This all began to change in the late 2000s and early 2010s. Fuelled by massive amounts of data from exploding mobile Internet usage, a long-studied discipline of machine learning called ‘supervised learning’ began to deliver surprisingly promising results. Long-standing AI research challenges such as speech recognition and machine translation began to see leaps in accuracy, which dwarfed all improvements made over the previous decades combined.
As a result of these advances in machine learning, virtual assistants, which had a notoriously hit-or-miss track record in their early years, started to see significant widespread adoption for the first time. This trend began in 2014 and 2015 and accelerated in 2016. To compound this trend, 2016 was also the year when every major consumer Internet company launched open developer APIs on nearly every major platform, which supports conversational interactions. This includes virtual assistant platforms like Google Assistant, Cortana and Siri as well as messaging platforms like Facebook Messenger, Skype, and Slack. It also includes the new generation of voice-enabled devices like Amazon Echo and Google Home. As a result, any company that is able to build a useful conversational interface can now reach potentially billions of new users across some of the most popular virtual assistant, messaging, and connected device platforms. For all of these new open platforms, human conversation is truly the lingua franca, and any organization which masters the ability to understand the natural language requests of their users will gain a huge strategic advantage in this emerging conversational application landscape.
What kinds of companies are using your tech today?
MindMeld currently powers advanced conversational experiences used by some of the world’s largest media companies, government agencies, automotive manufacturers, and global retailers. MindMeld's customers and investors include Google, Samsung, Intel, Telefonica, Liberty Global, IDG, USAA, Uniqlo, Spotify, In-Q-Tel and others.
What is the future of voice?
Over the next decade, voice will become a feature in every application and on every device where it might prove useful. It is not hard to envision some applications where a voice-first experience would be ideal. Wearables and virtual reality experiences are obvious candidates where a voice-first paradigm might be ideal. Regardless, GUI-based interfaces are unlikely to become obsolete. There will remain many applications and situations where a traditional GUI will still be preferable.
What impact does AI have on voice assistants?
Conversational applications may seem simple on the surface, but building truly useful conversational experiences represents one of the hardest AI challenges solvable today. The challenge lies in the inherent complexity of human language. Simple applications that support a very narrow vocabulary of commands are straightforward to build using rule-based approaches, but users nearly always find these simple applications trivial and tiresome. Applications to date that have succeeded in delighting users impose few constraints on a user’s vocabulary; they simply let users speak to the application as if they are conversing with another human. Applications like this, which can understand broad-vocabulary natural language, are notoriously complex due to the inherent combinatorial complexity of language or what’s also called the ‘curse of dimensionality’. In other words, the number of different ways a human might phrase even a simple question can quickly explode into many thousands of variations. The human brain is remarkable at making sense of many trillions of language variations in a fraction of a second with near-perfect accuracy. This same feat is all but impossible for today’s most advanced AI technology.
In the past few years, machine learning approaches, namely supervised learning and deep learning, have proven effective at understanding natural language in a wide range of broad-vocabulary domains. To date, large-scale supervised learning is the only approach that has yielded truly useful conversational applications, which have been embraced by millions of users. All of today’s most widely used conversational services, Cortana, Siri, Google Assistant, Alexa, rely on large-scale supervised learning. All supervised learning systems require two key ingredients: high-quality, representative training data and state-of-the-art algorithms. If the training data reflects the range of user inputs and outputs the application will experience during normal usage, then the algorithms can learn how to recognize the important patterns in the data, which dictate how each request, should be interpreted. The MindMeld Conversational AI platform was specifically designed to handle both the state-of-the-art algorithms and large training data sets required for building successful voice and chat assistants.
An interesting side of the Google Home story is the advertising revenue hit they may take with its implementation. How will voice apps and assistants pay for themselves?
In the long run, any dominant platform which helps connect users with information and services will have ample opportunity to monetize this traffic using a variety of advertising models. While it is too early now for a thriving marketplace to exist which pairs voice queries with advertisers, this marketplace will no doubt emerge in the coming years as voice interactions go mainstream.
How will user interfaces change in apps with more widespread adoption of voice?
While changes may not happen overnight, over the next five years, Conversational AI will become the primary way that we will interact with many online services. Most companies will need to build their own specialized conversational experiences, which will help respond to customer requests across a wide range of applications and devices. In order to make this vision a reality, enterprises will need to empowered with a new generation of tools and technologies for building great conversational applications. MindMeld is well down the path toward providing this new generation of Conversational AI capabilities to power the emerging application landscape.
What are the Top 3 governing principles/best practices for developers and data scientists when building advanced conversational applications?
1. Select a use case that mimics a familiar, real-world interaction so that users will have intuition about the types of questions they might ask. Selecting an unrealistic or incorrect use case will render even the smartest app dead on arrival.
2. Ensure that you have a large enough set of ‘ground truth’ training data to ensure that the vast majority of user interactions can be captured and measured. Dipping your toe in the water does not work. Real-world accuracy can only be evaluated after you take the plunge.
3. Employ large-scale AI and analytics to ensure that your application achieves at least 95% accuracy across the long tail of possible user interactions. Spot checking and small-scale user testing will be unable to expose long-tail corner cases which might fatally undermine overall accuracy.
For more on this topic: https://www.mindmeld.com/docs
The content & opinions in this article are the author’s and do not necessarily represent the views of RoboticsTomorrow
Comments (0)
This post does not have any comments. Be the first to leave a comment below.