Conversation AI - Pittsburgh ML Summit ‘19
Thank, you so much for coming out and you. Heard everything about how to get get involved. In the community, I. Am. Part of the developer, advocate, team so I work, with people in the community, like that and I. Would I get to work with developers, every day, we're trying to work. With our products, and. Using. Them in in. However. Possible. In their products, or in companies, and. Then when they ask questions, or. Want to troubleshoot, something I'm, part of the team that that, does that. So. Today I'm going to be talking about conversational. AI and, the new user, experience. And. I don't. Think I want, to do this because. I. Can't use my hands it's just difficult. So. We. Call, either. A telephone, system that's. Software-based. Or. Any. Automated, machine. That, we want, to have a, conversation, with, and the, reason for that is pretty simple. Most of the systems that we work with. Especially. On the phone are. Based on four charts and stuff, where it's like okay, if somebody says this do, that and if they say this do that it's very rigid, it's. It's not based on how humans. Talk and how humans, interact with, each other so to solve, and, the reason for, some. Of those problems, is. Are. Many. These are some on the list as a list, some. Of the top ones are. How. Do you use. The use cases are, not defined, properly, so you've. Built a a bot or a not autonomous, system. For something that you should not be building it for, for. Example a. User, wants. To, probably. Talk to somebody for. A specific thing and if they are getting. A machine, for that it doesn't, seem personal, to them or. On the other side you might be not telling, the user that they. Are talking to a machine which, could be misleading. So those, are those are some examples, but there are a lot many, that lead to. Ending. Up in a situation where people would. Have a frustrating, experience, and, the biggest obviously. Is that, that. The artificial intelligence that you're trying to use is not, is. Not defined. For the use case that you're so, and we're going to see a lot more of this but, I wanted to set the stage with some of the problems here and. But. In spite of all of that we are still, finding. That conversation. Is the new UI and the reason for that is pretty simple, there. As the number of devices that are increasing. As, we as. We see, as yours. Going by end days going by the, number of devices that we interact with every day has, grown. Over time so just. As an example right a, lot. Of us speak. To you Google home in the in the morning to. Learn our calendar, and what's going on in the day or. Alexa, or any other you. Know smart speaker, devices, it's and, and we, have if. You're like me who likes to play around with these things then, we have those connected, with our lights, and, our fans and, our coffee. Makers, and be like oh can you start coffee right and, and, as, we start, to work with devices, that are.
That. That, help our life get, better or, more convenient, we, are it's, going to be impossible for us to interact with them on screen or to look at the app and then start how inconvenient, would it be to look at the app and start the coffee maker when your eyes cannot even open right in the morning so it's just thinking. About those types of things and that's just, that's. Just when, you have when. You're not even disabled, or. Differently, abled in a situation, in a in a small situation like if I'm carrying groceries. Coming, into the. Coming. From. From. My car in, in the garage to, the house, and. I can't use my hands but I still want a lot to be open right though. Just those little, situations. Like that and we are we are used to using devices, like that so as, those, devices, are becoming more prevalent as we as people are getting more comfortable using those devices we, are going to have to use ways to interact, with them because hands. And eyes would just not be enough and. There are so many of these how many apps are we going to use and download. So. That. Is why conversation. Is becoming, the new UI and there, are studies that, companies. Have done which. Prove prove. That fact there are 60 percent of the customers. Customers. Who want to use. Self-service. As an option, to, to. Get help and that. Is a pretty big number when when we say self-service. This is around customer service, when we place that phone call or when we try, to get get get, to help to say, refund, or return. Our stuff or get get help to buy something, new things, like that. 80%, of the customer interactions, can be actually resolved, through a well-designed, part so it, just all boils, down to how do we use the conversational. Technology, that we have available today, to get. To. That. 80%, number. Four, for, interacting. With our users, and, to design those well and, to, get to those well-designed, parts, now. Talked. A lot about how, of. Why this is important, now, what, is a conversational. Experience, is also important, to understand now what conversational, experiences, anything, that we can speak to in. In terms of machines, so, things. That I think, there are different terms that go around. You. Might have heard about chat BOTS voice. BOTS, conversational. Ux conversational. UI conversational. Apps this these are all just the terms for, kind of the same thing which is how, do you interact with machines it, can be a phone it can be an, app, basically. A computer, on the other side and, any, conversational. Experiences, is is. Under, an umbrella, of, all. The technologies, that are used are, pretty much under the same umbrella which, is natural language understanding so. A, natural, language understanding is, the technology behind it, and it's really how, a machine. Interacts. Translates, what a user what, what we are saying in human, language, into. The computer, language and vice, versa, and. That. Sounds pretty, simple. But, there's. A lot that happens. Behind the scenes in, it and I. Don't expect you to read this but about I wanted to get out of this slide was natural, language understanding is, a subset, of natural, language processing it. Is that subset, which actually, helps, computers, define, what. Was the sentiment, in that, sentence, what was how, do I para para, phrase something, that the user has said in, in. It in my own terms or, define. Understanding. Different accents, understanding, different types of users understanding. Spelling, Corrections, and and and. I. Think I mentioned sentiment, so some, of those nuances that, we that. Come on with. The, natural language who, under natural language understanding and, all, the other terms are kind, of useful. To know because you. Would use text-to-speech to. Convert text to speech and from and, vice versa, which is speech to text, ESR. Is is useful. To just know because automatic. Speech recognition is. The first step that you would do in order to understand, what the users saying and then applies speech. To text and text to speech and. On top of that you would apply natural language understanding so just wanted to make sure that the terms are out there laid out so. Everybody understands, it and then the, power of natural language understanding is it applies to both texts, like written text as well as voice so, a. You're. Not you're, not using different technologies, for for, of those. Okay. So, what, does it take to actually build a. Conversation. Experience to build a conversational, experience it really doesn't take much you can there are a lot of automated, tools and we are going to see some tools later. In the presentation but. To. Build a good conversational. Experience, like we saw, 80%. Of the good. Experiences. 80%, of the user experiences, can can actually be served with good conversation. Experiences, so we need to understand what, would that good or best conversational, experience look like and. For. That. Let's. See some challenges that we that.
Need To be solved, in order to get to a good conversational, experience so if you were to do this on your own here the, first thing you have to understand, is how. What. Are the different ways in which a user will be asking me questions so. I'll. Take an example here when I ask for coffee in the morning I can say I, need, coffee and, assume. Depending. On the day and the time and my mood and where. I am it could be different ways in which I ask for coffee I need coffee can I just get my coffee coffee, right now Oh coffee. Would be amazing right now right those are very different ways of asking for coffee and I am, only one person if we go in the room and ask for ask all, of us to say look. For coffee we. Are probably going to come up with thousands. Of different ways in which people ask for coffee so, that goes to say that when, you are designing a conversational. Experience, you need to incorporate how. Are, how many different ways in which people can ask for coffee and that requires obviously machine, learning so you're going to take all of the that data you're going to train your your, train your model to to. Respond to that question and. That leads, to leads. To the point that if you were doing this alone this is going to be on your own this is going to be probably. Pretty hard, and. Then. The, second part is entities, when I have and. We'll take another example, here entities. Is nothing but parameters, that you need to grab when. When. User has said something or asked for something so if, I say I want to set an appointment for 2:00 p.m. tomorrow 2:00, p.m. tomorrow, an appointment, are three critical. Pieces of information I need in order to set up that appointment so. How, do you grab some of those those. Variables, from what the user has said, in the intent and use, them to. To. Take an action on them and then that, would be the third. Piece as well which is context. And context is I would say pretty much the most important, part of making. A bot. Really. Smart, and. Again. Another example, imagine. Any, example, in this case but this is really. To explain, how, and. For. Us humans, in general context. Comes naturally. So we don't have like what I was talking two seconds, ago I can I can assume that you guys retain, that right and and, that's, just natural we don't even have to think about it and that's the part we need to train them train the, computers. Or machines to understand, and we do that through through. Variables, and. Which is what we call context, so in this case let's, take an example if I'm trying to book a flight from LA to Hawaii, for, less, than $300. In this example right, and then and, then. I can continue. On with that conversation, and say, oh how about LA. To, Pittsburgh. For less than $300. I'm. Still. Talking about flights, so. That, part should be stored, somewhere, in, the previous conversation, if you're Bart does more than just, booking, flights. Because. I assume if that's the case you probably also do refunds you probably also do some other things rebooking. Maybe cars so depending. On what your bar does you want to store the context, of the conversation and. You see this happening if you use Google. Home and. Other devices like that then if, I ask for weather or what's the weather today, in San Francisco, and then I say, oh how about in the afternoon, or how, about tomorrow how. About in Mountain View because, I. Maybe going. Traveling, right so it. It. Context. Is pretty important, to make the, conversation seem.
Natural And you want to start. Off with, thinking. About context. As the first thing in a conversation, including, all the other, three things which is intense and parameters and how are you going to get all of that so, those. Are just three there's, a lot more but. I picked, out the top three because those are pretty, big ones now. We'll. See some more in, this architecture, as it evolves. When. You are talking about a conversational. Experience, in order to build it you need you, need to think about the left-hand side here which is you, need to make sure it's multi-channel, people. These days are not going to either call not, going to communicate. With with the brand or anybody, in just, one fashion so they are not always going to pick up the phone to do that they. Could be in their cars and using the digital, voice from there they could be wanting. To get help from phase through Facebook Messenger, they. Could be on on. Skype or mobile, any other mobile, app or Google home it could be anywhere, so you need to design for. For. Multi-channel. Experience. So you can't be just thinking about one thing and, that itself, requires. Integration. With all of those all. Of those things so think about that then. You have to detect intent, we talked about that already then you took by machine learning on it and the list growth keeps growing you have to start to think about natural language processing once, you have all of that you're going to process the, process. The, language. And then if. Say, if you start to go global then, you need to think about all, sorts of multiple languages, that you need to support conscious, being English anymore then. Think. About analysis. On that think, about small. Talks because sometimes, you may have to engage, in conversations. Like oh how's the weather today at your place even though they aren't there for something, else so. This. And, then the, most important, piece which is all, of, those need to then connect to your back end and the, back end is. The very critical. You can have everything, I say this all the time when I do this talk because you, can have the, best possible, natural. Language experience. On the front end but if it doesn't connect properly to, your back-end systems. Then. It is all worthless. Because. Imagine. In this situation, where I say oh I want to return I'm calling, about this order and. I want to return the pair of shoes because, I don't they, don't fit me or I don't like them whatever it is that. System, needs to go back and look at my order, and it needs to be smart, and it, it. Needs to first understand, what I'm saying which you would which, say, you've figured.
Out Doing, through natural language processing and, a machine learning and all of that, but. It still needs to go back into whatever. Is the backend system if it's a database if it's Salesforce, if it is another order. Processing, system it needs to dip into that it could be an API and then, come, back and give me give me a response and and that should, be in. A way that it's seamless and it can go back and forth between that in during, that interaction so it's very important, to understand, how scalable that system, is at, the, back how. How. I'm going to be able to connect, to it in order to provide that conversational. Experience, so, we. Talked about all of the way all of the challenges, that one. Needs to face in order to build a good conversational. Experience. But. It's if, you look at it all the red pieces it's pretty pretty hard, to, do it's not as simple as oh let me just build a bot it's you. Can in probably, an hour but. When, you start to think about all of this, you're you're going to find that it's not that easy so you need to you say need some tools in order to build this, so that you're not doing some of the stuff that is already out there and, one. Of the great tools that we, have at Google built, is dialogues though how many of you have heard about dialogue flow oh. Okay. Yeah whew. That's, good so I'm teaching something new which is amazing. Dynodes. No is basically. Like, you saw here. All, of the red pieces I've pretty, much replaced, with dialogue flow so, you, can handle, a lot of machine, learning and now it comes it comes pre-built, with machine learning and natural language processing so. You just basically provide, examples and I'll show you the tool in a little bit here. I'll. Show you the, tool just, right, after this and then you. Can see that you you're not really doing machine, learning you don't need to be an, expert in machine, learning to use the tool you can just start to put some examples, of how I asked for coffee and how you ask for coffee and then it will train itself and, then you just. Click. A button to train it once. It's trained you can just enable multiple, languages, and then as, you start to go into different different. Countries, you can it. Makes, your life easier, to to. Build it it gives you some analysis, it also gives you how. My bot is doing, over, time you. Can learn if. Some of the intents are not working you can start to create new intent, so it's. It's a tool that makes it a little bit easy for. For. Starting with conversational, experiences. If. You want to do it yourself you have would have to think about a lot of that, and. It's natural and rich in terms of it has, natural language processing built, into it so you don't have to train it to learn to. Learn natural. Language, and how people ask for things and it also understands, grammar and stuff so if I say something that's not, accurate. In English it would still be able to pick out because you, can provide that provide examples of, bad grammar, as well, so. Demo, alright, I have five minutes, so I'm gonna try to do this pretty, quick escape. Here. I'm. Gonna pray to the time oh god what. I've done here is I, have built an experience. With. Django, on. So if build an experience with Django that. Which is a front-end it's it's on App Engine, so. The user when they interact so the interface that you're seeing here is on App Engine and then, it. Speaks, to cloud. Sequel, for some, of the, some. Of the database. Requirements. And then I send a we, don't have to worry too much about that at this point and then all the images, that I take, from the user going into cloud storage, and, then everything. That, needs, to be going to dialogue, flow I use the dialogue flow API to make, a request to it and then, I'm. Using. Two. Of the other services, which we'll come back to so okay so this. Is the bot it can set appointments and explore, landmarks. So I wanted to show, two very different things that's. Why this. Is doing two very different things. Let's, say I want. To set. An appointment. For. 2:00. P.m.. Tomorrow. If. I can spell. For. 2:00 p.m. tomorrow. It, actually also has spellcheck, but I'm, not. Gonna try that because, demos, always. Driver. License. I. Could. Say registration. Or something. Else and then now it's basically. Talking, to Google Calendar and setting, me an appointment, I. Should. Be able to see that appointment, on my, calendar. That's. My actual calendar, this. Is the calendar that, you want to see and that's the appointment, that it's set up for 2:00 p.m. tomorrow. So. I'm basically making, a hook into and. Into, a calendar, and calling, that the. Other thing that it can do is. It. Can also. Just. In the interest of time I'm just gonna upload an. Image and, then it should be able to. Go. Into dip. Into the ml API for, vision and look. At what the image was and, parse.
The Landmark. That was in the image the, image itself, was. Was. Golden Gate so, it parses that it calls the ML API and gives. The response so. How. Is this happening this is dialog flow, you. Go into the intense and you create those intents I have it if somebody says hi hello that's when I trigger the trigger, of the bot and I say yes. I can help you with appointments, and explore and getting, I have one minute and doing really quickly here explore, landmarks, and. Then. Once somebody, has said okay I want to upload an image I go into the file upload schedule an appointment I come here and I say okay if people say this this and that need an appointment for p.m. tomorrow, then, I ask for a couple things that they have not told me like they. Didn't tell me that they want to come in for a license so I asked, for that and. That's pretty simple because I just make it a required field and I don't move forward without they give without them giving me that information. And. All of that it's pretty pretty, simple to do now I go to fulfillment. To actually serve that record to fulfill that request and in fulfillment, I write, it but a little. Piece of code where I say this. Is where I want. To connect, to my calendar. Function. Yeah. This, is the calendar function that that's taking, date and time and the appointment type which are the three pieces of information we, extracted, and then, from, there it's setting, up an appointment and if it can't find is find that slot. To be open it will throw an error and say, that. Is not available or there's a conflict, in that appointment same, I'm doing for ml API, I'm calling, the vision API and, I'm trading, an instance of the vision API and. Then saying unjust. Landmark. Detection, you, could also do I've done this for receipts as well you can pass a receipt, in the image and and look for text annotations, in it and you can then start to do refunds and learn more about what the users trying to say from the refund perspective, and things like that so this is just to show the art of the possible with the MLA API all. Of this code I'll go back into my slide so, that you can find me, all, of this code is available, on, my, github, repository. And. If you are really pretty new to dialog, flow in general or conversational, experiences, I have a series, called deconstructing. Chat bots on a YouTube channel. It has about 15 episodes and this demo is also in there so you can try, it on on by yourself and I have a step-by-step. Video. To follow along in, that as well so with. That I would wrap it up thank, you hopefully, this was helpful.