Designing for Accessibility (Google I/O'19)

Show Video

You. See this. This. Is my old hearing aid and this. This. Is one that, I wear today. What's. Different, this. One's flush colored and, this one's red. It. May seem, like a small design, tweak but, it changed. My life it made, me feel as if I belong, again. You. See right. Before fifth grade my. Mom sat, me down and she. Said you. Need. To tell. Michelle, about. Your. Hearing laughs. Michelle. Was my best friend, but. I, hadn't. Seen her all summer long and. Two. Weeks before I, was told that, I, was, losing, my hearing and it. Was going to just get worse and worse, and worse. But. I was, 10 years old I didn't. Know how to deal, with heavy stuff like this and. So, when. I called her up before. School started, we, talked about our summers, we talked about sports we talked about everything. But. My. Hearing lapse. On. The. Second day of school, she. Was standing behind, me and nine and, she tapped me on the shoulder, pointing, to my earring and said what's. That. It. Was an innocent, question, but. I didn't. Know how to respond, and so I said to hearing it as if, she was stupid not. To know and. Then. They're. Just silence. In. This. Silence, all, of us but. The rest of our relationship. She. Never asked, me again about. My hearing loss and. It's been that yeah, walking. Her slowly. Drift, away I. Thought. Different. As. If I didn't, belong and. She. Being just ten years old didn't, know how to deal with this, different many. Kids the. Boy difference, because, they're, just not sure what to do with it and. So. I found out quickly that I. Didn't want to be seen, as different and. I. Began this long struggle. To, try to prove that, although. I had a hearing loss. It. Didn't change me I was still normal. And. I did this by. Over. TV. Going. When. I went to college, playing. Just, one sport was enough I had, played to him I had. To go to an Ivy League I. Became. One. Of the first few deaf lawyers, in the u.s. I. Did. Some work at the, United, Nations and, then I. Became. A designer. But. Somewhere, along. The way I, realized. Something, I. Am. The. New normal you know. That TV show orange, is the new black well, I am. The. New normal. Difference. Is, the, new normal, difference. Even, if it seems, like, a limitation, it's what makes us fly what. Makes us valuable. Now. I. Would. Like to think about this. Disability. Encompasses. Ah. Of. Us if, we. All live, long enough we, will all get, disability at some, point in our lives and, who. Yeah has, broken, their legs or their arms. Really. That's it come, on. That's, an example of a temporary disability. But. What comes next, is key. We. Also are, experience. Something, called momentary.

Disabilities. Now. I'd, like to pop up a volunteer, to come up and help me demonstrate what, they are. What. I'd like you to do yes. To, pick, up that, box in, those books and then, come over while. Carrying box. Take. Sip of, the. Water you're gonna have to open it though. Yep, Toba. That, was pretty impressive. But. It was was it easy, and hard. Yes. So. Thank. You very much for. Your help as we. Go. About our lives. We. Encounter. Situations, where. We'll, be momentary. Disabled. Whether. We're carrying box and, trying. To open up a door and. So. Disability. Really. Encompasses. All of. Us, they, just some, of us that. Experience, it a lot more than, others now. As, a lawyer. But. For, equality, and, race. Gender. Disability, and. You, would think that I, would have been outfitted. With, the skills necessary to, feel, accepted, and valued by, society. But. To my surprise I. Found my strongest tools, when, I transition, from na2, design design. Has. This powerful, ability, to, ship perceptions. But it's up to you to, use it up to you. Sir. Finally. It happened. After. The Lascaux I went. Back to the, audiologist to get a new hearing aid and. I just threw out because. They. Weren't just these are flesh, colored things and more, but. They made it red, ones and blue ones and, green ones so. I opted. For the bright red, one and then. Something. Magical happened. My. Hearing, aid became. Cool. People. Started, saying things like left, of red this. Little, thing create this huge, shift. In my life, it allowed, to. Celebrate my difference, and it allowed others to join, it I'm celebrating. This difference with MIT this. Is because it. Opened up the. Door to converse, him about, difference without being focused, on limitations. Okay. Thank. You at least that was a beautiful, talk and it was a very good introduction to what to, our story which we call project euphonium. So, we're gonna start the story by telling. You a story about, one. Of our colleagues at, Google so. This is Dimitri connects key and, Dimitri. It turns out as a mathematician he's, worked at some of the great institutions for. Mathematics. In the world but. For the last two decades he's, really been thinking primarily, about designing. For accessibility that. Is trying to invent, technology that, was helpful. In some way or other so. Dimitri himself, has. A disability he's, deaf and, he, also has a very strong Russian. Accent, so the first time that, at least I met Dimitri I found, it very hard to, understand, what he was talking, about and but you know hanging out with Dimitri eventually you get the idea so it turns out that our computers, have the same problem. That is when Dimitri speech to his phone as I might speak to my phone his. Phone doesn't understand, him very well and this, is a clip in which he explains, that himself. Luca. Says we are a good digital. Speech. Recognition. But, if it was not fun as most, people. It. Will not defend, to you. So. What you see from this is that the the phone that, was being showed was the phone that was running the Google Cloud speech, recognition, model, and what I would claim is that if you, only looked, at the phone that, you, would not be able to really understand, the thread of Dimitri's. Conversation. Of what Dimitri was trying to communicate and, so, we, asked ourselves the question why is that the case why, is it, that the phone was, not able to understand, Dimitri but for example it is able to understand, me. And in, order to explain this I need to tell you a little bit about how, speech recognition, works, and why, it is that speech recognition has gotten so much better over the, past number of years so, when we speak what, we're doing is creating a wave form so a wave form is just a sound wave and it looks rather, intelligible. Unintelligible. The job that we're asking a computer to do is to take the picture on the, left and to somehow turn it into the words that, are being said so, as you all know humans, have gotten very good at interpreting pictures, and so the way that speech recognizers, work, as we first take the wave form and turn it in to a picture the picture is called a spectrogram, and it's just a picture of colors but it's still unintelligible. As to, what was being said and then, what we do is take the picture and stick it into a neural network which, is a big computer program that has lots of parameters in it and the idea is to make the computer program so that it outputs what, was being said now, of course just like us if you don't train the computer program it has no idea what the what.

Was Being said and so what we do is we, take all of the numbers in this computer program there are millions, of numbers that you have to tune and we give it one, sentence, at a time somebody. Saying something and the, computer, predicts, it saying this and then it gets it wrong and we bang the, computer over the head twiddle, the parameters, around a little bit until, eventually, by giving it lots and lots of sentences, it gets better at speech, recognition, and we have phones that work for. People whom the computer, has heard now. In. Order to do that it, takes huge, numbers of sentences, so tens of millions say if sentences, need to be given to the computer for it to develop a general. Type of understanding but. The problem, is that, for people like Dmitri or even or indeed anyone, who speaks in a way that is different than the pool of examples. That the computer was, given the. Phone can't understand, them just, because it's never heard the, example. Before. And, so the, question, that we asked. And this was a question that we started asking in, collaboration. With an iOS foundation. That we've been working with a OS, TDI who gave me this t-shirt so. I'm day. We asked whether or not it's possible, to, basically fix the speech recognizers. To work for people who, are hard to understood, and dimitri, is amazing, and he. Decided to take this on so. Remember, what I said it takes tens of millions of sentences to train a speech recognizer, it's, completely, crazy to ask someone to, sit and record, tens of millions of sentences but Dimitri has a great spirit and so, he sat in front of his computer and he just started reporting sentences. And so for example here is a sentence what is the temperature today and so the computer would say what is the temperature today and Dimitri would read what, is the temperature today and he sat there for days, recording. These sentences, until we had reported, upwards of 15,000. Sentences, and we then decided to train the, speech recognizer, to see if it was able to understand him and I should tell you that none. Of us knew whether, or not it was even conceivable, that this would could work because, as I, said it took many more sentences, to train the thing in the first place for, many people who speak in a way that is more typical, for speech recognizers, so here's Dimitri at the end he was still happy after. Doing this and then, here. Is the I'm, now going to show you a quick clip of what, happened. We. Need, to, make all. Interactive. Devices. Be. Able. To, understand. Any. Person. Who. Speaks to, them. And. So what you see is is that the device on the on the right, was. Able to understand, Dimitri whereas the device, on the left which is the Google cloud device was, not and this really gave us confidence that it was possible, to make, progress on this task and so, we started working in. Earnest with our collaborators. Als, TDI in, which we recruited. They recruited, a large number of people with a OS to, start recording sentences.

To See if this works now of course getting, someone to record 15,000. Sentences is completely, crazy that's never going. To work. At scale, and so instead we were investigating, technically. Whether or not it's possible to, make progress with, smaller numbers, of sentences, and what, I can report you is that we're making progress we're not there yet we do not feel that we've solve this problem in any way but we're working hard and there are groups of engineers at Google who, are working hard and this is just a little example so, the last column, is the ground truth phrases, the rightmost. Column is what Google cloud recognizes. On this particular, person who happens to have a OS and the middle column, is what, our recognizer, is right now doing and we're hard at work trying to figure out if it is possible to make this work for, people without requiring, so, much training data so, this is Dimitri as of this week so Dimitri now carries around with him about, five different phones, in his pocket, each of which has a different speech recognizer, on, it and he, and, he's testing, and trying to figure out the best way and it is our hope that if we can get this to work with Dimitri's help and with all of your help and hopefully people will record, make, recordings, for us the reason for this call, for data that sundar, made is that we need. More data from. People just recordings. To be able to make this work hopefully we will get there that is our goal and so, this sort of visit give, general, goal of euphonious. Mission, which, is what we would like to do is to, improve communication. Technology, by, including as many people as possible whatever, features. That the people have and whatever means to communicate of. Course speaking is an important, way of communicating. But it is not the only way that, we communicate we communicate, with each other by looking, by. Feeling, by doing so many different things and there are people who don't, have the ability to speak and so now I'm going to turn it over to Irene who, will start to talk about other speaking. Modalities, all. Right thanks Michael. All. Right so so, far we've talked about Dimitri and about speech but, what, about other forms of communication what, about folks, who can't communicate verbally, we. Want, to show you how we're approaching the, research for those types of cases as well, so. For, that I'd like to introduce our second protagonist for the day the amazing, Steve Saleen he's.

An Incredible, person he, had a brilliant career as, a landscape, architect and when. He learned that he has ALS. He said about to rethink, how people, with his condition get care, he. Also started thinking about how he could leverage technology, to create more independence, for himself so that he didn't have to rely as much on other people to take care of him and one. Thing he helped do was he helped create. A smart. Home like system that lets him request, an elevator, and, close the blinds turn on the music all by using his computer it's really amazing, so, Steve, happened, to be one of the perfect persons to partner with for this research because. He, is a technologist, himself, and. Speaking. Of computers, we want to show you how many folks, who have ALS communicate, today they. Use something, called an eye gaze pointer, to type out letters one, by one so. These are two different, systems. That they can use either a keyboard, or something on the right called Dasher and it, works it does a job but. If you can imagine, it's, just a little bit slow and what. He's missing, is a layer of communication, that all of us are familiar with interruptions. Mannerisms. Jokes laughs. Synchronous. Communication that. Comes by quickly that's something that's really hard for Steve and people with his condition to do. So. Something we wanted to try with him was to see if we could if he could train his own personal, machine learning, models to classify, different face expressions, and the, thought was is this, even useful for him to be able to trigger things more, quickly so, that he might be able to open his mouth and trigger something on the computer or raise his eyebrows and trigger something else it, was a question it's a research question and we didn't know the answer so. With, Steve's feedback, his ideas, and a lot of testing, we developed a machine learning tool that anybody actually can use, to train classification. Models in the browser and. By classification I mean a model. That tries to predict, what category, a certain type of input, belongs to let me show you an example to see how it works, this, is my colleague Baron and he's, training two classes, one to detect his face and want to detect this, really cute cat pillow that he has so. He's giving the computer a bunch of data he's training, it waiting, for it to finish and then he's testing, the model on the right and. Then. He publishes publishes. The model all of this is happening in the browser in real time and the images the processing, is happening in his computer's so the images aren't being sent to a server it's all happening, is in his computer in the browser. So. We're calling this Tito machine it's, a tool for anybody to, train machine, learning models in the browser without having to know how to code and it's. Actually built on top of tensorflow, j/s so, all of the underlying technology is free and it's open source for you to use. So. Okay how is Steve using this well, as I mentioned he's training, face classification, models for cases where he, might want a faster response time that what he can achieve with his eye I guess pointer and teach. One machine is the prototyping, tool that's allowing him to do this and explore, what types of use cases are, actually, helpful, for him so. Why. Is why, is this useful well Tito machine is situational, in two ways right, ALS. Actually, changes over time so people with the condition they deteriorate over time so, Steve might be able to do an expression today that, he can't do in a year he has to be able to retrain, those models on his own perhaps. Week by week month by, what month by month as he, needs it and. This and the second thing is that you might imagine that he might want to use different models for different use cases, one. Thing that he actually tried was. Training. A model that would, trigger. An air horn like a sound. Of an air horn when he opens his mouth and to, trigger a boo when he raises his eyebrows, and he used it one night to watch a basketball game with one of his favorite seems to. React, quickly to the game as I progressed, unfortunately. That night his. Team didn't win but it was actually really fun to set up. So. We've got a long way to go with this research this is really only the beginning and.

We Hope to expand the tool to support many more modes of input. The. Tool itself will be available later this year for anyone to train their classification, models but as I said before all of the technology. Is already available on tensorflow Jas. We're. Committed, to working, with people like Steve and Dmitriy, to make their communication, tools better and the, idea really is to start with the hardest problems, that, might unlock innovations, for everyone. But. It starts sincere hope that this kind of research might help people with other types of speech impairments, people with, cerebral palsy or Parkinson's, or multiple. Sclerosis and, maybe. Perhaps. One day it could be helpful it's even more people people, who freely communicate today maybe. You like folks who, have an accent in a 2nd language, and. In. Fact we started calling this approach to building start, with one invent for money we, think it's we. Think anybody can work this way and you can apply to many more types of problems the, idea is actually quite simple, to. Start by working together with one person to solve one problem and that, way you can be sure that what you make for them will be impactful, to them and the people in their lives and sometimes. Doesn't. Always happen but sometimes, what you make together it can go on to be useful to many more people start. With one invent, for many. If. You'd like to hear more about this project and start with one if you'd like to hear more about Samia trees Steve and actually played CTO machine we, have all these projects in the experiment, sandbox tent which is actually really close to the stage and. Finally lastly. We'd. Like to invite, you to help, this research effort as Michael was saying we don't expect people to train fifteen thousand phrases in order to get a model like this so, we, actually need volunteers to share their voice samples, with us so, that we may one day generalize, these models so if you or anyone you know has heart, under says speech we'd. Like to invite you to go to this link and submit some samples and hopefully. One day we can make these models more widely accessible to everyone thank. You.

2019-05-20 17:33

Show Video

Comments:

Grayson Peddie2019-05-22 09:07

Google Developers, how can I help correct the closed caption when I see "hearing laugh" and "hearing lapse?" I think she said "hearing loss." It seems the automatic closed captioning is not picking up the "ss" sound when she says "hearing loss." 4:03 "Disability encompasses ah of us" should be "Disability encompasses all of us." Seems to me the automatic closed caption thinks she said "ah" instead of "all." 4:15 For confirmation, did she say "yeah" in "who yeah has broken their legs or arms?" 6:26 "From na2 design." I don't know if I was lipreading right, but didn't she say "from my to design?" Could automatic closed captioning system do lipreading in the future? 7:20 Again, I did some more lipreading and I think she said "it allowed me to celebrate" and the dynamics of her voice caused automatic closed caption to miss "me" after "allowed" as in "it allowed to celebrate." 7:58 "So this is Dimitri connects key" should be "So this is Dimitri Kanevsky." 20:57 "Start with one invent for money" should be "start with one invent for many." At least she said "start with one, invent for many" the second time and automatic closed caption got it right.

Satyan Khajuria2019-05-22 15:12

For me by my

Other news

Is Refurbished Tech REALLY a Scam? 2025-05-31 06:14

The HD, WIDESCREEN Tube TV! Sony Trinitron KV-30XBR910 2025-05-30 19:30

Telefonica Tech Reimagines Itself for the Hybrid Cloud Era 2025-05-30 04:58