Researching Building & Consuming Responsible ML | BDL207
Machine learning are transforming, the way that we do business, but as we become more dependent, on this cutting-edge, technology. We need to consider, how, to build positive, outcomes, while preventing. Unexpected, results. To talk more about this emerging. Responsible, machine learning, please welcome, our esteemed, panel, we have, willam, we have hannah wallick, and we have manuscript. Thank you so much all for joining. Uh on on build live. Thank you. Thank you for having us. Of course, of course so. I don't want to introduce, you because i have, i mean if i was introducing you it would just be each one of these people, is amazing. And that's just all you need to know but what i'd like you to do is uh each introduce, yourself so willem do you want to go first. Sure, yeah. I'm willing. I'm uh i'm from the netherlands i work as a technology, advocate for infosupport. And, i typically, help, uh customers. Get started with ai machine learning, data platforms, this sort of stuff that's what i like. Uh and in my spare time i run the global ai community. Which is a very very large group of people across the world that are doing. All sorts of cool things around ai. Trying to teach each other, about machine learning. About ai stuff. And and. Yeah. Generally, coding. That's what i love. Fantastic. And hannah can you introduce yourself. Yeah, absolutely. Hey everyone. I'm hannah wallach, i'm a senior principal, researcher. In microsoft, research, new york. I'm a machine learning researcher, by training, i've been doing machine learning for about, 20, years now so, way before this stuff was fashionable. And nowadays. Most of my work focuses, on issues of fairness. Intelligibility. Stuff like that, as they relate to ai and machine learning. I, run, microsoft's. Internal. Ether, working, group on, fairness. And, i'm also a contributor, to the fair learn toolkit, that was, released this week. Fantastic. And we'll get to that very very shortly. Um but fears in the noosh can you introduce yourself for us as well. Absolutely, my name is marine sumeki, i'm from microsoft, boston, or really cambridge. And i'm a senior program manager, driving the product, efforts. Behind two of our open source.
Responsibility. Offerings, interpret, ml and fairlearn. Fantastic. So yeah for my descriptions. We have an absolute, community, hero, in willem. Uh, hannah just blows me away every time i see her speak about the amazing stuff she does around, and ethics in ai. And maneux, is obviously. Pulling. Research. Into, actual, product, so i just want to thank everyone for joining, us, we have a little bit more time than normal. To, quiz you all or or hear your perspectives. On responsible. Ml. So, i would love to start because we briefly, mentioned it in some of the introductions. About some of the announcements, that have happened at build, so minute, can you tell us a little bit more about, responsible, ml but also. The part you took from the azure machine learning team. Yeah absolutely. So as you mentioned, we have had some major announcement, in this conference which is pretty exciting. We do have, a two particular. Open source toolkit. Interpret, ml, and fairlearn. One of them the first one is for, helping you understand, your models better, how did my model makes its decision. What factors. Went into the decisions, of my model and that can be on the overall, or global level of overall how this model is making its predictions. Or from the individual, level so for this particular, person, what factors. Went into, his or her loan getting rejected. Beside that we have a, fairness, assessment, and unfairness, mitigation, toolkit that was also announced in this conference, that is called fairlearn, which has been based on the awesome research. From, hannah, and her team. Mira dudek. And, basically, that particular, one is. Due to the fact that we, understand, that ai systems, can behave, unfairly. And. There are multiple types of harms that can arise, and. This particular, toolkit, is to enable, you to really understand. And assess. The unfairness, that is happening, in your model, and of course, then a set of mitigation. Algorithms. To mitigate that unfairness. Now, we were aware that lack of access, to actual, practical, tools. In order to develop, and deploy. Responsibly. Basically artificial, intelligence, responsibly. Has been one, number one blocker. Uh, for a lot of executives, a lot of data scientists a lot of machine learning developers. To really get get ai, right. Um, and that's why, besides. Releasing them in the open so that everyone can, get benefit of them and everyone can join the effort co-develop, with us we also integrated, it with azure machine learning because we only want the best for all of our azure clients to have access to this state-of-the-art. Set of tools. To really. Put a lot of. Control, a lot of understanding. Analyzing, around their machine learning life cycle. I know i was gonna say i i love this in azure machine learning i love this in the sdk. Um, and. I i just haven't even had the chance at the moment to go in there and take a look so i'm pretty excited. And starting next week's first thing i'm going to be doing, um, but it's it's really really interesting to see that it's kind of coming, out of research, and then straight, into our products, and, big fan of azure machine learning so love, love love that it's in there, and i guess actually that probably segues, us very very nicely, to, hannah. Um. Hannah you you mentioned, you've been kind of working on this open source framework fair learn, and you also have a much wider. Scope, of kind of. Ethics, and fairness, accountability. Transparency, and ethics, in, um. In your research, space can you tell us a little bit about how the toolkit, fits in the space, and. I'm assuming it's not a one and done thing, there's more to it than this. Yeah that's right absolutely, i'd love to talk a bit about this, so one of the things that we've seen over the past few years is this growing, realization. That you know ai, systems, really, can end up discriminating. Or disadvantaging. Already, disadvantaged. Groups. And. In order to sort of really engage with these issues we have to approach them not just from a software, perspective. From a technical, perspective, but from a broader, socio-technical. Perspective. The reason, why ai systems can behave unfairly. Is yes oftentimes, because of the data, or decisions, that have made during the development, and deployment, life cycle leading to particular. Characteristics. Of models or systems, as a whole. But ultimately. It's when these systems, interact, with the real world and the people who are using them or whose lives are, impacted, by them that we start to see that kind of unfair, behavior. And of course because, we're dealing with this topic that is fundamentally. Socio-technical. This means that we have to draw in this wider range of tools and resources. Than simply, software. So yes fair learners, is incredibly, useful, for. Really understanding. Particular, types of fairness, issues, with machine learning models. Specifically. Issues around sort of allocation. Of resources.
Or Opportunities. And also issues around quality, of service. So does my machine learning system, perform, as well for one group as it does for another. But when we start sort of digging into these kinds of issues and how they affect people's, lives. And maybe different ways that we want to try and mitigate, them throughout development, and deployment. We have to use other things as well as software, tools. One of the things for example, that my team has been working on is a ai. Fairness, checklist. And again i don't mean to imply that this is sort of a one-size-fits-all. You know, just use this checklist, and you'll be done and in fact many of the items on this checklist, are much more. Intended, to provoke, conversations. And discussions. And really to make sure that teams are sort of engaging, with this space, even though it is difficult, and maybe outside, the realm of things that technologists. Normally, think about. But we think that all of these resources, together, so things like the checklist, things like fair learn. Um things like sort of broader, even education. In this space, and helping people think about, um. These kinds of issues from day one of development. Is really the first step towards, making progress. Another way to put all this is that when we're thinking about issues like fairness. It's a bit of a culture, change. Moving away from the traditional, sort of, tech first, can i do this, will it work. All of that kind of stuff, mentality. And moving to more of a people first mentality. So, if i do this what will be the impacts, on people. On their lives, on society. As a whole, and of course in order to make that kind of culture, change, you need a wide, variety. Of resources. Of tools, of materials. And so on and so forth. That's amazing. And and. I know. You, you, you've been doing work in this for quite a while, and so it's super exciting, to kind of hear. Both the collaboration. Um into engineering, but, also that kind of statement of, we're all on this learning, journey, right like the toolkit, is going to just it's going to help us as those who are building, models. Um to, to hopefully rectify those mistakes but i love when you're saying like, this got to be designed, it's got to be talked about right from the start, just like, we say about security. Exactly. It's a lot like security, and privacy, and that it's not something, fairness is not something that can be left as an afterthought.
It Is something that has to happen from day one, and so it's super exciting, to see microsoft, firstly, committing, to this space. And secondly, to roll out a variety, of different tools and resources, to help developers, really prioritize. This. One of our ultimate goals for fair learn, is not to just have it sort of focus, on, assessment. So we have this assessment, dashboard, component. And then also this mitigation. Component, these mitigation, algorithms. But to really expand this out into other resources. As well, like domain, specific, guides for when and when not you might want to use particular, fairness, metrics. And also for when you might want to use particular, mitigation. Algorithms. We also want to see this expand, to sort of handle. Other materials. To help people think about other types of unfairness. Beyond just these allocation. Issues or quality of service, issues, so here for example, we're really interested in moving to things like stereotyping. And denigration. How can we help our, developers, prioritize. Those, they're much more, socio-technical. Kinds of you know uh unfairness. That allocation, or quality of service which can be, they're still socio-technical. But you can often get more of a handle, on them from that technical, perspective. But ultimately, we want to be building out to. Really consider, that broader, space, and help developers, sort of navigate, that landscape. Together. It's also really exciting, to see all this coming over, research, as well, as a researcher, myself, it's just super exciting, to see when things, happen in the research, world and then actually make it out to to, the tools and the resources, that we're putting out there for developers. I was going to say it feels like we're hearing this more and more in fact i think three times today we've put we've spoken to people from microsoft, research, because, they've so closely, collaborated. With azure, with, uh microsoft 365. Or whatever it is so, it is fascinating, to see like this literally, is cutting edge, and immediately. Where can it be used right, it's a lot more satisfying. I have to say than just writing yet another, academic, paper, it's really nice to see this stuff actually, out in the world and people engaging, with it and using it, and having it make a difference. That's amazing. Don't play down the academic, papers i bet you've got a million, of them, so. But. Um cool thank you so much hannah for that for that um, initial, insight in that space, i wanted, to turn the conversation, a little bit to willam, willem. My bestie, from the global ai community. Um, so, william leads the global air community for those who don't know um global ai dot community. Go there. If you want to learn about ai. This is the community, to be a part of it is so exciting, and we work super close with willem and his team. Um but willem what i wanted to know, was you actually got a chance. Uh very, just before, build, to have a look at these technologies. Uh that minute, and, hannah are talking about can you tell us a little bit about what you found. Yeah so i got a chance to try out fairlearn. A, few weeks before build. Back then it didn't have a lot of documentation. So for me it was a uh. Actually reading, the papers, that uh, hannah and others wrote.
So For me yeah that's that's, the kind of thing that i like to do i like to try, new stuff and. Especially in the field of ai it's very important that you learn to read, those papers, because they give valuable, insights on how to use particular, tools. At. Fairlearn, it's, the same. So, i started reading the paper i started trying it out and, very quickly i found one of my models was was, sort of unfair, towards, female, customers. And it's. It wasn't immediately, visible to me because. I i was very careful, i i tried to exclude. Features like gender, like ethnicity. Age, all that stuff i tried to remove from my mod from my training set because i knew, it would be sensitive there would be something. That my model would pick up, but despite the fact that i removed them. My model still picked those features, up, and and that's kind of funny. Um, uh. When, you're trying to do this kind of thing, uh fairness, is not something, that you can can pick up right off the bat, you know up front whether your model is going to be fair you have to. Train it first. And then, measure its fairness using fair learning then go back and try to improve that so it was kind of fun to do that. That's amazing. And, obviously. There is some. Documentation. Out there did you use documentation. To kind of get started, with that tech. Yeah so right now the website, is up and running. So there's a user guide that takes you through, getting started with a couple of notebooks that explain step by step what you have to do to measure fairness. And how to improve it afterwards. And they also include a very extensive. Description, of the api, so once you work those. Through those notebooks. You have sort of a sense of where you're going. But afterwards, the api will help you truly understand. How to select certain features, or what what all the options, mean, and it's really great, that they uh they've been able to put up this website so quickly. Fantastic. And obviously, with, uh, there's other members, right of your sort of uh main part of your the global ai community, that they've got a chance to cover other technologies, is there somewhere, we can send our amazing, audience to, to go and read about them. Uh i don't have to link handy at the moment but we i'm pretty sure we can share the link afterwards. Um so we've got a couple of blog posts online, uh that explain, all the uh, responsible, ai initiatives. We've actually looked at, uh white noise uh, so the differential, privacy, tool that microsoft, announced, we've looked at seal. That's actually a, tool that allows you to train models without actually seeing the data that's also pretty cool and pretty complicated. Um. I've learned from, sami who's, been trying this out. And we've looked at, uh stuff around explainers, as well specifically. Into nlp, jobs. Um. Trying to figure out why is my model predicting, a certain category, for an email, or a document, or something like that, it's pretty awesome. Um. And and i'm pretty sure we can link uh, cinderlink, afterwards. That's fantastic, i was gonna say we do have comments to the right so maybe whether maybe we could pop there on my, build.microsoft.com. Just afterwards and, spam them with a few links just there, um, we have a really interesting, poll going at the moment for from our audience, and it is, fascinating, to hear. To see all of the different, types of systems. That people are building, with machine learning so we've got, um. Pure prediction. So maybe maybe more classical, modelling. And we've got quite a lot in the area of image. A little bit in speech, and personal, assistance, is an interesting one i guess it's been big. At build with chat bots as well and actually thinking about that loco, no code aspect. And so yeah i just wanted to share that with you guys, very very quickly. Um, and then could we switch back to uh minus, for a second actually here, so.
I Had a chance, to. Um. Because interpret. Ml what is it is a github repository, right all of this stuff is, is open source it's available on github, i had a chance kind of at the start of the year to dive in a little. It's quite complex. So brace yourself everyone, but um. Yeah can you tell us a little bit more about it maybe i should have just come straight to you. Yeah. Um. Yes so basically when you land on interpret, ml org, under github, you'll see a few repositories. Underneath, that. Uh the main one is interpret. So, interpret, holds a collection. Of, what we call glass box models or, in very simple, words. Interpretable, models. They are, linear models, decision, tree and of course one state-of-the-art. Machine learning model. Out of microsoft, research called explainable, boosting machine, or ebm. If you haven't checked it out you should check it out it is very very interpretable. Yet accurate and efficient. So. So that's one collection of interpretability. Techniques for tabular, data that we have there, also under interpret. We have collected, all the state-of-the-art. Basically, black box explainers. Interpretability. Techniques, that can explain. Any black box model. What they need is the input features, and the predictions, they do not care what's happening, inside the model, they use bunch of approximations. On top, to infer. How the model has sort of played the game on top of these features, in order to come with that predictions. So that's interpret. If i want to put it short a collection, of interpretable, models, or black box interpretability. Techniques, for, models, trained on tabular, data. Now another. Repo that we have underneath, the interpret, ml organization. Under github, is, interpret. Text. We learned a lot about. The need for, expanding, this repository. To include, text scenarios. So, our very first attempt was with text classification. So exactly the scenario, that. William mentioned, about, basically, classifying. Uh maybe an email to be spam or not spam. What words contributed, to this prediction, of spam. So that text classification, scenario was our initial scenario, that we started with and that's you that's where you find three different interpretability. Techniques, under interpret. Text. And then the other one is, what we call guys, or diverse, counter factual, examples. And, that is, really, really helpful, for understanding. And debugging, your model so, what it says is. Show me the similar. Um. Data points with different outcomes, so, marnutia's, day alone, has got rejected. Who is the closest, person. To manusc. Who has got approval, on his or her loan, it might be that oh amy, is the closest person to my new same age. Same marital status. Same, salary. Everything the same, just, her, loan. Student, loan is 10k. Less than manush, that's why she has got approved, and then using this counter factual points you can really see whether.
Your, Model's decisions, make sense, for of course if my counter factual is someone the exact same feature said as me just, gender, flipped to male. Then then you know that there is a, sort of a fairness issue that you need to go and investigate, with fairlearn, for instance. So it's like a red flag to you so. Yeah these are the, multiple, ripples now in near future we have. Also interpretability. For computer vision on our radar and we really want to. Expand, our repository. To include better. Debugging, and air analysis, tools. Nice, nice, i actually have i think quite an interesting question, from the audience. Um. And. That kind of yeah, i think manufacturing. But you'll want to comment on this one possibly, first but you know feel free to jump in everyone. And what is the likelihood, of, introducing, unfairness. Through. Actually. Tweaking, your model and and introducing, fairness in that sense, uh maybe more unfairness, i don't even know if that's a thing but, is. Is that something, that you that you're worried about or is that is there a way. Will this tool and help us to to clarify, whether we're doing that or not. Yeah so basically, with. All of these techniques. Any, you. Your model can behave, unfairly, so, there there is very difficult with interpretability. Toolkit to really understand, that fairness, aspect of it, but of course, we have fair learn there um, in order to bring a lot more understanding. Of, okay this this change that i had happening, to my model. Now take the outcome of the new model to fairlearn, assessment, phase and try to understand. How that impacted, the predictions, for different demographics. For different groups of people say females versus, male, versus, people of non-binary. Gender. So. None of these toolkits, is like magic, for you to understand, oh like, exactly. Uncover, everything, but, combination, of interpretability.
And Combination, of fair learn, with of course domain knowledge and context can really help you at least investigate. And bring a lot more understanding, around your model. I know that hannah has a lot of thoughts there too. Yeah maybe i can just jump in there um, just to say a couple of things i think you more or less covered it manush. Um. I guess the the only thing i want to flag is that when you're working with any machine learning model, it's of course entirely, possible, to make decisions. That are, well intentioned. And intended to reduce, things like unfairness. But that might actually, not help, and i think willem's, example, of removing, features, like gender, or race, or age, these kinds of features. Uh from your data set before training your model with the idea, being okay this will get rid of unfairness, this is you know then i'll be good to go is kind of a perfect, example. Even with those kinds of modifications. Of course models can behave unfairly. In large part because there are still signals, about these kinds of things. In other features, as well, and so that's why we thought it was so important, to have something, like the fair learn assessment, dashboard. That way people really can say okay look i think i've done something that's going to reduce, unfairness. But did it actually reduce unfairness. According to these metrics. The other thing i want to flag is that there's many, different, fairness, metrics. That you can assess your model with in that failure, and dashboard. And that's really important, because these models, all capture, different, aspects, of fairness and unfairness. You know fairness, is this fundamentally. Societal. Concept. And there's a lot of disagreement. About what it might mean for a model to behave unfairly. Does it mean for example, that the error rates should be exactly, the same for different groups, or perhaps does it mean something, else. One of the one of the things about fair learn is that we do have all of these uh, different fairness, metrics, in there so that people can explore, different ones and sort of say. Really how is my model behaving, as a whole. I love that i love that breakdown, of kind of like interpretation. And then and then, fair learn. Here's the different. Ways, that we can consider, or are we building. Fair models is that is there something that i need to know something i need to do, um so i guess a call to action here is just you know go, read those docs, get immersed, in in what's going on with those dashboards, and those interpretability. Kits because, for anyone building these models i mean, it's i guess it's innate, in data science it's kind of a skill that data scientists, have of, um you know digging in and interpreting.
What's Going on the data. I sometimes, make the joke, and like the sherlock, holmes of a data problem, trying to like sift it down. Like figure it out infer. Things. All of that kind of stuff so, um i i love i love that you've really lined it up nicely, um. Kind of splitting up the two the two different things. And then um, willem can i can i go to you for a minute so you've had a chance to look at the tech. Um do where do you see this being used in your day-to-day, job is this something you're like, god i've got to get on it straight away. So yeah. That's a good question, um, so right now customers, are getting started, with ml ops so they're, starting to use azure machine learning service and similar tools to. Actually make sure that they have a. Reproducible. Model that's one of the other problems that we still have. In the field that. But. We build a model and then we change the data and then we can't reproduce. The results, that we have before. So that's that's what's happening right now and what i see, um. Happening in the near future is that people start to wonder. Why should i trust my ai model, an azure machine learning service in combination, with interpreter, ml, is going to help people to understand. Why, am i getting this particular, prediction. And might there be a feature that's. Extra, heavy, in this prediction. That might indicate, unfairness. And then, you can start to use. Uh tools like, azure notebooks, in combination, with the fair learning dashboard, to discover. What is the unfairness, in my model. Right now fairlearn, is mostly a debugging, tool for me as a. As a data scientist, and a machine learning engineer. To take a look, what's actually going on in my model why is it behaving, this. Weird. In this weird way. So it's i think the combination. Of of envelopes, on the one hand and then these tools on the other hand will only make things better. Um for the near future. Fantastic. And i know if you can just put your other hat on for a second there willem, or your community.
Hat, Um. Is this is this going to be top of mind for some of your next community. Uh global, ai communities. Virtual, tours, all of the amazing stuff that you do. Yeah so we've got a, virtual tour planned on june 9.. Um. During that tour we're going to explain, more about, fair learn about all the other tools that we've explored, with the community. And i've got a number of other, things, happening, uh, soon. Mostly online of course because of, the, corona. Pandemic, that's happening around us, um. But, sure yeah a lot of people are interested, in this, in these new tools. And, um, i've got a lot of questions from people, what does it mean. Uh to be fair what does what, what do you think about ethics, in ai. Um so there's sure a lot of stuff happening in the community. Fantastic. And we have about a minute left on this panel so, if i can do a quick fire round. 10 seconds, each of something. And what's the key takeaway that you want people to take away for responsible, machine learning. Um oh let's take one uh. Um. I i just want to, uh really. Promote, that, let's, move away from treating, machine learning models as black box, interpret, ml and further, all started with the goal of bringing a community, around all these toolkits. And, helping, co-developing. It together, so definitely, check us out and send us github, issues. Sign up for contributing. And we'll be super happy to grow this community. Fantastic. Hannah 10 seconds can you do it. Right. Yeah i think the main thing i want people to take away from this is that this is a complicated. Space, that engages, with broader societal. Issues and as a result software, tools are not going to be a single, one-size-fits-all. Solution. You will need to sort of engage with these issues and learn more about them, but we're starting to see a lot more resources, out there to help developers, really navigate, that space and really understand, it, so fantastic. I'm so sorry we have a hard stop on this session thank you everyone.
For More on how we can engage in responsible, machine learning, check out aka, dot ms. Ms bill dash responsible, ml, next up a conversation. To. Sure to inspire the technology, and all of us it's our third digital, breakout of the day this time scott hanselman. Luanne murphy, who are inspiring, next-gen, code with make code go take a look, a little bit of sleep, and.