James Mickens on why all data science is political

Show Video

and for all the people listening i want to look right in the camera where is it oh i'm stuck in zoom world it's right here i want to make this very very clear all data science is political [Music] yes i will do that you ready everybody yeah we'll see if i can do this in one take sometimes again sometimes i can't you know just depends on the day um okay today we are joined by james mickens james is the gordon mckay professor of computer science at harvard's john a paulson school of engineering and applied sciences as well as a director at the berkman klein center for internet and society at harvard university welcome james we are very excited to talk to you today thank you thank you for that introduction good to see you again james glad you could join us so let's get started um i want to talk specifically uh first about people and technology and actually the people who create technology the people who create technology are in a unique position to sort of know when bias might be introduced do you have a sense of you know like what proportion of those people actually aware that they have this big responsibility i can't give you a specific number i know that the number is lower than we would hope you know i think that a lot of people come through their technical education whether it be formal through university or whether it be sort of informal through you know self-teaching watching courses on the internet but a lot of uh technologists and in fact a lot of sort of uh tech centered entrepreneurs they come through that sort of educational process thinking that you know implicitly that technology is value neutral that somehow um you know we just create these products and sure they can be used for good or bad but ultimately that's not for you know the technologist or for the business person to to decide um you know there's some sort of hope you know if it gets too bad maybe the government will intervene or you know social forces will take over there'll be boycotts but you know it's not our problem uh but that's wrong that's the wrong way to think about it um and so i think that the fraction of people who actively uh devote thought to this it's getting bigger so that's the good news right i think that particularly the younger generation of entrepreneurs and uh tech folks they are starting to think about these things more explicitly but there's still a large swath of of of the sort of tech community that uh would sort of prefer to not get bogged down in these quote-unquote non-technical details it's a interesting observation given that um maybe it says something about our community we we reached out on twitter with a poll before interviewing you and and asked you know if a developer sees something that really should be dealt with at the time or there's a heavy bias in a in a product you know the poll was should they always say something should they only say if it's like if they think it's a really big deal or is it not their problem um and it was pretty universal that they should always say something so maybe that's a says something about our community but um or maybe the trend headed in the right direction so it's actually yeah i mean or maybe like you know we can't necessarily trust the polls i mean it's the same sense and like you know we would all say we would help you know the grandmother try to cross the street we would certainly do that but of course when we actually get to the intersection you see everyone looking around saying who who else is going to help this person across the street i got somewhere to go you know and so i think that you're exactly right that in the abstract you know when you ask people these questions they say oh of course oh of course you know we should think about you know the well-being of other people when doing this that and the other but you know once you get the pressure of deadlines you know the product has to shift next week once you get the pressure of shareholders you know they want to make sure that uh you're competitive or competitive is is oftentimes defined as how many features do you have how fast is your software so on and so forth when you start looking at these these sort of more complex real world situations i think it's easier for people to um lose some of their their moral centering if you will i kind of wanted to follow up on that theme um around the way that we sort of bifurcate or silo right there's technical issues and questions and problems and then you know say call them social right and it goes both ways right like you know we just don't traverse that boundary a lot but so it seems to me what you're saying about education is really critical right if we educate people differently so that these ethical considerations are embedded in in how they think from from the get-go that seems ideal but of course we're at a place today where there's plenty of people you know in positions of power making decisions that didn't have that education so and you had this great quote um in a talk that i was that i was watching in preparation from this interview i think you said something like you know it's pointed out it's ethical considerations um become less important to us when considering them what could hurt our revenue right or some other and you mentioned that earlier right deadlines right we just have these different incentives um so i would be curious uh sorry to hear you talk about where we are today how do we meaningfully integrate these ethical considerations into our decision making and how do we even communicate about them when there's so much variation in sort of what people know on both on both the social and the technical sides right those are those are great questions i think the sort of first step is always realizing that you have a problem right so in other words the first step is always sort of stepping back and saying look i can't just silo myself in my narrow uh sort of domain of expertise you know the things that i i build the company the technology and the people that i train they interact with the larger society so i think just sort of getting that high level um idea in people's heads that's sort of the first fight then after you sort of you know won that battle i think the next uh sort of struggle is to convince people that many of these sort of you know ethical challenges they don't just have a very simple yes no answer and of course as engineers that is what we want to hear you know so as engineers we want to basically say look i get it ethics certainly i want to make sure that i don't go to jail and i get to heaven so can you just give me a checklist can you just give me a checklist and then whenever something ethical happens to me you know i have to make some type of ethical decision i just consult you know ye old uh checklist and then i just go yep okay boom then we're done you know unsurprisingly perhaps this is not the way that these situations actually resolve themselves in the real world you actually have to think about these things and you have to um you know make difficult decisions whereby the decision you end up making may still have some bad effects you know it may be the the best of a series of difficult decisions to make so that's where i think it's actually helpful to um to to have people either on staff or people that you can sort of talk to who are are sort of classically trained and thinking about these difficult issues you know many hospitals for example have an ethics board right and it's not because the doctors aren't aware of these issues it's that the doctors were first and foremost trained to you know heal people they weren't trained to think about you know uh philosophers and caves whatever it was that was obsessing you know the greeks back in the day uh so so i think that you know increasingly in at least some of the bigger companies you're starting to see there be some roles inside the company where the job is to think about some of these issues you know in the same way that you have sort of a lawyer to think about legal compliance you might have some philosophers some ethicists some sociologists on staff to think about these issues of um what is the right and wrong thing to do here who are the stakeholders are we ensuring that we're providing equity along multiple dimensions uh you know gender race disability status things like that so i think that's ultimately sort of where we want to go the endgame is that you know even if you have a company that seemingly is only focused on one thing you know making trades go fast or providing a social network or thing like that things like that they still have the capacity to think more holistically about how those products are integrating um with the rest of society there's a lot of talk about bias and data but there's a lot of steps in data there's the data gathering there's the collection the storage the parsing of it then there's the analyzing of it then on the back end there's you know systems of ai and ml you know interpreting them and making decisions or projecting the future i wonder are there are there different problems at each stages at each stage there and are some of the problems bigger than others and where do you see the do you see a biggest problem or is everyone just got to be aware all along yeah it's it's wall-to-wall problems chock-full problems there are many it's christmas has come early and your gifts are problems that's the way that i would look at it i mean i think that um you know intuitively i think the sort of human nature is to look at problems and try to be you know as reductionist as possible and to say i've got this complicated process ah but here here's the problem here's the problem we want to point fingers at something and say if we fix that then we're done i think that you know particularly when you look at you know sort of big data pipelines machine learning things like that because these pipelines are oftentimes so deep because the data sets are so complex and so multi-dimensional it's oftentimes hard to say um yeah if we just fix this one thing that'll solve 90 of our quote-unquote um you know ethics problems or our quote-unquote diversity problems or whatnot it's typically a more holistic type of reasoning that you have to apply um and i think that this you know somewhat related to uh our conversation from a while ago about there is no checklist right in as much as if you do this this and this then you're scot-free instead what it typically i mean it's really it's it's a lifestyle you know so you have to you know at every step of sort of um design and then implementation and then testing you have to be thinking about some of these questions and you know one pushback that you'll sometimes get from that particularly from you know engineers or from quants or people like that who view themselves as like being hard technical people they'll say this isn't what you hired me for you know you didn't hire me to sort of you know watch these videos about the value of diversity i believe it i believe it let everyone in but i don't want to talk about this you know and the problem is that there's a lot of research which shows that you you know that type of attitude i understand bias but i just don't want to deal with it you know sort of my day-to-day sort of um professional life that does not lead to unbiased outcomes right it's a thing that you always have to uh sort of think about in addition to the you know the technical side of things or to the business side of things so it's really you know to get back to your original question you've got some you know big data pipeline that involves a lot of different players that involves a lot of different systems you know at every step you have to think about um you know who are the stakeholders you know who are the people you're trying to help what are their interests how do you make sure those interests are being protected who are the set of people that you don't care about right who are the set of people for whom you're not actually targeting their concerns being explicit about that stuff is very important because when you don't think about it explicitly you end up getting sort of uh tech that sort of fails in these ways that end up causing a lot of harm even though let's say the the developers and the business folks don't have any explicit malice in their heart a great example is like a facial recognition for cameras built into laptops you know a lot of the early uh sort of cameras that came out they couldn't track people's faces of a certain skin tone you know and i have no doubt that the people who designed a lot of these early systems they weren't explicitly trying to do that but they didn't ask these questions about you know where's our training data coming from for example so they get training data that itself is biased it then results in a biased uh you know sort of uh facial recognition algorithm they didn't know that though and so they just shipped the product saying we've hit all of our internal metrics for accuracy but it was you know it wasn't until people started posting things on youtube where you know you'd have someone come into frame and then the camera would just freak out you know and just start you know emitting uh smoke you know it took that to make them understand like we really need to to rethink our process you know from the ground up you know not just sort of the engineering and the algorithms once you have the data but where is that data coming from in the first place we're talking a lot about engineers thinking differently what's the space or need for people who are non-engineers and what's their role and and interesting enough what what would they have to be doing differently you know do we just take ethicists and put them in a room with engineers or do they need to learn something first what's your view on that exactly yeah my number one recommendation you take the ethicist chain them to a radiator bring in the other set of people chain them to a radiator see who makes it out you know my bed's on the ethicist they're kind but they're coming uh i would say that you know what's gonna if if you look at any sort of um you know modern enterprise there's a very good chance that the set of job titles that you have at that enterprise are pretty varied you know even at let's say you know i'm just making this example but even in a tech company they have a huge number of lawyers they have a huge number of business people and economists you know these are very diverse companies and so what that means is that even if you think that oh i work at a widget company the majority of people who work at this company are directly making widgets that's oftentimes false you know they're sort of in these support roles to support the you know widget widget forward workers you can tell i'm not a business person that's not business talk anyways uh so so i think that you know what ends up happening a lot of times is that there are these decisions about you know products and services that have to be made that don't just involve the sort of people on the ground the people who are actually making those services or products and the decisions oftentimes bubble up to other parts of the company which are you know not those front-facing people either the lawyers the hr people things like that and so i think that when you talk about things like you know diversity training when you talk about things like um you know sort of ethics training it's not just a sort of teaching that you want to do to the people who are front line making the widgets it's all the way up and down the stack and to be honest i think a lot of the training also has to be directed to shareholders as well because i think another sort of like key tension that oftentimes arises is that people um by people i mean shareholders say things like you know well yeah all these things that you're doing that are not directly profit focused great we should definitely do that but but also don't hurt profit you know they certainly sort of want to have this the sort of ambivalence towards these things and that that ambivalence is um it's problematic so you do a lot of research on cyber security and it seems like that's an area where we're just beginning to grapple with the implications of that topic around on on diversity inclusion justice fairness i wonder if you could talk a bit about the connections between cyber security and equity sure so i think that um you know all of these issues that we look at in technology and i think increasingly in business too i think just defining these terms is becoming messier and messier like it's becoming more and more difficult so going back to this example of facial recognition so imagine that you have um cities i mean we don't have to imagine this there are cities who have cameras deployed throughout the streets and those cameras are used among other things to help you know uh prevent and then later on sort of unravel what type of criminal activity happened well if if the the data that was used to train those cameras to identify faces was biased that puts certain communities at uh greater risk and so there's an interesting cyber security angle there too because if someone were to break into those systems and let's say change the way that they identified criminals versus not criminals that risk would fall disproportionately on certain segments of society to put a finer point on it you know depending on what um uh what what zip code you live in right you are more or less likely to have let's say you know police cameras in that zip code and as a result like the the security or lack thereof of that camera of a camera system deployed by the city you know those the the the impact of that system being hacked into will fall disproportionately on people from different zip codes um so i think that you know this notion of cyber security as it's evolving as i said it used to just be can people break into my stuff it's it's become more encompassing as technology has become more pervasive so now for example cyber security includes things like can people break into my power grid by my i mean like you know a county a state um a nation uh cyber security includes things like can someone tamper with our elections and once again there you know there are opportunities for disproportionate impact in terms of the way that you know let's say four nation states might try to uh tamper with the the votes that have been registered by certain members of certain communities um so you know once again i think this all harkens back to this idea that when we talk about these issues of tech or business or cyber security or you know bias or diversity or ethics you really have to take this increasingly broader perspective on things because so many aspects of society are intertangled now with so many other aspects of society and it brings out the point i always wonder about um integrating these data sets so you mentioned certain zip codes have more data or more gathered information on the residents in those areas and out there there are companies like for instance financial institutions that want to evaluate people for loans and they're aggregating data from many many sources so inherently it would seem to be more information about people and potential crime or crime in certain zip codes and if the financial institutions just take that data at face value they may make interpretations because then that's another thing someone has to decide you know how do i aggregate all this data together and decide a profile of who you are um and then and then there's implications further beyond just the criminal justice system and is this aggregation of data i mean does is anyone how many people are looking at that and what are they seeing it's a real problem what you just described and for all the people listening i want to look right in the camera where is it oh i'm stuck in zoom world it's right here i want to make this very very clear all data science is political okay like it's impossible to take a data set and analyze it in a quote-unquote perfectly objective way because you're always going to be putting on there some type of value judgment about what the data set represents whether that data set covers all the attributes that you care about and what are you trying to do with the data set you know and i think that once again it's very easy for um technologies and entrepreneurs to say let the machines handle it because it's just zeros and ones but that's not at all the way that this this sort of uh a system works you know so when you look at things for example like um you know predicting whether someone's gonna default on a loan for example maybe to give them a mortgage or something like that you know let's say i give you some data set which looks at the historical rate of loan defaults for you know a bunch of different people put that in scare quotes intentionally first of all which communities are you looking at you know were they already the target of let's say previous predatory lending which put them in a poor position to pay for new loans things like that thinking about those questions and whether those questions are important that is a political process am i political i don't mean political innocence of like you know you're like a capital or republican capitalist democrat i mean political in the sense that you are making a statement about what you would prefer the world to look like given that you're going to analyze data in a certain way that's what i mean by political right and so uh you know i think it's so important for people to understand that because so often you hear this attitude of yeah we have these humans making these decisions and of course humans are biased but once we feed it into this machine learning thing once we feed into this algorithm all of our bias problems go away and that's just completely false you know and what you end up seeing time and time again is that if you don't ask these political questions you're not honest about that kind of stuff then you see the old biases that you were supposedly trying to get rid of being replicated in these new systems that you create but except now that you have this sort of this sort of uh facade of like oh it's just zeros and ones you know so when i see uh where i hear about things like oh we're gonna use you know uh algorithms to determine uh the first uh pass of uh uh cv screening you know so you submit an application and then an algorithm basically says here's the here's the first cut of things that concerns me because there's many studies that show for example if you have two resumes that are exactly the same you just change the name all kinds of like bad things happen based on whether you change the name to a woman or a person of color sounding name you know so on and so forth and so if you say oh the goal of our algorithm is to be just as accurate as our old system but your old system wasn't accurate you know so i think it's really important for everyone listening whenever you know if you're at a company and like your cto or some data scientist says don't worry we got an algorithm on the case we're not going to have any bias problems uh fire that person arguably making citizens arrest look up be knowledgeable the statutes try to you know just do something to them till the authorities can show up because that's just a terrible way of thinking about things so this is um a question that i was thinking about you know watching some presentations you've given and then also reading how you kind of how you talk about teaching right because you're an educator you're not just a researcher and you're clearly a communicator and you're somebody who cares a lot about trying to uh foster these conversations right so we can address some of the things that you're that you're calling out and you're identifying as i think you called it wall-to-wall problems which is a great turn of phrase that i will be using in my own life um and i think anybody who knows much about you knows that you really lean into humor right and also kind of storytelling and narrative um in in these public talks and i imagine maybe in the classroom too um so i would just love to hear you talk about why you do that um you know why you think that's useful because i imagine that it's a deliberate choice that you're making yeah i mean you know heavy is the crown sometimes i wake up and just have too many jokes in my mind and it's difficult to find a way to share that gift with humanity and yet i try you know yet i try so i think that um one reason i try to incorporate sort of storytelling and humor into my sort of public speaking is that you hear from politicians and leaders all the time we need more people in tech think more about tech tech is a great thing to get into science and math and engineering and studying blah blah and yet we don't have as many popularizers of you know stem subjects as one might expect given all these exhortations to go into that field and i think that um it's very easy for lay people to get this impression that oh you know stem stuff is very stodgy and you know i'm just going to be locked away in a lab all day and it's not it's not fun but i think it isn't in fact fun i think it is in fact interesting and furthermore it is in fact important you know many of the issues we've talked about in this conversation are issues that are extremely important to large swaths of society and i think that because of this latter issue in particular that there are so many important issues which you know need to be talked about but which can be uncomfortable to talk about that's the reason why i think humor can be very useful because i know you know from personal experience teaching engineers also you know being an engineer myself sometimes there's this reaction when someone comes to you and they're like hey have you thought about this you know this thorny ethical dilemma you're just like get off my lawn i don't want to hear about this kind of stuff you couldn't come into my chair and write as many lines of bug-free code as i could so just get out of here and that misses the point you know it misses the point about what it means to be like you know a member of society where you have to care more about things beyond just your narrow world view um but sometimes you have to lead people sort of to that river so they can drink like a horse or whatever that's saying is right and so i what i find is that um if you sort of bring up some of these issues using um sort of the the the delivery mechanism of humor people will um they're more likely to sort of be less defensive when you start talking about the more difficult things because you know no one likes to be told for example that they're biased and i mean anyone out there who's listening you know take an implicit bias test you will find out that you are basically like one bad day away from like living in the 1500s like i mean it is rough it doesn't matter how like open-minded you think you are you you'll take the test and like you'll be like i've probably got a 95 out of 100. you'll get like a negative 18 out of 100 like i guarantee that like i've never seen a score higher than negative five right and so i think that's tough you know it's tough for people to hear that message and so that's why i think it's it's helpful to to use comedy to sort of soften some of those blows and to tell personal stories you know it's like i i grew up in the south and you know i had a certain set of experiences there which um you know some of them were troubling you know in a certain sense uh but it's useful to to share some of those stories because ultimately you know we're all people i mean i don't want to i don't want to tear up on camera but you know we're all people there there are a set of these universal experiences um that we all have and i think that people realize that quicker through laughing because when you laugh together what what is comedy right it's very interesting not to get too philosophical here but you know comedy when you tell a joke right you've built a world view and you're asking people to join you in that worldview to join you in this little universe that you've created and then find the same things that you find funny you want them to find that funny themselves and in a certain sense that's what any teaching is that's what any advocacy is it's you're saying i've built this universe this way of thinking about things and i'm inviting you to come live in that universe you know and so that's why i think it's so important that um you know we get these messages out there and we deliver them in a way that is both honest but also um sort of caring you know kind of that makes it clear that um you know we all make mistakes you know but if you're trying to constantly learn and you're trying to constantly think about these issues explicitly then that's the best that we can do and that's the most you could ever ask of someone i think it's working by the way just feedback for you i agree no that's great i and just i i the only thing i wanted to say is that um what you're saying about humor and kind of storytelling right as a way to bring people into a place where they feel like they're part of a community like they're doing something together and there's you know kind of a shared experience and shared sense i mean i think it really you know concurs with um research that's been done on bias training which has found that if you just educate people about bias and and just that's it like just you know explain implicit bias to them and they get it and they know it's real and they know everyone has these um implicit biases it actually makes people ultimately behave in more biased ways and the only way to avoid that is to frame it around we're all trying to overcome these biases and so i think that's kind of what you're doing in a different way is creating that that kind of condition around we're all trying to kind of learn together and you know grow and and overcome this rather than just you know take the test and oh my gosh yes it's you know it's the middle ages and you know everything is terrible so i just wanted to say i think there's some real you know there's you know research out there i think that really supports um uh that approach as well as anecdotally i agree with dave i think it's i think yeah for the viewers out there you know i saved dave from the life of crime many times and now you know he's all cleaned up you know it's just been really really heartwarming to see so don't give up hope out there kids it's a good yes it's a great it's a heartwarming story tell that's we'll tell that story um in another episode it'll be like a prequel to this interview so before we close is there anything that we haven't asked you or anything that um you know you haven't had a chance to speak to that is is important that you want to leave people with um and if not you know any resources or kind of other places you want to point people to go to learn more you know who are interested in these in these topics i think that and this is kind of building upon some of uh something that i mentioned a bit tangentially earlier but i think one of the most important things anyone can do regardless of you know what their job is what profession they're in is talk to people because i feel that like a lot of um you know sort of frictions or issues or problems that arise they arise because um you know people just haven't been exposed to certain ideas or certain perspectives so you know one thing i'd really recommend that people do is that they um they try to talk to their co-workers talk to their neighbors talk to their friends just listen and see what kind of issues are top of mind for those people because i feel like many of the you know if we look specifically at this this problem of you know tech that's sort of gone awry tech that you know didn't serve the population the way that we thought it would many of the issues that arise were foreseeable you know they could have been dealt with early on if only we had sort of talked to and valued um the opinions of other people who in many cases are not far away you know it's not like you need to you know uh put in a telegram to someone living at the center of the earth um you know you just need to just go talk to the person in your next cubicle or just you know down the street um so that's one thing i'd really encourage people um to do i'd also say that um if you are interested in um sort of some more sort of formal training for these types of things there are in by these types of things i mean things like you know ethical reasoning i'm diversity training things like that um if you're currently a student or you know anyone who is at a university oftentimes universities offer um sort of resources to help with these types of things if you're particularly interested in issues at the intersection of computer science and ethics you can check out harvard's website for embedded ethics that's publicly available you can see some of the readings that we have there that technologists can look at but yeah really my high level piece of advice is just talk to people try to you know think about these issues involving business and tech and ethics holistically and you know i think you'll see better outcomes great i think that's perfect now 10 yes yes yes so um you know back when we talked about cyber security colleen i was gonna say you know this conversation can go in two directions because um james has mentioned bias in very specific places that aren't necessarily just tied to um you know all the things we were talking about and then he mentioned presidents um so i was thinking the conversation could have gone either way but i'm glad the way we went just you know just for everyone's blood pressure just for everyone's yeah hey look man i mean if you've got sizzle reel i can take it to msnbc i'm looking for some side money so we're happy to we're happy to help with that that's a wrap on the interview but the conversation continues and we want to hear from you send your comments questions and ideas to justdigital.hbs.edu you've been watching pathways to a just digital future an investigative project that aims to better understand and address inequality in tech this program was produced by the harvard business school digital and gender initiatives our team includes ethiopia almighty and my cat tanya flint one more time liz sarley thomas i'm dave holma and i'm colleen ammerman thanks for hanging out with us keep exploring at justdigital.hbs.edu

2021-01-18 06:13

Show Video

Other news

What’s Inside the $13,000 dCS LINA DAC? 2025-06-01 21:40

Nvidia CEO Slams US Chip Rules, Trump’s AI Action Plan | Bloomberg Technology 2025-05-26 20:00

Война силами хайтека | Всё перевернулось с ног на голову (English subtitles) @Max_Katz 2025-05-26 16:41