Faculty Research Lecture 2019 - Lise Getoor on Responsible Data Science

Show video

Good. Evening, my. Name is Kimberly, Lau I'm the current chair of the Academic, Senate and, I'm so pleased to welcome all, of you to the fifty-third faculty. Research lecture, at UCSC, it's, wonderful, to see so many of you here tonight, the. Faculty, research, lecture, is an institution. Supported, by the academic, Senate's, on all 10 campuses of the University of, California. Its. Purpose, is to highlight a faculty, members distinguished. Research, record, and to, provide the campus, and the local community, with an opportunity, to learn more about the honorees, work the. First faculty, research lecture, was delivered at UC Berkeley in 1918. And UC. Santa Cruz initiated. Its tradition, with Maurice, Alexander. Natan syns 1967. Lecture, each. Year, a committee of Academic, Senate faculty from across all five academic, divisions, nominates. A professor, to present this prestigious, lecture based, on the overall excellence. Of their work this. Year the nomination, of professor. Lisa Couture, was, approved by enthusiastic. Acclamations a, Senate, meeting of May 16th, 2018. Dr.. Carter's research, spans several fields, and data science, including, machine learning, and reasoning under uncertainty, data. Management visual. Analytics, and social, network analysis, as you, can see from this brief overview her, work is extremely, timely, and as, the director of the UCSD d3. Standing. For data discovery, and decisions, data center, she, was also particularly, concerned. With the responsible. Use of big, data I'll. Leave the details of her work to, Dean, Alex, wolf which. He'll present, in his introduction, and of course to dr. Couture herself, before. Then however it's, my great pleasure to, introduce our, Chancellor, George, Blumenthal. Good. Evening everyone and thank you all for coming. This. Event which. Is a peer. Chosen. Honor, and, the highest, academic honor, we bestow at UCSC, is. Always, a real highlight of the year for me and. It is so because it spotlights, research, which, is so central, to our mission at UC Santa Cruz at. UCSC, research, goes hand-in-hand, with excellent. Teaching and we. Are built on a foundation of, academic, research over the past half a century and, over. The over a time that, I've seen UC, Santa Cruz grow. And evolve from, an innovative, and pioneering, experiment. In higher education. To. A nationally, ranked research university, and, I'm incredibly proud, of that and it's, work of our faculty people. Like Lisa Couture, tonight's, lecturer that have gotten us to this place. It. Really feels amazing, that, this is our 53rd. Annual. Faculty, research, lecture, and just, for the record while, I've been to a lot of them I haven't. Been to all of them. It's. Important, to note that research, is also what sets the University, of California, apart from, other systems of higher education. It's. An engine of discovery that has fueled, the Golden State for, decades. Importantly. Research, also satisfies. That uniquely, human quest, for. Exploration, for. Knowledge and, for insight. UC. Santa Cruz is the home to artists, and scientists, philosophers. And. Engineers. Astronomers. And poets. Sociologists. And economists. Scholars. Across all of these disciplines advance. Our understanding of the world around us, their.

Contribution. Contributions. Enriches. And they. Enrich educational, experience. That we offer to, our students. UC. Santa, Cruz students, get, extraordinary. Opportunities. To work side, by side with leading, scholars, big. Thinkers, who are advancing, their disciplines. And really, transforming. Their, their. Fields, our. Professors, keep asking, important, questions keep. Writing and keep sharing. Their knowledge. Tonight. We get to hear from one of our very best, Lisa Couture the, 2019. Faculty. Research, lecturer, I had. The immense pleasure of, getting to know Lisa on an international, trip we made a few years ago to South Korea. Where. I remember, that she wowed, executives. At several, Korean, companies, that we visited, just. As I'm sure she. Will wow us tonight with her presentation. And here. To formally, introduce Lisa, is, Baskin. School of engineering Dean, Alex. Wolff Alex. Thank. You George, and thank. You all for coming tonight. I'd. Like to offer a special thank you to my, fellow academic. Senators, for, recognizing, tonight's speaker. As dean, of the Baskins School of Engineering one. Of the many privileges I get, is to. Is. To get acquainted, with our remarkable, faculty, and their very impressive, work and then. I get to trumpet, how wonderful, they are to, as many people as I can. So. Tonight I could not be more honored to. Introduce professor. Lisa Couture a scholar. Who embodies, the definition, of world-class, faculty. First. Though I'd like to offer a bit of biographical, background. On. Lisa, Lisa. Is a genuine, product of the West Coast Lisa. Was born in Seattle and grew up in San, Diego. Lisa. Is also a child of the. UC system her. Father was a well known mathematician. On the faculty of UC San Diego, she. Earned her bachelor's degree in computer science at UC Santa, Barbara and a. Master's at UC Berkeley, and. Then. Something went wrong. Somehow. She earned a PhD in computer science at Stanford. PhD. In ham Lisa joined the computer science faculty at the University, of Maryland, in College Park and, distinguished. Herself with. Her research in as, Kim said machine learning, reasoning. Under uncertainty data, management visual. Analytics, and social, network, analysis. In. 2013. She returned, to her UC roots and joined us here at UC Santa Cruz in the department. Of computer science. When. You hear just some. Of the ways in which Lisa has distinguished, herself you'll understand, why we're so lucky. To, have recruited, her and why, we're so proud to have her here with us today. Needless. To say she, has, been an incredibly, successful and. Highly cited researcher. And. A highly, highly, sought-after speaker, and lecturer. Lisa. Is a fellow of the Association, for the Advancement of, artificial, intelligence triple, AI, lisa. Is the PI for one of only a handful, of National, Science Foundation.

Transdisciplinary. Research and, principles of data science, or, tripods. Phase. One grants. This. Grant brings together UC, Santa Cruz faculty. Developed, a unified, theory, of data science, applied. To uncertain, and heterogeneous. Graph and network data, she. And her colleagues are now, hard, at work on a Phase two proposal. Which. Hopefully, will. Move the work into, a larger, multi, institutional research. Setting. Lisa. Led the establishment, of the UC Santa Cruz data, discovery and decisions, data. Science Research Center d3, and serves as founding, director. D3. Among its research activities, works with companies, to provide opportunities, for collaborations. Between students. Faculty. And industry. But. More than her scholarly, work lisa. Has been a highly, visible national. And international, leader and advocate. Lisa. Served as chair of the computing, research associations. Subcommittee. On data science. And was lead author of a highly influential, CRA, study, on computer, science and the emerging field of data science. Lisa. Serves on the National Academy, of Sciences, round table on data science education, where. The next roundtable. Discussion, is focused on collaborations. Between industry. And academia. On data, science. Lisa, is also a quiet, yet exceptionally. Strong, supporter. And advocate for women in the, data science and machine learning communities. And committed. More broadly to mentoring students, junior, faculty and women, in, STEM. To. Say the least Lisa brings great distinction, to the Baskin school of engineering and, to, UC Santa Cruz and I'm delighted to see her accomplishments. Recognized. So publicly, tonight. So. Please join me in welcoming the, presenter, of the 2019, faculty, research. Lecture, professor. Lisa Couture. All. Right well, thank, you George. And Alex and Kim for a really. Nice, introduction. And, thanks, everybody for helping. To make this happen. And thank, you guys for coming out in the rain I'm, really. Happy, to. See all of you here and I'm gonna be talking about a topic that I think, is really important, responsible. Data science, and I hope by the end of this lecture you. Will agree with me. Now. As. Far. As my goals. In, this talk so. I have several, the. First one like, any good academic. And I am. Professor, I want to educate you a little bit I want. To also. Excite. You about some. Of the opportunities in, data science I also. Want, to caution, you, and, at.

The End I want to leave, you with. Some tools and, tools. That ideally. Help you. Separate. Some. Of the, kind. Of reality from, hype and. Vice-versa. But, also, enough. Background. So that you can kind of engage. In, what, I think, is going to be a really, important. Emerging. Area, on just. What. We do, as, we go forward with, data, science, and more. So. I'm. Gonna start with the very beginning, so what. Is data Sciences. So is, it, this. Kind, of emerging. Discipline, or, you. Know is it just you, know I'm giant a big bad that's gonna go away. And, I. Really like this. Quote, that. Is from Katherine, Carson. She is a historian, and. Ethnographer. Of contemporary. Science. And what. She has to say is. Data. Science, is a shimmering, concept. No, one agrees exactly, what it is but, it gets a changes, underway that are serious, and real both. In academic. Disciplines, and out there in the world, something. Is emerging in the space where a mass of our pervasive. Data its, computational. Handling. And its, analytic, manipulation. Come, together, to, underwrite, inferential. Conclusions. And actions, in a data, fide world. She. Does on to say whether. We're looking at, foundational. Approaches, domain. Area uses, or, contextual. Implications. And entanglements. The intellectual, terrain, is just starting, to get mapped and. However. Each go. Forward. There's, something, irreversible. Happening, whatever, else. Happens. To the techniques, the, questions, or even the disciplines, the, data are not going away and so. As. Alex. Mentioned I've. Been involved, in a number of these conversations. Across. Nationally. And internationally around. Data science, and, it. Is true, that there's. Something emerging. Happening. Just. To give you one example. At. UC, Berkeley, they have a course, which is their data science 8 course, and this. Is an interesting, course it's designed, for a. Freshmen. And sophomore, students, they don't have to have any. Computation. Or statistics. Background. The, most, recent, edition that was in fall of. This. Past year had, 1300, students. 300. More on the waitlist and. The. Thing that it's not all about this is you know it's, well. We, do this and a lot of our courses is grounded, in, applications. But then all of a, lot of the applications are for the social world. So. Their goal is, that. Eventually. All, incoming. Students, will take this class and will. Then. Be. Kind, of literate, in data. Science, computational. Thinking and statistical. Thinking and ethics. Here. At Santa, Cruz, Abell, Rodriguez, has led, many of the efforts around, data. Science, together. With folks in, CSE. And more. So. What. Is interesting. As well as you, know there's. A lot, of cool, things about the course but for. Example the gender ratio in the course is. Much. More, equal. So. The. Focus of this talk however, is on. Responsible. Data science, and so. In. Order to ground. This, work in responsible. Data science, I'm going to be using. A concept. Of socio, technical, systems, so socio technical, systems, are a term, that comes from the science, technology, and society community. But. You know fundamentally. It gets at this kind of interaction between, people. And, technology. And you'll, see throughout the talk I'm going to return, to this as an, important. Component of. Responsible. Data science. So. In, this talk I have, three parts I gonna, give you some basics, and in the, basics, what I really, hope to do is there's.

A Number. Of terms that are being used. In the media you know algorithms. And AI and machine learning you, know I want to unpack these, for, you. Then. I'm gonna. Go. Over some research, and. Then, I'm gonna spend, some time going. Over cautions. And things, to be aware of and throughout, the talk. What. I'd like you to keep in mind is. Some. Of the, how. When. Why. And why not I am, gonna try and give you enough, of an intuition, about, how. Some of these things work that. Ideally. You, can have, also, some, understanding. Of you, know when they might be applicable and, also when, they might go wrong. So, let me get started with that and let me get started, with the most basic piece. Which is what. Is an algorithm. This, term is being thrown around by the media, quite a bit, and. The. Basic idea is you know it's a just a kind. Of recipe. For doing something and. You. Know people use them all the time you know any time that you have some. Tasks. That you need to do you, know I'm gonna bake a cake you know I need some ingredients I have some steps so I follow those steps and I get a cake out you, know that's, a. Human. Algorithm. When. The term is used by the. Media they're talking, about computer. Algorithms, so, I. Am. Going to give you a crash, course in. Algorithms. And I'm, gonna go through five, different kinds, of algorithms and, as, I, go through them. Keep. In the back of your mind you, know what. Are the different ways these could be used but what are the different ways that they could potentially go wrong. So. First. One straight-line algorithms, okay everybody. In a minute so how many of you have ever written a computer program. Okay. So. The. Very, first, program. You probably wrote was a straight-line, program, where you, basically, just. Go. Through you. Say. Some, steps you know this is an example of one that, computes. Your, account balance after you make a withdrawal. The, next kind of algorithm. Is a. Rule-based. Algorithm. Where you have some sort of conditionals, there's some sort of if-then so, even in my little code snippet some. Of you may have thought well it, might have been smart, if you check to see if you had enough money in your. Account. Before you made the withdrawal, the. Interesting. Thing is these, are. The. Kinds, of programs that are often, used for diagnosing. Things so, here's a little snippet of you. Know how you might diagnose. What's going, wrong with your printer, if your printer is not working, and. The. Key, thing to note is, these. Kinds, of algorithms. Go. By the term expert, systems, in AI. So. These were. Initially. Super. Popular in the 80s but they're still very popular, for. You, know detecting. Problems. And, debugging things you. Know those, does. Annoying call centers, did you call and, then rally through things or, a. Form. Of expert, system. Okay. Now. Let's. Get to a data-driven, algorithm. So, a data-driven. Algorithm. Is something where you have some data coming in you, go, through the data you do. Some counts, and check. How often, something occurs, so, as an. Example. Here's. My I, have. Some shopping data, and I've counted the, number of times someone bought bananas and, they bought milk the. Number of times they bought bananas and, bred bananas, and carrots, and, and, you know whatever it is in that white jar, and. Then, a new. Customer, comes in you know they have bananas, and you. Know I'll, pick, whichever one of those happen, most often maybe most often it was milk I'll, say oh you know go down the milk aisle and, now, these. Kinds of algorithms, are often. Kind of called machine, learning algorithms. And this, is a example. Of a recommender, system, in particular. Now. You can get fancier, than this so you can, have some statistical, model behind this and so, on but the. Key, underlying. Thing. That's done is counting. So keep. That in mind when, you're, thinking about when. These will, work and when they won't. Another. Very. Useful kind, of algorithm, as a randomized, algorithm so randomized, algorithms, have some sort of simulated. Coin, flipping involved. And these. Are, super. Useful for different. Things so one of the things they're really useful for, is if you're searching very, large spaces. It helps, another, thing is if you're doing simulations.

Of Distributions. In some sort of statistical, model they're useful and they're, also useful, if you're, designing. A. Game where, you want it to be interesting so the same thing doesn't happen every, time you, play the game, and. Then the last. Algorithm. Category. That I want to go over is deep learning so, how many of you have heard of deep learning. Ok, so deep, learning is getting, a lot of attention these days and, just. At the. Kind. Of simplest, level deep, learning is all, about kind. Of constructing. A super. Simplified. Abstract. Neural network, and. They're. Very, good when you have large. Numbers, of input-output. Pairs of. Coming. Up with a compressed. Representation. For, and so. They're good at memorizing, large, amounts, and. Constructing. An abstract. Representation. The. Models, are considered. What's, called black box, because. You. Know just inspecting. A neural net it's really hard to figure out you know what exactly it's doing, and. There's. Been some. Pretty. Serious, issues. There's. A ton of examples, of deep, networks gone wrong one, of the basic, first. Ones came out of our vine. Where they were. Looking at a deep, network, that was trained, to recognize. The, difference between Huskies. And wolves and so, it. Performed, really well on, the training data then. They, took. Some new, data and. You. Know didn't do so well on the new data and it turned out that what, the neural net had. Memorized. Was. All. Of the wolves had. Snow in the background, all the. Dogs had, grass in the background, and that's what, it had learned. So. A. Colleague. Of mine rich. Karuna a number of people are kind of advocating. Caution. With. These he, has this friends, don't let friends use, black, box models, and, I. Actually, think, this wired. Title. Gets it, greedy. Brittle, opaque, and shallow the, downsides, to deep learning so, I think, there's, a, there's. Some, really. Cool things that they can do but there are some cautions and I'm going to be returning, to those and later. Part of the talk. So. You. Have just had a crash course in algorithms. You, get your gold, star and check mark so now. You. Just had. Extensive. Education. And data science. I'm. Now, going to turn. To, talking. About, research, and. In. Talking, we're, going to come back to these algorithms. As we go through in, talking, about my research I'm going to return, to talking. About these. Kind. Of socio. Technical, systems, that are coming. Plex. Connected. Heterogeneous. And. So. On and. One. Of the things that all. Of these algorithms, machine. Learning and data science, algorithms, typically, do is they take this rich structure.

And They. Flatten, it so. They flatten, it into. Tables. Where. Can. If each row, in the table is, treated. Independently, and. Atomically. And. So they kind, of take this. Kind of cool data and, they. Put. It into a table. So. What. Are the issues with, that it turns out that there's actually a bunch of issues so first off you. Know this flattening oftentimes. Is making, incorrect, independence. Assumptions. Further. The. Models, end up, oftentimes. Not being interpreted, well and here not interpreted. Well in a different, way than I'm talking about for deep learning because, oftentimes what you do is you do a lot of feature engineering, to, transfer. That, rich graphi structure, into. The. Columns. In this table and you. Know then. You. Know you wrote, some code to do that you lose the code it, ends up you know you don't remember, how you got is, you, know very bad. It's. Not declarative, in that way but, the. One that I want to emphasize is that it doesn't support collective. Reasoning, and, so collective. Reasoning. Is. This. Idea, that, rather. Than, treat. All of the inferences, independently. Let's, model the dependence, and the. Particular, way that, we, model, their dependence. We. Can use kind of local, information. About. Whatever, or. Trying, to make inferences about but, then let's also use the. Kind, of relational. Structure, and. You. Can use this for, prediction. I probably will lapse into, talking about prediction, but you can also use, it for discovery. So just trying to kind, of understand, what's going, on in your domain, or your data and you. Can also use, it for causal. Modeling, causal, modeling is important, when you actually want. To make an intervention, I'm, not going to be talking about that here so much but I'm very. Interested in. Causal. Modeling in networks. Where there might be interference. And so on so if anybody's, interested in that come talk to me afterwards. So. Let. Me. Go. Through some examples of. This, collective. Reasoning. And. I'm. Gonna go through them. They're. Gonna be simple examples, but I hope you'll see kind, of the utility. Of. Each of them so the first one is information. Integration. And information, integration is something that happens all the time when. You're doing any kind of data, science, problem. Where you, have to you have these different digital, representations. For things and you have to figure out you know which ones are talking, about the same thing so think about bioinformatics. You, have a bunch of articles, that are talking, about genes, they're, referred, to different, ways how do I figure out which ones are the same you. Know I have different, treatments. How do I figure out which drugs are the same or. You. Know something like digital humanities. Where you have a bunch of text, and, maybe you have video. Maybe you have images. And again you're trying to kind, of figure out you. Know is this person over, here or this place over, here the same in, these. Different, texts, so. The. Challenge. With all this is typically. There's. All sorts. Of noisy. Clues, that. You essentially, want to piece together and, so. I'm going to do a very relevant, Santacruz, example, so this, is a collection, of documents about, the university. And talking. About our mascot. And the. Idea, is which. References. Refer to. The same real-world. Entity, and I'm. Going to make use of kind. Of just, in a single document, local. Information, so I'm, gonna figure. Out you. Know which. Things, refer, to the, university. To, UC Santa Cruz and because I figured, out those two are the same then I can figure out the University and the school is the same I can. Look at mascot, and banana slug and so this is something, that in natural language processing is. Referred to as Co, reference resolution. Kind, of figuring. Out all these, things that they're whether. They're the same in say a single, document, but. Now I can go to a, document. Collection. And. Again do the same kind, of, reasoning. But. Reasoning. That goes across. These. Different. Documents to. Real, world entities and, then, I might have another document, this happens, to be about falafels, santa cruz and talking. About the owner sam and, i figure out that, that sam, is not the same as Sammy, and so there's this kind, of figuring out what's the same and figuring, out what's different, so.

This. Has. A number of challenges one. Of the biggest challenges, is there's. Tons of uncertainty. I made this seem like oh it was totally obvious, which one referred. To the same thing. Usually. There's. A lot more ambiguity. And being able to handle, all that ambiguity. Is the challenge, and again. I showed, it to you for like, four documents, you. Know how do you do this at scale. When. You're overloaded. With digital. Information for. Doing this so. That's. The first. Example of, collective reasoning. The next one I want to do is collective. Classification. Then collective. Classification. Is just the idea, that, many. Data science, problems, you, have. A bunch of entities, that you're trying to label, or classify. In some ways. Now. It should be according. To some, demographic. Attribute, so it could be according. To you. Know gender. Could, be some, sort of. Political. Persuasion. Or so on. I'm. Going to do a slightly, more, complex. Example, which is, trying. To, figure. Out in an online debate. Different. People's. Positions. On a topic. And, so. I. Have. A debate, people. Are posting, things those, arrows. So Green, Arrow is I agree, with you, the red, arrow is that I disagree. With you and. The. Question. Is you, know, what, is their stance is it you know probe the, topic or. Against. The topic, and. This. Is a topic, that my. Former, PhD student dunya, Sridhar who's now doing a postdoc at Columbia, worked on together with Marilyn, Walker, Marilyn, Walker does tons of work in this space of dialogues, I'm, gonna be going. Through a simplified. Version of this but. Again in. Order to kind, of figure out users. Stances. I can, start off and I can look, you know just at the document. And I, can infer, something about that maybe sentiment. Or some other words that are used in it what. Stance. Is likely to be but. Then the relational, information is, exactly, kind, of then, using. These. Disagree. And agree links. To. Reason. About what the stances, would be and I, can, reason, about the, stances, by saying things like okay, if someone's Pro and someone, agrees with them then the other person is Pro if. Someone's. Pro and a person disagrees with them then, that, person. Is, likely. To be anti. And. Just. Like before I'm. You. Know kind of writing, this as this logical. Rule it's, actually going to be uncertain, so it won't always hold, but these are little clues, about what. The stances, might be so. I can, go. Through and make, the, inferences. Now. Here. One. Of the challenges. That may. Have crossed your mind as I described, this is. Well. What, if I didn't want you to figure out my stance. So. Privacy. So. This. Kind, of inference. Where even, if I try. And hide my. Attribute. Maybe my demographic, attribute. By. Doing. This kind, of. Reasoning. It's, oftentimes. Very easy, to infer, something and I've, been in, a number of conversations. Around. Big data and privacy where, kind, of say like oh you, know the, way that we'll make sure you can't. Disclose. This as we'll say you're not allowed to store, it well. If you're able to infer, it with high probability, then. That's an important, kind of leak leak. And, my, former.

PhD Student. Elena's Oliva who's now an assistant, professor, at, University. Of illinois-chicago. Actually, way back in 2009, did. Some interesting, work on, basically. Facebook, looking, at. The. Ways that, you. Could figure out demographic. Attributes. Through. Group, memberships. And. You. Know we're kind of used to Oh your friends tell you something, about yourself, but it, turns out that group membership. Tells you even more and. Interestingly. Enough for, group memberships. The. Group owner had control, over that information you, didn't even have control, over you. Know who saw this. So, this. Whole, topic of privacy, and, particularly, privacy. And socio. Technical. Systems, is really, interesting. And. Important. To be aware of, the. Last pattern that I want to talk about is. Recommendation. And so recommendation. Is something, that I'm sure all you guys are familiar with a very. Common, thing where you're trying to you, know. Recommend. Some item, or give, you a user's, a ranked, list of documents. Or something and again, you can use. Some. Information. So here's an example, I'm. Trying to figure out if a user will like an item and you know which items. With a user, like. And. I. Can. Use, local, information so, if a user is interested, in a topic and the item. In, this case maybe a news article is about that topic then I can say that the user will like, that topic. But. Then I can also use this. Kind of graphic. Information so. You can say that okay. If. A user, likes an item another, item, is similar, to that then. The user will like that second. Item I. Can. Also. Reason. About the similarity, between users. Say a user, likes an item there's, another user. That's similar to that user then, the. Second, user will like that item, too so, I can kind of infer all these little triangles. So. What's the challenge here, the, challenge, here is how do i define, the similarity, measure, you know there's not going to be one. Unique. Similarity. Measure to use there's all kinds of different ways that you can be similar, and, part.

Of Our work is very much you know we're going to allow you to specify a. Bunch. Of different ways of doing this and. Kind. Of combine them together in, a, tractable, way. So. The. Commonality. With, all of these is there some kind of relational, structure, there's. These complex. Heterogeneous. In our dependencies. And there's, a noise and uncertainty, so, my. Research is all about how. You take, a more nuanced, approach it takes into account these. Relationships. But at the same time the. Context. And the, probabilistic. Dependencies. And so I've, done work in this space for a long time I'm, going to be telling you about some of our most recent work which. Is called. Probabilistic. Soft, logic, our PSL, and, it's. A programming. Language that allows, you, in a easy and scalable, way to you. Know represent. These, kind of collective, inference, problems. And. The. Cool thing about, it is that. It, combines. You. Know those rule-based. Algorithms. That we saw before with. Data driven approaches. It, combines. Logic. And probability. It. Combines. Both hard, and soft, constraints. So hard constraints, that have to be satisfied soft, constraints, which you, know you'd like to have satisfied, and, fundamentally. It. And combining. Knowledge. And data in, a. Really, interesting way, and. So, at, this point I have to acknowledge. My. Awesome. Group, of students. This is links, for Lisa's inquisitive. Students, I, have. Some of them around. In the audience, I think so. They're awesome they're the, ones that, make. Being. A professor awesome, that's the best part so, thank, you guys. So. PSL. Is this. Probabilistic. Programming. Language for collective, reasoning, and the. Way it's, encoded, is, we, have some weighted, rules. Then. We have some data together with the rules and the. Cool thing is it, then, defines, a probability. Distribution. Over, the, collective. Outcomes. And. A. PSL program, this, is an example of a PSL program, for that Stan's. Example. From before so it's, pretty simple to write it's pretty interpretable. And. It. Basically, kind. Of takes, this.

Is A program you have some data and it's instantiated. Into, a, particular. Kind, of distribution, and. The. Distribution. Is, a. Particular. Form of, Markov. Random field so. It's a hinge loss Markov, random field you. Know if I, was in a. More, technical setting, I would go off on a deep dive here for, like an hour telling, you all about this, but since. I'm not I'll, have to give you the high level. The. High level, that's really cool is, it's. Taking. Logical. Inference. Which is known to be intractable. And mapping. It to convex. Optimization, and. Neat, thing about this, this, is very much work of my former, PhD student. Stephen Bach, is now an assistant. Professor, at Brown and, my. Former. Postdoc, who is now an assistant professor. At Virginia, Tech where. They. Looked at those, randomized, algorithms. Some, results from there some. Results, from machine. Learning graphical. Models certain, kinds of relaxation. For, them and. Soft. Logic, so something, that's in the AI world. Where. You, have rather, than having boolean, values, that are true/false you have they. Can be interpreted either as degree, of truth or. Similarity. We. Were able to show that there, was one, formalism. That, there. Was a quibble, --nt optimization. Under, these three different, interpretations. And, that. It ended up giving, this scalable. Way of doing, inference, in these, large. Models. And so. This. Is kind. Of really cool whenever, you get something that under. Three. Quite. Different. Interpretations. You end up with the same optimization. I think, that shows. That. There's, there's something fundamental, there. There's. Still a lot more work to do you. Know if this excites, any of you come talk to me because, yes. There's, there's lots more things we can do from here. So. Give. Us all in a nutshell. It's. Able to do inference really, fast, you. Can make it even faster by. Using. Canister. The art optimization. Techniques, distributor, processing. And turn there's a lot of, fine-grain. Local, parallelism. I'm. Not talking about it here but you can learn the. Rules you, can learn the weights for the rules, you, can also deal with latent, variables, which I'm gonna give an example of in just a minute. It, combines, data and knowledge in this interesting way that, ends, up giving you these, models, which are much more interpretable, and, the. Cool thing is you, know it's open source code, data tutorials. Are all available, online, and, if any of you are interested in using it we like to help people so. Come talk to us. Let. Me go. Over now two. Examples. That. Are a little bit more. Complex. Than the ones we. Went over so far the, first was an example where we used, PSL. Models, to detect, cyber bullying. And so. We. Were able to not, only kind. Of look at different messages. And infer. You, know when, the messages, were bullying. Messages. But we were also kind of trying to tease out the social structure, so, what are the different roles. And. Then also discover, what are the different kinds of attack types and. The. PSL, program, for this oh sorry. This. Is work by Sabina. Thompkins so she was a former, PhD student who's, now doing a postdoc. At Harvard who'd. Just graduated, from here last year. This. Is the program, I, won't make you read that but if you want the code it's here. And. She. Demonstrated. It on some. This. Was cyber. Bullying and Twitter which you can imagine that has a lot, of. Complexities. And you, know you see, this tweet you know is it really a bullying, tweet or not and one. Of the things we were able to show that modeling, that, certainty in. The label, actually, you could do a better job from. Modeling, the uncertainty. She. Also was able to kind, of uncover, and, some, evidence, of certain, kinds of power dynamics, and.

Found, Certain things like what are the most common attack, types and so on so, this is one example. Another. Example. In, the. Social. Domain was. Some, work by. My. Former. Postdoc, for at long that I mentioned, before on. Inferring. Trusts social trust so here, I have, these individuals. And the. Green links are that they trust, one, another the red links are that they distrust, each other, and there's, actually you, know two very, well-known, different. Theories, of trust, in social. Psychology. One. Is called, the structural, balance this. Is the idea that. Kind. Of a friend of a friend is. A friend. An. Enemy, of an enemy is, my, friend and. So on so it gives like all these combinations. Of trust, and distrust and, says you, know which ones are stable. And, the. Theory, says you're going to go towards, this, stable. Configuration. Competing. Theory. Is the. Social. Status, theory and this is much more about, hierarchical. Relationship. So the idea that I trust, people that have more expertise. Than me or that are higher in the hierarchy and I distrust. The people below me and, the interesting. Thing is for, this it, will give, you kind, of different combinations. Of trust, and distrust. Links. And so, what, we could do is we, could build a PSL model, for, both of these the PSL, balance, and PSL trust we. Compared, it to a baseline. And, what. Was at, the time to, stay the art methods and this, is for predicting. Distress, links which is actually really hard to do, we, ended, up both our models did. Better than steel art methods. But. Now, the. Thing that we did from here I think. Is really cool we then. Added, in. Psychological. Variable, so a latent, variable that talks about how trusting. Or how. Trustworthy, an, individual. Is and, adding. This into the model ended. Up, boosting. Performance significantly, I think, this happens, a, lot in these kinds of models kind of judicious. Insertion. Of these latent variables, and then when you're able to have some interpretation. For them it's. Very interesting so the cool thing is to then go back and look at the organization. And try and say like, oh what. Is it about the place where the balanced model. Was working what is it about the part, where the, status, was working, and can I say more. We've. Done a lot of other projects.

Just. To mention some. Work by, Artie Ramesh, former. PhD student, who is now a professor, assistant. Professor, at sunny Binghamton. Done. Work with MOOCs so looking. At engagement, and different. Kinds of social behavioral, effects, in MOOCs. P. Kuki who just recently graduated it's, done, a bündchen recommendation. But also explanations. Like why did you give, me that recommendation. And. Jay, Fujairah, who is now an assistant professor, at USC, did. A lot of work in knowledge, graphs and this is kind, of going a little bit beyond, this, information. Integration that i talked about so far where you're, trying to kind of extract out, these knowledge structures. From. Text or other digital information. So. That. Was about research. And I, am, very very happy, to talk more. About. It. I want. To go into talking about some cautions, and for. The cautions. What. Can go wrong. Go. Back to those little algorithms. That I gave at the beginning and kind of think about some of the things that can go wrong, it. Turns out a lot can go wrong. I'm. Gonna go through some examples that, I've gotten from my colleagues, I'm, actually going to be teaching course, in this next. Year so if you have other examples, send, them my way. These. Have been getting a lot of press. And. I'm just going. Over a couple of these so the, first, example, comes, from Amazon. And. Amazon. This. Was covered by Bloomberg, in, 2016. They built a tool, to. Decide. Where. They should offer same-day. Delivery, so. It's a prime service, and they trained it on. Users. Buying, history, income, and location, and focus. Right now on just the left side of this slide so the, part, in gray shows, a part, of Atlanta, that, did get the service, so the. Northern part did, get the service. This. Is. Chicago. Everybody, except, the. South side got it and. This. Is Boston. Everybody. Except Roxbury, got, it and. What. Happens, to be true about. These. Assignments. Is, you, know predominantly. White. Areas, got, the service and predominantly. African-american. Areas. Did not get the service, and so essentially. What they did was they, built a tool the. Tool did not explicitly, use, race but. Not, surprising. You know given those out, that basically came, up with the digital redlining. Algorithm. And so, they got a lot of flack and. Everybody. In these areas now is, receiving. The service. That's. One example another, Amazon, example. Is that. They built a tool to. Recommend. Resumes. And, guess, what, they trained it on their. Own. Resumes. And. Their, own resumes. Were predominantly. Male and, guess, what that algorithm, ended, up. Showing. A gender, bias. So. Amazon's. Not, alone here, are some Google examples. As. A matter of fact science. And justice is going to have a speaker, in March that's going to be talking about. Bias. And search results so I'm going to cover a couple different, Google examples. One. Of them is in, their smart compose, where. They. Are filling, in pronouns, it, turns, out that you know you type in CEO you. Get, out and. Fill in the pronouns they're all going to be male pronouns. This. Is Google, Translate, you, take a language, like English, that is gendered, and you say she, is a doctor he is a nurse you, translate, it into Turkish. Not. Gendered, you, translate, it back into English, you, get T, as a doctor, and she is a nurse so. But. Probably. The example, that has gotten the most attention, is. Around. Recidivism. Prediction. So how many of you are familiar with, this story this, ProPublica. Story. Okay. So not a lot so. This. Is really, interesting so and, I'm. Not going to be able to do justice, to it all of the nuances, but basically. There. Is an algorithm. That is. Being, used. Widely in, the criminal, justice system, for. Free. Child bail. And, sentencing. Where. The. Algorithm is supposed, to predict, the defendants. Likelihood. Of committing a crime. Well. It turns out they were able to show. In. A particular. County. That, what was happening was there was strong. Racial, disparity. In the affects so first off false. Positive, where, you say someone, is high risk but, they're not high risk, this. Happened, twice as often to african-americans. As to. Whites. Then. False. Negatives. Where, you say they're, low risk but, they're. Actually. Do, go on to, commit. A crime that, kind, of error happened, almost, twice as often for white. Defendants. So. The, issue here is, over. Predicting, recidivism, for, african-americans. And under, predicting, recidivism, for what. So. What. Went wrong, okay. So a bunch of things that went wrong so one, of the things that went, wrong in all of these examples is. Biased. Data so. If. You. Have. Input. To your system, that's biased, whether, the bias is coming from, selection. Bias. Institution. Bias or. Societal. Bias then. The output, is going to be biased and so.

This. Is. Famous. Computer, science. Phrase. Garbage. In garbage out, if you give it bad data you're gonna get out bad results. Now. It, turns out that there's a lot of other kinds, of biases, in addition, to these that are important, to take into account and, one. Of them is automation. Bias. And. This is a well-known effect. You. Know people. Have. This habit, of you, know the, algorithm told, me to do it you know and they trust the algorithm, too much. So. This, is a serious. Issue. Sometimes. It said you trust the algorithm, sometimes oh it's a hard decision and, so you know the algorithm made me do it in the. Case of the recidivism, prediction. It's actually even worse because, in a number of states if the judge, disagrees. With the algorithm, then. They have to write a report explaining. Why. And so. That puts like an additional. Overhead. And, all. Of this leads to what's. Referred to as algorithmic. Discrimination. So, the idea, that algorithms. Can amplify. Bias. Which, we just saw but, they can also operationalize. It in a way that it's applied, much, more widely, than. You. Know from just a. Individual. And then, it also, legitimizes. It so, that. It's given. Too. Much credence, and, so. Not surprisingly. People. Have. Been. This. Has become a very, active, area of research. So as an example here. The. You. Know in 2011. There were hardly any papers. About machine, learning and fairness. This. Is from, a tutorial, by salon Baracus, and more it's heart by. 2017. There were a ton, of them there's. Several. Workshops. And there's now actually, an ACM, conference on, fairness, accountability. And transparency. Where. It's specifically. Trying to bring, together, technologists. And policy. Folks and. Practitioners. To talk about these issues, so. There's been you, know a huge, surge, in academic. Research. Not. Surprisingly. There's, been some criticism of, that. Research, so there's been a healthy, amount of critiquing. These. Kind of technology. Technological. Fixes. It's like oh I'm gonna have my machine. Learning algorithm, and then I'm gonna put on an ethics box, at the end to fix everything. The. Cool thing is I, think there's starting. To be some real, collaborations. Where. Through. You, know some discovery. Of what was happening on the technical, side together with, some. Interpretation. Of that on the. Social side kind. Of new interesting, things are coming out and. I really, like this quote by Bill. How at u-dub. He. Says a responsibility. Means going beyond technical. Or technocratic, solutions. To also, involve. Substantive. Debate about, ethics, values and, competing. Interest. How. Is, ethical. Expertise, define, who, used to be at the table what. Are the limits of certain kinds of solutions, so. This. Is what's happening in it academia, turns, out there's things that are happening in the real world - there's this. Is the first law. In. New York on, automated. Decisions, systems. And there's. A number of other efforts so. The. I'm. Happy, about this this is there's, a lot of energy. And research. That's happening, here I. Want. To talk about what. Else can, go wrong and, what. Are the things they can go wrong that I don't think are receiving, the attention that, they deserve. So, first is, just. Poor, quality. Just. Like there's bad data there's bad algorithms. Now, if you look at that first line on that ProPublica. Thing yeah, this, thing only had 61, percent accuracy. Now that's barely better than flipping a coin.

We. Should not be using these things in these settings, the, second, one is magical, thinking. So. I think there's a lot of writing right now about, machine, learning and AI all. Those things that it can do it, is. Overinflated. And so you really need to question. The. Capabilities. And I hope I'm giving you some of the tools that will help you be. A little bit more. Skeptical. Of some of the claims and, to, unpack, why. You. Should be skeptical there's I'm, gonna go over one technical problem, which is the frame problem, and this. Is that just. Basic, idea, which I mean you should seem from the examples, that I gave it you know these, are really crude, models. So, they're, gonna make simplifying. Assumptions. So they can, only. Take, into account, some. Kind. Of limited. Amount. Of. Information. And. From, that limited, amount, of information. They. Can. Make big, mistakes. So. AI and, machine learning methods. Can work, well but usually, in very kind of constrained, settings, and this is what's referred, to as weak AI or. Narrow. AI. The. Fourth thing that can go wrong is values, so. All. Of these algorithms, at, their heart are. Optimizing. Some metric. There's. A question, of who. Supplied, the metric. You. Know who. Gets, to decide it, whose, values, are encoded. In it and so, on. Another. Issue is. Just. Bad code. So. Just, like we had bad, data and bad models, you can have bad code, and, so. The. Whole area, of software, engineering. Our Deen that's one of his research, areas as software engineering, is all about you. Know how do you ensure, that code doesn't have mistakes, in it now, when you're going to data science, algorithms. And data driven algorithms. How, do you ensure. Things, statistical. Properties, like. And. Good. Science practice, about reproducibility. Generalizability. Transparency. And interpretability. You. Know these are all important. And. The. Last one that i want to mention is, taking. Into account that, algorithms. Can actually, shape people, so, again, going. Back to our socio. Technical, system, we. Have people, interacting. With. Technology and. Algorithms. Sometimes. People. Are just and adapt, their behaviors, in response, to algorithms, you can probably think of something you've done yourself sometimes it's in benign ways sometimes. It's an adversarial. Ways and sometimes, it's just in unintended. Ways and, we. Really. Need. To think more carefully, about the, potential, impacts, on people's. Behaviors. In, particular. Their agency. And their autonomy. As we, design, an, ever more complex. Adaptive. Systems. So. In closing I have, a couple, more, things that I'd like to go over. First. Off I'm. Going to get some advice. My. Students. And colleagues will attest, to the fact that I love to give advice. So, first off advice, to computer scientist. Some. Things to keep in mind first. Off data is not objective. It has biases, historical. Context, or more this, is something, that social. Scientists. You, know they're totally, well-versed in this, is something that's not as. Kind. Of. Commonly. Taught. In. Computer. Science or emphasized, in computer, science. Technology. Is not neutral. They. Have values, baked in it's, important, to think about them and, there's. A moral, imperative to, understand. The domains in which your work is being applied and, consider, the domains, to which your work could, be applied. There's. A ton, of new research that needs to be done to deal with these. Incomplete. Uncertain. Biased. Socio. Technical, systems, and. Finally. We. Need really need to educate people, so we need to educate people, about, some. Of the limitations of, you, know algorithms. Machine learning AI and more. Ok, so advice. For collaborating. With computer, scientist, some. Things to keep in mind one. Of the things is we, love, abstractions. And simplifications. And I. Think if you're. Someone. A humanist, and so on this, is kind, of a, totally. Different way of thinking we're, you, know human, s or about the specific, that context. And so, on and so, oftentimes, kinda, keeping this in mind that, you have kind of different mindsets. Is actually. Useful. We. Also we. Really like logic, so. We like zeros, and ones truths and false black and white so, that's. Another useful, thing to keep in mind. We. Don't, always have the greatest people. Skills. So. Yes, there are more. Introverts. And. As hers, and, the other things, among. Computer. Scientist, and engineers, it's. Useful, to keep this in mind I remember the first student, that I had that, had Asperger's, once I kind, of figured out that that was the issue it's like it made everything much much, easier to. Deal with. So. We're. Not. Typically. Trained in ethics yet, so. There, is a huge, surge. Of. Interest. Across. The country, nationally. About ethics, training and ethics training that's not just a band-aid, you know here's your ethics class but there really goes throughout, the curriculum.

And. Like. Many you, know we want to make the world a better place so you, know. It's. We. Want to collaborate, with folks. In doing, this. So. What. Is responsible. Data science, I've kind of emphasized. The. Literacy. Aspects, of it and computational. Literacy, statistical. Literacy ethics. Literacy. Justice. Domain literacy. And. It's also data, science for social, good so. This. Is also an. Active, area where a number, of universities. And. Other organizations. Are developing. These programs. Around data science, for social, good right, Donnie at the University, of Chicago was, one of the first, people to do this and I've. Had him come to speak campus. A number of times one. Of the things I find interesting is. That I. Think. The most interesting. Ethics. Discussions. Are coming, out of the folks that are working in this space some, of the deepest, discussions. That I've heard about ethics, are folks coming, from here. So. Responsibilities. Meanings working, about on these hard societal. Problems. So, we. Can use, data science, to discover. The bias to, discover. It that in Justices we. Can look at hard problems, like homelessness. And. We can look at things like, education. Justice, you know this is something that Rebecca, London, and rod Ogawa, are. Doing, some cool work where they're, combining, together. Education. Outcomes together. With, juvenile. Justice and, social programs to understand, you know what's happening, especially to, kind, of marginalized. Populations. Some. Work by my. Former PhD student, combining. Two hard things environment. And human, trafficking, and you know looking, at what the, impacts, are so. I. Think, we have. A choice we have a choice, amount, you, know we can build these socio-technical. Systems. These kind of dystopian, ones where you. Know we automate, bias and, they serve ALS and they make us angry and, unhappy. Or, we can build these kind of utopian, ones where you. Know we collaborate. And, we do collective, actions. And, we help make the world a better place, and. I. Think it's our choice you, know we really need to be, thinking about a bigger. Picture you know which are we going to choose so. We're. Turning to my. Goals. Hopefully. Teach. Excite, caution, and give you some tools in terms of.

Takeaways. This. Literacy. Aspect, of. Computational. Literacy. Statistical. Literacy ethics. Justice. And then. A. Process-oriented. View. Towards. Designing. And critiquing. The, systems, that we're building and then, to do this we, really need to collaborate we, need to collaborate, across. You. Know engineering. And. Humanities. Social. Science art. Education. And. You. Know what. Better place to do this yeah. UC. Santa Cruz we have strength, in all these areas so. Let's. Work together. Thank. You. Thank. You Lisa, you, know I I'm. Reminded, of something my mother always said to me just. Just, because you can doesn't mean you should, which. Seems to be you, know kind of a theme here and and, that. Thoughtfulness, that you're bringing to it is I think just, incredibly, important, and what, it's also important, is its it's, not just about the technology but, it's about the social context, that surrounds the technology, so as you said UC, Santa Cruz is a wonderful, place to explore, those questions and speaking. Of questions we have time for some. So. If, I think there are some runners. So. This is not my area and a naive, question I'm right down here in front of you okay, so, these. Algorithms, that, look at Facebook posts or Twitter messages, is the core data just, individual. Words that, really just doing word counts, and associations. Are gonna, do something, more sophisticated. Like the meaning of a word in, a sentence. Okay. So the, question, it was about. So. We got on the mic so I don't have to repeat it right okay. And. I. Wanted, to put. Up some, resources. Resources. So. First off you know I have to do my. Sign. My disclaimer. For Facebook, or Twitter I. Get a. Answer. Hopefully. You. Could see even from what I was saying that. Just. Looking at the text you. Can do, a. Surprising. Amount so text, would just be the word counts, but. Looking. At the relational. Structure. So. Who, liked things who's friends with who, and other. Kinds. Of attributes that's. Something, that. Leaks. A lot of information. In. Terms. Of. Inferring. Additional. Meaning, yes. There, are techniques. For. Trying. To infer. Meaning from. Text sometimes. That's. Grounded. In some ways and sometimes it's. Not. So well, grounded. Kind. Of a, separate. Topic. Though. Because. I I. Think. That, part. Of it is like how much do they, know about us. Might. Be behind, it. One. Of the things that they do is they aggregate, a lot of external. Information and, that external. Information, is, something. That gives. A ton, of. Semantics. And. So on so. It's. Really through, the. Kind. Of combination. Of these things although, I will admit that when. I've gone, to, talk. For, example to Facebook, they. Say. That they're not doing a lot of graph, inferences. A lot of what they're doing is kind, of pretty basic. Their. Questions. So in one of your last slides you showed the picture of the utopian the dystopia, and you, said that, we, can choose which one we have but do. We really, have that ability or is it out of our hands, you know this data is external, like you said. So. First. Off I think it is important. That we. Need to think, beyond. Just, the. Technology, to. Understand. Important. Concepts. Economic. Concepts, political. Concepts. Social. Concepts, psychological. Concepts, to. Understand. Where. We. Can. Affect, change and where we can affect, it most, effectively.

I Do think, that right. Now we're, on the cusp, of an. Opportunity. For. Significant. Changes, and so yes, now. Is a time where. Especially. Around privacy. There's. A, changing. Feel to the debate, and. Participate. In it it's really. Really. Really important. And. Then also think of how addictive, you are, to your little devices and then, try. And change some of your own behaviors, which you do have control over. Okay. Lisa. First, thank you very much for your for your talk I. Don't. Know how many in the room will know hannah arendt or, if you know her work but she has this very wonderful a lot. Of us in the humanities and Social Sciences I, think are synching with her these days one. Of the reasons she wrote a book called the human condition and in that she. Opens with her worries about the rise of, the mathematical, sciences and how they might lead to a world in which we could no longer speak. And, talk about that which we do and. What. I love about what, you've done here is that you have done, the work to speak about that, what you do which. Can be very technical, but you've made it into a language that we can share and talk about and so I. Just want to say that I really, appreciate it and now I have five different versions of algorithms, work with now I used to only have one and. And. I used to code, back when it was basic, in the 1980s, I don't. Know how many people, had that experience but, but. I want to go to you're. Raising these you know a rent alerted. Us to these issues a long time ago but. They're coming home to really roost right now and you've presented, all these wonderful examples, of that, and the, question that we're getting to at the end is how do we change and. You. And, this last question was is it just that we choose and. That. You know yes there's an agency part, of this we can choose but who's the we and how are we going to choose becomes, a really hard bit so, I want to go back to, a point in your your. Talk where you talk about there, is a moral, imperative. For. Engineers, to think about how. What they're doing is gonna play out and I thought this is a really you know in actual, situations I'm not quite sure how you phrase it with something like that you said there's a moral imperative, so. First. Of all I'm interested in how you ground the moral imperative but, secondly. I want, to know how you turn, the, moral, imperative, into. A practical. Endeavor, and, that. If, we think about and then if we think about and I think universities, are, really important sites for this because what are we doing other than training the. Next generation, of scientists, and engineers so, how do pedagogically. Do. We change and, what. Do you see, as some, of the pedagogical, challenges. In engineering, schools, what do you see being tried what do you think is promising, what, do you think can happen here so easy questions. This. Is what I love. The. STS. Kinds, of questions so. Big. Question, awesome. And this, is why I love being able to talk like, with folks like you. Around, UC. Santa Cruz on these topics. And. First. Off I think that, it's very interesting, a lot of the work in socio, technical, systems, from, I, don't know two decades ago I think, is all. Super. Super. Super. Relevant, now and, this. Is something that. I've. Been a, participant. In. Being. A technologist. Talking. To humanists. About, some, of the concepts, and we. Do. We just struggle, with the language. Like. I know when, I'm talking, to you when I first got here and talking to rod about. Justice, I was like justice. Yes justice, sounds good of course I like justice. And, then it's like you, know in the past few years kind, of reading, up more, as like I realize, like there's a whole freaking, iceberg. Of meaning, behind it, and, right. Now. Around. Ethics. And, technology. A number. Of folks have said they feel like there's a Renaissance, that's, emerging. That, Renaissance. Is nascent. Right now and, a, lot of it is you know we, don't always, have, the, language. Yet, for. Talking. About hard, issues I've, been fortunate enough that I've had very. Patient. Some. Economist. Kind of explained, to me all these different, structures and, philosophers. Explaining. Things and so, kind, of learning. About. Those I do, see. This. The. Conversations. That even, as. Short. As three, years ago seem, like band-aids, have gone much deeper, and so. I. See. Progress, but, it requires, this, kind. Of respect. And the.

Respect, That goes across you. Know yes I'm willing to explain, the. Complexities. Of these. Structures, where would are. The places where you can have, practical. Impact, you, know if I had to rank. Them where would i look at them and so, on, and. What are the concepts. That. Need. To be like. If you you, are. Trying. To train, folks. You. Know going beyond, just saying like oh you. Know there, Conte and there's utilitarianism. You know they need. Something deeper than that and that deeper, thing, comes, from. Understanding. Social. Science more so understanding. I think economics. Power, and, how. That fits in, understanding. You. Know political, science, understanding, sociology. Understanding. Psychology, and. More are, important. In that and yep. Let's. Work together on this. Lisa. I would like to ask you a question. You. Know I think I'm probably quite impressed because I think it's the first data, science talk I've ever heard that did not include, the phrase big data. Anybody. Noticed that. But. Still. It used to be very large data. But. Big data you, know to me there. Is a there is a, important. Sense in which it's, it's. Something. To be considered, because we think that bringing. Information together, is going to provide, us with data, that, will feed these models, with better information, or at least more information, allowed us to make larger. Decisions. That consider more factors. But. Big, data also has the implication of a concentration. And of, a concentration, of power. And. Of. Knowledge, and this. Kind of gets at an earlier question because. That concentration. Then. Becomes, something that's harder, to work, against, so do you have any thoughts about how to deal with that aspect. Of of the. Data science question, this. Collecting. Collecting and, collocating. And. Encapsulating. Of of. Marj. Amounts, of data that actually, drive this whole this. Whole test socio-technical. Enterprise. Now, the. It's a very interesting, question. There. And there's, multiple levels, to answer, it at so. You. Know one. Of the important, concerns. Is yes there, is a concentration. Of power due. To, you. Know a very, small number, of players, having, a lot of data that's. Beyond. A big data. Science issue and. That's really important. And probably the most important, thing. Then. The. Other piece though, is. It. Hopefully, came across in my talk is I don't. Like things to be purely, data-driven, so there's data driven but, then, you. Know I have some knowledge I, have some theory, about the way the world works I want to have a way of kind. Of. Representing. That as well and kind of marrying, the. Two and, so I. Didn't. A size, big, data in part, because I, like, to upgrade you. Know I did say we're scalable, and by the way we do do the largest graphical, models that anybody, does you know even Google is impressed with how big our graphical, models are.

Yes. Yes. But. I think. This. Idea that just Oh big. Data is gonna solve the problem certain, things are not big data problems, and. You. Know since I'm interested in data science for social good a lot. Of those problems, are not, big. Data issues. For. A variety of reasons. Could. I ask one question yeah. I think we have time for one more question okay lovely um I think there's a very timeless, natural, trade-off, between. Keeping. Track. Yeah oh, yeah, I'm over here hi oh thanks. Hi I know thank you so much for talking to us um I would I just wanted to ask about the natural trade-off between keeping. Track of those ethical concerns and implications of other people using your work and progress. I think it's really,

2019-04-22

Show video