Innovation Conference: Transforming AI into Organizational Performance: Challenges & Opportunities?
Now to our next session, which is, transforming, artificial, intelligence. Into organizational. Performance. What are the challenges, and opportunities. So i invite all the speakers from that session to activate their cameras, and their microphones. And uh. Welcome, to thank you for coming. Melissa, thanks for having us, um i'm rob siemens i'm a professor, here at nyu stern i'm melissa's, colleague. Um, and we're really excited about this next panel. Um i'm going to moderate this panel it's i got a mix of industry and academic, experts. And i'll start by inviting the panelists, to do what melissa just said if if you all haven't done so yet. Unmute yourselves, and turn your video, on and, while you're doing that let me just provide. A very brief high level overview, of what we're going to cover over the next hour. So the past decade has seen rapid advances, in ai. Especially, in controlled. Uh laboratory, settings. This has led to an explosion, of startups, and commercial, applications. But, uh some argue that the on the ground truth is turning out to be a little bit different. That things are harder to implement in real world settings. Than anticipated. In, lab settings. Are these minor hiccups, are they part of a larger, problem. What types of solutions, are there, are there differential, impacts. When it comes to large firms versus small firms and how can we remedy that difference. So to better. Sorry to better, educate, ourselves, on this topic. We've assembled, a panel of industry and academic, experts and i'm going to let them introduce, themselves, to you. We're going to go alphabetical. By last name. And so that means, that we're going to start with liz o'sullivan. Liz if you could take a minute. And just. Tell us about your background. And what it is you're doing right now. Sure. Thanks rob and thanks everybody for having me it's great to be here. Um and my name is rob said is liz o'sullivan, i've been working in the new york city tech startup scene for pretty much my whole career roughly 10 years now, um i started out. All in the ai space but on the business side as um you know product manager. Product marketer, for, an nlp, company we were doing some programmatic. Job matching and advertising. More recently, i've been doing a little computer vision at a company called clarify. And my new company, arthur ai, is, a algorithmic, accountability, company, we monitor models to make sure that they're doing okay the kinds of things that you would want them to do.
Um, My spare time i also volunteer, for a non-profit. Called the surveillance technology, oversight, project where, again we try to reduce the amount of ai hype that exists in the world. And um and monitor, the city uh specifically, new york city and state, for abuses, of surveillance. And other kinds of really high-powered, technology. Um so i'm right there on the line between, like this technology, is incredibly transformational. And we need to allow, it to grow and, become and to let it foster. But there are certainly some applications, that are going to do a lot of harm for society, and we need to work out while those at the same time. Thanks liz and um. We look forward to hearing in particular, about some of the products that, you've got at arthur ai. Um. Now quick sidebar. Um i was actually an english major. Um in undergrad. And so you would think that i'd get the alphabet, down, right. But as, natalia, you're laughing right yeah i've already messed it up okay, i said we go alphabetical, by last name technically, that would mean. Lavina, first. So, natalia. Uh let's come let's, uh why don't you introduce yourself no problem rob, since i'm not a native speaker, i don't, keep the track of alphabet, either, so i'm natalia, levine i'm a professor, at, nyu stern school of business. I was the inaugural, director of the fubon, center, that is running this wonderful, conference, i am happy to see our second conference. Start to, a great start. Um, the reason i'm on this panel. Is because, um, i have just completed. A 10 month old, long. Study, with my phd, student sarah lebatz. Of, artificial, intelligence. Adoption, in radiology. Which is often. Given as an example, of the frontline, adopter. In the, high expertise. Space. Broadly my research is about boundaries, training technology, development, and news, and i won't take more of the time and i am very much looking forward to other panelists. Remarks, uh as mine will be. Um, probably plain devil's advocate. Thanks. Thanks natalia, and we look forward to hearing more about that study. Um continuing, with the introductions. Uh vincenzo, you're next. Hi, hi everyone, thanks rob. I'm vincenzo, palermo. I've, been, i'm the global data science, lead for the research, team at accenture. I've been in this world for three years now, and. Focus, the team focuses, on not only implementing. Ai, analysis. To, towards. Clients, problem but also developing, new, thought-provoking. Research. Around ai, technology. And innovation. Uh very happy to be here and discuss this topic important topic with all of you, thanks. Thanks vincenzo, it's great to have you and we're looking forward to that high level view that you'll be able to bring to the panel. Um foster, you're next. Yeah hi uh. Happy to be here thanks for doing this, rob um. Yeah so i'm a professor, here at stern, um, and, i'm also a distinguished, scientist at the real estate unicorn, compass.
Um. I've uh, been studying, and uh doing, uh ai and data science for an awfully long time and i think the. Most relevant, for today. Is that i've been deploying, and. Orchestrating, the deployment. Of. Ai and data science solutions. For 30 years. Uh including dozens, of applications. Across, industries, in telecom, and banking, and tech, for the dod. In advertising. In real estate. Um. And uh i also had a boutique, consulting, company that helped stakeholders, figure out how to get return on their ai investments. Uh that was acquired by compass last year which is one of the reasons i'm working there now, um. And uh. Yeah and so i'm uh yeah looking forward to our discussion. Great thank you foster. Um and thank you all, uh, for being on the panel look really looking forward to hearing your thoughts. So the first topic that we're going to start with is, where is it that we're seeing successful. Implementations. What what's going well. And so vincenzo, i wanted to start with you if we could just start with sort of this high level view that i'm hoping you can bring to the panel. Um and you know get given your role at accenture if you could touch on that, that'd be great, what's going well. Yeah absolutely, so, in line with this topic we actually. Did. Some of our research. Right pre-covered pre-coffee. We interviewed, about 1500. Companies, globally. We were able to identify, what's what are the key success, factors. At least among ourselves. And, that mean. What's, working. For this, for this set of companies, the first main driver is culture. Culture, across, the entire organization, where culture is. Driven, by, a ceo, vision, a vision that is, fully focused, on data-driven. Ai, driven, insights. As you can imagine, the second, step. Once you have a vision, how are you implementing. It so, how are you tangibly. Bringing, ai, within the organization. To your customers, whether you are b2b, b2c, industry. And that's by creating, a dedicated, units. Within, companies. That are. Not only. Based, on talent. And we all know about the discussions, around. Uh talented. Around, that scientist. That architecture. Data strategy is not a consultant. And so on. But. They are driven, by, a clear, figure in the organization. Which is often. Embedded, into either the chieftail. Officer, chief ai officer. Some analytics. Senior. Senior, executives. That are able to be the connection, point between, the, implementation. And the strategic, vision. And from a more tangible, tangible, perspective. We didn't, find. Differences. In terms of, how much money companies, are investing in ai pilots. So it's not a matter of throwing, more money, throwing more dollars, into projects. But it's a matter of how the money is being. Used, how the money is being invested. Beyond, culture, and talent. It's a matter of setting, it the right, expectations. But we find that successful. Companies. Are actually setting, guidelines. For their return on investments, so to evaluate, their projects, that they tend to be much longer, usually. 12, months, or even 24, months or longer. While. Companies. That are a bit more unsuccessful, that tend to be six months four months so, it's difficult, to estimate, what's the real impact, about a pilot, of ai. In such a short time is. Such a short time frame. And. Finally. One critical, change also, is. Going over the technical, debt. Especially, larger corporations. Now when we're looking at, new startups. The face. Legacy, legacy, systems. So. We're we know about the discussion, around. Breaking down the silos, bringing, in all the data across. The organizations. From external, vendors, external, partners. But even. By doing that it's still not enough, because now we have an overflow, of data. So being able also, to. Go through the data, cut through the noise of the data. That's, when. We find the true value the two value added of. The so-called, b data. To it to inform, the ai, project. And to give you some ideas on. These that all these, shopping needs that i gave you in terms of success, factor while they seem. Normal, and they seem. Um. Kind of conventional. We find that only, about 15. Of the companies that we have surveyed. We can classify, it in this bucket, so we still have about, 85, percent of them that are still struggling. And in fact, among all these t executives. We have 84. Of them that just say. Yes we want to implement, ai. But. 76. Is still struggling, doing that. Despite. The actual financial returns that they might be able to see. Wow. Thanks, vincenzo, thank you that's really interesting about the success, factors. Hey about the success factors and then b about, how few of the companies, actually have those. In place already. Um i, so thanks vincenzo, i wanted to turn to foster, who again as foster was introducing himself, he mentioned how he's been. You know one foot in academia, one foot, in the let's call it like the real world, uh trying to implement some of this.
Um, I'd love to get your thoughts on what vincenzo, just said. Foster. Yeah i mean that actually all seems. Directly, in line with my experience. Um, my experience. Um. Is that um. There's a small percentage. Of maybe 15 i don't know what the percentage is i i trust, in those numbers, you know small percentage, of the firms that know what they're doing, and a large percentage of the firms that actually, don't know what they're doing. Right and a lot of the failure, simply, comes. From actually just. Not knowing what you're doing and i can actually come back to that come back to that later what i wanted to basically, say was um, i think there's an orthogonal, dimension. To the stuff that vincenzo, was talking about which i've. Uh. Observed. Repeatedly, i mean i, when i was thinking about where is it um. Uh where is uh do we see success. Of, ai data science whatever we want to call it, applications. It doesn't seem to be based on industry, we can have we can come up with great successes, in any, industry. We probably could come up with some industry where nothing is automated, there's no data, if we really wanted to find an edge case but i think that would be a, you know. Approving, the rule exception. Um. Even if that makes sense um, the um. I think that there's a there is though another dimension, here because there's really three different, sorts. Of. Ai applications. In business and in other, other um. Other. Uh, organizations. Right. And i'm going to present them in the, order of easiest, to hardest, to actually. Deploy the products. Right, the first is, providing, what i'll put in air quotes decisions. To massive, numbers of consumers. Often take it or leave it, right, these would, i think the the poster, child of this would be recommendations. But you can recommend, songs we can recommend books we can recommend, wine and actually we can recommend all sorts of useful things, right, and we do that at scale. And they're really take it or leave it. And. Another, instance of this would be search, google search is a massive, ai, product. Right it's amazing. Right but it actually provides, massive numbers of consumers, at scale, take it or leave it, um. Decisions, right so that's the first and we can go on you know find my kid's face in all the pictures. You know things like this right um. Um. The second which is harder, but still i think on the easier, side. Is. 100. Automated. Decision, making. So not giving take it or leave it decisions, that then then a consumer, does 100. Automated, decision making so i started. Working for uh you know i worked full-time for the phone company in the 90s. Uh basically, bringing machine learning. Uh, products. To realization. Right, and we did things, like um, just dispatching, technicians, to fix problems in the network. Um, fraud detection, where a lot of it was just where a lot portion of it was just automated. Right and so on and so forth, and, here, you know, again we've been able to do this, over and over again, credit scoring, you know just just over and over again, uh across, industries, right um it's harder, because you're actually making decisions. You, but the the reason it's not as hard as the next one is there aren't pesky humans getting in the way. All right so the last one which is the hardest which is actually my favorite, one of the ones where i've, i you know i focused a good bit of my professional, career is, helping, experts, do a better job at something they're already expert at. Um. And so, you know for instance, ha the other half of the fraud detection, work is actually, helping the fraud analyst to do a better job, right we worked with banks to sort of help them to, uh be able to catch people who are um. Committing, crimes. Uh you know, um. In the you know real estate we help real estate agents to be able to do their jobs better and they're really good the ones we, have at least, are really good at doing their jobs right and i think here. This is the hardest. We do it successfully. But you know, this is where, a place where you want to bring the pros in, you don't want to just fumble around. You know, and try to do it yourself. Um. Foster, thanks for laying out that uh that framework. Um, i think it's something that we can we'll sort of come back to probably a few times throughout throughout the discussion.
What I'd like to do right now is actually just use that last. Bucket that you were talking about to segue. Uh to ask natalia, to describe, a little bit about her research because it feels like it's really related to some of what she's been doing, exactly. It's like foster, made but, perfect, link. So, can you guys hear me yeah oh good. Yeah, so, yes indeed, we were studying, how ai was adopted. By, top expert, in top setting. To augment, or improve their decision making and as, actually i'm going to go back to the second bucket for a second because. For some of the, artificial, intelligence, applications. In diagnostic, radiology. There is hope for full automation, one day, and what i wanted to say, from the bat which might be interesting, to people wide audience here is that, well we think and read about in the economist, and other places, that. You know doctors. Like diagnostic, radiologists, should be threatened, because your next x-ray mri will be read by machines here, what we saw in practice, that the doctors, were very much interested, in welcoming. And wanted the technology, to help because their job is very, very, hard, and, they have extremely. High legal. And professional, responsibility. And ethical, to be honest like they care and they want to do good. So, when people think what i'm going to say. Has to do with resistance, a threat, we saw no zero evidence, of that. So for some uh so what, we did, look at was, um a very, you know prominent, hospital. Whom i which i cannot name but that is sort of on the forefront, both in terms of, doctors, expertise, like top rankings, as well as in terms of, their, adoption, of i'm, seeing themselves, as pioneers, in the space. And a lot of top management, support. Um. For like, trying, things evaluating. Things. And so what i'm going to say next, is is a little disappointing. To not only. To us as researchers. And people, on this panel but also to the doctors themselves, which is that most tools were disappointing. In terms of their results. And, um. One thing that that sort of brings us back to the topic of performance, claims, that i we're discussing, is. That when, you read about it in new yorker, and, um you know and economist, and even in academic reports, you see a lot of really great results so probably most of you seen how good image recognition, has gotten, right, like it can beat humans, now based on imagenet, tests, like if you go in in those databases. It says it's better than humans right. Well, kind of in diagnostic, radiology, and medicine, there is what you we end up with is a question what does it mean to be better than human right. So, how do we know what we know is the question. So for example, you know i'm gonna use an example, from.
Actually, Not from medicine, but, you know if you think about the pictures of animals and you try to classify, them sometimes, we misclassify. An animal right, in the same way doctors can misclassify. A case diagnostic, case. Yet these tools have been trained, on doctors, prior decisions. So, what happened, is that eventually. Our hospital, was considering. Only the tools that did exceptionally. Well on all sorts of accuracy, measures. And, the graphic, measures that you see in various reports such as areas under the curve. Which actually first is one of the pioneers, of this measures across ai, uh, in everywhere. So, um. This measures, you know they would be saying it's close to one meaning that the tool has very few false positives, very few post negatives, it's you know it's a goal, then they start piloting, this tool and we've seen at least five tools with in-depth. Pilots. And by the way speaking to vincenza's, point like it's so few are doing anything we we think of the world as everybody is doing a ton the reality is this is a forefront, hospital, was actually piloting, five tools. So when they were piloting. In all of these pilots they were surprised, at how. Far off, the tool was from them you know diagnostic, quality, in terms of their own expert opinion. And in some cases. You know they they were saying well you know maybe it's okay for trash tools for example, you know it's okay if it's a bit off as long as it gives us some red flags kind of like fraud detection, and foster's, case, but in other cases. You know we have quotes like at missed half of the brain you know it's not really anywhere, close to performance. And what i want to say is that when you look at this, um, then there is this next decision. Let's say the tool is a bit off are we actually going to implement, it and how do we know whether, we are right as doctors, always as a tool is right. And in some of these cases, the financial, roi, was quite attractive. So in spite, of significant. Issues with quality, which is how i would talk about it the knowledge quality. They're kind of proceeding, and i don't think they're doing it out of malice or cost savings it's because. At the end of the day they're confronted, with an issue of we don't know who is right. And i think this is one of the biggest issues that we see in ai more broadly when you start automating, expert, work, you often do not know who is right so what you do generally, is go for augmented, models, both humans and ai's involved. But in order for this models to work you actually need some way of interrogating. The tool, and establishing. Sort of gold standard, ground truths, which we find very difficult, in medicine. So with that said i just hope. We keep the. Skeptical, hats on as we talk about the future of this technology. Natalia, thank you super interesting to hear about, um in a minute liz i'm going to turn to you but let me just just before we do that let me just mention to the audience. Um that we will have time at the end for questions and answers so, please feel free to start using the q a feature. Uh to start putting in, some questions and then we will. We certainly will leave time at the end to get to your questions. Liz so i wanted to turn to you um, you know both to hear a little bit about the tools that uh your company is producing. Um. That presumably, could be helpful. To, uh some of the companies that natalya, for example, has been uh visiting, and and studying. But also just to get your sense more broadly. About this issue. Sure yeah it's a great question it's one that we've been you know in industry and working on for as long as ai's been an idea, like what what can it do what can't it do and we went through i think we're coming to the end of nearing the end of this phase of, just wildly, experimenting. With computer vision especially. Um but, with most of these technologies, like is it suited for hiring is it suited for document extraction. Um, and in a lot of cases we're finding very narrow applications.
Are Really good you know like a document extraction. It works, it works very well and this can automate a ton of processes, um and similarly, to that you know nlp, language. Is powering lots of chat bots and call centers now, and so these kind of very narrow use cases, um specific, to one company where you know your clients are only going to ask for five questions, there's only seven answers to them, where the problem is easily quantified, that's i think where ai is at its sweet spot right now and we see that over and over again. Um, but one thing that i think people who deploy ai, neglect, more often than not is that it's not a magic wand it's not an easy button where you can just put something in place and let it go. Because. These systems, are organic. They evolve, and and if it's not the system that evolves even if you, deploy a version that you've tested and interrogated, and you know that it works with your special ground truth, um accuracy, set which is nearly impossible, to find, um but they do exist sometimes. Um then people's behavior can change, so let's say you know covet is a perfect example of this, all of our healthcare, models that are trying to predict, you know where early interventions, might be needed, or beneficial, to the population, to help them. Prevent you know potential suicide, or potential. Rare diseases, where these diagnosis, could be caught by machines. Um, the argument, goes, in some cases better than a human, up for debate but, um but this all is happening, and, people aren't doing the same things that we were doing before covid. And so, we don't know, as, uh, practitioners. When, these, events, change, all we see is a drop in our kpis. In our, business metrics, customer, churn. Or, profit, is is, gone or a loss or or you know something terrible happens all of a sudden and your system is shut down, um, the unfortunate, thing is that it goes through this process of root cause analysis, and takes a while to then figure out ultimately. Oh the model messed up right, um so our platform. Is an observability, tool a lot like things you see. For systems management and cyber security. Um, to basically. Shine light, into the systems. On an ongoing, basis. Um and it aims, to reduce that friction, of transitioning. From a lab setting into the real world, which is exactly, the critical point where most systems, fail, things rarely, behave, the way that you expect them to in the real world.
As They did in the lab, um simply because you know we can't predict everything it's simply impossible for humans to do. You also can't understand. The like very complex, math. Calculus, and linear algebra that the machines are doing and so we've had to invent tools and these are peer-reviewed, academic, mathematical, concepts, um we're not reinventing, the wheel but, um that reverse engineer. Rationale. Behind why a machine made a particular, decision. That it made, and and what the factors were leading to that, um and so we we'd like to say you know you really can't optimize, something, that you're not measuring, that you're not looking at on on a regular basis, and have added some degree of automation, to that to, bring humans to the table, when things change when something needs immediate, attention. Um, so this is not a tool that will solve every problem it's not you know ethics in a box but it is a technical, approach, to a technical, challenge. That is i think a really important step forward, in making sure the systems behave the way we want them to. Thank you liz. So um. So what i'd like to do is just sort of open it up a little bit to, uh to whoever on the panel would like to chime in. First, and this is sort of building off of what liz was just describing, a moment ago so, um, we have the vision of ai, and what, and, and and what it can be doing. Um. We have, examples, of where it's perhaps not living up. Some examples thank you vincenzo, of where things are going great, right the 15. Um. Plenty of examples where things aren't going great, right natalia, provided some of those to us, um. I'd like us to start where i'd like you guys to to help help us all think through what are. Possible, solutions, and maybe that's too, grand of a word. But liz describes, sort of some technical, tools. Right which is sort of one. Approach that could be taken, vincenzo, is starting to hint at sort of other, types of um sort of necessary, conditions. Let's say. But, i'd love to get, folks to chime in and, you could sort of visually, raise your hand. So i see natalia. Foster. Um. So let's start with that, well we'll start with natalya. And then foster. Thank you rob, i was thinking, as, liz was um discussing. The, you know what's kind of like explanatory. Tools right everybody. Believes that it would help. To if the some of these tools were more transparent, especially, could explain their logic. And we saw. Several, of applications. Of airways, this would indeed be the case. However. For. Setting like medicine, and i'm pretty sure for many other settings maybe. Not all but but many. There is a tremendous. Time pressure, to to provide this diagnosis. So. We we were actually, it was pretty clear to us that if some of these tools were actually. Given an additional, functionality. Of explanation. The doctor, looking at the, tool disagree, you know results that disagreed, with the initial judgment would indeed. Be able to figure it out but they would need. A lot more time, which brings in the questions that vincenza, might would be caring about which is the business model in roi, right because, if you, really need. If ai is actually in our case all universally, was adding quality, not not saving time. If if it added any quality it would be quality, rather than like replacement. And time savings, and the whole business model and pros, work processes, have to be resort. To compensate. In a sense for that quality. So an example of this that actually already happened, is that in many places, now in the us for example. For mammography. Detection, there is a mechanic, strip, medicare, and other insurance pay an extra dollar if it's read with a computer, right. Um, but but the but if you that is just for the radiologist, to look at the computer, diagnosis. But in fact for geologists, to make sense out of it which would mean to have like time to look at explanations, they'll have to pay. Like twice as much as they pay now to justify, this extra quality. So what i wanted to say is that and actually it's a question maybe to vincenzo. As well is like, where do you see people willing to pay for quality, because i don't think it's like a cost case, in this high, expertise. Space. So somewhere down the line maybe somebody's, saving. Overall health care costs, like avoiding. You know, you know procedures, but, that's what was and like i was thinking about explanatory. Ai. It's like important part is who's paying for the time it takes to read the explanation. Okay, thanks natalia so, i did say i was gonna go to foster next but actually foster if you don't mind, uh since uh natalia, specifically mentioned vince nintenzo, a couple of times let's go to vincenzo, then then foster, nincenzo.
Absolutely. And thanks and italian. And it's it is a, major, so managing the cost is, an aspect on the business, model absolutely. Yeah. As and going back to this point also, explaining. What's happening so what is ai so right now we there is the hype we just use algorithms, they are widely available. The problem is why. And by simply. Adopting, algorithms. That's what actually we see we can see an increase in cost because then right now we are, randomly. Experimenting. And i want to be excited to exaggerate, a little bit we are randomly. Experimenting. Using, a code, or using, ai, simply, because it's event. But without having a clear scope in mind. And you in your research you mentioned obviously. It was performing, very well in in the lab setting, then it was performing, last week but, it was still a way to augment, the doctors. And, if, you think about. Just having, a technology. That is able to, support, our. Skills. Today. It's a way for us to. Physically, ultimately, reduce the cost because now we can. Be augmented, by the eye and we can. Shift our attention, or that, cost-saving. Time setting that we're experiencing. Towards, more productive, functions. So we do see that. That's the key aspect. On the finance, on the financial, on the financial, cost saving on the financial, roi. We do also notice, that, uh, more tangible, metrics, in terms of revenues. For especially for large, corporations. And in our research we do find that. Those companies, that 15. That, small group they actually, have a narrow-eyed. 3x, larger. Than. Everyone, else so. There is some proof that, you you actually can experience, larger financial, returns. But it's all a matter of time you you, can't expect, to simply. Plug and play an algorithm, unless like foster said you're doing something very simple. Something automatic, or what at least mentioned just a black or white decision, that is. Easy, to implement, and easy to define but when it becomes a complex, decision, machine learnings. And ai are not. Uh they don't have a conscience, they they don't have common sense they are, very very powerful. In uh pattern. Recognition. Thanks nintenzo. Um. Let's come back to foster. Okay, so um. I want to product, um. Let me let me start with one thing. Right, because i see this missing, in most of the discussions. Of ai hype and failure, in the press. Um, and it just seems to rarely, be, be discussed, and to me it is like the, the most critical, issue here, right. So. When we actually go out and we go and we and we deploy. Ai applications. As with. A lot of kinds of deployment, of products. You have, the engineering. You have the science, and you have the product management. Right. You know, most of the failures that i've read about. There's just no product management, done at all. Right, and so essentially, i mean so so let's just take the example, of, since, since uh uh natalya, called me out yes i have been a champion. Of roc, curves roc analysis, area under the roc, curves and so on i've written, tons of papers, on that, you know on my on the science side. It is a science, evaluation. Measure. Area under the roc curve says, nothing. At all, about how well a product, using that model, is going to perform. I mean unless it is. A 0.5. Which basically, says, the the model the model actually isn't predictive, at all right it says nothing about it, i've worked in applications, where you can have area under the rrc curve of 0.99. Where that wasn't good enough and ones where you have 0.59. And actually it was. Extremely, high return, right. We take the science side, we work with the product. The people, product management, we design, a product, and that has to do has user research. It works on iterating, to try to figure out how to use the models, well right i mean, the. Anyone who is saying look the science results, are good. And my product, failed. Don't blame the science results, i mean. That's my uh. That's my position right until you basically. Somehow. Work through the product, side of things right and i agree completely. With liz in order to do this right we need to be able to explain, the individual. Decisions. That are made, by the systems. Right, you cannot, build a good, product, from most applications. High. Applications, where you actually, have high-stakes, decisions. Right you know without being able to understand, why the model is making the decision, is is is is making the decisions that it's making. Can they, stand by fox for a second. I'm i self-nominated. As a devil's. Advocate. I'm gonna have to play, to the end. Actually, you know of course i'm on the same page with foster and he's aware of our work um, so i had couple of questions. Yeah, thank you. In fact just to give you an example, of foster's, point.
The One tool that we saw that was, you know the one that was super useful when physicians, were saying like i cannot, not imagine i worked without it before. Was the tool with the improvement, of performance. With the roc, curve of 0.65. Just so you get the sense for it, but because the radiology, stone without this was 0.6. Right so that little boost. Uh was very important to them however. And that brings me to a wider point that applies across different domains. Is that this type of uh you know usefulness, let's call it or inside, was achieved, because, in this practice, you could imagine product managers was the doctors themselves who were pioneers. And was like, co-developing. The tool in a sense. They look at the whole process. So in order to get that, sort of quality. Improvement, that matters that was worth spending, time, to unpack, the ai. They were re reconfiguring. The whole process. So that, another, tool that wasn't ai is that they were having on their desktop, allowed them to quickly understand. Whether the eye, result. Was actually meaningful. So, this whole configuration. Is what it took. To get the roi and i think, from cases, i know outside, medicine. You know being in this space. It's, it's takes like sort of high level view of the whole process which i know foster, being at the, senior position, in the company. Where he's doing it has the leverage, to do, but i'm i'm gonna put out there that, by introducing, ai for a particular, task, in, as and not considering, the whole process, both in terms of value and in terms of its configuration. It's unlikely, that anybody will get anything, out of it so it's kind of i wonder foster if you would accept the modification. Given that we're talking about expertise. It would be the process engineer, almost not the product manager like the process manager has to be up one step. Looking at the overall, practice. Reconfiguration. And roi, on the whole overall. Practice. Because for me this is one of the i mean the product management, organization. I mean that's that's a responsibility. Of the product management. I guess manager has failed, if they haven't actually taken that stuff into it i think i guess in organizations, like hospitals, it would be practice, leaders like somebody with a broad, sense of what's going on, rather than with a narrow task focus sense however, we call them that's a person. Or a group of people who can reconfigure. The practice, to get the real, performance, whatever that means quality, of course or speech absolutely. But by the way i want to say that there is a one difference with your study which is. Bringing, in something. That, at least, purportedly, had already, been productized. Exactly, they're evaluating, a bunch of products, what was the product management, how did they work together, to actually make it a product. Rather. Than. A toy. To be extreme right rather than a toy right i mean, you know i sort of think of this in a lot of these things that we that we that that i see it it makes me think of. Oh we went out and we hired some phds. You know and they had written some papers on this thing that showed good results, right, and so we had them build the bridge. And then the bridge fell down. You're like. Surprised. Like you know i mean is that how you would build a bridge you wouldn't check to make sure you have professional, civil engineers, on the task and people who actually built bridges, before. That didn't fall down you know and so on and that's this i mean a lot of what we see in the practice of ai again, there's 20, or whatever of companies we just do it really well, right and then everyone else to me is just still, kind of fumbling, around and so i don't see it as necessarily. That this is the way it has to be it just is the way it is. Um i'm, yeah so liz i see your hand yes thank you i didn't want to come to you go ahead. Yeah no i think that's a really good point uh both both of you have made very good points about it not just being a problem with the science but the whole system, and how the system interacts with the real world and i think there's one, tiny dimension that i might add. With my background coming from startups which is also. You know then there are the business motivations, of the companies, that have productized, these tools. Um and and those may change or the sales people selling, a model, that was designed for one thing to do something else, and i think especially, after covid, and so for those of you who aren't in the space this is very dangerous, when a model, is designed for a certain purpose. And it's repurposed, to something else different domain different data different use case, that can mean the world, in whether.
It Fails or is very successful. And we see this a lot, recently, with computer vision companies, pivoting. Um from you know person identification. Security, applications. To, does this person have a mask on, or is this person ill right can we determine, somehow from the pixels, in this photo, whether this person is at risk of having covid, whether that's a thermal image detecting, fever or something like this, um, but it's a little unclear. Especially. Whether this generalization. From domain one to domain, two, is the retraining, is a brand new product it's being well thought out has a product manager you know a lot of startups forego that especially, early stage startups. Um and with so much of this technology, being open source a lot of times you also have business, owners. Who aren't. Experts in the space, that are just making ai for x because it's lucrative, because it gets ubc, it looks great on a powerpoint, presentation. And things like that so i think the whole industry, has a lot of maturing, to do. Both. Organizationally. And operationally. And and we just need to continue educating people about what is good use of ai and how to do it in a smart responsible, way. Thanks liz i i was i'm so glad you brought up covet because that was what i was going to specifically. Ask you about, it was interesting that uh you brought it up vincenzo, also sort of touched on it. Um. You know early on in your in your comments. Um we've been getting some questions, in um, and we'll go to those in just a minute by the way other people who are listening in, uh please, if you have questions, please feel free to put them in the q a we'll get to those in just one minute. Given that covet came up i just wanted uh vincenzo, to come back to you because you did mention coven very briefly is there anything you wanted to add on that. I'm. 100. In line what lee said, and it's, relevant. So covet. Has been a complete, shift in the way, businesses, are working, supply, chain is moving. The way, consumers. Are changing their behavior. And, that also implies, now all the. Ai, processes, that we had in place, up to six months ago, it needs to be updated, so. It needs to adjust, now to everything, that is changing to give an example. On credit craft for, fraud. A one-way, ticket, an airplane, one-way ticket, used to be an indicator. Of fraud. Now, you can imagine, everyone can imagine, how many and, including, myself, but one-way ticket, probably a few months ago. And and, now to come back to. New york. So. Now that's not an indicator, anymore so, how are you going to adjust. From an organizational, perspective, despite, your model being. Excellent, let's assume that even your predictive, score is fantastic, it's working, is in place. You don't have organizational. Processes. That, that are blocking, the development. But now you need to adjust to the change in the market to the change in the industries, to sh excellent, shocks. And that has a complexity. Of a, a layer of complexity. To this entire, discussion, that we've been having. Can i comment, quickly on the question. So in march, of course as most of us researchers, started thinking about, you know coveted related, things and we started talk to our sen, our. Hospital, about you know obviously, imaging, ai, for covet. And, nowadays, i start a lot of my classes asking, you know if do you want your local, radiologist, in your hospital to be diagnosing. Covet, or ai. And most students which is public perception, for me, go for either, ai, augmented, but the right answer and i'm gonna put it out here in writing is that your doctor, unless it should, be the one diagnosing, forget the eye.
Unless You are in a place like you know where you don't have enough expertise, and the reason for this is, because. It is a fairly, i'm talking about radiological. Diagnosis, it's a fairly, clear. In you know thing for doctors, to see. They have, vehicles, for disseminating. Knowledge about it through, their academics. And in fact they're pretty good at it once they have imaging, of the right type, whereas for ai to learn something new from small data set at the beginning of the pandemic, is quite hard as all of us know, now you could argue that, so the reason i'm saying it is, is actually to highlight, how much difference there is between the public perception, of the issue. Which is what we're talking about, and sort of, expert perception. Because, you know having attended, now, many, you know medicine, kind of dialogues, about the use, of ai, for covet detection. In imaging. It's pretty clear that like they there is this, you know no. Evidence, that it's useful, basically, unless, you don't have. Access to professional, expertise. Where you know is better than nothing. So, with that said i think that we have to like almost, like be careful. In terms of how public, perceives, things as well because, having this over hype. Might hurt us you know as all of you guys have mentioned, and one should be very very careful. Thanks, um foster, yeah if i could just to quickly say because because i i focus so much on the lack, of it seeming, lack of product of product management. I want to make sure that um, that i don't give short shrift to the engineering, and to the science right i mean doing really good engineering is important and i think the, the case of like the world changes. Right. Is the case that's why that's one of the reasons why you want to also make sure you have top-notch, science talent because, for machine learned, models. Right there are certain, changes, in the distribution. Which we can handle just fine. There's certain, changes in the use distribution, that are challenging, but we know how to deal with them and there's certain changes to the distribution, that are going to be big problems. Right you know and it's the scientists who understand, these things right you know and so you want to basically, also make sure that you basically it's not like oh don't have any phds, just have product managers no you need to have this. Well-rounded. Team. You know who has all the different, lines of expertise, that you're going to need. Thanks. Um okay we're going to turn to some questions we've gotten, um, several, questions in that they're they're in a bunch of different, sort of domains, so i'm going to i'm going to pick and choose i'm going to start with this one. This is from fabrice. Can you talk more about explainability. In consumer, finance, a couple of statutes, require. That an adverse, credit decision, be explained. To the denied, borrower. How is the technology. Going to meet this requirement. Um i'm not sure who wants to take that. I'm going to volunteer, foster for this. Liz i think you also were talking a little bit about explainability. So maybe we'll come to you. Yeah i can uh liz if you want to go first that's fine with me i i yeah, yeah so i've done a lot of research on this, um, we can explore. Cases, like the finance, case, um and what i'm saying here is cases, where, the, individual. Um, inputs. To the model. Are comprehensible. Themselves. Right. We know how to understand. The, why the models, made the decisions. That they that they made, um. I i can't i mean, we take the rest of time just to go through the how, how the things work but basically, you can ask the question. What is it. That, caused. The.
System. To make the decision that it that. That it did and there's a variety, of different uh different techniques, here right, things get more difficult. When. Actually, the inputs themselves. Aren't, uh comprehensible. The extreme, case there are pixels in an image, right but people also, have worked on ways of understanding. You know for instance. Uh take an image it's classified. As. Foster. You know, why did it classify, as foster you can ask what's the minimal, amount of change you can make to the image, maybe in blocks, or something like that such that it's no longer classified, as foster, and you find that he's got these carry you know sort of eyes that sort of look very special, and this funny thing around his mouth, and that makes him characterized, as foster, or it's just his whole face, or whatever, right you know, um, things where you can't, where the the the more. Difficult. A human, would have i mean here's a rule of thumb, the more difficulty, a human would have making the decision, of why did i do that. The more difficult, it's going to be to explain, the the you know so you're basically like how do you know that that's foster and not his brother who looks very much like him i don't know it's just not it's foster's. Foster not his brother, right that's going to be a harder thing for the ai to explain as well if you say why would somebody, decline, credit and you know you could say well. Guess what, they don't make enough money, given how much they're asking, for in their loan. Right those are those kind of things we can you know we can do and i can point everyone to. Liz maybe as well to lots of papers written on it. Yeah i'll, agree with all of that i think explainability, is a really exciting, field it's not that old as far as fields go and, it's developing, every day and this is a very exciting, and hot space, and, you know the conferences. Nurips, facts you see, all kinds of papers on explainability. Um because there's this, gold rush right now people are racing, to have the most robust, most explanatory. Versions of techniques that exist that have existed for, a little while. Um because right now we're basically, mounting a proxy, like an approximate, model so we create a second model, and then that model is easily explained, because we have very clear feature weights and importance. And. You try to focus that on one particular, decision, to see if that's decision. Is. The reasons why, um it's not the most robust. Right now as it could be in terms of like really getting at the core of it um but that's going to change because like i said the field is just moving very very fast. As to whether this will work in credit, i think that's more of a regulatory, question, than a technology, question because like yes we have these techniques, and they can be used. Um but will the fed let us and that's a really, important, note and i'm not sure that using ai and credit decisioning. It you know it's, there's a lot of scrutiny on it right now especially, after the apple card goldman scandal. Um notions of bias you know our platform. Similarly, tracks. These, metrics, across protected, categories, and you see there's a lot of, disparate. Impact, that people, who are purple get different kinds of interventions, than people who are green get different loan prices, and things like that, um, so, in the end yes and i think we are seeing, um you know some appetite, from companies, to use ai to price loans and to do things in banking. Um and they're in turn putting pressure on the occ, and the fed and the sec.
And They're. Also feeling a lot of pressure from the public. Um to. Have better laws around these like the most recent guidance i think is from. 2007. Around, model, governance, and model oversight, and things are different now so, um i would keep a very close eye on the space, around, anything in fintech, and ai, um i think we're going to see some big changes in regulation, soon and they're kind of coming from a coordinated, effort, not just from one agency. In general. So can i i think there's a clarification. That's really important, because a lot of the methods for explainability. Are explaining, the output of the model. Which is different from explaining, why a decision, was made, and so there's a method say. One of the most popular methods for explaining the output of the model for a particular decision is called shop. Shop gives like these features, have this have the most. The highest, weight, in, explaining, the output of the model. It turns out that they those features may have nothing to do with why the decision, was made and so we really want to make sure. That were, so you could have a you could have a decision, made. Where, the top. Shop, feature, is not important at all and you can have a, a, decision that's made where the only important feature has a shaft weight of zero, right and so. I think we need to make sure we distinguish. These are just two different things i use chap all the time in my work because gen a lot of times i need to understand, why the model, did what it did, but that just we just have to clarify, that's different from asking the question. Why was the person denied, credit. Right the top chat feature may have nothing to do with why the person was denied credit, so so so there's a question that i think, one of the questions that we've gotten that i think is related. Uh to this discussion, this is from richard berner. What are the best practices, for identifying, bias, and back testing. Yeah. Go ahead. No please do first i've been talking forever, well, one aspect. Which was, mentioned a little bit before far, faster, it's definitely the data you are bringing in. Uh so through the process. You have. Eighty, percent, of your effort is on data whether it's from data collection. Cleaning, preparation. Tagging, in the right moment at least mentioned the april goldman, case for the credit card, we can mention with other cases about buy especially. Around mortgages. So. An ai an algorithm, it's like a toddler, that you are teaching, them. How to make a decision how to give you an output. And if you if you're that, since the beginning. It's already, teaching, that to the toddler, that, minorities. Are less likely to, to have uh, to be approved, for a credit card or credit for mortgages. This is what the algorithm, is going to learn. So, already, starting. From, understanding. What the data, what how are you cutting through the noise, so, going back to the best practice, how you're getting through the noise how are you making sure that the data you're using is correct. That's, absolutely. Step one from my perspective. Thanks vincenzo, liz. Sure um you know most, most companies are doing, extensive. Outcomes analysis. On, synthetic, data, and different kinds of tools exist to help you, interrogate, a model with you know, what if analyses, like if i were to change the gender, of this, inference, what would happen, to the prediction, would it change, um and there's a whole i mean there's, 50, some odd different definitions, of fairness, so to speak in a.i. Um, and so it's it's very case dependent, what is the appropriate, way of measuring, it for, your, situation. Um and critical thinking is a big part of that, something i don't hear enough um. From these kinds of analysis, is like this is yes a science problem it's a math problem, um but it's also, a humanities, issue and you see this a lot of times in healthcare. Where you have data sets, um open source data sets or things that have been collected. Um being again repurposed. For uses, that maybe they're not intended, and those models are not as effective.
And So if you don't know for instance. You know that your. Your, data set by which you're making a forecast, about which women. Over a certain age can't get pregnant, if you don't know that that data is coming from like a study, in france, from, the 1800s. Involving prisoners. That's going to make you you know a lot, different. It's going to make you feel differently about how that information, can generalize, to the real world, um and that is a fact that it has nothing to do with ai but it sort of permeated, the collective consciousness, like that's what we think about when we think that's why most people think that women can't get pregnant over a certain age, but the data itself, is really at the core of the issue it's always a data problem, uh whether that's just reflecting, history. Or it's, reflecting, some, some actual, bias that's, encoded, into us into our behavior, um nowadays, like you see with police data sets. Um, so, i wish i had a magic bullet for it i really do, but at the end of the day the the. You know the one most important thing is that you are looking, for these things that you're trying and you're not just stopping, when you. Achieve a certain level of what you would call fairness that's appropriate, for your business, that you continue, that pursuit, to continue to lessen the disparate, treatment, um with a lot of these techniques, that are are now available and again growing very quickly. Um, yeah. Thank you, can i make a quick pitch here because last year as part of a fubon, center ai in business series. Foster, has, done an excellent, fireside, chat with salon baracus. From microsoft. Who is uh research who is one of the pioneers, in this field. So, there was spending about an hour right foster. Going into absent, algorithmic, fairness, in in and i think part of. Um, salon baroques, research is being very interdisciplinary. And kind of almost law and sociology. Focused so, i highly recommend that people interested, go to our fubon, web, center website, and, get to that, video, and i think you'll get a lot out of it just this year. Uh. Thanks natalia. Um. So i, we've got about, four minutes, left, and there's, we still have a few topics, and a few questions. I, want to pick one that's sort of um this is perhaps selfish, on my part it's something that's sort of near and dear to my heart. It's about education. Um, this is from, isioma. Utomi. What are new opportunities. Or applications, for ai, in education. Particularly. For customized. And continuous, learning. Did any of you have any thoughts on this. I just saw a recent study i don't have i'm you know i'm sure foster, knows lots of applications, but i thought, one you know obviously. A lot of the stuff that's an ai and application, has done around books writers as large platforms, like coursera. And such and uh, there is a recent study. That i have seen where they truly can use this machine learning and other analytical, tools, to increase engagement, right so one of the key issues, is people like start and drop off right.
So Having, you know pinpointing. Who needs what sort of which little push and then our colleague alex the zillion. Uh who is um, one of the prominent, recommender, systems researcher. Has built a platform, for the, public universities, of free university. That nyu, supports. That helps, in catch students, through midterm, assessment, basically. Giving them a recommended, system as to what they need to study and where they fall behind. So both of this um you know basically. Takes a teacher out of the loop when the teacher can't spend time with millions of students. So i think for large free platforms, it's invaluable. But this is just the two things that came to mind. Where i think, you know other than should have. More depth than this. Anybody else, have any other thoughts on. Ai when it comes to education, vincenzo. I think, uh. Just some last comment from natalia, building on that so the recommendation. System is definitely, one of them so what's, the next class, you should take. But if you take a step, back you can even imagine. Giving your current, skill set. Regardless. Of what's. What classes, knowledge you have. What's the next skill, you you should you should improve, to achieve a certain goal so thinking modern, in as a talent development. If you take a first layer of ai adoption, in defining, the skills, and what's. Suggesting, what's your net skills or the. Right, skill, combinations. To, get to a certain level and from there then the educational. Path. So it might be even a two-layer, aspect. Just absolutely. Serving, just to clarify. What alex has built with his postdoc. Was this is not a. Course recommended, it's actually within, the learning platform. So it sort of relates to a point only within the plot let's say you've mastered, quadratic, equations. But you're really failing, on, graphing, right so it's a very, and the beauty of it that it's it's. Completely, hands off and they even pulls the material. From like a huge database, of textbook, and sources so it's not just from course material. So it is quite, out there like i i don't want to say it's a little course recommender. System it's it's a big plug, okay then yeah just just to be clear. Yeah i'm not an expert in this area but i studied it a, long time ago, and i happened to my opinion, on this is where we could, see. Tremendous. Impact in the world. Is, we've we've known how to do, intelligent, tutoring, systems, for a long time for things like math algebra, geometry.
So On, i think we could also do. Languages. Uh, reasonably, well, right. One of the issues is that um in developed, countries, it's not as needed. Right but taking an extending. Basic, education. Across developing. Countries, this could be. You know sort of a, very i believe could have, amazing, impact one of the problems is. It's not a big possibly, not a big commercial, success, so who's gonna uh who's gonna fund this. Um. So so if we're talking about delivering, it, um. Via, are we talking about delivering it like this. We could deliver. It could be delivered, like this. Yeah because i got to say the interface, needs, huge. Huge. Huge improvements. That's sort of my own take, uh dealing right now with two kids. Six and ten at home who are. Just literally, struggling. With this, one because she can't. Deal with it the other because there's so many distractions. Out there. Um, immersive, reality. What do you think what's that, no i'm just kidding around i said, they need immersive, reality, teaching, yes not sitting in front of us okay yes. Um. Uh everybody so it's 140. So, that concludes, our panel, um i would love to continue to talk with all of you all day long because i i'm big fans of. Foster natalia, liz vincenzo, all the work that the four of you do and i always learn a lot when i interact with you so, hopefully the audience did as well, i know i did. Um. Thank you all very much, uh the audience can't clap but i sure can. So.