Artificial Intelligence, Jane Austen, and the Law - Apoorv Agarwal
Good. Evening and welcome to the Marion minor cook Athenaeum, my name is Isabelle lilius and I'm one of the a fellows this year in, the. Fall of 2015. I took, a Jane Austen literature. Course. This. Is a complete coincidence which. I still consider to be one of the best classes I've ever taken and around, this time two years ago we, were reading Northanger Abbey and one. Of my favorite statements, about the power of literature is a quote from this book which reads. It. Is only a novel, only. Some work in which the greatest powers of the mind are displayed, in which the most thorough knowledge of human nature the. Happiest, delineation, of its varieties, the, liveliest, effusions. Of wit and humor are conveyed, to the world in the best chosen, language, now. While I'm biased to the novel because I am a literature major there. Is an abundance of ways, and languages. To understand, and examine, human nature and the, way of the world yes. Maybe the combination, of artificial, intelligence Jane. Austen and the law may, not roll, off the tongue or click, in the mind as easily as bread and butter or wine, and cheese but. There are they are all ways in which we perceive our environment. Process. The structures, that govern it and inform. Our actions and, choices, for. What's not to say that. Technology and, law don't, also display. The minds greatest powers, or the thorough knowledge, humanity. Has to offer in all, its complexity, and variety, our. Speaker tonight will. Address the question of whether, artificial, intelligence, can, augment our understanding, of literature. And changed. The practice of law an. Expert. In machine learning and natural language processing a, poor, of aggarwal is dedicated, to understanding and improving how, humans, and machines benefit. From, working together his. Work which focuses, on sentiment, analysis, relation. Extraction, text. Summarization and, automated. QA has, received more than a thousand, citations across. The international, research community, a pour. Of Agarwal received, the Ivy pdh. Fellowship, award in recognition, of his work as first. Author on two separate patents for IBM's Watson, he. Received his MS and PhD in, computer science from Columbia University and, received. The Andrew P Tesoro Memorial. Award for, outstanding, performance in teaching, he. Has published more than 30 academic, papers in machine learning and natural language processing in. 2014. He was awarded a grant from the National Science Foundation's. Innovation, Corps program, which, allowed him to lay the foundation, for founding, his company text, IQ, beyond. Academia, a core, of Agarwal has been cited by American, Banker wired, popular. Science and science, magazine, with, text IQ he, aspires to harness and channel the complementary, strengths, of human and machines towards. Solving high stakes enterprise. Data problems, among. Other novels, his research focus, on the work of CS Lewis and Jane Austen, he, is the co-creator of the TED talk artificial, tella death intelligence. Excuse, me a performance, explaining. Artificial, intelligence, to a non-technical, audience through. Movement. As always. I must remind you that audio and visual recording, are strictly prohibited please. Silence and put away your mobile devices at this time and please. Join me melt in welcoming, mr. Agarwal to the Athenaeum. Well. Thank you so much for. The. Kind introduction I'm really very excited to, be at this venue and. Since morning I've had an opportunity to interact with a, lot of students from. You, know the five schools, and I really enjoyed my. Time, here so far and. Just, to kind. Of you, know continue. Start. From an introduction, I. Joined. Columbia, for my PhD in 2009. And. We decided to do you. Know some research on, literature. So we used artificial, intelligence, systems, to. Extract. Sort, of social networks, of characters. And their interactions. From. Literature and and. Later in, 2014. We got a grant from, the National Science Foundation to.
Start This company where. Clearly there, were a lot of things that, we had learned, and. Done research on which we, could now take. Outside, of the context, of you, know literature, and use, it on, a, real-world problem that. Problem happens to be in. The field of law, and, of course it was a you know extremely, interesting. Journey and I've always kind of wanted to, kind. Of find time to articulate. Like how that happened when, that happened and this talk kind of gave me an opportunity to. You. Know articulate, this journey, and story so, thanks thanks a lot for this opportunity and I'm very excited, to. Share kind. Of the. Story of how we started, some. Research around literature, which, no one you know could ever imagine, being. Commercialized, but then we were able to, create. An. Application that's, actually, now. Is even helping you, know the government find, sensitive. Information in. Datasets. So, a couple of goals that, I'd like to you, know achieve through. This talk one. Of them and. I've you know. I've been hearing this, question a lot today but, also a, couple of your. Albums, and Jay is in fact still, at, CMC, now. Jay, and Ben they work at. Text IQ and they've. Been curious about whether, or not they. Should learn computer, science if so how much, why so. I think one. Of the goals I'd love, to achieve today is really, to, you know encourage. People. Not. To, not. So much do, a computer science major but. I think we live in a world where. The. Amount the, amount of devices that are smart, or there are intelligent is, only going to increase and. I think it's it's very important, to understand. That hey you. Know if I'm texting, my refrigerator, asking. If there are enough tomatoes or not I, think it's important, to understand, how this, process is working and. For people who, won't, understand, they will live with kind of this perpetual, anxiety. Of hey, are these machines now going to you know take over the world or not so. I think especially. In this world it's. Rather. Important, to understand how these computers, function and everything is logical and, it's really very controlled, and. This kind of bleeds into the second Google that, I want to achieve through, the stock, which. Is remove, some, of the anxiety, around. You. Know AI taking. Over the world or bringing the world to war, and. There, are a lot of I, think celebrities. And AI who've you. Know rather recently. Put. Out kind. Of statements, saying, that this is inevitable but, I really don't believe it's. Ever going to happen so I'd love to present. My thought, process, and kind. Of an argument of. Why we. Are not only far far away from it if it were to happen but, I actually believe a, lot of work needs to be done even. For us to get to a point where we can say this will or will not happen. So. These are kind of a couple of goals, I'd love to achieve through, this talk and. I. Thought a lot about you know how to structure that's. A lot of things to kind of get through to you know tell you about this. Journey tell you about a eye and. Something. Or literature, as well as law what we do at text IQ at, the same time project, out words and maybe say a bit about. You. Know the future of AI and, and, such so. In thinking about how I, wanted to structure, I thought, I'll structure, it around, the. Purposes, of AI what, are the purposes, that AI. Is serving, in. Our you know society right now, so. I think one of the most obvious purpose that, it's serving is that of automation, so. We are taking pretty much. Kind. Of all the, monotonous. And mundane jobs, that don't really require creativity, and. Automating. Them using. You, know artificial. Intelligence, algorithms. So. That's, where you know our company plays. A role where. We're. Not replacing humans, but we are augmenting, humans their capabilities, in a way where. They can achieve. What, they want to achieve in a more efficient, and effective way, so, one of the purpose of AI which is obvious, and clear is. Automation, and that would be you, know most of my talk where, I'll take you through give. You a sense of what can be automated today what, cannot be.
Followed. By what, we do at text IQ followed. By connection. To literature how it all kind of got, started. The. Second purpose of, AI I feel, is. Kind. Of self-realization, and. Self-realization. May not even be the right phrase, to use here, but. Let me let me try and explain what I mean by this. So, I was working. On Watson, once, you know after it one kind. Of the Jeopardy, game show we were trying to adapt, Watson. To the medical domain sure, it was a great kind of marketing stint but now we wanted Watson to do something, interesting. Or something useful, so. What we were trying to do, with Watson was. Give. It like, a set of symptoms for a patient, and have. Watson, come up with a diagnosis, so, if the patient has a low-grade, fever headache. Blah blah blah. What. Does Watson. You, know believe. The, diagnosis, to be nice. So that's what we. Were, trying to adapt, Watson to and. In doing so we actually ended up hiring. You, know real doctors, and. You. Know I remember in our our lab was pretty open and in, our lab all the researchers, would kind of sit around kind, of the doctors and we'll bring up a question and then. Ask the doctors to think, out loud what. Is kind of the thought process, that they're following to. Come up with the answer and. Very. Much like that. Example the process of building AI, systems, is, actually, a very introspective. Activity. And, we're. Perpetually, curious. And thinking, about, hey how, do the human human, mind you know works in, solving, a problem and once. We understand, that then, we try to replicate that. Thought process, through. You, know a machine you, know learning pipeline, or. A process, so. It actually I. Would. Argue that AI, one of the purposes is serving is actually. Helping us even. If it's through a reward function but. It's actually helping us, you. Know think harder about hey, how. Do our minds work how. Do we solve problems and, also. Like, you know where we kind, of came from how did the. World evolve, so. This is this, would be kind, of the you know second, and the shorter, part. Of my talk and. Finally I think one, of the other purposes. That AI is serving, is that, of accelerating. Creativity. And. Unfortunately, I mean this is a whole beast in itself this, also a topic they haven't really deeply, thought about so. I won't, really be you know go into this. But I'm certainly seeing. Instances. You. Know around me where. I see mathematicians, and you, know artists, using. AI, to. Kind. Of you know sort of accelerate, the, creative, process, have. AI produce, you know. I don't, know hypotheses, a set of structures and then, the human comes in and chooses. You. Know what works for them so I, won't, really, get into this but I think you, know this is also. Yet. Another purpose, that is serving, and. This. Is a list that you know I kind of just came, up with so I think in the Q&A session, or later I'd, actually love to get your feedback and. Would. Love to hear from you if you think, there. Are other purposes. That. You know AI serving. So. Starting, with the automation, I think, you. Know just as a few example, of. Self-driving cars I don't know if you've seen them in LA, I've not, seen them in New York thankfully. But. But they're you know inevitable, very soon will be you know driven, around by you, know self-driving, cars. Virtual. Assistants, so I actually have. An. Assistant, who. Named. Amy who. Schedules, all my meetings so, I literally just go in I see see Amy and. The. Person I'm trying to schedule a meeting with doesn't, even know this, is a robot and Amy talks, in national language she has access to my calendar, she says I'm free at this these times which. Time works for you the guy if the person says hey this time doesn't work for me she comes back with what you, know suggestions, one. Thing I find absolutely amazing, about Amy that, you. Know I think, humans have a hard time doing is time zone. Translations. And. Mile, III. Didn't, like dislike scheduling, meeting so much but if, I had to now schedule a meeting in India or London or other times when this was just like. A nightmare so, I literally, just tell Amy hey I'm in this time zone and the person, is in this other time zone figure it out and she, does it really well. So. You know this is one example of automation and.
I. Don't know if you've come, across brain. Computer interfaces, but, there, are certainly, kind. Of interfaces, now where we've. Studied the brain enough to understand, that, if a person, thinks about, something. Maybe the color red or, they. Want to do a movement with the hand we. Figure out what are the patterns, of neurons that get fired in the brain and then we can translate these patterns, into actually a physical movement so. Now there are devices out there you. Know they're not totally I guess, commercial, but, now. This woman on, the chair she can't really you know move her hand but now. She's just thinking that, she wants to have water and this, robotic arm has has. The bottle and it actually brings the bottle to. Her for. Her to have water, so. That's yet another, in. Fact a very noble, you, know automation. And. That we can do using yeah. So. That's all you know good and, I think like. I mentioned, earlier. Very. Much we live in a world where. Absolutely. I, think it's, debatable. How quickly but. Absolutely, whatever, can, be automated, will. Be automated. There's absolutely, kind. Of no question about it the. Big question is what. Can be automated, and. Not in 50 years nor in 20 years not in ten years but, sort of what can be automated. Today. And. It was a very good, study done. By McKinsey. I don't, love the title the title is where, machines can replace humans I like to think. More. As. Machines, you know augmenting, the humans but. This was a great study that kind, of studied a lot of verticals, in the industry so finance, and. I. Don't know oil and energy and certain other kind of job functions, and. They they. Try, to analyze, so, the example I'll show her is finance. And insurance. So. They found these. Two broad categories of. Job. Descriptions. You know that people do or job functions, that, people do in. The finance and insurance industry. So. One job function. Will do is. Has, to do with managing other people or stakeholder, interactions, or. You know applying, kind. Of some expertise. The. Other side of job functions, that people do is around data data, collection, and, data processing. So. They took all of these job functions, and, then. They. Calculated. Hey how much time collectively. Do people spend on these job functions, turns, out over fifty percent of the time is actually spent. Collecting. Data and processing, data so, not. Management, not stakeholder, interactions, actually a large part of this. Vertical. Is. You. Know spent, on just collecting data and processing, data, then. They also. Presented. Some statistics, around what they you. Know thought could, be automated. In. Each. Of these job functions, so. The upshot here is actually. There are a couple, of up shots one upshot is nowhere. In. The future will have you, know robotic, managers. Robots. Won't be managing, humans anytime. Soon and. Only about 15% of. People. Who. Are in leadership roles or. Who are doing kind of stakeholder, interactions, only. 15% of. Everything. They do can, be automated so that's not a lot. However. Over, 50% of, everything. People. Who are doing you know data collection, and processing do, that. Can be replaced. You. Know by machines. So, this, was kind of an interesting. Statistic. And. You. Know if you were to make you know courier choices based, on this. Report you suddenly. Kind. Of want to manage, other, humans or want. To do stakeholder. Interactions, and not. Do. A lot of data collection and processing because. All of that will, be or, a lot of that will be automated. But. It's the, thing to note here is even, that's not 100% it's. Not that all of data collection and all the data processing, will. Soon, be you know done by machines. And. That's that, kind of brings me to a point that you. Know very much we live in a world where. It's. A great time for you. Know humans and you know machines to collaborate. Together. Towards. A. Kind, of a not. A common goal machines don't really have a goal but towards our goals. And. Going. Back and I know for a fact like 10 years ago and you know I was working on sentiment, analysis, or working on some other, NLP. Problems, as a technologist, I, wanted. To build machines. That could I mean. It was a kind. Of, you. Know Herta pride to know that hey a human, will be in the loop to. Run the machine we, kind of wanted to build, machines that would just replace. Kind. Of you know a function now. That I have gained more. And more kind of experience, working on real world problems I'm. Able. To appreciate kind. Of the role of a human in the loop so I, think. Technologists. In general, have, come. To accept this. Idea of having a human in the loop when. They're you, know running their AI. Systems, at. The same time I think non. Technologists. Who you know don't, study let's say artificial. Intelligence, or computer science they've. Also come to accept AI. And, you. Know just given if you, know we're trusting, our cars to drive us around I, think, you, know we better trust I don't know our virtual assistants to schedule, our meetings that's less deadly, after.
All Or. Maybe not. But. But. There suddenly I think an acceptance, from both. People. Who build their systems, and consumers. Of AI systems, to actually you, know accept one another and kind, of work to together. To. Achieve kind. Of the end goal and. I'm going to tell. You a, few, frameworks, I, don't, expect you, know anyone. To, really, recognize these. Frameworks these are very. It's a machine-learning kind, of parlance. But in case you're interested, there, are these frameworks that, we can now use, to. Bridge. This gap or have machines, and you know humans work together so, on one such framework, or paradigm it's called active learning another. One is called reinforcement learning it's used a lot in, kind. Of game theoretic settings, so when Google build alphago, that. You. Know that that beat humans at, the game you. Know go it used a lot of reinforcement, learning and, also. Interact a machine learning where humans and kind of machines can, work. Together to. Pluster. Things or do other kinds of things. And. This is I now, want to give you kind of a sense of how powerful this is especially. Our powerful this is and. You know economic, sense in a it. Actually makes. A big big difference. In. The, real world. So. What, we do at text IQ and I'll go more into it later one, of the things we do is identify. Sensitive. Information so. I want to kind, of, give. You a sense of. How. Long it takes and what it takes to do the same job of, finding, sensitive, information without. Machines, and with, machines so. What we've done you know at text IQ HCI, stands for human-computer, interface, we're. Kind, of a non technical person can. Interact, with the machine make, sure the output kind. Of looks good or you, know provided, some feedback, or training data and now, together they, can work to, find sensitive. Information and data sets. So. This is a these, are real statistics, on a real. Matter that. We worked on it, was a big healthcare provider and. They. Had about two million documents, to go through, and. If, it wasn't for, you know technology if the you had. They used their you know normal, process, it. Would have taken them you, know hundreds of humans and about 20 weeks to. Go through these two million documents. And now, we've shown it multiple times that this process, is actually pretty inaccurate and. They know for themselves that human mind is you know fallible, it gets tired, and. Humans. Do miss a lot of sensitive information so. On average we've seen that humans are. Missing, about 10%, of sensitive, information, and. The. Cost to client that's kind, of the most remarkable kind, of statistic, here people. Are actually paying to. Get this done per document so. People are paying about $2, a document, they're paying humans two, dollars to look at every single document so. If you. Know this client there are two million documents if, they wanted to run. Their normal, process, it actually, would have costed them four million dollars in just a matter of twenty weeks that's, kind of a big number and two, million is you, know it's it's not, that big of a number you. Know there certainly matters where, people. Have many more documents and the document size are only increasing. So. You. Know fortunately. For us this, client didn't really have two week or twenty weeks they did have the money to spend four. Million they would have spent the money but. Fortunately. For us they just didn't have the time to spend twenty weeks they, had about four weeks because the court imposed, deadlines, on. When you know these people can finish their job so, they were very stuck we're a small start-up not don't.
Really Have a lot of security certifications. But. These guys were stuck they had no other option but to try a technology, so that's how they kind, of cut through and you know shortcutted. Their security. Clearance and all this process, and actually ended up using a technology but. After doing that they were very happy because. We were able to get them the, results, in two weeks which, is about ten times faster, they. Did their own sampling, and made sure looked, at our results, didn't. Find any sensitive. Information that, you, know we had missed so they were very comfortable, turning. Over documents to. The other side and. We. Were able to save them you know about, three million dollars in a matter of a month so. This, client now is, very happy with that technology. Kind of continues to send us data, sets in. Which it you know wants us to find, sensitive, information, but. Just to you. Know kind of contrast so this you know second column is just, humans, just search terms and. The third column is humans, with machine I mean, the value proposition, or the value that you. Know the latter process provides, is pretty clear it's, a it has a faster, turnaround time, it's more accurate and. It's it's much cheaper so that's kind of the impact that, this idea of humans and machines you know working together. Can. Have they, can have an impact there they are having an impact on real, problems, in. The real world. So. That's a you, know good segue to what. We do at text IQ so I'll you, know I've alluded to it a bit but I'd love to tell you few. Of the things that, we do when more. Importantly. Kind. Of the kind of machine that. We're trying to build. And. That will kind, of lead me to, talking. About literature so, all. The literature enthusiasts. It's it's coming and I will spend some, time on it so. So. A text IQ so there. There are a number of kind. Of legal & compliance disasters. Out, in the world, Facebook. I think a few months ago was fine over a hundred million dollar by. The European, Union. Around. Some, data policy, privacy concerns. Earlier. This year Aetna. Had to pay a huge fine of, a billion dollar to. Humana because this merger, didn't. Go through, and. There. Are several examples, of these you know hundreds of million or billion dollar fines that. Is really. Hurting the you, know US economy, and in general, the. Economy, at. Large, and. The, question is. What. Are companies, doing today to.
Kind Of either mitigate, the. Risk of these legal, disasters, and compliance, disasters, or how, are they dealing with it. So. Believe, it or not the. Status school even, right now is. Very much search, terms and. Human, bodies. And. What, people do is they take a bunch of search terms so let's say someone's looking for and. Actually do have an example so it'll become clear let's, say someone's looking for a certain kind of sensitive information they'll, come up with the list of search terms, very. Much like you know the way you use Google search whatever comes back they'll. Get humans to, go through every single document that hits the source stone and. You. Know just to give you a statistic JPMorgan. Alone big financial institution, New York has hired. Over 8,000 people since. 2008, just, looking for sensitive information in their data sets so they're spent and. A lot of time a lot of money larger, resources, doing this and. Even then we. Know humans make mistake it's it's not a very effective, process. Now. Just imagine you're, you know you. Happen to be this, person who's looking at, emails. Of some employees, who works, at JPMorgan right. This. Email is literally, pulled out of context, you don't know who the people are you, don't know what the conversation is around you don't know what the context, is and. So it's really hard for humans to look. At documents that are pulled out of context. And kind. Of make any, sense, out of it, and. That's that's what makes this, process very, risky that's why humans are missing. A lot of sensitive, information that. They they should be catching. And. Before I give, you a very you know specific, example, of how this technique kind, of falls short and where. We come in let. Me just you. Know make a definition, and this is one kind of sensitive, information, that. We look for, in. Our clients datasets so. It's, a, attorney-client. Privilege and. The. Definition, of this kind of sensitive information is, at, least part of the definition, is these. Are communications. Between, attorneys. And clients, regarding, legal advice so let's, say you're. You're an engineer, at Google and. You're working on a patent, application, what. You're doing is you're communicating constantly. With the in-house, lawyers. Lawyers. Who work at Google or represent, Google and you're. Putting all your ideas, of what's, novel about this pattern what's, the innovation, why is it different all. Of this information actually. Exists, in emails, between, people. And. This. Is you know highly sensitive imagine for Apple Apple's, whole business, is, reliant. On this element, of surprise that, hey the world doesn't know what we're coming out with next and it's, a very you, know business centric problem. They cannot, put. Out a document, in a you know out in the world that kind of tells, everyone what the product. Roadmap looks like, so. Actually people put in a lot of sensitive, information, in their emails while communicating with, you. Know tourneys so, by law these. Set of communications. Or these set of documents, are protected, so if Google gets into a litigation let's say with Oracle. And, they, don't need to turn over privileged, documents to Oracle they will have to turn over everything else that's. Relevant, to the litigation but they don't need to turn over privileged. Documents which, usually have, a lot of sensitive, information. So, this, is one, type of sensitive, information that, we find for. Our clients right now. And. You, know just to be, able to kind of visualize. How. The content process looks like. The. Current process is to come up with a list of search terms so. Because, these are communications, between attorneys, and clients what they would do is get.
The Names of all the tourneys that, are you. Know that work at, let's, say at Google and they'll, come up with other, generic, words like a torn knee ligament. And. They'll search for. All of these terms in, a large set of documents, and. Whichever. Document, hits any of the search term now, there's. A human sitting in front of computer and five hundreds of human sitting. In front of computers, going, through every single document, and saying, whether or not this is privileged whether, or not this is sensitive right. So that's kind. Of the current process. And. There. Are two obvious problems, with. This process, one. Problem is that search terms they. Catch too much junk. So. Let's say there's this email you, know hey Mary how was your weekend now. Even if Mary is an attorney. This, email is not going to be privileged. Except. You. Know I don't know if you've turned on this. Kind. Of disclaimer that people put in the end of the emails this email is privileged and confidential and. So, on but a lot of people. They. By default have this kind, of footer you, know set in their emails and whenever they send an email is, automatically, added to their email so when, you search for a word like privilege. This. Footer is pretty much in all of you, know corporate email data set and pretty much everything comes back so. Search terms are highly over-inclusive. They catch a lot of junk that. You, know has nothing to do with sensitive. Information. So. This is what makes the kind of process time-consuming, and expensive and, this is why you. Know the the client that I alluded to earlier it, would have taken them 20 weeks because, there's just simply too. Many things to go through. But. The bigger problem is that search terms are also what we call under-inclusive, they. Miss a lot of sensitive information. Which. You, know we we want to catch, so. Here's an email it's you, know to Mike it says hey heard back from John, he. Recommends, we multiply, the. Two prime I made this up to primes to generate the key now. There's really all. The words in this email, are very general, you're not going to search for John in a corporate email data set everything will come back, it. Doesn't have the other obvious words like privileged confidential. And. So a search term will actually never hit, on this email but. If, John turns out to be John Powell the attorney then, by definition this. Email is actually privileged. And you. Know Google or whichever company they can withhold it they don't need to send it to the other party. So this is where search terms for short and. This is what creates or, makes the process you. Know rather risky, where people are in fact turning over confidential. And privileged information to, the other side by, all the time. And. Just to summarize this. Is the status quo and. You. Know hopefully. You. Know hopefully, I've got. The point across that it's highly, inefficient it's expensive, and even, after throwing whatever money, you want to throw at it it's. Still very risky. So. Just to give you a sense of, you. Know what we do at text. IQ or how we'd. Be able to you know catch. This document. Thinking. About just. A human thought process, of why, you. Know we would call, this document privilege I think, first if, you, know humans we'd, want to know hey. Who who are Sarah and Mike at, this company we. Would like to know who is this John that's. Being mentioned in. This email we'd, also like to know what's the relation, between John, and Sarah and John and Mike and. Once we establish, that hey in fact, John is turns, out to be John Powell, who's. An attorney and Sarah and Mike are clients, of John that's, when we'll be able to conclude, that in, fact this, is a privileged document, that we should withhold. So. At text like you you, know one of the things we, do is you. Know given these emails. We. Were actually automatically. Able, to identify. The. Organizational. Roles of people. Who are emailing, one another we're. Also able to automatically. Identify who, John, is and. Think. Of it you know if Sarah, is emailing, Mike then intuitively, and, mentioning. John on a first name basis, intuitively, both. SAR and Mike know John and if, they know John they would have communicated with John at some point and if, there are too many John's and kind of their shared, social network we, can use language and context to figure out exactly what, John it is so. It goes far beyond search where we need to bring in not. Just the linguistic analysis, but we also need to analyze kind, of people their roles to. Figure out what's, going on in, this document. And. Then many, times people don't even have a list complete, list of attorneys that work at their company, believe. It or not this is true we've now worked on several matters there people don't have a complete list of known, attorneys, so, one of the things we do for our clients is actually find.
Go Fish for attorneys, in their own company, and find that hey this, person is actually an attorney who, maybe you know left ten years ago or whatever but was an attorney and. We. Find relationships, between people, and. We can you know tell how these relationships, kind of change over time. So. In in terms of you, know kind of division. The. Sort of machine, learning platform. That we're. Trying to build at, text IQ it, not only understands. Language and you know it can understand, many, kinds of language so our. Machines can work with about 130, different. Languages. But. It also understands, people to some extent and understand. Understands. Roles of people within an organization. It. Can comment about the personalities, of people. You. Can also under it also understand, relationships. Between. People, in. Fact. Yeah. If this video is going live we'll need to edit this part of, the talk for sure but, a lot of our clients are very worried about. One. Of the sensitive information that, they care about are actually, romantic. Relationships, within an organization, believe it or not every. Single large, organization. You, know not. Every single but a lot of them we worked with they're, always these. Embarrassing. Documents. Within, corporate, email datasets that they obviously, never, want the, other party to discover, because, then in litigation or in court they can you know get an upper hand so one of the things we now do is actually, find, go in and find unknown. Romantic. Relationships, between, people. And, and, and. The reason why we're kind of you, know building this is because, now because, I mean. Our world is first. It consists, of people their interactions. And their relationships, and everything. Else you. Know fills in you. Know given a document a standalone, document if, you understand, who's the author who's, the recipient we. Understand, what. Are the interest, of the author what. Are the background, what is the role given. All this context, will actually understand. The content, of the document itself, better. So. This is kind of the main thesis that you know we're working with and. The thesis is that before, we go in and you, know understand, a document. We kind of need to understand, people. And their. Interactions, and their, relationships, so that comes first and.
Once We understand, how, people, are organized, within an organization. We. And kind of how the organization, functions, in. Fact I would argue CEOs. Of large organization. They have no idea how, the company, is working you know what, are the departments, doing who's doing what how's everything, organized, how. The generating, revenue it's very hard as soon as companies, grow over, even. 50 people it's very hard for one person or a set of people to really understand, how. This organization is functioning. But. Given. You. Know communications. Of people. You. Know we we're, building machines that can actually, understand, people their relationship, their roles and kind, of start, to understand, how, is this company actually working. Or functioning. And. Once we understand. That then, actually zooming into all, kinds, of sensitive, information. Becomes. Much easier so. What we're trying to do is kind of build a platform that. Understands. How the organization. Functions, and, once we understand that there, are many many different kinds of sensitive information that. People care about and. We're. Building applications. Pourer. You, know type of sensitive information, to. Help our. Clients, find those in. Their corporate data. Set. And, we're sort, of you, know the motivation, came, from. Believe, it or not it kind of germinated. From. This novel that fat, novel like for, one year with me trying. To get through it and. Didn't. Understand a world, there's. Just so many characters, so, many you. Know such large and. A community, so many social, and, complex. Interactions. And relationships that. It. Was really hard to follow in fact the. Only thing I think I remember from, warren piece's why. Napoleon. Got his the. Number triple six and how. We. Can do the triple six calculation. You know given someone's, name, so. It. Was very frustrating, to kind. Of. You. Know you. Know get, through war. And peace and when. My advisor got this grant. When. I started my PhD I was like hey. If. You want to build. Build. A system that understands, characters, their interactions. That can figure out communities. And social relationships, I'd. Really want to build something or. Run it on kind. Of war and peace hopefully. You, know I'll understand, it better after, the machine summarizes. War. And peace for me around, characters, and, their interactions, actually. That's where you. Know this motivation, to, even start working on something that. Understands, people and the relationships, and interaction, and it germinated, from. So. This is the segue, to. Literature. Our research that, we did you, know around literature, and just. To give you remind. The clock and give you some background we. Incorporated. Our company, text. IQ in, 2014. But. A lot of the research that we do or used, we. Started doing in, 2009. And. We got this grant from, the National Science Foundation, which. Was about studying. How people. Now create, social. And organization. Relationships. But. The grant was, in the context, of Enron emails so, I don't know how many of us know about Enron but this is a company that did a big fraud. Accounting, fraud was, shut down in early 2000, and after. It went bankrupt. Some. Part of their emails were released to the public to, researchers. To. See if researchers, can find you. Know how this, company function or how if. There are ways we can develop to detect these. Kind of frauds or these events, automatically. So. That you know, California. Doesn't lose electricity. So. We, had. To do this research in the context, of Enron emails however. There. Were two problems that, we realized one. Was we.
Didn't Have all. The communications. Of Enron employees we, only had communications. Of a, few, employees so the data was inherently, incomplete. And the. Second, that we really didn't have any ground truth, none. Of us worked at Enron we had no idea, what. Were the relationships between people we had no idea how. The organization. Functioned, so. If we build a machine that, made. Some predictions. We, wouldn't know and. If so let's say the prediction was bad we. Wouldn't know if it was bad because of our methodology or because, of the data set that we were using this. Data set was incomplete and. Also there was no ground truth we, could compare our predictions, with. And. That's why you. Know we decided that hey. Let's. Build this machine using. Literature and once. We build something that in fact is useful, or you know provides value, then, we can spawn. It on kind of a corporate email data set. So. An author of, course you know in, a work of fiction they, introduce characters build, characters personality. Introduce. Other characters, and through interaction build. Relationships. All. Of these things evolve, over time but. It's a it's, a confined, world and it's a kind of a complete world. And. So, that. Helps us alleviate, the you know first problem, which is it's. Not incomplete, data it's complete data if our, machines. Are in fact able to you, know the methodology, is right then we should be able to figure. Out or comment, about the roles of characters, in the novel, accurately. And the. Other thing is you know all of us have read Alice, in Wonderland we know what the Queen's role was we, know what the rabbits role was so. That's just, general. Knowledge so if our machine, predicted, hey the rabbit is the protagonist, or the main character, clearly, that's, the wrong prediction, and we'd, have to you know improve our methodology, we can't blame the bad, prediction on the data anymore so, that was the charm and that was the, advantage. Of working. With. Literary text because even. Though these are artificial worlds. These. Worlds are still complete, and. And. We. Know, how. You know what are the characters, we know what are the communities, we. Know what are the relationships, so we can compare our, predictions, with you. Know what we know. So. With that kind, of you, know. Motivation. We kind of set out to, try. To summarize. Novels. Around characters, and. Their, interactions. But. We spend a lot of time just thinking, about interactions. Like how do we define, interactions. What kinds of interaction, do, you want to differentiate between, two. People talking to each other versus, two people having dinner worse, versus, two people you, know running together so. There are all kinds of. Kind. Of categorical. You, know or ontological, questions. That came up and we actually ended up spending a lot of time reading. A lot of. Even. You know psycho linguistic, and you know that kind of literature to come up with a categorization. For, interactions. And. We. Came up with two very high-level it. Was an ontology that we defined but at the very high level there, were two types of interactions, that.
That, We came up with these. Are you know rather, obvious but. One kind of interaction, is bi-directional so. In this picture you see two people the. Clouds are the cognitive, States and. You. Know one is a person a person B so. Person a has person B in their cognitive state and person, B has person, a in their cognitive, state so. You know now you know we're you, know having you, know conversation, you're aware of me and I'm you, know there are a few so, this is kind of a bi-directional. Interaction. The. Other kind of interaction, is one directional let's say I start talking about at. An Obama right now so. There's evidence that I have Obama in my cognitive, state. But. As much as I like there is no evidence that you know Obama has him. In my have. Has, me in his cognitive, state so, that's kind of you. Know a one directional. Relationship. And. Through. A study on Alice, in Wonderland we actually were able to show that. Differentiating. Between these, two types of interaction, tells. Us a lot about the. Roles of characters, in, fact it also tells us about the status or. The influence of characters. In. A story. So. Just to give you an example we. Found that hey. The. Queen was. A person in power clearly, I mean she was responsible, you know for Alice's. Execution. And. We. Found that throughout, the novel a lot of people talked about the Queen they, referred the Queen and all kinds of you, know different plots. Or events. But. The Queen interacted. With a, small, set of people at, the end and in, fact the sets of people that talked about the Queen was. Quite different from the set of people that the, Queen interacted, with. Found that so this is a clear differentiation between. We. Use both unidirectional. Relationships, people talking, about the Queen and. Mutual. Interactions, where people, you know the Queen is interacting, with so. Because we differentiated between these two kinds of interactions, we. Were able to notice, this. And and, you can imagine like for. Example a celebrity, you know everyone talks about. The celebrity, but. The celebrity, they would interact with very, few people and this, set of people would be quite different from the set of people that talk about them. So, in corporate email you know data sets if you want to find people you know people in a position of power or who's influential, in a given team or in. A given business unit. This, observation actually you, know helps us in, identifying. People. Who were whose. Title, may, be misleading or, title, could be an engineer, but. Actually turns out that this engineer, is very influential, in the steam literally, everyone's, kind. Of you know talking about them but they, have a very small set and different set of people they're interacting, with, so. Differentiating, between these two types of interaction, actually helped us allowed. Us to comment. Or. Find, things that won't even written anywhere. It was nowhere written, that. This, was in fact the case but we were able to infer knowledge. Which. Wasn't, kind, of memo realized in writing anywhere. Which, is very powerful. You're. Also able to find and this is you know this, is a matter of fact finding but. We. Get excited as computer scientists, that we were able to find this automatically. That. The story was being told from. The perspective of Alice, and in fact she, was the protagonist. So, again we had to make, use of both the, unidirectional and bi-directional relationships. To, infer this. And. We also found that the role. Of the rabbit so we were able to find the roles of characters, in, a story we were able to find that the role of the rabbit was, pretty much to move the, story along the. Rabbit was never in. A position of power or. Influence and in, fact if you study. The props character, theory or there are other kind of character theories that. Say what are the kinds of roles, that characters, need to play to build a story like, a protagonist, antagonist one, of the characters, that, people often use I. Don't. Know if that's the right technical term but a flat character, who's, literally whose, only role in the story is just to move the plot along who, really doesn't contribute, much, else but allows. The author to move kind. Of the plot along. And. Depending. On time I'd love to kind. Of tell you a funny. Story but. How. Much, more time, oh. Really. All. Right. So. We built a system that was able to extract a kind of social network this wasn't you know this is not the prettiest picture where.
We Took Jane Austen pushed it through our system. And. The, characters, here you know the important. Characters, are, kind, of in big circles, this. Is kind of the influence characters, have so Emmet really was you know the central character. Knightley, Harriet, and, some of the other characters, were also is a static, view but in a dynamic view you could actually. You. Know look. At exactly, what the communications, were about you could summarize you, can find kind. Of the kinds of relationships, different, characters had. So. There. Was an application of, all of this on you know literary theories unfortunately, I don't you know have the time to get into this so I'm going to skip. Over the, literary theories there we validated. So, here was the, main conclusion, one, of the theories said that. As, the settings go from rural to urban the. Social networks change, in a rural, setting you would expect everyone knows everyone else and it's, a complete social network in an urban setting there, are different kinds of social you, know structures, but, in a novel for the author to make actually, the novel you know digestible. By the readers they take out all the urban noise from. The novel so, once they take out all their urban noise the novel actually looks very much in, terms of social network it, starts to look very much like a novel, in, a rural setting so that was one of the you, know main conclusions. What. Else can we do this is a you know we can automate, the Bechdel test this, is a test we automated, on film screenplays, never, got around to do it on literature. But now that we have machines I can read. Through you know thousands, of novels, we can see hey are. There any patterns, of kind, of authors or time or genre, where, novels. Pass more. Novice pass the Bechdel test versus, the other you. Can also study you know props, character, theory and. Validate, for ourselves hey is. This a complete, set of characters, that, going to. You know build. Build. Build a story or you. Know are the more rather less. So. Does this augment our understanding, of literature for sure if I was in front of a computer science, major crowd. Who know nothing or literature, or as much as I do my, answer would have been yes boldly. Yes, but. In general you guys obviously, experts, in literature I'm a bit hesitant to. To. Say yes so, I don't. Know what you, know, my. Answer is well what do you think. Yeah. And. The. Next part of the talk which I. Can. Zip through in five minutes or not. Okay. Okay. I'll zip through it it may not make any sense it may not have made any sense even. If I had the time but. But. Let's see. So the second kind of purpose of AI I think. Is kind, of self-realization and, I believe. None. Of this is you know you know going to happen so. L on mustardy, I could lead a kind of third. World war Stephen. Hawking taking. Over kind of the man you know mankind, Sam Harris argues, that, super. Intelligence is inevitable, that will just. Happen and I really, don't think, it. Will happen and. Then. Think of these examples of automations, or AI that we are building we are building AI that can you know drive cars. And, we will continue to improve this AI but there, is no no. Matter how much we improve this. AI that drives cars. It's. Not the case that suddenly. This AI will become self-conscious, or will start acknowledging. Other, machines, you know as a species, and starting. To realize that hey. You. Know there's we're slaves to humans and retaliate, or whatever right, so. The AI will continue, to improve but. The. Kind of AI we're talking about self, consciousness is not even on. The path of. That. Improvement, no matter how much it improves it's. Never going to be self, conscious at least the kind of AI we're pursuing right. Now and. I. Think. And, this is just a belief it's a very kind, of, it's. Just literally. I'm you know sharing my beliefs with you I don't have any citations. Empirical, whatever, to. Validate. But I think if you were to build machines. There are self, conscious and aware of one another in. Some sense we need to simulate. Evolution. And. To. Simulate evolution. Of course we'll need kind. Of very powerful hardware, because. Evolution it's an exponential, you know process it's a very time-consuming process so.
We Need to come up with hardware, that can do, exponential, number of calculations, in linear time because we don't want to yeah. We don't want to simulate. Evolution. That takes as long as our own evolution took you want to simulate it, and you, know something that finishes in a month or in a year, so. We you, know absolutely need a very good hardware to be able to simulate. This evolution, and. Then we'll need to kind, of come up with, sort. Of you. Know some. Initial conditions, of how we kick-start this evolution, and. I think some I don't know if you want to call it laws of nature or, some basic, laws that. Would kind. Of govern this. Evolution. So. I think the, way to get to self. Conscious, machines or machines that recognize. One, another, we. Need to simulate evolution. And. These, are the things I'm sure will need more things but, the at least we need all of these things to simulate, evolution. So. First of all we are very very far from it right now. You know quantum computing, can hopefully get, us there but there. Are a lot of problems. Out there, that. Are very hard we just simply, cannot solve that require exponential, you know calculation. So, hopefully at some point we'll get to hardware where we can do exponential, calculations, you know in linear, time or constant, time. But. This is a very confined environment it's, not only confined by the hardware because, the simulation, is running on a hardware and unless the hardware is ether which is I can't. Imagine that to happen it. Will be confined, just, by definition also. I think we. Will be absolutely. In charge of, you. Know what laws we use to have, this evolution, kind of go on and it, would be sort of you know bubble. Wrapped it would be confined and we. Will be more, or less in, control of. How this you, know world evolves, so. For these reasons. III. Don't, think. You. Know so I argued, that we, need to simulate evolution. To you. Know build. Machines that are self conscious and, then. The. Environment that we simulate in it's. A very controlled environment, and it's kind of hard to. You. Know think how. You. Know how these self conscious machines that we build it now are going to break out of the simulation, or you. Know what's going to happen for them to kind, of come into, you know our world and kind. Of retaliate, or take, over our world and. I, think the, reason why we you know do this simulation, is. Not because, we want to build. Self, conscious machine I think the. Motivation will come from this. You know perpetual, or constant, conquest, or understanding, you. Know how. We came, into being or what are the laws of nature, that. Govern our, world, and. If we were ever, able, to or. Wanted to discover, the.
Laws That govern, our, world the best way of validating what. Is that set of law what is the set, of initial, conditions that, led to you. Know the world we live in is to simulate it and see, if, in fact that happens and. It's. Of course not going to be one world that will simulate will end up simulating, you know an exponential, number of wars, with a different set of laws and, kind, of see which, of these worlds, and. Evolve, to start, looking like you, know the world we live in and let's. Say in one of the simulation, we, find a simulation, number seven twenty nine wherever, results. Or their dinosaurs, in, this world that we created oh it's kind. Of starting to look like, our own world, they're, very hard questions that we still need to answer, how. Will we actually detect if this simulation is leading you. Know to our world, because. There. Is this process of mutation, and. It, is unclear the. Sequence, in which you, know organisms. Appear. In this world it's going to be the same they're going to look like. Humans, or whatever it's, it's very unclear, and. Can. These you, know simulation. Lead, to a species that are smarter than us that's also unclear I don't. See why. The. Species that we simulate is you know going to be smarter than us but maybe. And. If. You. Know we we, as a species have this kind of urge to discover where we came from and the way we discover this is through, simulation, there, is no reason not. To. Believe that the simulation, that we do will actually go through the same process and they'd, want to kind. Of discover. Themselves. And so they'll start, you. Know they'll get to a point they'll make enough progress to get to a point where they simulate. Their. Own kind. Of evolution. And this, as you can see is suddenly. Very quickly kind. Of becomes a recursive, process, it's like a fractal, and, now it's unclear when this recursion is, actually going to stop and even if a desktop, and. We are able to you, know discover the laws of nature that go on, you. Know our world it's, unclear you. Know how what happens when, people. Are actually or species, are actually you, know able to break out of these simulations. So. There are a lot of I think questions, and progress that need to be made. For. Us to go, from our world to kind, of self conscious machines and I. Think. You know for this reason I really don't believe a, AI is bringing, the world to war or taking over the world. Anytime. So. I managed to run our time but hopefully I've. Convinced. You that computer science is cool, it, does give us an opportunity, to think about a, lot of things other than just programming, it does allow, us to you know connect with different, kinds of. Fields. And specializations. And. So. I would highly encourage all. Of you to at least understand, get a basic understanding of, computers, and. Hopefully, have. Some. At. Least to some extent helped. Remove the anxiety around AI taking the world. Thank. You. We. Now have time for a few questions please, raise your hand if you have a question and Isabel or I will come to you with the microphone. Thank, you so much for coming I wanted to ask what your opinion on gendering, personal assistance so. Alexa. Siri. And Amy are all generous female I presented, consciously, our. Users as being female I'll to ask your opinion on first. Of all this to me was, necessary perhaps and like, your opinion on sort of the the impatience that might have or changes. In in area sure. Yeah no that's a, great. Question, and, in fact the, company that built Amy, extra, layer they also have. You. Know male version in. Fact my advisor he was a Siri but, I think he has the, male Siri so. For, all of these agents, maybe it was an afterthought. But. People. Are you. Know trying to. Kind. Of also produce you, know male versions you. Know all of these assistants, and let, humans decide which now. Unfortunately. When. Someone did a study of how. Many people use a female assistant course as a male it, did turn out that give. Even given the choice most of the people were.
Using You, know female. Assistants. So. I think that just goes back to. Yeah. This unfortunate. Gender. Bias you, know in in society in community. Where I. Guess. You. Know we associate, some, kind of rules with. You know with, gender I. Won't. Know how else to answer. This question yeah. Thank. You so much for your talk um so I was wondering if you've heard of Ray, Kurzweil, and his super, optimistic predictions, for. AI and if so, what did you think of them. I've. Been living, under a rock running. A company is very hard so, unfortunately, I haven't, but. I don't, know if if, there's time I you know maybe we can take this off fine. Thanks. For coming I wanted, to ask we've, seen a lot of high-profile, individuals, talk about how AI. Is gonna take over the world I found, your argument, to be very convincing so, my anxiety is suppressed thank you very much for that, where. Do you think the main, divergence. Is between. You and some of the other high-profile individuals, in the AI space, on. That argument. That's. You, know excellent. Question I, think, I. Think, the divergence, or I, mean. I think. You, know one of the major assumptions, they're making is that, hey. AI has improved so much in 10 years and. Imagine, the next 10 year is how much it's going to improve so. It will become go, from being intelligent to being super intelligent, and, now once it's super intelligent, hey look it can, do. Things better than humans which by the way it's doing right now but. These things are very specific. They can only do very specific things better than humans but it's not that they can no. Matter how much progress they make in driving, cars better they're, not going to become self-conscious. So, I think for, super intelligent, machines or for machines to really. Generate or, create their, own things we, need to. Simulate. Evolution. No matter how much progress we make in the kind of AI we're seeing right now I. Don't think. You. Know that kind of super intelligence, can be achieved so I would say that's kind of the point of divergence. I think the, divergence, is coming from the set of assumptions that. We. Kind of believe in or are working with. Hello. I just had a question about how. Computer. Programmers. What their opinions, and perspectives how, those are incorporated. Into the algorithms. And if, from. A computer science perspective if, those have checks and balances especially. When having, these future, predictions. And, yeah. Just your opinions and thoughts on that if there are regulations in, place and, if, so, what. Do those look like if not how, should they look sure. Yeah so. If I followed, you. Know your question, correctly. There, is this, need. And in, fact an urgent need to regulate some, part of AI and, especially. The part of AI that now is driving you know robots or driving cars which can which, are actually a matter of you, know life and death so, the government is you, know they, have a lot of work to do because the technology, is ready to go but, the regulations, and laws are not really there, yet so, there is in fact a lot of work but arresting lis this work is not coming out of computer science world it's.
Coming Out of policy so there are a lot of policymakers, in. DC right now who you know working night and day to come. Up with bills to. Kind. Of regulate AI. And. Especially kind of the autonomous, wykel. Kind. Of AI most. Of the boundaries and checks computer, scientists, would put is just to make sure the system doesn't crash, so. They're. Not very interesting. I think, to a larger, audience. Hi. I. Was. Interested in, asking. About. Whether. For. Finding. Human relationships, was it like, you. Would give. A. Set. Of like different novels or stories and, tell them like like. This. Name refers to the, protagonist of each story and then, have. The computer find like the common ground between, those different protagonists, or was it the other way around where you like worked with one story and you edited the parameters, until it like finally, returned to the right person right. Right, so. We, kind, of knew the ground truth we knew, you know who the protagonists. Are in you know different stories, so. We, would run. The novel you, know through our machine and it would make prediction, and then, we would validate out. Of let's say I don't, know six hundred novels how many, to get the protagonist, right. So, we would just use that to kind. Of compare. Compare. The result and. Then, we. Of course have a training kind. Of data set which. Will give to the machine to you know kind of learn from so. When it's training, itself it would look, at a complete set of novels, it. Would still of course read one novel after the other but. It'll kind, of summarize and learn from, kind. Of the complete data set to. Figure out how to predict who. The protagonist is. Hi. Thank you so much for your talk I noticed. In the map of relationships, with Emma that there are a few like, people, on the side like Shakespeare, who weren't characters, that they were just, mentioned I was wondering how good, your, like programming, is it discovering, like who is actually a character and who is who, is not or if they're good at catching that mm-hmm. Yeah. That's. A you know great question. And. Then we didn't, do this so, clearly I mean obviously our machines are not perfect, you know they made mistakes. And maybe Shakespeare was just mentioned, but because it was a. You. Know name of a person that the Machine detected, it, thought that it's a character. You. Know in the story and that's, why it appears on, the map I'm. Sure, you. Know we can figure out ways of eliminating. Kind. Of these characters, because these, characters, are not really going to interact with other. Characters in the story so, characters. That are just being mentioned or, being, talked about but don't really play a role in the story maybe, we can label them as external. Characters. But. That's something we. Did we didn't do in our research. Hi. Thank you so much for your talk I was I was. Wondering how. You arrived, at the conclusion that, to, make, self realize machines, we need to stimulate evolution, I. Was. A little confused by that. Again. That this is very, much kind of a you know woken, work-in-progress. And. So. I didn't get around to kind, of talking, about. This but obviously all the things I mentioned one. Implication, is there, is no reason you. Know for us to believe that. You. Know we were not in a simulation. And. You. Know we're of course you know self conscious and that's. Kind of, the. Reason, you. Know why I believe that for, us to raise, consciousness it's, a very complex. I think. Operation, and. I don't know how we can just create, it without, following kind. Of a process of evolution from, you. Know cells that are very simple cells that are not aware of another. But then, through. Kind, of a process of evolution more, c