You and AI – the history, capabilities and frontiers of AI
Well, hello everybody and, welcome. To. The. Royal Society. My. Name's Andy hopper I'm treasurer. But. Perhaps of relevance, to this I'm professor of computer technology at, Cambridge University as. Well so, this society, has been around for a, little. While and. Over, those centuries, 350 something, years. It. Has played its part his, fellows, have played its part in. Some of the most important, discoveries. And. Actually, practical, use over the years as. Well, and. In. April. 2017. About, a year ago. We. Launched. A. Report. On machine learning actually. It's a series, of reports what. Might be called. The digital, area on cybersecurity on machine, learning. Actually, on teaching computer science that sort of thing. However. Our report, on machine. Learning, called. For action in a number of areas over the next thirty years. We. Use the phrase, careful. Stewardship. In relation, to machine, learning data and that sort of things to. Ensure that the, benefits of this technology are, felt, right across. Society. And that, we encourage, facilitate. Participate. In a public. Debate. On. This. Broad. Topic. And. Discuss. How. The benefits are. Distributed. As, well. As trying. To think ahead of some of the perhaps. Majors. And other other things. This. Series, of events, and lectures which is, supported. By deep mind. We. Hope will help develop, a public, conversation about. Machine. Learning AI. And so, on and. Provide. A forum for a, debate. About. The. Way these technologies, may, actually. Will already. Do, affect. The. Lives of everybody on the planet. So. It's great to see her here at what is our first event and. So we have demis. Hassabis a, superstar. To. Give our first. Lecture. In, this series and. I'm. Very pleased to say he was an undergraduate in, my department so you. Know. Boy damn good we. Like we. We like that sort of thing in in my part well, everywhere else I guess as well so, that's good but then he went on. To. A PhD, in neuroscience at. UCL. Our. Interesting, thing. Which you'll see actually comes together in his work and then he did. A couple of things but co-founded. Deep, mind in 2010. He's. Very distinguished. He's. Received. A fellowship. Of the Royal Academy of Engineering. Also. The, silver. Medal and, is. A fellow of the Royal Society of Arts as well. In. 2014, deepmind was acquired by Google. And has grown, enormously. I know, it, retains. The name deepmind, which. I. Think is very interesting positive. But. Also has. Activities. In, Edmonton. Montreal, and an applied team. In Mountain, View. So. Dennis. Has, done well to. The point out for example on, the one hand time. Listed him as one of the 100, most important, people he friendship sort people in the world. But. Also he, was awarded, a Seabee, for. Services, to science and technology, so. Welcome Demi's and we look forward to all. Our minds being improved, on, your favorite topic thank you very much. Well. Thank you Andy, for that very kind introduction and thank. You all for, taking the time to come this evening it's. Great to see you all here so, we're very proud of deepmind to be supporting. This very important, lecture series here, at the Royal Society. You, know we think that given the potential, of AI, to. Both, transform, and disrupt our lives we, think it's incredibly, important, that there's a public debate.
In Public engagement. Between the researchers, at the sort of forefront, of this field and. The broader public and. We think that's very critical going, forwards, as more and more this technology, comes to affect our everyday lives, so. What we're hoping here, and the. Idea behind them with this Royal Society, lecture series is to sort, of open up a, forum. For to, facilitate a kind of open and robust conversation. About the potential, and the possible pitfalls. Inherent. In the advancing, of AI so. I look forward to answering all your questions at. The end of the talk, so. Today I'm going to. To. Talk about AI but, specifically, focused, around how, AI, could be applied to scientific. Discovery, itself I, thought this particularly appropriate given this is a lecture at the Royal Society, but, it's, also the thing that's I'm most passionate about so, this is the reason why I've spent my whole life and my whole career, on, trying to advance the state of AI is that, I believe, if. We build AI in the right way and deployed in the right way it, can actually help advance. The state of science itself. So. I'll come back to that theme throughout. This talk, so. To begin with the. Kind of way you know there's no exact, definition, of what AI is but a kind of loose, heuristic, I think that's worth though that's kind of worth keeping in mind is AI, is the sort of science of making machines smart that's. What we're trying to do, when. We embark on this endeavor of building, AI and, deep. Mind itself my company, we, founded it in London, in 2010. We can became part of Google in 2014. But. We still run independently. Right. Here in Kings Cross just, up the road and the. Way to think about deep mind and the vision behind it was to try and bring together some. Of the world's top researchers. In all, the different Sublett disciplines that were relevant to AI from, neuroscience, to machine learning to mathematics, and bring them together with some of the world's top engineers, and a lot of compute, power and, to, see how, far could we push the, frontiers, of AI and. How quickly could we make progress so, you can think of it as like an Apollo program, effort for AI and nothing. Until that point until we found a deep mine existed, that was really set.
Up To do this in this way. Another. Explicit, thing behind this the vision that we had for deep mind was, to. Try and come up with a new way to organize science. And and. What I mean by that this would be a whole lecture in itself is could, we fuse together the, best from the, top academic labs, and the blue sky thinking and the rigorous science that goes on in those places with. The kind of a lot. Of the philosophy behind the best startups, or best technology. Businesses, in terms, of the. The amount of energy and focus and pace that they bring to bear are on their missions so. Would it be possible to kind of fuse, together the best from both of these two worlds and that's, the way you can think about the culture, at deep mine is a kind of hybrid culture, that. Mixes the best from both for those two, two. Fields. Now. What is our mission at deep mind well, we articulate, it as a kind, of two-step process and, a slightly tongue-in-cheek but, we mean we take it very seriously but, so this is how we do articulated. Step one, fundamentally. Solve intelligence, and then we feel if we were to do that then, step two would naturally follow use it to solve everything. Else and so what, we mean by solve intelligence, is actually. To just unpack that slightly is to fundamentally, understand, this, phenomenon, of intelligence, what, it is what. Kind of process, is it and then, if we can understand that can, we recreate, the, important aspects, of that artificially. And make, it sort of universally, abundant and available, so. That's, what we're what we mean this solve intelligence, first part of this mission and I think if we are to do that in a general way both. Deep mine and and and the research community at large then, I think naturally, step, two will follow in terms of we can bring this technology to bear on all sorts of problems that, for the moment seemed quite intractable, to us so, things like you know perhaps as far afield as climate, science all the way to curing, diseases like cancer that we don't know how to do yet I think AI you could have a role to play an important role to play as, a very powerful tool, in, all of these different scientific. And, medical and, endeavors. So. That's the high level mission, and, that's how a guiding star at deep mind but, how do we go plan to go about this more pragmatically, so, what we talk about is, trying, to build the world's first general-purpose. Learning. Machine and the, key words here obviously learning, and general and so. All the algorithms that we work on at deep mind, are. Learning, algorithms, and what we mean by that is that they learn, how to master. Certain tasks. Automatically. From, raw experience or, raw input so they find out for themselves the, solution, to tasks, so. They're not pre-programmed. With that solution directly by the programmers, all the designers, instead. Of that we create a system that can learn and then it experiences, things and data and then it figures out for itself how to solve the problem. The. Second word general, this is this notion notion, of generality so the, idea that the same system, or same single set of algorithms can. Operate out of the box across, a wide range of tasks, potentially. Tasks that it's never even ever seen before. Now. Of course we, have an example of a very powerful general-purpose. Learning, algorithm, and it's our brains the human mind and and. Our, brains are capable of course of doing both of those things an exquisite, example, of this, being possible and up, till now, you. Know our algorithms have not been able to do this so, the best computer, science has to offer has fallen and. It still is way short of what. The mind can do. Now. Internally, at the mind we call this type of AI artificial. General intelligence or, AGI, to, English it from the traditional, sort of AI that's. Been you know AI is a field has been going for, 6070. Years now since, the time of Alan Turing and, and. You, know a lot of traditional, AI is handcrafted. So.
This, Is specifically, researchers. And programmers. Coming, up with. Solutions to problems and, then. Directly, codifying, those solutions into. In terms of programs and then the program itself the machine just thermally, execute, the program the solution, it doesn't it doesn't, adapt it doesn't learn and so, the problem, with those kinds of systems is they're very inflexible, and. They're very brittle if something, unexpected happens that the programmers, didn't cater for beforehand, then. It doesn't know what to do so, it just usually, just catastrophic ly fails and. This will be obvious to you if you've ever interview know interacted, with. Assistants. On your phone, often. You know they'll be fine if you stick to the kind of script that they already understand, but, once you start conversing. Them freely you very quickly realize, there's in any real intelligence behind, these systems, they're just template, based question answering systems, so. By, contrast the, hallmarks of a GI systems. Would, be that they're flexible, and they're adaptive, and they're, robust, and. What gives them this kind these kind of properties are these general, learning capabilities. So. In terms of rule-based day oriented or traditional, AI that, probably the still most famous example, of that kind of system was. Deep blue.the ibm's. Amazing. Chess computer, that was able to beat the world, chess champion at the time Garry Kasparov, in the late 90s, and, these. Kinds of services are called expert systems and, and. They're pre-programmed. With. All sorts of rules and heuristics to, allow them to be experts, in the particular type. Of task that they were built for so in this case deep blue was built to play chess now the pawn with these systems is and, what you can quickly see is that deep, blue was not able to do anything else other than play chess in fact it couldn't even do something. Simpler like player a strictly simpler game-like noughts and crosses it. Would have to be reprogrammed again, from scratch. So, I remember I was. Doing. My undergrad in Cambridge actually when this match happened, and I remember coming away from this match more, impressed with Garry Kasparov's, mind, than, I was with deep, blue and that's because of course deep blue is an incredible, technical.
Achievement, And a big landmark. In AI research, but, Garry was able to more, or less compete, equally with this brute. Of a machine but, of course Garry Kasparov can do all sorts of other things engage, in politics talk three languages and write books all of these other things that deep blue had no idea how to do, with. This single. Human. Mind so, Tim to me they felt like there was something. Critical. Missing from, if this was intelligence, or AI something. Missing from the deep-blue system. And I think what was missing was this notion of, adaptability. And, learning so, learning, to cope with new tasks. Or new. Problems, and this idea of generality being, able to. Operate. Across a wide range of very differing, tasks. So. The way we think about AI at the mind is in, the framework of what's called reinforcement learning, and I'll. Just quickly should explain, to you what reinforcement, learning is with, the aid of this simple diagram. So. If you think of the AI system and. We call the the AI systems, agents, at at, deep mind internally deep mind here, on the left picked up by this this, little character and this, agent, finds, itself in some kind of environment, and it's, trying to achieve a goal in that environment a goal specified, by the designers, in that environment, now, if the environment was the real world then, the. Agent you can think of the agent as a robot so, a robot situated. In a real-world environment, alternatively. The environment, could be a virtual world like a game environment. Which is what we mostly use at deep mind and then in that case you can think of the agent as like a virtual robot, kind of avatar or game character. Now. In either case the agent only interacts with environment in two ways firstly. It gets observations. About, the environment, through, its perceptual, inputs. We, normally use vision. But. We are starting to experiment with other modalities, like sound. And touch, but. For now almost everything we use is vision, input so pixel, input in the case of a simulation and the. First job where the agent is to build a model of the, environment, out there statistical. Model of the environment as accurately, as it can, based on these noisy. Incomplete. Observations, it's getting about, the world out there so, never has full information about how this environment works it can only surmise, and approximate, it through the observations, and the experience, that it gets, the. Second job of the agent is once, it has that model of the environment and is trying to make plans about, how to reach its goal then. It, has a certain amount of time to pick the action, it should take next and. From the set of actions available to it at that moment in time and it can do a lot of planning and thinking about if I do a how action. A how will the world look how the environment, change my direction B how will it change which. One of those actions will get me nearer towards its goal and once, it's run. Out of thinking time it has to output, the action the best action it's found so far that, actually gets executed that, may make a change or not make a change to the environment which, will then drive a new observation so, this whole system continues. Around in a kind of feedback loop and all, the time the agent is trying to pick actions, that, will get it towards, its, goal ultimately towards, its goal. Now. This is that's reinforcement, learning in a nutshell and how, it works now. We this, diagram is pretty simple but there's a lot of very, complex, technicalities. Behind, trying to solve this, reinforcement, learning problem, in the fullest sense of the word but, we know that if we could solve, all of these issues technical, issues that with this framework is enough to deliver general, intelligence, and we know that because this is the way biological, systems learn including, our, human, minds so, in, the in the in the primate, brain and the human brain it's. The dopamine, neurons the dopamine system in our brain that implements. A form, of reinforcement learning so. This is one of the methods, that humans.
Use To learn. The. Second big key piece of technology, that's that's that's created, this sort of new renaissance, in AI in the last sort of decade is called, deep learning and. Deep. Learning is, to. Do with high rock your networks so, they're kind of you can think of them as loose approximations, to the way our neuron. Knurled our real neural networks our work. In our brain and here's. An example of a neural network working so, imagine that you're trying to train one of these neural networks here on, the right this these layers of neurons to, distinguish, between pictures, of cats and dogs so what. You would do is you, would show these this this AI system, many, thousands, perhaps even millions, of different, pictures of cats and different, pictures of dogs and this. Is called supervised learning so, what you would do is you'd show them a pictures usually you'd show the input layer at the bottom here this picture the raw pixels from, this picture of a cat or a dog and then, it. Would the the this neural network would process that picture, and then would ultimately. Output, a label, either a guess saying I think that's a cat or guess saying I think that's a dog and depending. Whether it was correct or incorrect you. Would adjust the. Neural network adjust the weights between these neurons so, that next time you get asked a question about this is the catalog this is a dog you're more likely to output, the right answer, so. And, it uses an algorithm. Called back propagation to, do that so it goes back and adjust then your network weights depending. Whether you got the answer right or wrong so that you're more likely to get the answer right next time and once, you do this this this incremental, learning process, many thousands, perhaps even millions of times eventually, you get in your network, that is really amazing at distinguishing between pictures, of cats and dogs if that better than I am so I actually can't tell whether, that's a cat or a dog from that particular picture, so. We, our one of our big innovations, of deep mind was to. Pioneer, the combination. Of these two types of algorithms so, so. We, we call this combination. While the logically deep reinforcement. Learning and we, use deep learning to, process, the perceptual inputs to process the observations, and make sense of the world out there these these visual inputs that the system is getting and then we use reinforcement learning to, make the decisions, to pick the right action, to, get the system towards.
Its Goal. So. We pioneered this sort of field and one of the big things that we we should we demonstrated, was. We built the world end-to-end, learning, system and it's called dqn and what we mean by end-to-end, is it went all the way from perceptual, war perceptual, inputs in this case pixels, on a screen to, making. A decision about what action to take so. Really it was an example of one of these full, systems that, can go. All the way to. From processing, the vision, to making a plan and executing. That plan, and. What we tested it on was. Atari, games that was the first thing we tested on was atari, games from the 80s and. We. Tested it on 50 classic games those, of you in the audience who are old enough to remember these games which is probably not many of you they. Were space invaders, pac-man these kinds of games that I'm showing here at the bottom and I'm. Going to show you that the the dqn system how. It learnt and how, it progressed through its learning in a second in a video and the next slide but, just before I show that I just wanted to be clear what you're going to see so, the only input, that, the dqn system gets is the, 20,000. Pixel, values, on, the screen so, so. Those are that those are the inputs that it gets just these pixel numbers it doesn't know anything about what, it's supposed to doing what it's controlling, all, it knows is these, are the pixel values and you've, got to maximize, the score that was the goal it has to learn everything else from. Scratch. So. The, architecture, we use is is is. Here on the screen here so this is a neural network you, can think of on its side and so. On the left-hand side you can see there's. The current screen being a B. And the pixels on the screen being used as the input then, it gets processed through a number of layers and then at the output you've. Got a, whole, bunch of actions that can be taken I think it's 19 actions. That, can be taken the. Eight joystick, movements, the eight Joystiq movements with the fire button or. Doing. Nothing and so. It's got to make a decision about any of those actions to take in the next time, step based. On the current input screen. Input. So. This is how it works on the classic game breakout, breakout. Is one of the the, most famous games in Atari games and, here. In this game you control the the bat and the ball the pink bat at the bottom of the screen and what you're trying to do. Break, through this rainbow color brick wall brick by brick and you're, not supposed to let the ball go past your bat otherwise you lose a life so, this is I'm going to show you this video now of the agent, learning, after many, hundreds, of games of play so this is dqn after 100 games so, you can see it's not very good agent yeah it's missing the ball most of the time but. It's. Starting to get the idea that it should move the bat towards the ball now. This is after 300, games - 200 more games experience and now, it's got pretty good at the game it's about as good as any human can play the game and, it pretty much gets the ball back every, time even. When the ball is going very fast at very a very vertical angle but, then we let the the the system carry on playing for another 200 games and then it did this amazing thing which, was it figured out that optimal strategy was to dig a tunnel round the left-hand side and then put the ball behind, the, wall, so, of course this gets it more reward for, less risk right, and of course gets, rid of the rainbow-colored brick wall more, quickly and that, was for, us really our first big sort, of aha moment, what watershed, moment at deep mind this, is now from four or five years ago and we, realized we were onto something with these kinds of techniques it was able to discover something new that even the programmers, and the brilliant, researchers, of that system did, not know how to do we didn't know what we had you know haven't thought about that solution, to the game.
So. Then more recently a couple of years ago we started work on what, is probably still a most famous, program. Program. Called alphago, and, alphago. Was a system to play the ancient, board game chinese, board game of go so, this is what go looks like for. Those who don't know and this is what they play in China and Korea in Japan instead. Of chess and. Go. Is actually very simple game there's only two. Rules basically and I could teach you with it in five minutes but, it takes at. Least a lifetime sometimes some would say many lifetimes to, master, the game and the, aim of the game is the, game ends this is a position, from the end of the game people so, there's two players but black and white and they take turns putting stones on the board and eventually. In the game the ball fills up like this you, end up counting how many areas. Of territory, did, you wall off with your stones and, the person that has the, side that has walled off the most amount of territory the most amount of squares with their stones wins, the game so, in this case it's. A very close game and white wins by one point. Now. The question is why is this so hard go, so hard for computers to play you, know I just sold you the beginning of the talk that chess, was solved. Was, was was cracked sort of twenty years ago and. Then since then go has been one of the Holy Grails for AI research, and. It's much much harder and, there's. Two real reasons two, main reasons why go. Has been much harder than chess one, is the. Huge, search space that. You need to set the huge number of possibilities in go so. There are actually 10 to the power 170. Possible, positions in go which, is way more than there are atoms in the universe there's about 10 to the power 8 II atoms. In the observable universe so, what that means is if you ran. All the world's computers, for a million years on. Calculating, all the positions you still wouldn't be haven't have calculated, through all the possibilities, in go there's. Just too many to do through brute force, and. The second and the even harder thing about go is that, it was thought to be impossible to, explicitly. Write down by hand and what's called an evaluation, function, so that's a function that takes a board position and tells, the computer which, side is winning and by how much and that's, a critical part of how the chess programs, work. That's why deep blue is so powerful, a team, of chess grandmasters, with brilliant programmers, at IBM put, came together and the, program has distilled what was in the minds of the chess grandmasters, and try to distill that into an evaluation function, that would allow the deep-blue system, and his successors, to evaluate. Whether the current position was good or not and then that's what's used to plan out what move you should take and in, go this is thought to be impossible because that the game is to e so Taric. That, sits to almost. Artistic in a way to, be able to evaluate in that sense with, hard, and fast rules, and. If you talk to a professional go player they'll, tell you the game is a lot more about intuition and feeling, than, it is about calculation. Which. Is a game, more like chess which is more about explicit, calculate. And planning. So. We, made this big, breakthrough with alphago, and the way we were able to do this is we tackled, those two problems this this problem of combinatorial explosion. And huge search spaces and this problem of evaluation, function, with, two neural, networks, so. The first neural network we used was called a policy Network and what. We did here was we fed in board, positions, from. Strong. Amateur games that we downloaded off the internet and we. Trained in your network to predict, the, next move the human player would make so. In. Blue here is the board the, current board position with the black and white stones on it and then, what the output is another. Board but. Here with probabilities, that alphago. Thought for each possible, move in the position so the higher the green bar the higher probability would, give to a human, player playing that move and. What, this had policy network allowed the system to do is rather than look at all the possible, moves in the current position and. Then all the possible replies to those possible moves you can imagine how quickly that escalates, it can instead look at the top three or four most likely. And, most probable, moves rather. Than the hundreds of possible moves that you could make so, that massively. Reduces, down that, the breadth of the search tree. The. Second thing we did was we, created of, what's called of we call a value network and, what.
We Did is we took the policy, network and we, played it against itself. Millions. Of times, so alphago, played against, ourselves millions, of times and we. Took get. Random, positions from the middle of those games and we, of course know the result of the game which side won and we, trained alphago. To, make, a prediction about from. The current position about, who would end up winning the game and how, certain, alphago. Was about, its prediction, so. And. Eventually, once we trained it through millions, of positions it. Was it was able to create a very accurate, evaluation. Function. This value network and what, this value Network did is took a current board position again in blue here at the bottom of the screen and output. A single real number between zero, and one and zero men white was going to when 100%, chance hun percent confidence in that a one would mean black was going to win hunt and confidence and that and point, five would mean the position alphago, dries the position to be equal, and. So here, if by, combining, these two neural networks we, solved, all of the hard problems inherent in computer. Go and. What, you'll notice instead. Of us building. An explicit, evaluation, function, like they do for chess programs, you know typing, in all these hundreds, of different rules so in fact modern chess chess. Computers, have you know the order of about a thousand, specific. Rules about chess and. About positions, in chess instead. Of that we didn't have any explicit rules we just let the system learn for itself through, experience, by playing the game against. Itself many, thousands. Indeed. Millions of times. So. Once we had this. System we, decided to challenge, one of the greatest players in the world and, incredible, south korean grandmaster called Lisa doll and. I described in his the Roger Federer of go because. That's the equivalent position he, occupies. You, know he's won 18 world titles, a bit like Grand Slams and he's considered to be the greatest player of the past decade and we. Challenged Lisa, doll to a match a 1 million dollar challenge match in South Korea in Seoul, in two back in 2016. And it. Was an amazing, you. Know once-in-a-lifetime, experience, actually in the whole country pretty much came to standstill one thing you got to know about South Korea is they love AI they, love technology and they love go so, for them this, was like the biggest confluence, of all their the things they find exciting, all together and Lisa Dahl is a is a sort of national hero there, he's equipment of like you know David Beckham or something with us so. So. That's so that that was an incredible experience you know this is this is a picture of the top left of the first press conference you know is literally a huge ballroom full of full, of journalists, and TV cameras, and, you. Know there was over 200, million viewers, across Asia for the five-game match, which. Was incredible, and alphago. We. Won for won the match and. You, know it was hugely unexpected, even. Just before the match, Lisa. Dahl was asked to predict what, he thought was going to happen he, put it to the, niel victory. For himself or for one at minimum and in fact it, was proclaimed to be a decade before its time, both by AI experts, including. Computer, go experts, and and.
Also, Go players and the go world. And. The important thing here was not just the fact that alphago, won but, actually it was how alphago, one that, was the critical, thing so. Alphago. Actually, played lots of creative, completely. New moves and and, came up with lots of new ideas they're, astounded the go world and in fact are still being studied now you, know nearly two years later and. I revolutionising. The game so. It's not a question of alphago, just. Learning, about human. Heuristics. And and, and. Motifs, and then just regurgitating. Those motifs. Reliably. Sort of regurgitating, them it actually created its own, unique. Ideas, and, here's. The most famous example, of that i just want to quickly show you this. Is move 37, in game 2 and in. Go there, is a whole history goes bingo is. Been around for 3,000, years and, and. Was, played professionally, for several, hundred years in, Japan and China and other places and there. Is this notion go of famous, games that, are looked all back on and studied for hundreds of years and indeed, famous, moves in those famous games sort, of go down in history and this. Is considered to be you know gonna follow in that lineage this this move move 37 from game 2 and this. Is the move here on the right hand side and. The. Alphago. Here is black and lisa dollars white and when. Alphago, played this, move lisa dole sort of literally fell off his chair and the reason is and all the commentators commentating. It thought this was a terrible, move and and. The reason for that is that in, the early parts of go in. The opening phases. Of the go game you normally play on the third and fourth lines, so go is played on a nineteen by a nineteen board and you normally play on the third and fourth lines and that's the kind of accepted, wisdom of. How. You should play in the opening those are kind of the critical lines, but, here you'll notice that alphago. Played this relatively. Early move move move 37 still very early in the game on the fifth line. So one flying further up and this, is normally considered to be a huge mistake because you're giving white, your opponent, huge.
Amount Of territory on the side of the board so, it's considered to be sort of a very. Weak move so this sort of thing no professional, would ever consider playing. And. The key thing about what. Africa did here is that it played this move and. The. Thing about go is it's can in in Asia it's considered to be kind of like an art form but, it's sort of objective art, because. You know later on any one of us could come up with an original move we could just play around and move and it might be original, but the key thing of whether is is did, it make a difference and impact the game the result of the game that's what determines whether it's a kind of beautiful, and truly. Creative, move and in, fact move 37, did exactly that because, you'll see the two stones here that I've outlined in the bottom left there surrounded, by white stone so they're in big trouble but, later on about hundred, moves later on the, the fighting. That was going on in the bottom left-hand corner spilled, out into the center of the board ran, across all the way across the board and ended, up those two stones down the bottom left ended up joining up with that move 37, stone and it, was that that that move 37, stone was in exactly the right place to, decide that whole battle which ended up winning alphago that game so, it was almost as if, alphago. Placed that stone presently, a hundred moves earlier to, impact, this fight. Elsewhere off the board at exactly, the right moment so. So, this was really you, know quite an astounding, moment for. Guren and computer, Co. Lisa. Tholins stuff was incredibly, gracious and their absolute, genius and. What. Was really amazing was he won, a game and it was an incredible game that he won he made an amazing move - and he, said after he was very inspired, by the match you know I realized it was a really good choice learning, to play go this is amazing sort of the reason he played go and it's been an unforgettable experience and he actually went on a three-month unbeaten. Winning, streak, in. In human championship. Matches after. This match with alphago, and he was trying out all sorts of new ideas, and, techniques, and. If you're injured in that I'd recommend to you if you want to see the behind the scenes story I recommend. You watch this. Documentary that was done by an independent filmmaker, and won all sorts of awards at film festivals that's now available on Netflix which. Which will really give you sort of behind the scenes look, at how alphago. Was created, and, what went into it. So. Since then we've continued, working on these kinds of systems and, and. Now, we've. Created a new program called alpha zero which. Advances. What we did in alpha, going takes it to the next level so. What we've done without 4-0, and we just released this just before Christmas or, as we generalized, alpha go to, be able to play not just go now but, any two-player, game, including. Chess and, shogi. Which is the, Japanese version, of chess both, of which are played professionally. Around the world and, and. The second thing we did to generalize it further so it plays more than one game so, don't forget this gets at the notion of generality so that was something I I criticized. About deep blue deep blue could only that program, could only play chess, well, alpha, zero can, play any two-player, game. The. Second thing is that we we, remove this need right, now for go if you remember what I said about the policy network is it, first trained, to mimic human, had. Strong amateur players that we we've shown it from the internet, but. Instead of that where alpha zero does is it starts, completely, from scratch so, it's it can only relies, on self learning playing against itself so, it starts off when it begins totally.
Randomly So knows nothing about the game or anything about, what, a good moves or all likely, moves it has to learn all of that literally. Starting, from random, so. It doesn't require any, human. Data to, to bootstrap, its learning and. We. Tested this program, in. Chess of course there, are many already, very. Very strong chess programs, way stronger than the human world champion, the current, top. Program is called stock fish and it's. An open source program and you. Can you. Can think of it as the descendant, of deep blue twenty, years later so it's it's way way stronger now and you can run it on your laptop and it's, so strong no, human player hat would have a chance of beating it and it's, in fact many my chess player friends and I used to chess when I was a lot younger, I, thought that stockfish could never be beaten, like that was that that was the limit at which chess, could be played and, amazingly. Alpha0, after. Just four hours of training so it started off random, and then, after four hours of this self play self. Playing. And. A few million games it was able to beat. Stockfish twenty-eight nil with. 72, draws in a hundred game match so. There's. Really quite astounding results again. For the AI world but also for. The chess world and we're. Actually going to publish, we've just released preliminary data on this and we're going to publish this on in a big peer-reviewed, journal in, the next few months and. Again. Here, just like we'd go where, it came up with these new motifs, the you know playing on the fifth line in the opening, that, four, thousands, over sort of overturned thousands, of years of received wisdom, human. Wisdom here. In chess even more quickly more amazingly, was, that it created, it seems to invented, a new, style of playing chess, you. Know and and the summary of that is that it favors mobility. And the. Amount of mobility your pieces have over. Materiality, so. In most chess programs, you, know the way that you write rest codes one of the first rules you you input into a chess program is the, value of the pieces you, know rook is five points Knight is three points Bishop is three points and so, obviously you don't want to swap your rook for a knight because, that's minus two points all right so that's one of the very first things were put into the very first chess computers, those kinds of rules and what. An alpha, zero actually. Is very contextual, so in certain positions it will be very happy to sacrifice, material, to, gain mobility, so, the remaining pieces it has to, increase their power on the board and what that means is it can make incredible, sacrifices. To. Gain positional, advantage, really. Long-term sacrifices. And and. We released ten sample, games from this 200 game match and these are being poured over by chess grandmasters, at the moment and there's. Lots of great YouTube commentaries, on this if you're an amateur chess player your interests in chess I recommend, you you you you you have, a look at a few of these great commentaries, or on YouTube that, talked about why. These games, this, style this alpha zero star is is so interesting, and what's it and the secondly interesting about it is that a lot of these professional, chess players commented, on how alpha zero seemed to have a much more human, style than. The. Top chess programs, that. Have a such much more kind of mechanical. Style and it's a little bit ugly to the human, eye the, way that computer, chess programs, sort. Of play until. Now. Now. So these are some of the the. Breakthroughs, that we've had and there, are many other breakthroughs and many other domains from from other groups around the world and AI, right now is you know become a huge buzz word and with, a massive, amount of progress has been made in the last five. To ten years but. I don't, wanna give you the impression is that we're anywhere, close to yet, to solving, AI. There's. Actually tons of key challenges, that remain in fact it's a very exciting time in some senses I feel like we've all we've done is is dealt, with the preliminaries, and now we're getting to the heart of what intelligence.
Is And I'll just give you a little, taste of some of the things that that I'm personally thinking about and that my team is and each, one of these things would be a whole lecture in itself and indeed I think some of the other lecturers. In the speakers, in this lecture series will probably cover some of these topics so, unsupervised. Learning is a key challenge that is not solved yet so this is what, I've been showing you is supervised. Learning where, like the cats and dogs where I tell you that system the answer so that it tries to figure out how to adjust itself so that it's more likely to get the right answer, and I've, also showed you about reinforcement, learning where, you get a score or reward, so in go you get the machine gets a won for winning and a zero-zero, reward for losing right and it wants to get reward but what about the situation where you don't have any rewards, and you also don't know the correct answer which in fact is most of the time in fact when we do human learning and babies learn most. Of time they're not getting any feedback and yet they're still learning things for themselves so, how does that happen so that's called unsupervised, learning, second. Thing is memory. In one-shot, learning so what. I've shown you is systems that are in the moment so, they process, currently what's in front of them they make a decision in the moment and then they execute, that decision, what, of course to have true intelligence you need to remember what you've done in the past and how that affects your future plans right, and you also need to be able to learn. Much. More efficiently so I've told you that alphago you know alphago needs to play millions of games against, itself to learn to get to this level but. Humans can learn much more quickly right we are able to learn things sometimes, in one shot just one example and that's enough and that's something both, of those things that are kind of related and. Actually this is what I studied for as, Andy mentioned in for my neuroscience, PhD was how the brain does this and it's actually a brain error called the hippocampus which is what I studied for my PhD and is critical, to both. One-shot learning and episodic, memory. Another. Thing is imagination. Based planning so, so. One thing is to sort of plan by trying out possibilities, like, in chess or go you know it's quite simple, go, although guys have got a lots of possibilities the, game itself dynamics, is very simple, you know you the rules are simple you know what will happen if you make a move then how the next state will look of, course the real world is much more complicated, complicated. And, that. Is is not easy to figure out what's going to happen next when you when, you make an action so, this, is where you know imagination. Comes in this is how we make plans as humans is we, imagine, viscerally, like how we might want to you know job interview, to go or a lunch hour or a party, or something like that we actually kind of visualize it in our minds and then, that allows us to just uh what if I if I said this thing or if I did this thing then how would this other person react and so on and we play these through these scenarios through in our minds before, we actually get to the situation, and that's extremely efficient, way to do planning and, it's something that we need in our AI systems, learning. Abstract concepts, so what, I've shown you here is implicit knowledge so kind of figuring out what this perceptual, worlds about but what we really need to learn is about abstractions. Our, high-level concepts. And eventually, things like language and mathematics which, we're nowhere near currently.
Transfer. Learnings another key thing which is where you, take, some knowledge we've learned about it from one domain and you, apply it to a totally, new domain. So, that might look perceptually, completely different but actually underlying, structure, of that domain is the same as some other domain that you've experienced. Again, our. Computer, systems are not. Good at doing this kind of learning but humans are exceptionally, good at this and then finally of course all, the things I've shown you here games Atari, games Go games chess games. None of them yet involved language which as we all know is key to intelligence, so, that's a whole field, that, still needs to be addressed. With. These kinds of techniques. So. I just want to talk a little bit now about how this, is being already applied even the systems we have today so there's many challenges, to come but, I think already the systems we have today can, be usefully, used in science, in. Fact we've seen that by. Work, we've done some work we've done and many other groups are, using some of these systems I already talked about deep learning and reinforcement learning, in. All sorts of very interesting scientific. Domains so, there's being used to discover new exoplanets. By analyzing. Data from telescopes. AI. Systems, are being used for, controlling. Plasma, in nuclear, fusion reactors. We've. Been working on and, others on how, it can help with quantum chemistry problems, and. Also it's being used a lot in healthcare. Domains so actually, we have a partnership. With more fields to, help, the. Radiographers, quickly. Triage. Retinopathy. Scans. Or scans of the retina to look for macular degeneration so, it. Was very very Forks we need diverse fields and I could have you know done many slides on different applications, that are currently going on with AI and I think this is just the beginning, one. Of things I'm most excited about is applying. It to the problem of protein folding so this is the this is the problem of you get an amino acid sequence. 2d. Sequence of the protein structure, and you need to figure out the 3d structure the protein, will eventually, fold into and that's really key to a lot of disease. And drug discovery because, the, 3d, structure of the protein governs, how, it will behave, so, this is a huge. Sort. Of long-standing scientific challenge, in biology and we're. Working quite hard with. A project team on on this there with some collaborators from the Crick. Other. Scientific, applications I see coming up is helping, with things like drug design the, design of new materials, and, in. Biotechnology, in areas like genomics, and in fact if I was to boil down the kinds, of problems the properties of problems, that are well suited to the, AI we already have today let alone what we're going to create in the future I think it comes down to three, pop key properties, property. One it's. Got to be a massive combinatorial. Search base so. So. That's kind of got to be inherent in the in the problem secondly. Can, you specify a clear objective function. Or metric, to hillclimb, against to optimize, against it's almost like a score, if you like a score in again you, have to be able to have some kind of score of how well you're doing towards. The final goal and. Then you either need lots, of data to learn from actual, real data or an, accurate, and efficient simulation, or simulator, so you can generate a lot of data in. The way that we do with our game systems. So. As long as you satisfy those three constraints. Properties. I think we all the a our systems we already have today could, potentially, be usefully, deployed in those areas and I think there's actually a lot of areas in science that already would, fit these these, these desired properties, and. Then of course there's. All sorts of applications to the real world that we're working on in combination, with Google including. With, healthcare, we work with the NHS in many projects. You're. Making, me assistant on the phone more intelligent, and. Also in areas like education and. Personalized, education, and, and I think AI is set to, revolutionize. A lot of these other, sectors. So. Just to sort of sum. Up now you, know one of the reasons that I've spent. My whole career about on AI is the I've. Always felt that it's a kind of meta solution, to, many other problems that that face us today you know if you think about how, the world is today one, of the big challenges is the, amount of information that, we're confronted, with and.
That We're producing as a society. So and I mean that both in our personal lives in terms of like choosing you know, our. Entertainment, to. Science. Where, there's just so much data now being produced from something like CERN or in genomics, you know how do we make sense of it all and indeed. The systems. That we would like to. Better and have more control over our incredibly, complex systems you, know think about climate, or the, nuclear, fusion systems, now these incredible, lead, complex, systems that are up in some cases are bordering on chaos. Systems and. So they're, very difficult for us to describe with equations, and to understand, even the top, human. Scientists, and experts. So. You, know for a long time big data was the. Buzzword you know before, AI was is now the buzzword, AI was the blue you know big big data was the buzzword and I think that actually in a way big, data can be seen as the problem you. Know if you think about it from an industry, point of view everybody, you know all companies, have tons. Of data now and talk, about big data the, problem is how do they get the insights out of that data and how do they make use of all of that data so. To be useful to their customers, and their, clients, and so on and I think AI is the answer to. Help find. The structure and insights, in all of that unstructured. Data and in, fact you can think of I think one way to think about intelligence, is as, a process. An automatic. Process that. Converts unstructured, information, into. Useful, actionable, knowledge, and. You. Know I think AI could be sort of help us automate, that, process and, for, me my personal dream and a lower dream of my, team is to, make a I assisted, science possible or. Even perhaps create AI scientists. That can work in tandem with, their. Human expert counterparts. And. From a neuroscience, point of view one of my dreams as well is to try, and better understand, our own human minds and I think building AI in this neuroscience, inspired way and then, and then sort of comparing, that that construct, that algorithmic, construct, with the way the human mind works will, potentially shed some light I think on some of the unique, properties of our own minds things like creativity dreaming.
Perhaps. Even the big question of consciousness. So. To sum up then I think you know AI holds enormous promise for the future and I think these are incredibly, exciting, times to sort of be alive and working in these fields but. You know this is where, all this sort of potential, also comes a lot of responsibility, and I just want to mention this and I think some of you will probably have questions about this later you know we believe a deep mind as with all powerful tech oh geez eh I must, be build responsibly, and safely and used, for the benefit of everyone in society and, we've been thinking about that from the very beginning of deepmind and, this requires lots of things that were actively engaged with right now with the wider community research. On the impact of the technology, how to control, this technology, and deploy it and we, need a diversity, of voices both in the development and the use of the technology and meaningful, public engagement, which is why we're so happy to be supporting. This lecture series and we've, just launched, our own ethics, and society, team at deep mind that's, involved, in working, with many outside, sort of stakeholders, to figure out the best way to go, about deploying. And using these types of technologies to benefit, everyone in society and. We've also been involved on industry. Scale across, the whole field in. Co-founding. The partnership, on AI which is a cross, industry collaboration with, problems for profit and nonprofit companies, so, the biggest companies the world coming together to talk about this and try and agree some best practices, and, some and and some protocols, around how, to how. To research, this technology, and how, to engage the public with it and. All this is happening you, know for us right here in the, center in the heart of London you know our home we're very proudly British, company and you. Know we work here at Kings Cross with.
Our Colleagues at the Crick Institute and, that Alan, Turing Institute, which is based in the British Library all, around and Kings Cross is becoming quite a hotbed, and. UCL of course is round there of a, a I research, and. You. Know the and dimension of the starts we should leverage in the UK all of our incredible strengths, these amazing, universities. That, we have here Oxford Cambridge UCO Imperial and others that have incredible strengths in computer science and we. You know I feel very strongly that deep mind needs to play its part in encouraging. And supporting, this AI ecosystem, through sponsorship, scholarships, and internships, and, and, actually lectures given by deepmind staff and. I'm very passionate about establishing, the UK as one of the world leaders in AI and I think we have an amazing position and, we've, we should really be building on our heritage in computing, that starts with you, know actually Charles Babbage. Inventing. Computing, really hundred, years before we started before before, it's time in some senses and then of course that continued with Alan Turing who famously, laid, down the fund menthols of computing in the 40s and 50s then. The World Wide Web you, know with with people like Sir Tim berners-lee. Instrumental. In creating the, Internet and I feel like you know the next thing in the lineage of those of those types of technologies artificial, general intelligence and, I think the UK has a huge part to play in then I hope deep mind will will play its part in that too. So. You, know it's great to be opening this, this, series of lectures, you. Know I think we we need to capitalize on on on what we have here in the UK both we're. You know from the ethical side and the technological side, and one thing I would say is that it's, important, for us to be at the forefront of Technology if we want to have a seat, at the table when it comes to influencing the debate on the, ethical use of this technology, and, again I would encourage you all to get involved the public you know understand, these technologies, better and how they're going to affect society and then engage on how you would like to see these. Technologies, deployed, for, the benefit of everyone you, know I think AI can be an incredible benefit to society if, we use it responsibly I, think it could be one of the most exciting, and transformative. Technologies. We'll ever invent thanks. For listening. So. Folks we have time. For. Questions. And and, so on but let me start off with. A. Question. Just to get things going. You've, outlined some of the technological. Possibilities. What are they barriers, what are the difficulties, what stands in the way on the you. Know deployment, the use the science of, this some of the things you've talked about yeah, I think. You. Know I mentioned, in the slide about what. The remaining, challenges, are I think that the you, know it's important to remember that there are lots, of very, difficult things about intelligence we still don't know how to do right so you. Know I think I outline some of those key areas, we're. Working hard on that and many others are too but we don't know how quickly those solutions will come I think there's going to require some really big breakthroughs. Are still needed at, least as big as the ones we've had and.
Possibly, Many of those so, I think that's to come over the next few years in terms of the barriers to sort, of using them you know I think we have to think very carefully about how we you. Know we want to test these kinds of systems because in a sense these systems are adaptive, and they learn so, it's a very new type of software, in a way right, so software, generally is you know better better the most handy is you know we we write some software and then, you, test it and you stress test it and unit testing there's all these ways, of testing software and then you know if it's. Ready to be shipped, and deployed and our. It goes and you know it's going to behave the way you want now, of course one, of the advantages of our systems is when you send them when you put them out in the world they'll continue to adapt and learn and and to. The new situations, they encounter that you may not have thought of but, then the question is how do you make sure they, still behave the way you want them to and. How do you test that kind of system so actually that's a big challenge I. Can. Be a little light-hearted in. Flying, I'm, told a, common, phrase from one pilot, to another is what's it doing now so. A little, bit of that sort of lost that. Sort of stuff so questions. Let's. Start off at the front here lady right in the front, there's. A microphone coming, very. Quickly -. Yeah. Sorry Maggie, Murphy from The Telegraph hi, so. I've got a question about. Switches. The implications. A I may, have an Democracy, further down the line so, a lot, of what you talked about was predicting, human behavior, I do. You think that there's a legitimate concern, around. Predicting. Human behavior, at scale. Potentially. Manipulating, people serving, them political, advertising. Or. Even, with private corporations, perhaps gambling, or gaming, where you have to pay to level, up what. Your thoughts on that yeah I'm not sure I did talk about predicting, human behavior, but, so. Yes, so, what, we're talking about here is if you're referring to the kind of Facebook stuff I mean we're talking about finding structure, in any kind of data so, I'm thinking more about scientific. Data that you've got, or you, know in the case of our stuff the gaming data so, it. Playing against itself and generating, its own data and then finding, paths, you can think of it's like an intelligent, search so you've got this huge combinatorial. Space and you, want to try and find say a new material, design or, new drug design or, indeed a new go position. And how. Do you efficiently, search through all of that that, amount of data and, what you've got to find is structures. And patterns that, can help you reduce the, size of that search really, that's what you can think of out of zero and alphago doing, and. That's where we are at the moment and of. Course you know eventually these systems could, predict all sorts of things potentially but, right, now I mean the first thing you ought to do is get get the data into some, kind of format that you can actually express and. The second thing is kind of an, objective function some kind of goal that you want this to the system to do so, I, think, you. Know it's quite different to the kinds of goals that say Facebook has with, Widow's systems I, have. A question, from downstairs. I'm. Afraid I don't know who it comes from but here, it. I'll paraphrase.
Humans. Are irrational what's, the approach approach, how do you approach an, automatic. System which has to deal with some, level. Of irrationality. Sure well I think in, fact that's that's one of the most tricky things about a. Lot, of systems like economics, I think, is one of the most difficult areas, of, science because it. Really is a sort. Of aggregate, of human behavior right and and of course they will tell you better than most scientists, about about human irrationality and, how that impacts things I mean I think what, we've got well potentially we have here is systems that you. Know can be quite rational, and then. We've got to think about what. Aspects, of rationality do they need to model if at all to understand, the systems at some point probably they're going to need something you, know I can imagine they're going to need to understand, if they interact with human experts. And and and human systems you. Know a little, bit up to empathize, about how, humans. Behave and what they can expect from them but I don't think you. Know I think part of the power of these systems is they could be very. Rational, systems have. You an earlier career person and gentleman halfway, down with, this end up. Like. To say whether you're early. Yeah. Kind of. And. Michael. Vick's and startup, founder and so. There's this discussion between Elon, Musk and Mark Zuckerberg about, the. Threat AI poses, to humanity, and, so. Firstly what your thoughts on that both, from a personal perspective it. Just seems crazy to me because I mean what are we exactly worried about we worried about system, we can't turn off are we worried about a Westworld type scenario, where there's. AI running around us, manipulating. Us, it. Just seems so far off when at the moment we're just talking about games, we're not talking about the complexity, that you discuss towards the end of your lecture. Yeah. I mean it's a good question I think you. Know that's why I try to make emphasize, we although there's been a some impressive breakthroughs, we are still at a nascent stage and, we. Are talking about just you know board games and things I've at the moment but, I think my view is a sort of somewhere in between that. Bait that you're talking about so just, for those you don't know II lawns or has, sounded, the alarm bell a lot about the. Dangers of AI, and. It's, sort of existential risks, of AI and then, Mark Zuckerberg. Replied kind of that there aren't any and, we. Should worry about that and it's, all sort of roses, and. I think actually the real, answer comes somewhere between my. View is that we. Need a lot more research, has. To be done about that, what these systems are and what their capabilities could, be so, you. Know what type of systems we'd want to design right I think a lot of these things of because, we're very early still, unknown. So. You. Know and, and I think a lot of the things people worry about are going, to get, a lot better so for example the interpretability of these systems this is one thing that's all I get often asked is well, you know how does alpha go play go right, and we don't know that yet it's a big neural network and it's a little bit like our brains you, know we roughly, know what it's doing but actually, the specifics, it's not like a normal program where you can point to it and go like this bit of code is doing this and for. Safety critical systems perhaps, in healthcare and others you know if it was to control a plane you, know you'd actually want to know exactly, what, why, the decision, was made right and be able to track that back for accountability and, make, sure there's no bias and, for fairness and all these other things and these are this is very active error research and we we have a whole team that researches this and I'm.
Not I'm not you know I think things, will get a lot better on that phone in the next five plus, years it's because we're at the very beginning of even having these systems working that's, why we had don't yet know how to build visualization. Tools and other things but I think we will do. Having. Said that you, know that there are, you. Know we got to make sure you're like it's a very very powerful technology, and the reason I work, on it is because I think it's going to have this amazing transformative. Effect. On the world for the better in things like science and technology no technology and, medicine but, you, know like all powerful techn