TF Machine Learning for Programmers (TensorFlow @ O’Reilly AI Conference, San Francisco '18)
All, righty shall we get started so, thanks. Everybody for coming to this session I'm. Going to be talking about sensor flow and particularly tend to flow from. A programmers perspective so machine learning for programmers I'd like to show some code samples of using tensor flow and some in, some simple scenarios, as well as one slightly more advanced scenario, but before I do that I always like to just do a little bit of a level set and, if. You were at the previous session, sorry some of the contents, gonna be similar to what you've seen already but. When I like to think about AI and when I come to conferences. Like this one about AI or if I read the news about AI there's, always stories, about what, it can do or what it might do but. There's not a whole lot about what it actually is so. Part of my mandate and part of what I actually like to educate. People around, is from. A programmers, perspective, what AI actually is what, it is for you what you can begin, to learn how to program and then how you can apply it to your business scenarios. But. We're. Also at the cusp of like this revolution. In this technology and lots, of people are calling it like the fourth Industrial, Revolution and. For, me I can only describe it as like it's it's, the third big, shift in my own personal career and. So for me the first one came in the early to, mid 90s when the, web came about and if you remember when the web came about we were all desktop programmers, I personally. My first job was I was a visual basic program our programming. Windows applications, anybody, ever do that. Was, fun wasn't it and. So, then the web came around and what happened with the web then is it changed, the audience, of your application, from one person at a time to, many people at a time you had to start thinking differently about, how you build your applications, to be able to scale it to lots of people using it and also the runtime, changed, instead, of you being able to write something that had complete control over the machine you, would write something so there's sort of a virtual machine that, the browser gave you and then maybe that browser would have plugins like Java and stuff like that he could use to make it more intelligent, but, as a result what ended up happening was, this paradigm, shift. Gave. Birth to whole new industries, and I. Work for a small company called Google anybody, heard of them and. So. Like you know things like Google weren't possible, anybody remember Gophers. Yeah. So that that's really old-school right so well Gophers where with almost the opposite of a search engine a search engine like you know you type something into it and it has already found the results and it gives them see you a gopher, was this little application that you would send out into the nascent Internet and it would crawl everywhere, a little bit like a spider and then come back with results for you votes so for me like you know who ever had the great idea to say let's and let's flip the axes on that you know and come up with this new business paradigm, ended up building the first search engines, and as a result companies like Google and Yahoo were born ditto, with things like Facebook you, know that wouldn't have been possible with the browser can you imagine trying to before. The pre Internet where there was no standard protocol, for communication, you, know and your bright desktop applications, can imagine being able to build something like a Facebook, or a Twitter it just wasn't possible so, that became to me the web was this first great tectonic shift in my own personal career the, second, one then came with the advent of the smartphone, so. Now users, had this device that they can put in their pocket that's, got. Lots of computational, power it's got memory it's got storage and it's loaded, with sensors like cameras. And GPS etc, now. Think, about the types of applications that you could build with that now, it's a case of companies. Like uber became, possible, now I'm person, I personally believe by the way that they all, of the applications, are built by introverts, because. Like you'll see all of these great things that you can do nowadays it's because they serve introverts, I'm highly introverted and one thing I hate to do is stand on a street corner and hail a taxi, so when uber came along it was like a dream come true for me that I could just do something on my phone and a car would show up and now, it's shopping it's the same kind of thing right I personally.
Really Dislike, going, to a store and having somebody say can I help you can I do something for you can I help you find something I'm introverted I want to go find it myself put my eyes down take it to the cash register and, pay for it and now online shopping, it's like it's, done the same thing so, I don't know why I went down that rabbit hole but it's a it's. Just one that I find it's like it the second tectonic, shift has been that like the advent of the mobile application, so that these new businesses, these new companies became possible so, the third one that I'm seeing now is the AI in the machine learning revolution now, there's, so much hype around this so I kind of like to draw the diagram, of the hype cycle and so. If you think about the hype cycle every hype cycle starts off with some kind of technological, trigger. Now. With AI and, machine learning, that technological, trigger, really, happened a long time ago machine. Learning has been something in AI has been something that's been in universities. It's, been in industry for quite some, time, decades, so. It's only relatively, recently that the compute the intersection, of compute power and data, has, made it possible so that now everybody can jump on board not just not. Just University researchers. And with, the power of things such as tensorflow that I'm going to show later anybody. With a laptop can start building neural networks where, in the past neural networks were reserved for like the very best of universities, so, like that technological, Traeger that's like rebooted. In many ways the AI infrastructure. Has only happened in the last few years and with any hype cycle what happens is you end up with this peak of increased. Expectations, where, everybody's, thinking ai is going to be the be-all and end-all and will change the world and you know will change everything as we know it before, it falls into the trough of disillusionment and, then at some point we. Get enlightenment and then we head up into the productivity. So when you think about the web when you think about mobile phones and those, revolutions. That I spoke about they all went through this cycle and AI kind of went through this cycle now, you can ask a hundred people where we are on this lifecycle right now and you've probably got a hundred different answers but I'm going to give my answer that I think we're right about here and when, we start looking at the news cycle it kind of shows that loosely. We start looking at news we start looking at glossy marketing, videos a is, gonna do this AI is gonna do that at the end of the day ai isn't, really doing any of that it's smart people building neural networks, with a new form of a new metaphor for programming, have been the ones who've been able to do them have been able to build out these new scenarios, so, we're still heading up that curve of increased, expectations and at some point we're probably going to end up in the trough of disillusionment, before. Things will get real and things, you know you'll be able to really, build whatever the, uber, or the Google, of the AI generation, is going to be it may be somebody in this room will do that I don't know so. At. Google we we. Have this graph that we draw that we try to use that we train our internal engineers and our internal folks around AI and around the hyper ready area and we say like we kind of like to layer it in these three ways first of all AI from. A high level is the ability to program a computer to act like an intelligent, human and, how. Do you do that there might be some traditional coding and that but, there may also be something called machine learning in that and what machine learning is all about is instead of writing code where, it's all about how the human solves a problem how they think about a problem, and expressing, that in a language code like Java, or C sharp or C++, it's, a case of you train a computer, by getting.
It To recognize patterns and, then open up whole new scenarios, in that way I'm going to talk about that a little bit more and then. Another part of that is deep learning with, the idea behind deep learning is now machines. Being able to take over some of the role that humans are taking in the, machine learning phase where, the idea is that in where machine, learning is all about I'm going, to for example for I'm gonna show slide next about activity, detection, but, in a case of activity, detection, instead of me explicitly, programming, your computer. To, detect, activities, I will train a computer, based on people doing those activities so, let, me think about it let me describe, it this way first of all how many people in this room are coders. Written. Code oh wow most of you ok cool what, languages out of interest just shout. Them out. C-sharp. Thank you. -. Ok. I've written about a bunch of books on c-sharp I still love it you know I don't get to use it anymore but that's nice to hear something so I heard c-sharp I heard Python. C++. Ok, cool now what do all of these languages, have in common Ruby nice what do all these languages have in common. That. You as a developer have, to figure out how to express a problem, in that language right so. If you think about if you're building a problem, if you're building an application for, activity, detection, and say, you want to detect an activity, of somebody walking like. I'm wearing a SmartWatch right now I love it because since I started wearing smartwatches. I became, much more conscious of my own fitness and, I. Think, about how the SmartWatch, monitors. My activity, that when I start running I wanted to know that I'm running so it logs that I'm running when I start walking I wanted to do the same thing and count calories and all that kind of stuff but if you think about it for a coding perspective how, would you build a SmartWatch like this one if you're, a coder now, you might for example be able to detect the speed that the person's moving yet and you're right a little bit of code like this right you know speed is less than four then the person's walking that's. Kind of naive because if you're walking uphill you're probably going slower if you're walking downhill you're going faster, but I'll just keep it simple like that so in code, you, have to have, a problem and you have to express the problem in a way that the computer understands, that you can compile in and you build an application out of it so. Now I say okay what if I'm running if. I'm running well I can probably go by the speed again and I say hey if my speed is less than a certain amount I'm, walking, otherwise.
I'm Running I go okay now I've built an activity detector, and it detects if I'm walking or if I'm running pretty cool pretty, easy to do with code so, I go okay now my next scenario is biking and I go okay from going based on the speed the data of my speed I can do a similar thing right, if I say if my speed is less than this much I'm, walking, otherwise I'm running. Otherwise I'm biking so, great I've now written an activity, detector, a very naive activity. Detector, just by looking. At the speed that the person's moving at but. Now like my, boss loves to play golf and he's like this is great I want you to detect golf, and tell me when I'm playing golf and calculate what I'm doing, when I'm playing golf how do I do that you. Know I mean I mean what I call as a programmer I call the oh crap face because. Now I realize that all of this code that I've written and all this code that I'm maintaining, and I have to throw away because it can't be used in something, like this this scenario, just. Doesn't become possible, with the code that I've written so. When I think about you know going back to the revelations, that I spoke about like, for example something like an uber wouldn't, have been possible before the mobile phone something. Like a Google wouldn't, have been possible before the web and something, like my golf detector, it wouldn't be possible or would be extremely, difficult you. Know with that machine learning so. What is machine learning. So. Traditional. Programming I like to summarize in a diagram like this one and traditional, programming is a, case of new, Express rules using, a programming, language like Ruby or c-sharp or whatever and you. Have data that you feed into it and you compile that into something that gives you answers so. Keeping the very simple example that I have of an activity detector, that's giving me the answer of you're, playing golf you're running you're walking all those kind of things the, machine learning revolution just flips the axes on this so. The behind, the machine learning revolution is now I feed in answers I feed in data and I get out rules so. Instead of me needing. To have the intelligence to define the rules for something, this. Revolution. Is saying that okay I'm, gonna tell a computer that I'm doing this I'm doing that I'm doing the other and it's gonna figure out the rules it's gonna match those patterns and figure out the rules for me so. Now something, like my activity, detector, for, golf. And walking and running changes. So. Now instead of me writing, the, code for that I would say okay I'm going to get lots of people to walk again, lots of people to wear whatever sensor, it is like maybe it's a watch or a smartphone in their pocket and I'm gonna gather all that data and I'm gonna tell a computer this is what walking, looks like I'm gonna, do the same for running I'm gonna do the same for biking and, I may as well do the same for golfing so, now my scenario, it becomes expandable, and I could start detecting things that I previously would not have been able to detect so, I've opened up new scenarios, that I previously would not be able to program, by. Using if-then. Rules or using whatever language see anybody. Remember the language Prolog. Anybody. Use that yeah you, know even Prolog couldn't handle that even though they said Prolog was an AI language, so so. The. Idea behind this is the kind of emulates, how the human, mind works so. Instead of like me telling, the computer by having, the intelligence, to know what Gulf looks like I train, the computer by taking data about what Gulf looks like and the. Computer recognizes that data matches, that data so in the future when I give it more data will say that kind of looks like golf so I'm gonna say this is golf so. We. Talk about learning we talk about the human brain so I always. Like to think like well think about how you learn, something like, remember this game, anybody. Remember this game everybody, knows how to play this game right it. Seems by the way ever this game has different names in every country in it that's always hard to remember like I grew up calling it knots and crosses heads. Nodding, most, people, grew up maybe calling in this country tic-tac-toe, I gave. A talk similar, to this in Japan earlier this year and they had this really strange name that I couldn't remember for it but, this is a very simple game right, now, if I were to ask you to play that game right now and it's your move where, you go. How. Many people will go in the center. Look. How many people would not go in the center we, need to talk. You. Know so like you probably learned this as a young child and maybe you teach this to children but.
The Strategy of winning this game if it's your turn you, will never win this game unless you're playing against somebody who doesn't know how to play the game by not going in the center first now. Remember, how you learned that okay. If you have a really tough teacher like me I would teach my kids by beating them every time at, the game and, you. Know that if, they would start at the corner I would beat them and they would start somewhere else and I would beat them at the game and you know keep doing this kind of thing until they eventually figured. Out that, they have to go in the center or they're going to lose right. So you know that was a case of this, is how the human brain learns, so. How. Do we teach, a computer the same way now. Think about like for example if your kids goes and they've never seen this board before so. You, know in this society we read left to right top to bottom so, the first thing they probably do is go in the top left-hand corner and then, you go in the center and then there goes somewhere else and you go somewhere else and then you go somewhere else and you get three in a row and you beat them they. Now have what in machine, language para lensses a labelled example, they, see the board they remember, what they did on the board and that's been labeled as they lost you. Know then they might play a game and they have another labeled example of they lost and they'll, keep doing that until they have labeled examples of tying and then maybe eventually labeled, examples of winning so, knowing how to learn is a, step towards, this kind of intelligence, and this is when we talk about machine learning we get our data and we label them it's exactly the same as teaching a child how to play tic-tac-toe or, knots, and crosses, so. Let's. Take, a look at so if I go back to this diagram for, a moment before I look at some code now, the answer now the idea is like thinking in terms of tic-tac-toe they. You have the answers of experience. Of playing the game you, have the labels for that you know that you've worn you lost whatever and out of that as a human you begin to infer the rules did. Anybody ever teach you you, must go first if you, do you must go in the center first if you don't go in the center for us you go in a corner if you don't go in a corner you blocks, he would sill in a row you, know you. Don't learn but those if-then rules I know I didn't know most people I speak to didn't, as a result they ended a plane game and they infer the rules for themselves and that's exactly the same thing with machine learning so, you build something the computer learns how to infer the rules with a neural network and then, at runtime you give it data and it will give you back classifications.
Or Predictions, or yeah, you know give you back intelligent, answers based on the data that you've given it, so. Let's look at some code so this is what we call the training phase this is what we call the inference phase but. Like enough for a theory, so, I like, to like explain. A lot of this encoding so. A very simple hello world scenario. As all programmers have is I'm. Gonna use some, numbers and I'm gonna give you some numbers and, there's, a relationship, between these numbers and. Let's see who can figure out what the relationship is you ready okay. Here's the numbers, so. Where X is minus 1 Y. Is minus 3 where X is zero Y is minus 1 etc, etc can, you see the relationship between the X and the y. So. If y equals something. What would Y equal. To. X minus 1 excellent. So. The relationship, here is y equals 2x minus 1 how do you know that, how. Did you get that. What's. That I. Can't. Hear you sorry a, linear. Fit ok thanks yes, so you know you kind, of you've probably done some basic geometry, in school and you think about you, know usually there's a relationship y equals MX plus C something along those lines so you start plugging an MSc in in your mind until you find something that works right so, you go ok well if Y is minus 3, maybe that's a couple of X's which, would give me minus 2 and I'll subtract, one from that give me minus 3 and then I'll try that with 0 & 1 yep that works now try that with 1 and 1 that works so, what happened is there were a couple of parameters around the Y that, you started guessing what those parameters, were and started trying to fit them in to get that relationship that's, exactly, what a neural network does and that's, exactly the process of training a neural network when. You train a neural network to, try and pick a relationship, between numbers, like this all it's, doing is guessing those random 4. Calculating. Look. Through each of the parameters calculate. Which ones that go right which ones have got wrong calculate. How far it got them wrong by and then try and come up with new values that will be closer to getting, more of them right and that's the process called training so whenever you see training, and talking about needing lots of cycles for training needed leading lots of GPU time for training all the computer is doing is trying failing, trying failing, trying, failing but each time getting a little closer to the answer so let's look at the code for that, so. So. Using tensorflow and using care us I don't, have the code on my laptop so I got to look back at the screen sorry so, using tensorflow and care us here's, how I'm going to define a neural network to do that linear fitting in just a few lines so, the first thing I'm going to do is I'm going to create my neural network this is the simplest, possible neural, network it's got one layer with one neuron in it and this is the code to do that so where, you see the chaos that layers are dense units, equals 1 input shape, equals 1 that's, all that I'm doing is I'm saying and a single neuron I'm going to pass a single number into that and you're going to try and figure out what the number I want to come out of that is so, very very simple, so.
Then My next line of code is remember. I said like wallet, neural network is going to do is try and guess the parameters. That will make all the numbers fit so. It will come up with a couple of rough, guesses, for, these parameters, and then it has these two functions, once called a loss function and what's called an optimizer, and all they're doing is like if you remember that set of six numbers I gave you it's like saying okay well if y equals something, times X plus something. I'm gonna guess those two something's I'm gonna measure how, many of my Y's I got right I'm, gonna measure how far I'm wrong and all of the ones that I got wrong and then, I'm gonna try and guess new values, for those something's so, the loss function is the part where it's measuring how far I got wrong and the, optimizer, is saying okay here's, what I got the last time I'm, gonna try to try to guess these new parameters, until I'll keep going until I get y equals 2x minus 1 or something along those lines so that's all you do you just compile your model you specify, the loss function you, specify the optimizer, these, are both really heavy math we kind of things one of the nice things about Cara's one, of the nice things about tensorflow is they're all done for you you're just going to specify them in code and. I'm gonna say I'm gonna try the mean squared errors my last function and I'm gonna try something called SGD, which, is stochastic gradient descent as my optimizer, and every. Time it loops around it's gonna just guess new parameters, based on those okay. So, then the next thing I'm gonna do is I got to feed, my values, into my neural network right so I'm gonna say my, X's is gonna be this array minus 1 0 1 etc my. Y's are going to be this array so here I'm creating the data and so. I just get them and I load them into a couple of rays this is Python code by the way and. Now, all. That I'm gonna ask my neural, network to do is to try and come up with an answer and I do that with the fifth with the fifth method so, here I just say hey try and fit my X's to my Y's and this, epochs equals 500, means, you're gonna just try 500 times so. It's going to loop 500, times like that remember I saying it's gonna guess those parameters, is gonna get, it wrong it's gonna optimize its gonna guess again it's gonna get it wrong it's going to optimize so, in this case at my code I'm just saying do that 500, times and at. The end of those 500 times, it's going to come up with a model that, if I give it a washer, if I give it an X it's gonna give me what it thinks the Y is for that X ok, and you do that using model, dot predict so, if I pass it model dot predict for the value 10 what, do you think it would give me. If. You remember the numbers from earlier why is 2x minus 1 what do you think it would give. 19. Right it. Doesn't. Because. It'll, give me something really close to 19 it gives me about eighteen point nine seven and I'm gonna try to run the code in a moment to show but. Why do you think it would do that. Well. What's, that, it's. Predicting, and it's, also it's just being trained on a very few pieces of data right. With those six pieces of data it looks like a line and it looks like a linear relationship, but, it might not be right. There's there's room for error there that, you know with the fact that I'm training and very very little data this, could be a small part of the line for all we know goes like this you know instead of being linear once you move out of those points, and as a result those kind of things get factored, into the model as the models training on it so you'll see it's gonna get a very close answer but it's not going to be an exact answer let, me see if I can get the code good running.
It's A little complex, with this laptop so. When. I'm presenting it's hard to move stuff over to that screen just one second. This. Requires, some mouse foo. Alright, so I have that code. Let's. See yep so, you can see that code I have now running, up there and if you look right at the bottom of the screen over, here. We. Can see here's where it has actually done the training it's done 500 epochs worth of training and then, when I call the model that predicts it gave me this answer which, is eighteen point nine seven six four one four and. That was one that I ran earlier I'm just gonna try and run it again now if I can. But. It's really hard to see. Click. That one arrow. So. This idea is pycharm by the way so, you see the it ran very quickly because the very simple neural network and as, a result I was able to train it through five hundred epochs and whatever that is half a second what did it give me this time was eighteen point nine seven four seven is that what I see you. Know so again very simple, neural network very simple code but, this is that kind of just shows some of the basics, for how it works so, next I want to just get to a slightly, more advanced example once. I get my slides back. Whoops. Okay. So. That, was very simple that was hello world right we all remember our first hello world program with which we wrote if you wrote it in Java it was like ten lines if you wrote it in c-sharp it was five lines fear, audit and Python it was one line if, you wrote it in C++, it was like 300, lines. Do. You remember that I remember Pet, Souls book and programming windows anybody ever read that the, whole first chapter was how to do hello, world and MFC and it was like 15 pages long I thought was great so. You know but that was a pretty easy example, that to me is the hello world of machine learning just doing that basically in your fitting but let's let's think about something, more complicated so. Here are some items of clothing now, as a human you're looking at these items of clothing and you've instantly, classified, them and you instantly recognize, them or at least hopefully most of them but, think about the difficulty for a computer to classify, them for example there are two shoes on this. Slide right. One is the high heel shoe and the upper right and one is the sneaker in the second, row but. They, look really different, to each other right other than the fact that they both read you, know and they, you know you think they vaguely fit a foot the high-heeled obviously your foot has to change to, fit it and the sneaker you know the foot is flat but, as a human brain we automatically, recognize this and we see these as shoes or if, we look at the two shirts in the image right one of them doesn't have arms because.
We Automatically, see it as being folded the one with the tie and then. The green one on the lower left we are we already know it's a shirt it's a t-shirt because we recognize, it as such but. Think about how would you program a computer to recognize these things given the differences, it's, really hard to tell the difference between a high-heeled shoe and a sneaker for example. So. The idea behind this is there's actually a data set called fashion amnesty and what it does that gets 70,000. Items of clothing and it's, labeled those 70,000. Items of clothing in 10 different classes, from, shirts to shoes the handbags and all that kind of thing and it's built into care us, so. One, of the really neat things that came out of the research behind this by the way is that the images are only 28, by 28 pixels. So. If you think about it's faster, to train a computer, if you're losing less data you, saw how quickly I trained with my linear example, earlier on but, if I were to try and train it with like high-definition. Images of handbags, that kind of stuff it would still work but. It would just be slower and a lot of the research that's gone into this data said they've actually been able to train. And show how to train a neural network that all you need is a 28 by 28 pixel. Image for you to be able to tell the difference between different items of clothing as you're doing probably right now you, can take a look and you can see which ones are pants which ones are shoes which, ones are handbags that kind of thing so this allows us to build a model is very very quick to train and if. I take a look here's, an example of one item of clothing in 28 by 28 pixels, and, you automatically, recognize that right it's, a boot or shoe or, something along those lines and so. This is the kind of resolution of data that all you need to be able to build an accurate classifier, so, let's look at the code for that so. If you remember earlier on the code that I was building was. I created the neural network I compiled, the neural network by specifying, the the, loss function in the optimizer, and then I like I fit it so in this case a little, bit more complex your codes going to look like you're going to use tensorflow for, intensive flow you're going to import the chaos namespace, because. The net chaos namespace, really nicely gives you access to that. Fashion emne state asset so, think about all the code that you typically have to write to download, those 70,000. Images download. Their labels, correspond. A label with an image load, all of that and that kind of stuff all that code all, that coding is saved and just, put into these two lines of code that's one of the neat things also about Python that I find that makes Python great for machine learning because that, second, line of code there where it's like bracket, trained images comma trained labels comma bracket, test images test labels equals.
Fashion, M this load data what, that's actually doing is it's loading data from the data set which is stored in the cloud it's, sorting, that 70 thousand items of data into, four sets, those, four sets are then split into two sets one, for training and one for testing and that, data is going to contain the one on the left there the the, training images is, 60,000. Images and, 60,000. Labels and then the other set is 10,000. Images and 10,000, labels that you're gonna use for testing now, anybody gets why would you separate them like this why would you have a different set for testing that you would have for training. The. Clues in the name, right. So how do you know your neural network is gonna work unless you've got something, that against you, know earlier we could test with our linear thing by saying by feeding ten in because, I know I'm expecting two X minus one to give me nineteen but, now it's a case it would be great for me to be able to test it against. Something that's known that's against something that's labeled so I can measure the accuracy as I go forward so, that's all I got to do in code so now if I come back here let's look at how we actually define, the knurl now oh sorry before I do that so, that the training images are things like the boot that I showed you earlier rounds 28 by 28 pixels. The, labels are actually just going to be numbers rather than like a word like shoe why do you think that would be. So. You can define your own labels and you're not limited to English right. So for example, 9. In English could be an ankle boots the second one is in Chinese, the, third one is in Japanese the, fourth language can anybody guess. Brogue. Routine. That's. Actually Irish Gaelic sorry. I'm biased I have to put some in, so. Like so for right now for example I could build a classifier not to just to give me items, of clothing but to do it in different languages, so, that's just what my labels are going to look like so. Now let's take a look at the code for defining my neural network so. Here is if you remember the first line of code where I define the single layer with, a single neuron for the classification. This is what it's going to look like and this is all it takes to, build this clothing classifier, so you see there are three layers here the, first layer where, it says care start layers are flatten input, shape equals 28 by 28 all, that, is is I'm defining a layer to, take in 28, squared, values, remember. The image is a square, of 28, by 28 pixels. But you don't feed a neural network with a square you feed it with a flat layer of values. In this case the values are between 0, and 255. Nons. And there's an activation function on, them which we'll explain in a moment and then, my third layer is going to be ten neurons why, do you think there are ten in that one can anybody guess so. 28 squared for the input, 10. For the output anybody. Remember where the number 10 was mentioned. Yeah. Number labels there are 10 different classes, so, what happens when a neural network it's when you train it like this one it's not just gonna pop give you an answer and say this is number three or this is number four typically. What will happen is that you want to have ten outputs for, your ten different labels and each output is going to give you a probability that it is that label so, for example the boot that I showed earlier on was label number nine so, neuron zero is gonna give me a very low number neuron, one is gonna give me a very low number neuron, two is gonna give me a very low number neuron, nine is gonna give me a very high number and then by looking at the outputs across all of these neurons I can, now determine which one the neural network thinks it's classified, for remember.
We're Training this with a bunch of data so, I'm giving it a whole bunch of data to say this is what a number nine looks like this, is what a number four looks like this is what a number three looks like by, saying okay this is what they are I encode the data in the same way and as a result we'll get our output like this now, every neuron has what's called an activation function and the idea behind that it's very mathlete kind of thing but, if in programmers, terms the TF dot n n dot Rella that you see there if you think about this in terms of encode if I, say if X is greater than zero return, X else return zero, okay. A very simple function and that's what the rally is and, all that's going to do is as the code is being filtered in and then down into those neurons, all of the stuff that's negative just gets filtered out so as a result it makes it much quicker for you to train your neural, network by getting rid of things that are negative they're getting rid of things you don't need so, every time when you specify a layer in a neural network there's usually an activation function like, that relu. Is one of the most common ones that you'll see particularly. For classification things, like this but, again the rel is a very Matthew thing a lot of times you go to the documentation, you'll, have wonder what relu is you go look it up you'll see a page full of Greek letters I don't understand that stuff so, for me something like rally like as it's as simple as if X is greater than zero return, X else return zero all. Right so, now I've defined my neural network and the next thing I'm going to do you'll, see the same code as we saw earlier on where what I'm going to do is compile my neural network and in compiling my neural network I got to specify the loss and, I got to specify the optimizer, now, there's a whole bunch of different types of loss functions there's a whole bunch of different types of optimizer functions, you, know when you read academic, research papers, around AI a lot, of them specialize, on these to say for, this type of problem you should use a loss function of spar, categorical, cross-entropy, because X you, know for this type of problem you should use an optimizer, which is an atom based optimizer, because, X you, know a lot of this as a programmer, you just have to learn through trial and error you know I could specify the same loss function, and the same optimizer, that I use for my linear and then, try and train, my neural network see how accurate is see how quick it is and then I could try these one see how accurate is see how quick it is there's a lot of trial and error in that way and understanding. Which ones to use right now is in exact, science it's, a lot like for example as a traditional, coder which. Is better using a for loop or a do loop right. Which is better using a while or using a wind you know those type of things and as a result you see as you're building your neural networks there's a lot of trial, and error that you'll do here but reading academic papers, can certainly help if you can understand, them so in this case now like for the for.
The Fashion amnesty you know after a bit of trial and error we ended up selecting for, the tutorial, to use these two functions, but as you read through the documentation, you'll, see all the functions that are available so. In this case I'm training it with an atom optimizer, and remember, the process of training every iteration it. Will make a guess it says ok this piece of data I think it's a shoe ok, it's not a shoe it's a dress why did I get it wrong you, know I'll use my loss function to calculate where I got it wrong and then I'll use my optimizer, to train to change my weights on the, next loop to try and see if I can get it better this, is what what the neural network is thinking that's how it works is your action training so, there in this case the atom optimizer, is what it's using to do that optimization the. Across categorical, and categorical. Cross entropy is, what it's using for the loss so. Now if I train it it's, the same thing that we saw earlier on model Duc fits so, all I'm gonna say is like you know hey modeled outfits I'm gonna train it with the the, input images and the input labels, and I'm just in this case I'm gonna train it for five epochs okay. So that, epochs number it's up to you to tweak it what, you'll do is your training your network and as you're testing your network you'll see how accurate, it is sometimes. You can get the process called, converging, means, as it gets more and more accurate sometimes, you'll find convergence, and only a few epochs sometimes you'll need hundreds of epochs of course the bigger and more complex the data set and the more labels that you have the longer it takes to actually train and converge but, the fashion, amnesty. Actually using, the neural, network that I defined in the previous slide five epochs is actually pretty accurate it, gets there pretty quickly with just five.
Okay. And now if I then you know just I want to test it and, the model itself we began at the the, important object here is the model object so, if I call model dot evaluate, and I passed it the test images and the test labels it, will then iterate, through the 10,000, test images and test labels it will calculate it will say I think it's going to be this it, will compare it with the label if it gets it right it improves its score if it gets it wrong it decreases, its score and it gives you that score back so, the idea here is remember earlier when, we separate, the data into a 60,000, for training and 10,000, for test instead, of you manually, writing, all that code to do all that you can just call the evaluate, function on the model pass it the test stuff and it will give you back the results it will do all that looping, and checking for you. All. Righty, so. And. Then. Of course if I want to predict. An image if I want to if I have my own images and I formatted, them into 28 by 28 grayscale. And I put them into a set now I can just say model dot predict my images and it'll give me back a set of predictions now what do those predictions look like so, for every image because, the output of the neural network was, there. Were 10 layers so. Every, image is going to give you back a set, of 10 numbers and those, 10 numbers as I mentioned earlier on nine of them should be very close to zero and one of them should be very close to one and then, using the one that's very close to one you can determine your prediction, to be whatever that item of clothing is so, if, I, demo. This and show it in code let's, see go back here. It's. Really hard to see it so forgive me. Oops. I'm. Gonna select fashion, yeah I, really. Need a mouse. There's, like fashion. Okay. Can you see the fashion code or is it still showing the linear code. Is. That fashion right there all. Right. Okay. Did I just close it. I'm. Sorry it's really hard to see, so. Let. Me go back is. That one fashion. Up. One all. Right that one okay. So, here's the code that I was showing on the earlier slide you. Know so this is exactly, the same code that was on my slides I'm just going to go down there's one thing I've done here that I didn't show on the slides and that was the the. Images themselves were, gray scales so, every pixel was between zero and 256. For, training my neural network was just easier for me to normalize, that data so, instead of it being from zero series from zero to 255, it's, a value from 0 to 1 which is relative, to the actual top value and that's what those two lines of codes there and that's one of the things that make Python really useful for this kind of thing because I can just say is you know that train, image is says is a set of 60,000. 28 by 28 images. And I can just say divide that by 255, and that normalized, it for me so that's why one of the things that makes Python really handy in data science so, but you can see it's just the same code so. I'm going to do a bit of live audience participation.
Hopefully. I can get it to work with this so remember, I said there are 10,000. Testing images, ok, so somebody, give me a number between 0 and 99 99. Don't. Be shy what's. What's. That just. 27, okay, so. Hopefully. I can see it so I can get it. That's. Not 27, is it. Okay. 27. And. Here. 27. I tested it earlier with value of 4 5 6 0 so. What's, going to happen here is that I'm going to train the neural network, to, identify those pieces of clothing and so. They're. Able to identify pieces, of clothing I have no idea what piece of clothing number, 27, is in the test set but, what it's going to do once it's done is by the end you'll see it says print the test labels, for 27, so, whatever item of clothing 27. Is there's a pre assigned label for that it'll print that out and then the next thing it will do is it will print out what it the predicted, label will be and hopefully the two of them are going to be the same there's, about a 90% chance if I remember right from this one that they will so. If I run it it's, gonna take a little longer than the previous one so, now we can see it's starting to train the network and. Because. I'm doing in a PI charm I can see in my debug window so, you can see the epochs epoch to epoch three, epoch, for this, accuracy, number, here is, how accurate it is against, testing so it's about 89%. Correct. And then, you see it's actually printed two numbers below and they're both zero so. That means for item of clothing number 27, that class was zero and then, they're predicted for that class was actually also zero so got it right yay anybody. Want to try one more just. To prove that I see. If we can what's, that, 42. I love, it that's. The ultimate answer, but, what is the question. Okay. 42. I'm. Guessing. 42, is probably also item, zero but let's see hopefully. I haven't broken any of the bracketing. Let. Me run it again so. Because, it's running all of the code it's just gonna train the network again okay. There's epoch to epoch, three, hello. There, we go. So. Let's remember earlier I said I'm just training it for five epochs that just makes it a little bit quicker and I'm also seeing if you look at the convergence, on epoch. One it was 82% accurate. Oh I got it wrong for 42. Predicted. It would be a six but it's actually a three so. But the first epoch, you see this accuracy, figure, eight two four five that, means that calculated. Was 82% accurate. The second epoch 86% accurate, the third eighty-seven all the way down to the fifth or 89 I could, probably train it for five hundred epochs but we don't have the time but, then it might be more likely to get number 42 correct and thanks. Mr. Douglas Adams's that you've actually given me one that doesn't work so I can come back and test it okay. So. That's fashion amnesty, and how it works and so, hopefully, this was a good introduction. To you Ford really the concept from a programmers. Perspective of, what machine learning is all about and, I always like to say like it talks that if you, only take one slide, away from this talk if you've never done machine learning or you want to get into a programming, machine learning take this one here because, this is really what the core of the revolution, is all about and hopefully the code that I showed you you know demonstrates, that that machine learning is really all about taking answers, and data and feeding them into get rules out I didn't. Write a single line of code there, today that, says you. Know this is a t-shirt or this is a jacket. Or this is a handbag you know this has sleeves if had sleeves then his t-shirt if has, heels then his you know shoe I didn't have to write any of that kind of code I just trained something on the data using, the below thing the below part of the diagram feeding in answers feeding, in data building, a model that will then infer the rules about it so, with that I just want to say thank you very much and I hope you enjoy the rest of the conference.