Art + AI - Pittsburgh ML Summit ‘19
Hi. Everyone it's really exciting to be here the. Last time I was in Pittsburgh was a real long time ago I I went, to school here in Pittsburgh the I and I so go Titans. And. Right. Now I live in New York and I never get a chance to come back and see everyone so it's really nice to see all of you today and so. My name is Victor Dibya I'm. A machine-learning. Gd and so what that means is I spent, some time writing articles about machine learning and, contributing. To the community and, having nice little conversations. And talks like this and the. Rest part of my work my, day job actually, I'm, a research engineer at caldera fast world labs and so we're machine learning research, and develop and applied research group, about, New, York, in. Brooklyn and, and. So that's my day job and and, the night job it's. A different story and so at night I pretend. That I'm an artist and and then I spend, my time using, neural networks to actually, try to generate art and so folks, here we are and the title of my talk is, out, an AI and so here, I talk about my work, generating. Some African mascot using, generative, adversarial, networks and, it's. Not an extremely technical talk, but, the idea is to get everyone excited about the opportunities, that kind of lined this space. And. So a little bit of a background on, why art AI why, is this particular intersection, really, interesting, to me and it's, it's a little bit personal I grew, up in Nigeria a country in western Africa, and one. Of my fondest, memories from my childhood was that once, every year typically, in December. My. Family, and my, extended family my, uncles, aunts and their, kids will all have. This trip back to a village, and, join. That process, we will spend time just like hanging out and in. Going through a few activities and one of the highlights, of that event was something called the acrobatic, masquerade, dance of eastern Nigeria and, so what happens there is that you see people dressed, in interesting, tires. Like this and they had all this interest in masks and they're like do all these flips and acrobatic, dances and so as a kid I was really fascinated by African, mask art and now. As a machine-learning engineer, it. It's. This. Kind of work is one way to kind of explore, that that's fascination, a little bit more and. So. One thing you notice if. You've ever seen some African, mask art and so in the diagram, to see and the right desert, two examples, of mask. Design, from in, Africa and the. First one is something called bronze, mask from the demon tribe in Nigeria, and the second, on the right is the chakra mask from from, Congo and, so there are a few things you'll notice about them out of the box the first is that they, typically here's and the, main reason is because they're supposed to represent some kind of deity or, some of the ancestors or, mythological. Beings that typically have power, of a humanity, the. Second thing you notice about them is a pretty sophisticated, and. So you see them in the range, of materials from wood from clay bronze, iron and they, typically have this really interesting complex.
Patterns. In them the. Third thing is that they're surprisingly, ancient, and so archaeological. Excavations. And research, shows that you know many of this mask dates, as far back is a nice century BC which is pretty interesting. Then. Finally, they're functional, and so as the post artistic, pieces that just consumed for the visual excellence, these things actually, represent, identity, and just like you were wear clothes and, have accessories. As part of a reflection, of identity. African. Masks are actually designed that way the. Other thing is that they represent communion, memory and so some of these masks represent, things like planting. Planting. Crops harvest. In wars and all that interesting, stuff and so they have this very interesting rich rich background. And. So why, did I go ahead and do all of these and so the first thing is like I said this is kind of personal it's a way for me to breach my interest, in arts technology. And, also express, my, identity, and look, at all that intersection, the. Second thing is if any of you have worked with generative. Adversarial, networks you will see that there is this growing. Genre, of. Artistic. Pieces, that are created using your networks and the, interesting thing is that that space is typically dominated by classical, European arts, and so you see. Artistic. Expressions, using data, sets of images from Van Gogh Rembrandt. And Picasso and a, project like this is an opportunity, to kind of diversify that space and, then finally as the research scientist working in the field of automation learning. There's. All this really, good work around creating, really good generative, models and the, data sets we have today things, like some. Really good data sets like celeb a solid, HQ, and this can be really good DSS, for evaluating, the, results of the, New York textures that come out and so part. Of this project is to at some point when we understand, the dataset really well contributed. As potential. Potentially. Good benchmark, for performing. For, benchmarking, generated, models. That's. Kind of the motivation so, what did the results look like really fast so, the image you see right here it is. An image of things that look like masks for the interesting, things that none of them are real and so all of these are like examples, of some of the masks they're generated, using a deep convolutional. Generative. Adversarial, networking, so this, is an example of what some of the results, from all of this exploration can be looks like. Ok, so how, did I go ahead and build, this this, little project. And. Typically. Whether some of the related feels that. Surrounds. Area so the first is generative, art or computational.
Creativity. Or code art and so, this is a fairly, old and robust, field and so, for people who who. Have seen this typically, described, as art as in whole or in part created, using some kind of autonomous, system and, over the years there's been some pretty good tools out there how, many of us at this processing. Ok. And, so processing, is the basic programming language it helps you generate. Patterns. And art expressions. We've. Also things in things like p5, GS, the. Interesting thing here is the tools like that they're like, basic, programming tools you have some kind of artistic vision, you write some code and these, tools can help you realize that vision. But. Neural networks take a slightly different approach and so, in this case rather than actually, having your rules specify, do your data set becomes, the primary impede into the creative process and so imagine I was gonna generate, some nice-looking patterns. Of circles, if I was gonna go the traditional approach, I would write, some code and say. Processing. That helps me generate all of that put in some parameters, and see what the results look like on, the opposite side if I was gonna use neural networks the idea would be to collect a few thousand examples. Of circles. Get. The neural network to learn what that distribution looks like and then sample, from that distribution to create new circles, it's. Not really intuitive, it sounds like a lot more work but it turns out that if the. Amount of work you want your. Tony maws tool to do is really complex and really complicated, for example. Generating. Things like masks art it's, hard to really write rules for that and so generative. Models are a really good choice for that. And. So the process is typical, like, any machine learning, our, projects, you start out with collecting, your data sets and this in this case the, data set needs to be carefully, curated especially. Because it's a fundamental part of the process the. Second part of it is related, to training your model and so here we, use a generative, of adversarial, Network in. The example I'm going to show user did convolutional. Generative. Other server network. And. Then finally, there's some work to. Kind. Of evaluate you know what. Results, they did come, up with to, these results do, they make sense a label in the well is. The network memorizing, this impedes all that good stuff and. So. The. First piece is data correction data, curation or, data collection which I guess is the best part of the process and so do. We agree that the. Data collection and cleaning is the best part of the process, yeah. I see, someone who really enjoys data collection, so we, all know that you know really. Good results, you, need to have a really good data and this thing tends tends to eat like a huge, amount of your time process and, so for this project it was all around all around data. Scraping, in. Some cases manual, image downloads, the. Next thing is to carefully, hand Curie's every, single image and, so here you're removing vector. Images we want actual. Images, of masks and so we vectors, you, want to. You want to remove images, that are incorrectly. Labeled you, want to remove mass, that has nothing to do with Africa and. Then finally you need like reefs. To identify. Duplicates. And there's some. Semi-automatic. Ways to actually do this but but, I don't discuss that here so, at the end of this really nice process of a couple of weeks I ended, up with about 11,000. Curated images, and that can have became the nucleus for the, experiments, performed, here and the, image in the right shows, some examples of some of what, some of this mask actually, look like, and. So the range is really diverse and. Ok. So now that we have our data the. Next thing is we. Need to come up with a generative. Adversarial, model, that and the goal here is to create, something that, understands. Tries. To understand, what did what, a mask would look like the distribution of African masks so, is that a test time or a generation, time we can get it to work with us to generate stuff that's new and novel and, so for that I use. The basic generative. Artists or Network formulation, and and this this stuff it it I find it's really exciting and really interesting, it's, an arrangement of two neural networks that play competitive. Skill game and so it has two parts first. The generator and and. The second part is the discriminator, and, so what the generator ends. Up doing is that it learns to map a noise. Vector of, some specific length and then, map that to some. Candidate, to the distribution, of images and, so what happens is that it takes its input some. Noise and its, output is actually a, generated. Fake image and. So. That's the first part of the network the. Second piece is called the discriminator, and what, it has one simple task and so it takes in an image, that's generated, by, created. By the generator, something, that. Possibly. Doesn't, exist and in, at the same time it takes in an image from the actual data set from the data.
Set Of real images and this, task is to kind of decide is this real or fake it's a real being that it comes from the actual distribution and fake, being that something, synthesized, by the generator. And. And. Essentially, the, way this works is that a training. Time. For. Each image that comes in for the discriminator, we. Look. At its prediction. Does it get it right real or fake and, then we use that signal, with a back, propagate that and we use that to update both to discriminate, and network and we use that update. Generator. Network and the, idea, here is that at the end of the training process if everything works well we have a generator, that becomes really really good at generating, images that the. Discriminator, can tell apart and finally, we have a discriminator, that becomes. Really good at telling the real fun fake and. One of my favorite analogies here is an analogy. Of the thief and the policeman so we, start out with this thief, who, is really bad at his job he's an amateur, he creates a painting, and, right. Out of the box to put the policeman, can spot that you know this is a fake this. Doesn't exist you're, going to do so. The thief goes to jail they, enter jail he meets the Godfather because, the Godfather has like a really good Network he has people in the police department he, can tell the thief you know the reason why you're caught this is the reason why you were caught and, so the, thief serves this time he gets a reason and what does he do next he's like oh you know I'm not the same thief who went to prison I'm much better now and he, goes around he creates new fake and then, it takes the police a while to figure it out and then this game can continue at the end of the process we have a thief that becomes really good at generating fakes, and a policeman, that gets really good at telling these things apart hopefully. They don't spend all of their lives doing that in some point they would shake hands and say goodbye but I don't, know and. So the, image you stability on the left is just the standard image of again, training, and so we start with random noise and so the generator starts, by, generating, really bad images, and as the training progresses you, start to see that it learns really, it becomes really good at generating images for some reason get some permission, issues here. And. So how did I kind of implement, and run these experiments I started with this. Again implementation. Or using, tensor flow and, the. Main modifications, I had to make was to, write. A custom, input. Pipeline, essentially, ensure that my created, image can flow through that network and. The. Second night second thing I do did was to modify, the generator, and discriminator networks, to, generate. 640, X and 128. Px images, respect respectfully. So typically that's just all about adding more capacity to to both of these but, with these networks and then finally I run, all these experiments, on TV you know the process is typically, straight for the only challenge I had which is more about, my my, own mistakes, was that it, turns out that the code for the, sample code for. 40. P implementations. It. Represents, an. Image matrix. In the really unusual way, and so. Typically. I would represent an image matrix by, say, the. Width the height and then the number of channels let's is 64, by 64 by. 3 but, this strange. Code will do 3 by 6 4 by 64 and so, what ended up happening was the first couple dozen experiments. Just. Outputted. Nonsense until I can I hunt it down that that little problem and fix that and. So. What, does it take to actually train, a model on the TPU and so the first thing you need to do is on, the Google Cloud console or, you could login. Hopefully, you have you, already have some access to it CPU and, you use this really nice command-line. Tool. That they did design, called CTP, up and you will just run that and typically. It will work you through the process of. Essentially. Setting up a, compute. Engine VM to. Beus for the project and typically, that compute engine VM will be attached to a TPU pod or a TPU, machine that's available, the. Next thing it will help you do is it will help you as a second to that machine that is like that, can access your your TP engine pretty much that it's a straightforward process. And. Then finally you could study your environment, variables, essentially, what I have here is my training, script just. The way you would run it on your local machine you would run it exactly the same way essentially you point here where, your input data looks like your. Model training parameters. And also if you wanted. If. You wanted your put your training script right it's output to some, storage. Pocket you could specify all of that so it's a it's. It's an easy straightforward.
Process. So. What are the pros and cons of actually, using TP is the pros is that it's easy to set up the. Second thing is pretty fast and so it's easy to automate and perform a lot of experiments. When you use TP use and, then final thing is it's available and so I know there's something called the tencel flow research cloud, our neighbors here students, or researchers. Okay. And so the, good thing is I think for most of you this is available for free and so, if you've won more information, on how to actually access them I'm happy to have a conversation with you about that the. Limitations, here is that as a well. Recently. You. Need to rewrite parts, of your code to actually run on GPUs. So, there's this thing called TP, estimators, and. It's. Actually you need to build your neural, network and your training to be using TP estimators, and so if, you haven't used them before your, code wouldn't, just translate, that run automatically, I'll give you the. Good part is that with the release of tensorflow. 2.0, there's. Some work to finalize support. For, kiosk models out of the box and I, think it's it's super close I think the main challenge right, now is that a lot of the TPU. VMs. Do not have. Images. That support TF. 2.0, a load box but I think in the next month or so you should be able to run. Chaos models out of the box on CPUs. And. So at this point I have this I have. This model that can generate 64. P x. 64. Px images, the problem is that in a high resolution of world 64, px is still super small and he. Tried to generate something larger, 128, pixels, or 256. Pixels, you, run, into a problem that's a known problem with regenerative. Al-zahra networks it becomes really unstable to Train and it has some, problems called describe the smooth claps and so it essentially is that a process, where it just generates the same thing over and over again and there's no diversity. And. So the idea is can we use in your networks maybe some other type of neural networks to kind of like. Like. Solved. The small resolution problem and in terms of that yes we can and so you could use something called super resolution guns, and the. Idea here is that it's it's it's but it follows the generator. And discriminate, or pardon but the idea is that instead. Of mapping impute, noise to, generated. Image generated. Images what your generator tries to do is that it Maps a low resolution. Version. Of the image to a high resolution version, of the same image and. At. The end of the day she, learns to super resolve input. Input. Impedance. And. D lines distinguish, between super. Resolved images from G and the original high quality images and so it achieves this by using a perceptual, loss which is a combination, of both the traditional adversarial. Loss and some kind of content loss and. I'm happy to talk about the details of this at some other points and, so and, so at this point does, this really work what did the results look like and so. I'll. Just. Flow. Through a few examples that I thought were really interesting and, I. Guess, as an artist, if I was a good artist I should name all of these pieces but I haven't done that so I guess that's one of the things about to do this and, so. On what, you see right now on the left is a 64, by 64 image. Generated. By this again, and on, the right is a 6x. Super. Resolved version, using, another generator, value on the network but this time not from topaz, labs to use and so, the interesting thing to notice here that there, are details on the right that, pretty much just done existent, left but this model, has been able to. Hallucinate. Or complete, that image in a way that actually makes sense and it looks interesting and. This. Is not a really cool example and and, these things look like interesting, artistic pieces to me oh I guess that's up for debate there but they look really interesting to me and. So. These are other other examples, and, so. My favorite examples I think I should add names to them and so, at this point as a good research. Scientist, I have. These really nice images, and I guess the thing that's on my mind is you know. Are. The images actually in the vel like, has, it again actually come up with new African art pieces or has it just memorized, what and the impede Network and. Other. Questions, are like what kind of representations, that's a gangland and how can we actually explore, this space and to, kind of do that a little bit I built a tool something I'm calling an, algorithmic. Art inspection. Interface and. What what's here is that at the bottom you see all of the images or some.
Of The images generated but again and, if you selected, any one of them it will show on the top the, selected, image and. On. The right hand side of that it will show all of the images, in, the. Original data set that most. Similar to that image and. It. Provides an interface kind of compare you know is the. Generated. Generative model just memorizing, the impede dataset or or, is, it really doing some novel interesting, things and. The. Other question now is how do you actually compute similarity. And so in some papers and people do things like pixel. Mean, square between pixels, that's, not a good idea another. Approach which is what I actually use here is to use features extracted. From a pre-training the convolution. Your network and so here I take two. Layers from VG 16 an early layer that focuses, on low level features, its. First convolutional, layer or. The pooling layer after its first convolutional, layer and, then I take also the last pulling, layer and, I use better then two kind of. Computer. Similarity, and use cosine distances, metric for that and through. These explorations you start, to learn a few things, the Gann has learned to generate of all of all images mask. With of all shapes I. See. That it learns masks, with of all shapes and sometimes, it it focuses on masks with like hair hairlike, projections and. This, tool is actually available online on github so you can actually go and tweak, around with it, and. So finally for reflections, I think, one of the things I learned is that as, this. Whole, field can I get, smoke proliferated. We seen neural networks actually generate I, think it's important, that we explore, tools. That help us we I tell. You know is. This thing reading of all was the visual quality like, and. Yeah. It's. Not just not to generate this tools but also, important, to find metrics. And to evaluate is novelty. The. Other interesting thing is who. Has, agency, right and so. It. Let's say in a few years this the, results of this process become really interesting, who, owns who owns the IP is it, me or is it the machine and so, my thoughts around that is that you know data, site creation is a careful process in, which the artist can express themselves and I think it's art and. Parameter. Selection what kind of model do I use and there are a lot of strategies so that artists, can actually use here it could be like training. On one data set having, some pause training at different, data sets using different, model architectures, and then finally interpretation, and image and so these are all ways. In which the artists can have express themselves and contribute, to the process and the. Model, is just a tool and. Then finally there are important. Thoughts around ethics so, essentially ensuring that you know all the facts generated, from the project, using such a great that's beneficial, and inclusive, of its origins. It's. Also important, to explore. Best, practices, around the ethical collection, and distribution of the data set and a. Few. Other emerging, concerns, that are out then. Finally, and. I'm gonna stop here, next. Step so this project is will explore things like conditional generation. So. There's some effort to kind of label this data set and explore, condition organs there's, also effort to use on supervised, methods. To explore, interesting, loads using things like infer guns, then. Just efforts. To improve. The quality of generated, images, and so optimizing. The loss functions, using, things like variational. Also encoders and. Their. Combination, so against and then, finally some kind of interactive, installation, tool there's a version of that online that, there's, opportunity, to kind of improve that, and. So I'm. At the end of my talk the, source code for training the game you can find out there and the.
Interactive. Tool for inspecting novelty, you can find it in the second link um thank, you. You.