TensorFlow for JavaScript (TensorFlow @ O’Reilly AI Conference, San Francisco '18)

TensorFlow for JavaScript (TensorFlow @ O’Reilly AI Conference, San Francisco '18)

Show Video

How's. It going everybody. Here. To talk about tensorflow and JavaScript today my, name is Nick and this, is my colleague ping and. We work on tensor flow GS here at Mountain View. So. The traditional, thinking is machine, learning only happens in Python right that's that's kind of what the, you know everybody. Thinks about but, is that always the case. Has. Anybody seen this before this is there's something we host on, on. Our tensor flow documentation. This is the machine, learning playground, the tensor flow playground, and, it was actually built by our colleagues, in the. East Coast and it, was just, a visual to put into some of our ml classes, and it, kind of shows, how data. Flows throughout, a connected. Neural network with different activation. Functions and, this. Was, a really popular project, we built and. It. Was a lot of fun to make and we've gained a lot of traction from, it so we started to think maybe. It makes sense to do ml in the, browser, there's. A lot of opportunities, for doing ml. Directly, in the browser we don't need any drivers, there's. No CUDA installation, or anything you, could just run. Your code. The. Browser has a lot of interactive, features especially, with over, the last several, years of development, there's. Access, to things like sensors, and cameras, easy you can easily hook up to that type of data stream. And. The other great part about doing ml directly. In the browser is it's a good privacy use case you don't have to send any user facing data or any user data, over. The wire over an RPC to. Do inference, behind, the scene in your infrastructure. You could actually just do. That directly, on the client, so. Coming back to the tensorflow playground. This. Is about 400. Lines of JavaScript, code it was very specifically, typed for, this. Project, so. Our team kind of took, this prototype, and started to build a linear algebra, library, for, the browser. This. Project, was initially started, it. Was all open source under. It. Was called deep learned is and. We took deep learn j/s and aligned it with what we're doing tensorflow, internally, with eager execution. And that type of alignment and launched, tensorflow. GS last April. Already. And. Once, we launched it we had a. Lot, of really great community. And Google. Built products and I want to highlight a couple this, is one that we built at Google it's called a teachable machine this, is all done in the browser there's, like three labels you can give what you're training in the webcam there's, like a green purple and red. And it sort of highlights how a basic, image recognition model, can run directly in the browser so this is this stuff all exist online you can still find it. And. The community, built a self-driving car all in, the browser called medic car and this is cool you can watch a train and learn the inference and what the cars driving. People. Built games so this is a this, is a web game that somebody trained with tensorflow, GS to. Avoid, it's, kind of a funny, animation, but there's a little dude running. Back and forth and he's, hiding from those big balls and that.

The. The. Model is learning to avoid the balls all through using tensorflow GS and continuing, to play. This. One's really cool this is a Google, project called magenta which does, a lot of ML with, audio we. Have a, large library called magenta Jas which is built on tensorflow, GS to do in browser, or audio this. Is a cool demo somebody, built it's a digital, synthesizer. That, learns how the plate music and can, drive with it. Another, another, cool example that, just came out is this, is all community, built open-source, it's called face api GS so. It's a library, that sits on top of tensorflow, GS has, a few different type of image recognition and, can, detect faces and facial features so even like toddlers work, pretty well. So. I want to kind of showcase. How, our, library pieces together, there's. Sort of two main components to tensorflow Jas there's a core API, and, on a layer C API. And. That. Is all powered in the browser by. WebGL, that's how we did the linear algebra aspect, for the browsers we bootstrap, all the. Linear. Algebra all through WebGL textures, and. On the server side we actually ship our C. Code that we run Python or, I'm so that power is tensorflow Python, so it's you, get the high-end CPU, GPU, and, then eventually we're, working on the TPU integration, story for servicing. And. Those, who have used Kerris the layers api is, almost. The same as Kerris very similar syntax. The. Core API is our op level, and you'll anyone. Who's worked with tensorflow save models that, api will be pretty somewhere. Ok. What can you do today with. Tensorflow GS. Well. You actually just author small. Models, directly, in the browser. There's. A limited amount of resources the, browser's have so we kind of get into that a little bit later but right now you can do pure, model, training. In the browser, you. Can import pre train models so this is a model it's been trained somewhere else usually. In the cloud or on some Python device and we, have a tool to serialize the model and then run that inference in. Node or on the browser. And. We have the ability to retrain, models so it's a very basic transfer, learning, we. Can bring in a model anyone. Who's seen tensorflow for poets it's a very similar exercise. So, the get started with the core API I, want, to do just a very simple basic fitting. A polynomial so. This, is a scatter. Of some data we have and we're gonna write a really simple, model. To try. To find the best fit for that this. Plot. Of data so. The classic FX equals ax squared plus, bx plus c. Excuse. Me. So. The first line this is all es6, style javascript, for those who are familiar so. We're gonna import at tensorflow. /tf, GS it's the name of our package and, we namespace, it as TF. And, our first step is to include three different variables, a b and c and, we, actually initialize, those as 0.1. This. Is gonna be passed into our training sequence. The. Next option, the next step to do is declare our function, so this is all using the TF GS API. For doing that f of X equals, ax squared plus. B. To the power of X plus C and. We have some sugar to make that a little bit more readable using. Chainable, api so it's very common pattern in JavaScript. Next. Step is to declare a loss function just. Have a mean, squared loss and. Then. We declare the, SGD, optimizer, with, a default. Learning rate we've declared somewhere in this code, and. Then finally we loop it through our training sorry box we pass through and every step we. Minimize, our loss, to the, SGD. Optimizer, this is a very similar, to Iger style Python for those who have done that in the Python name. Next. Thing I want to highlight is the next step up that layers that carrot. Style API. And. To do so we've we've, been working on doing audio recognition directly, in the browser. And. I want to highlight just simply, how that kind of works so really. Simple spoken, commands, like up-down. Left-right. Can. Be run. Through FFT. To. Build, a spectrogram, so, we take, audio in and we build a spectrogram, as an image and we train our model on that, and. We. Can actually build, that. Convolutional. Net work pretty, simply. With, our layers API. And. The first step is just the same as our fitting polynomial, will include, the package TF GS, and. Then. We're gonna build a sequential model this is very Charis, Charis. Charis. Style excuse me our first, step is to do a comp to D a couple. Different filters and kernel, size. Raivo. Activation, functions again this is very Kerris very. Familiar, for those who have used carrots then, we have a pooling, layer and. Then. We're gonna go ahead and do some more comp 2ds in another max, pooling level, and so. On we repeat as we work our way down the funnel and, finally.

We Flatten out our layers. Add. Some drop out add, a large. Dense layer at the very end. One. More drop out layer and then finally our soft max for. Audio. Label. Audio. Labeling, and, finally, let's compile the model so this is again, very similar to Kharis we're gonna compile our model that we built we'll, know any errors that we have as them as the model is constructed to give. It an optimizer, and, we, call model dot fit to start passing in our training data with our labels, and. Once. The model has trained we, can save, it to disk we. Have options for saving directly, in the browser and on uhnot gs2 file. And. Finally, we. Can use that model to do prediction so we model dot predict and we pass in our spectrogram. Okay. So those. Are two quick passes, at some of the api's we use the higher-level core and in, the lower level I'm sorry the higher-level layers. API on the lower level core API, but. One of the cool parts of doing the browser is we can take in models, that were even trained today and build, interactive demos, and. For. That I want to showcase a small video, this. Was actually built by a. Collaboration, of the tensorflow GS team and a. Google. Internal, design, firm and we built this game, for. Mobile devices and, it, uses mobile net to do emoji. Scavenger, hunt so on the. Game, the. Game. Will suggest an emoji and you have to run around the office and find it with your webcam on your phone and this is all doing inference it's PowerDirector browser. For. This one I'll play, a quick video and it, will kind of give you a better highlight, of what's going on. This is a slice of pizza. So, is this. Emojis. Have become a language of their own we. Use them everyday to communicate, in our text and emails so much, so that it's easy to forget about the real world objects, they're based on which. God is thinking can. We create a game that challenges people to find the real world versions, of the emojis we use every day introducing. Emoji. Scavenger, hunt, emoji. Scavenger, hunt uses tensorflow j/s. Open-source. Meats machine learning meats javascript, meats funds it works like this. We. Show you an emoji use. Your phone's camera to find it before the clock runs out. Find. It in time and you advance the next emoji.

While. You're searching you'll. Hear a machine learning system doing, its thing. See. If you can find all the emojis, before the timer runs out why spy approve. Emoji. Scavenger, hunt powered by machine, learning, start. Your search at gqo slash, emoji. Scavenger, hunt well. I see a URL. Cool, so they sort, of show cases for someone's, already done the hard work of training a model now we can build that great interactive, demo. Okay. So I want to highlight how we actually do that behind the scenes so the, first step is taking. That free training model this is mobile, net it's been trained under Python tensorflow, and. Kernel. II in mobile net those who have used mobile net will know there's a object, detector that you can tune for a specific labeling, and. Then we import that into our JavaScript, app - scavenger, hunt. First. Step once, the models been trained we. Save it there's. A few different password in this there's, the traditional tensorflow. Saved model API. And. We also have support, carrots, as well so. This is a sequential. Mobile. Net model for Kuras. Then. We have a conversion step so this is a tool that tensorflow, GIS ships, over. Python, it's. A pip install tensorflow, j/s and you, can use the tensorflow J's underscore. Converter. For. Interacting. With a save model there's a couple different options for finding. The. Output of the inference graph and where we want to serialize, our artifacts, and. Then we also support, the Charis style converter. As well for, hdf5. File. Format. And. Finally we would like to load those artifacts, into the browser. So. This, is all JavaScript code, for. That save model, it's. TF. Load, save, model and we. Have two different artifacts, that our script creates there's a weights. Link. And then a link. To the. JSON. File which describes. The inference graph. And. Again, there's a Karass style one Karis, actually ships all in one JSON file which has one down side of. It. Avoids some of the caching that we do we provide for save models. What. Happens in that model conversion, steps so the first thing we do especially. Like save model has a lot of different, paths, for the graph there's, a inference, graph which. The one we want there's, steps, for training and a lot of time if you're using the TF data pipeline, there's, actually graphs for all the data ops so. We actually pull out the graph for inference and in class. Ops that's needed and run some optimization. And. The, one other great thing we do for a save model is sharding. Of weights in the four megabyte chunks, which cached nicely, with modern browsers, so it's only a one-time fetch for those larger models. And. We support about a hundred, and twenty plus of the today's, tensorflow ops and that convergence step and we're always adding more. And. Again, the. TF, cares layers are supported, with this compression step. I. Also. Wanted to showcase one more demo, this is a newer demo, that we've we've, just. Shipped this summer and it's using Poe's net which is a human, estimation, demo. And for this I'm, gonna hand it over the pink who's gonna highlight, this all. Right guys. PostNet, is not an example of. Converting. Python. Training model and loading, to the browser so. On, the right, side you can see a, lot of control, that can fine-tune the model, and. On the left side is a live, feed of a video so, in the video you can see you can detect my face features, as well as my. You. Know body parts and. So. This, is a collaboration. Between. Google. Research, team as well as external. Contributor. So, this model is. In, our model. Repository. You. Can check them out. On. The, left side you can see actually, has about, 15 fpm. So, frame per second you, can build some cool application. Like, build. Recognizing. Motions. For, sports etc. We. Also have other. Models. In that repository, so like audio command, model, that Nick mentioned earlier and. Also we're. Adding some. Other like object. Detection model, so, all of that is available for you to use in the browser. Um you. Know just go ahead check him out and let us know if you'd do any quips Thanks.

The. Great part about that is it's, feeding, directly off of the camera feed in real, time and we're doing about 15 frames a second and presenting, over. The USBC. So does pretty well. Okay. So I did, mention earlier about training directly in a browser this is the retraining, the transfer learning stub, and. For this we have another cool demo that we want to showcase, again. We're using mobile, net and then, this thing's, gonna pull this demo up while I'm talking. So. We built, this demo where we do we have a baseline, mobile, net model that we've loaded in the browser and, we're, going to Train, ping. Space to play pac-man. So. He's gonna start collecting samples, from the webcam of what. His up down left is and so for this he's gonna use his face so. As he's moving his face around he's. Collecting, different samples, that were gonna pass into that retraining, step so there's a up down. Left and right and then, with, this demo as you hold down we were collecting more and more frames. He's. Getting close okay and, then. Now, that he's collected is frames, he's gonna click the terrain model button and we'll. Watch our last shoot, straight down he only takes a couple seconds. And, now now he's ready to play pac-man so go ahead and hit play there pink. All. Right this is this is a there. You go so. As his, the. Model is running directly in the browser we retrained, it to those pictures of his face and the. Controls are lighting up left down right based. On what the model is doing so. All. Right. So. This is a great use, case of what you could do with. Taking. Advantage of some, of the stuff that browser provides and doing accessibility for. Machine. Learning and building. Cool piece okay ping we. Can play pac-man all day man this. These demos are all available on our site which will hike showcase. At the end you can actually just run this today. No drivers no anything to install. Okay. Cool. All. Right so I've showed off a bunch of demos of doing, using. Our core API using, that layers API, bringing. In pre-trained. Models and doing some basic. Retraining. So. Where does performance, kind of step up stand, for tensorflow GS for the our, browser runtime that WebGL, powered, runtime so. This, is some benchmarks, we've, done using, Python, and mobile. Net so there's two computers, that we use for these benchmarks. The. Top one is a. High-end. Workstation, with a sheet, 1080. GTX, the high-end, Nvidia card so. It's super fast a little under three milliseconds, for our inference time and then. We. Used a MacBook, Pro 13, inch MacBook with. A non integrated. Graphics card or with. A graphic integrated, graphics card not a standalone graphics, card and, that. Was using the CPU built so there's no it's, just the default AVX instruction, step. We ship with tensorflow. And, we were doing a little under 60 milliseconds, for, inference. Time so. Where does the tensorflow J estimate benchmark, stand. What. Kind of depends on that. Super. Beefy 1080. Card. We're really close. About. 11, milliseconds. Per inference time the. CPUs are running on this laptop is a little bit slower it's a little under 100 milliseconds, for inference time but, that was still giving us that 15, to 20 frames per second which allows you to still build interactive demos. So. This, discussion, leads. Us to our next part which is we're. Just tensorflow GS, and server side come in to. We. Think there's a lot of great opportunities for going, with JavaScript. Ml. On, the server side under, no GS the. Ecosystem. For no. Packages, is really awesome there's tons of pre-built. Libraries, off the shelf for NPM you can build applications really, quickly and distribute. Them on all these different cloud services.

The. Default. Runtime. For no GS v8 super, fast it's had tons of resources put into it by come like Google, and. We've seen benchmarks, where the. GS, interpreter, in, node is 10 times faster than the Python. By. Enabling. Tensorflow. With no GS we actually get access to that, high-end hardware so those cloud TPU is the GPU and so on. So. Those are all exciting, things I wanted, to showcase one. Real, simple use case of nodejs and, tensorflow. The. Code, snippet I have up here on the screen is actually a really simple Express app if anyone's used it it's, just a request response, Handler, and. We just handle the endpoint slash, model which. Has a request and a response that, will write out - so. This model right now actually, we. Have a model that we've defined and we're going to do some prediction, on input that's been passed into this endpoint, now. To turn on tensorflow. GS but note it's one line of code it's. Just importing, the binding so, this is a binding we ship over NPM and it gives you the, high end power of tensorflow, C library, all executed. Under the note GS runtime. And. What can you do today under with server-side so all, those stemless we showed of doing. Writing. The model in the browser those actually just run under under, node as well you can use our conversion script. We. Ship. The. Three major platforms, Mac OS Linux in Windows CPU and then, we also have. GPU. And CUDA for Linux and Windows we just launched Windows late last week. And. All. The full, library supports so the layers API in our core API all work to get today right out of the box with no GS. And. To kind of highlight how. We can bring all these components, of NPM and tensorflow. GS and no, GS together, we. Built a little interactive demo. So. I, know. Not everybody is super familiar with baseball, but Major, League Baseball advanced media has this huge data set where, they record using. Sensors at all the stadiums the. Different types of pitches that, players throat games so. There's, there's. Pitches that are really fast to have a high velocity and low movement, and then, there are pitches who are a little slower that have, more movement so we. We, curated this data set and built, a model all in tensorflow GS that, trains against this data and detects I think seven or eight different types of pitches and. It. Renders it through a socket, so don't, get too hung up on the. Intricacies. Of baseball, this is just really solving a bread-and-butter, ml, problem of, taking sensor data and drawing. Up a classification. So. For this. I'll have ping run through the demo. Okay. All. Right so for. Web, developers you know you could really use tens, of loaches to build, a full, stack kind of application. So, on the left side is a browser that I. Started. A client. On, the on, the on the browser inside a browser which, trying to connect to the servers through socket IO on. The right side I have my console, are gonna start my server, no, Jess server, immediately. You see that is binded, to our our, tens, of low CPU, runtime. And. As. It goes the models getting trained and the. Train stats, are feed, back to the client side, as. Training. Progress you can see the accuracy increase, for, all labels. You. Know the curveball has. About 90% accuracy right now, with. You. Know kind of service, I implementation. Is easy to, fini data not like inside a browser is much harder let. Me try, to click this button what this would do is that will load live. MLB. Pitch. Data into. This application, and we will try to run new foreign Sandow. So. Let me click on that. So. Immediately. You can see the orange bar, is the prediction accuracy, for. All of these labels some. Of them we actually. Get better with the live data it's 90 percent for, changer, something. We did a little bit less accurate. Fastball. To seem is only 68, percent, so, all of all I think is, to, demonstrate that the model actually generalize.

Pretty Well for the live data as well so, yeah, back. To you. I'm. Gonna actually kill that demo or my laptop will Dec great. So. Just, highlighting. Exactly, what's going on there there was a the nodejs server which was doing our training there. Was a training, data set and he validate, data set which. We were reporting. Back over socket IO how good we are at each class through our evaluation, and then. We, add the ability to just easily reach out to MLB, Vance media through. Node and parse through their data and then sending that to the model which was the orange prediction so kind of a cool use case of. I've. Trained my model how does it stack up to real-world data and doing like a quick visualization and, that, was all playing JavaScript, playing. HTML, and all, the source code we've shown you today all the examples we've showed you they are open source and we'll. Link to them at the end here. So. Performance so I highlighted, where the WebGL runtime. Kind of stacked up with that Python benchmark. So. Let's step in and look at Python. Benchmark, against, the node GS runtime so, again these are those initial. Benchmarks I highlighted. The. Node runtime itself is just. As fast as the Python runtime for, inference of mobile net this. Is because we're, using the same library that Python uses and, there's. There's, not much code. To get to and then we're running on that high-end code so. Okay. I've highlighted a lot of stuff we built since. Basically. This year and launched. In April the node stuff has been out since the, end of May so. What's, next what's the direction that tensorflow GS is looking the going. We. Have some high-level bets, that we're doing there's a project, that's going on that we're going to release you're very soon in the next month or so, it's, our visualization, library so it's ability to pull in through, the browser and do, quick visualizations, of your model and the, data we have to, look for that coming soon. We. Also have a full data API so. Very similar to the TF data. It'll. Be browser and node specific, so there will be convenience. Functions for I just want to read data off of my webcam and not convert it to tensors how this. API will provide that for you and. On. The server side it'll, be giving high highly. Optimized, data, pipelines, for doing nodejs, terrain. And. So. Those are our two high level things those are the the big projects, that kind of cross both of our runtimes. Looking. Forward for the browser we're, working on performance so, those benchmarks, I should with WebGL a lot of them are the. Bottlenecks, or limitations. For a WebGL, so. We use 2d textures, that render the texture data the tensor data there's. Some bottlenecks. For downloading, those, textures reusing, those textures so, we're working on WebGL optimization. We're, also adding more and more ops, lately. That focus has been audio, and text. Based models so we're adding a lot more ops to help with that we, have a great. Stable. Library, of image recognition ops and the audio stuff is coming. And. The. Other thing we're looking at is helping push that spec so the WebGL runtime, was really interesting, and it kind of helped bootstrap. ML on the browser the, WebGL isn't the best use case for this and we're, looking at a few different options one, is compute. Shaders which. Is much. More similar to CUDA, like where, I can allocate the right amount of GPU memory I need to use and do, that and, we're, also. Following. Closely the web GPU, spec so there's a bunch of different offerings, from, Chrome. And Internet. Explorer, and the browser vendors, for. What we want to do we're, sort of helping. Watch. That space and provide guidance as needed.

And. On. The node JSI, cloud. Integrations, a thing we're looking at this includes. The, serverless. Type, integration, points. Integration. With our TP use and, so on. We're. Actually, working, on generating. Op. Code to, provide all the core tensorflow, ops and nodejs. The. Python version of tensorflow. Most. Of the code is actually generated. From our off registry, internally, so we're we're, writing that for typescript, for JavaScript users too. And. We're providing a better async, support with libuv so libuv is the underpinning, and no GS for asynchronous, programming, we're. Working better to make that scheduler, work, much nicer so we're not blocking as much main. Application, threads. Okay. Wrapping up showing, you a lot of stuff I kind of want to step back and highlight a couple things. First. One is our core API that's the bread and butter of the, tentacle OGS we it's, the or op library. Allows. You to interact with tensors. And. We. Also have our layers API which is our Kerris style. Api, for training. We. Also support, saved model, and caris model conversion, today through our converter script. And. The. Newest, runtime. We have is no GS we've. Just got done talking about much of that and. With. That I want. To thank you guys for attending everything. I've shown you is on GS, tensorflow, org, we, have quite. A bit of stuff up there there's all those demos that I showed you could actually they're linked in that page so you can find them as well as the source code we. Also have a, variety, of github repos, everything, we do is on github. TF. Tensorflow. /tf. GS is our route one that's our Union, package we keep track of all of our github issues there. It. Also links out to a variety, of things we have now we have an. Examples, repository, which has maybe, ten. To fifteen examples. You can just, run. There's, also a link to our model, zoo so, this is a models. That we've pre, train. Packaged up for JavaScript use and published over NPM a lot, of them actually have wrapper, API where you don't even have to take data, and convert it to tensors and then pass. It in for inference it says says here's. An image HTML, of. Canvas can you do a prediction so. Those are really cool all that stuff's linked on TFTs. We. Also have a gallery. To of community, built stuff and it's it's always growing. This. Is our community mailing list you could also find on our website there's. A lot of good discussion of for how do I do XY, and Z or I need, this feature can you please help. The. Gallery repo I just mentioned that's where all of our community. Built, examples, live. And. Models. Repo and, that's. All.

2019-01-02 03:31

Show Video


Nice! Great work.

the ax^2 + bx + c equation is a mess at 6:48 he writes a*x^2 + b^x + c which is not even remotely the same thing??

Loving this

Also, the tenori-off (@3:43) is a sequencer, not a synthesizer. It has an instrument that sounds like a synthesizer but it is not actually one. What it actually is is a sequencer.

Does tfjs support multi gpus on server side?


Other news