Qualcomm Technologies and the power of Windows AI Q&A | DIS282H

Show Video

[Music] hey everybody welcome I'm Fred balsiger I work on the Windows silicon team I'm super excited to be here today to introduce my friend Jeff from Qualcomm Jeff why don't you tell us about what you do okay great thank you Fred um so I read the Qualcomm AI software team we're going to talk a lot about um our AI stack and so on uh throughout this little chat and uh really excited to be a build just came out of a keynote you know day one uh exciting stuff with all the generative Ai and co-pilot stuff that's going on windows so excited to kind of talk about how we can help yeah I have a handful of questions and I bet the audience perhaps folks online do too but it may be so bold to jump in sure So based on the keynote you know the name the name of the day is running AI models on the edge right so the edge meaning uh on at the computer chip level as opposed to the cloud or Azure kind of thing how does Qualcomm help developers with that yeah great so we're really excited to be here at build and really introduce bringing the Qualcomm AI stack I think I got a little diagram here a qualcom county ad stack from like our entire ecosystem also to Windows now you know we've been partnering with Microsoft internally on the experience pack using our stack for a while but now we're really excited about a whole bunch of announcements related to bringing the Qualcomm AI stack to Windows and we'll talk a little bit about both uh you know Innovation with Amish runtime I think in a bit and also the Qualcomm air engine direct and we'll can talk maybe a little bit about how different layers of the stack can help developers in different ways but I think the important message is Qualcomm has been focused on AI at the edge for years now right we have Android devices and all kinds of devices including you know windows on Snapdragon devices that run our stack accelerated on our silicon and I think the important message of like the diagram that that everybody's seeing right now is silicon that addresses a whole bunch of markets and lots of ways that you can access that based on your experience as a developer your needs as a developer get onto the silicon and really develop one model in one way and use that stack to Port it across all things so whether you're in cloud or you're an edge or an automotive and so on you can use the same stack and run stuff just everywhere that's perfect you kind of teased my second question which is like hey how are you supporting the Onyx runtime so you know this week at build there's gonna be a lot of familiar messages around development patterns and how you run your AI models both on the cloud you know and then on the edge and how that works back and forth and so I think this diagram here really shows that Onyx runtime is a consistent story and or solution for developers that want to inference their AI models and they do so at any particular level with Qualcomm yeah yeah and so uh we're excited to be partnering with Microsoft on onyx runtime that you'll be able to see Demos in our booth we'll talk about that later uh showing Onyx runtime showing uh demo will show here in a few minutes okay and working with your teams and really that's one of the ways and I think maybe the preferred way for a lot of developers to be able to get their AI models and bring them to Windows and hopefully windows on Snapdragon also and that is part of being building up as consistent approach and we'll talk more as we get into the stack if you want to use Onyx runtime of course that's a really powerful and kind of common way to do it for Windows in general and sort of regardless of how you go about it you're going to get the performance out of our silicon with this kind of common stack approach that we've been talking about that makes sense so talking about deployment of AI models right because I know you have the Qualcomm neural processing engine SDK but you also have this I believe you call it the AI engine type of thing and so deployment's a little bit of a you know tricky subject in terms of those two technologies how do you unwind those so let's go back a little bit to uh this kind of more detailed diagram and um so you I mean we saw lots of Stack diagrams in fact here it was up on stage this morning uh so it won't be a surprise to developers but the way we think about it is that this AI engine direct layer is like kind of gateway to the hardware acceleration and so underneath it every IP Block in our header genius silicon is accelerated for best performance for the kinds of workloads that that Hardware really excels at right so you know in gaming you might be on the GPU and other applications you're going to be on our dedicated AI Hardware that is a building block and we use that building block to deliver the Onyx runtime experience we use it if you want to come in kind of natively we call it at the metal so to speak if you're a developer you've got a Threading architecture for your app that you know doesn't kind of fit these other runtimes in our other ecosystems in Android it might be TF Lite it might be our proprietary runtime all of those run times all of those bring your model to your developer experience are really powered by the AI engines Iraq so it's a fundamental building block we build our stack on it and we've helped you guys build Onyx runtime for Snapdragon in the same way okay so so building on that so like one of the things my team will be demoing later today or maybe perhaps tomorrow this week is uh generative AI models so something like stable diffusion right and so I think you know there's a lot of interesting software and Hardware challenges when it when it goes into like how do you run a stable diffusion model for example some generative type of AI at the computer chip at the Qualcomm Snapdragon level like how are you thinking about that what are those challenges yeah yeah so excited um you know I think uh Satya said something this morning like um like uh you know it's like we woke up on January and just started issuing press releases right right so you know three months ago if you just said hey let's stand up here and build and talk about General on the edge you'd be like are you sure about that so here are some example images um we actually will have in just a moment I'll ask Ankit to come up and show us a live demo we have a live on Windows on Snapdragon uh it's you'll be able to see it both with our proprietary stack and with Onyx runtime you won't be able to tell them apart they're nice so the nice story here is you get the same performance as a developer kind of regardless of how you want to interact with the stack and so maybe the fun thing is let's do a live demo and we can play around with it a little bit perfect I'll kick him up here and uh he's gonna drive and we'll talk about it again this is Windows on Snapdragon device this is stable diffusion we can talk a little bit about some of the magic behind how it was run we've got a couple plots loaded up here so we just have a couple you can pick from now we're good a couple props that we can pick from I think we're doing a country home so if you have a design aesthetic you want to try um on this device um uh you know I don't know it doesn't take so long to generate an image well we got a whole bunch of steps to make this thing run faster I want to really underscore this kind of design and build once and run everywhere this is the same model that we showed on a mobile handset at mobile Congress a few months back okay so look at the picture very high resolution and so that developer focus on let's just build once and run it all over the place I think is super important because it gives developers that portability and they don't have to make that reinvestment in you know their Innovation they could really focus on kind of the use cases so and that's great why don't we uh change the seeds so you can see this is live you know we'll have it in our booth you can come play with it type in your own queries it's open query um and so it'll run again it's running it's entirely on device there's no Cloud connection or anything going on here um and to give you some ideas about a billion parameters to give you some idea about kind of its complexity of the model and we'll get a different country home so if you wanna oh nice this looks a little more like uh I don't know Hall of Mirrors in Versailles or something okay so really why don't we switch back to this if you don't mind switchbacks so okay really cool we got this generative Ai and you'll be able to see it live here at the show so ai's changing fast right there's effectively a hockey stick looking curve kind of thing for how we're investing and it's it's industry-wide yeah and so what what's what does your vision look like over the next couple years for Qualcomm in terms of the Investments you're going to make in AI okay so huge I think I think um you saw a sample of it this morning in in in the remarks of the keynote um you guys are amazing innovators I mean I was blown away by like all the use cases that that you've already imagined and you've prototyped you have running you know we look at a lot of different markets and um and so I think it's there's a lot of interesting use cases starting to emerge productivity use cases I think are kind of obvious uh we have people who want to do this kind of thing um uh create images of course that's important uh we have people who want to do for example transplant their family photo in front of the mundane restaurant to the beach right so you can imagine you know combining computer vision techniques we have some of this stuff running in our Labs already um you know segment out your family tell a generative model like this create a photorealistic beach scene compose them back together boom do it entirely on device right so creative uh use cases cases like um in vehicle so we don't think about like okay we're going to do generative AI but some of the whisper net stuff that was shown this morning connectivity I want to run some workloads locally on this on the compute for mpu and then I want to hit the cloud when I need to sure or even just you think well what am I going to do with generative AI in the car but imagine interactive nav so conversational nav oh let's go pick up Susie and on the way back let's go get a pizza right okay plot all that out right what did you mean this pizza place or that pizza place right these kinds of things where you think oh wait um you know industrial applications for iot uh all kinds of on-device productivity and Screen summarization right things that in some cases these are things frankly you don't want to send to the cloud you don't want your chat maybe your memo have to go to the cloud right you're on an airplane there's a lot of use cases where that blend of on-device and cloud and that natural transition I think can really complement each other so really excited about it so that's the use case side the technology side right like the build where it builds okay what are the technological limitations these models are big so I mentioned a billion parameters we're on track to do 10 billion by the end of the year give or take so that gives you kind of an idea about kind of what we think is capable on a device not connected to the cloud and what does that take well it takes thinking about the architectures it takes thinking about like the custom silicon that we have how do we get you know parameters to it how do we get to in for quantization maybe lower um how do we preserve accuracy when we do that what do we do for memory management these are all the kind of technical challenges that my team and others uh at Qualcomm are looking at and we've got some of the stuff running in the lab I feel pretty confident that we're gonna get there and I think that's really the kind of Gateway that we need if you want to bring a co-pilot you know plug it into Edge or plug it into word or whatever and you really want that experience we're going to need you know a sizable amount of that running on device so that it doesn't break on an airplane you know we all I flew here you live here um you know we want to bring that to the device and we want to make that in a private immersive you know High throughput experience that makes sense and so you know you know I spent a lot of time getting ready for build a bunch of people did and you always kind of think about the audience the developers that show up that are somewhat interested in this Tech like what is the call to action for developers like how do they think about great we have a Qualcomm expert up here on stage what is the call to action what do you want them to do okay so I want to come back to this this idea of Onyx runtime as one piece right that's a big theme for for you and for us at the show I don't want to come back maybe to a simplified version of the stack it's up here on the screen and again maybe re-hit like what appeals to a developer in terms of access right we you know we as Qualcomm we want to be open to all different kinds of Developers again whether you're developing like for Windows itself or in some adjacent Market you're developing for other ecosystems this idea that we've got a common stack that supports silicon from almost from ear hearing aids and earbuds all the way up to you know compute devices Automobiles and so on this is I think super powerful for developers and part of me goes to Onyx runtime easy on-ramp to getting your your applications your AI models on device so um I think just the takeaway here a little bit is look the neural processing sdks are kind of proprietary runtime it predates Onyx runtime it predates a lot of stuff um we know you guys have used it and now you're moving to Onyx runtime and then if you're like you want to be closer to the metal you've got an app that only needs to be accelerated on like our hexagon accelerator you're going to get Best in Class power performance on that accelerator and you can use the AI engine direct if that's the way you want to build your app so kind of the call to action a couple things um we've got a booth Booth 428 people can come see us in the booth you'll see the demos you can play with stable diffusion to your heart's desire and you can see it both on onyx runtime and okay um we've got a bunch of talks we've got a talk on on device AI leadership by leandirt SVP at Qualcomm runs uh Ai and a bunch of other software stuff x Microsoft we got a talk uh on hybrid AI so this idea about cloud and Edge apps ecosystem so windows on Snapdragon in general like what kinds of things are we enabling you know where are we with you know compilers and all of the development environment around that right and then uh Deep dive by Ankit unstable effusion so you want a little bit more about how we built it I should note um this this morning we did launch Qualcomm a engine Direct on our website for uh for Windows it's the first time that we've announced it available for Windows so you can get neural processing SDK and you can get Qualcomm and direct for Windows uh up on our web page um this is also a how-to unstable diffusion so we break down in Jupiter notebooks there's I think there's three of them break down kind of all the steps we went through what we did for quantization what we did to get it on device how we kind of broke the model up in a way to manage memory and make the performance good and so on so people can take a look at that we're very excited about feedback on that um so but we encourage you to give it a go try it out there'll be more we're gonna be doing more developer facing stuff particularly with Gen AI as the kind of months go forward that's perfect and so I'm looking at the slide here and this Ring's true to me right so my team shipped the windows Studio Effects so these are the audio and video effects that Leverage The Qualcomm Snapdragon npu the AI models for a better collaborative you know video call and so what we did is we first adopted it the bare metal that's here you know you're Snappy you're excuse me you're snappier we're lovingly call it snappy but snappy SME yeah the Snapdragon real processing engine right and so we first started there yeah right because we need to get something going really quick and then now we're talking to first party teams first party being all of our teams within Microsoft around embracing you know Qualcomm and offloading your AI models onto the mpu in a consistent way so then we're working our way up the stack that we're projecting here to the to the Onyx runtime and so this this Rings true even for what we do in-house at Microsoft and uh look at honest one time I think it's developer preview right at the show I think um the the uh Qualcomm qnn execution provider I believe is in preview preview okay so again we're looking for feedback like if you've got a model that doesn't run doesn't run fast enough you know that's how we're going to make it better right so we'll work together support the developer community and get it done that's great and so Jeff I've been to a bunch of builds before and so this format's quite a bit different than the other ones where this is kind of supposed to be a very casual DNA and I hope we have an Engaged audience both online and physically and so if there were folks here that had any questions by all means like feel free to to raise your hand we have a couple people with microphones that come around the audience and and you know get your question either heard on the call or we're happy to repeat that too uh I don't know but again a casual format for those that are in this room yeah open for some questions we got a little bit of time great I thank you I'm David from meta and I've actually had some experience in getting to play around with these uh Technologies and so I just want to for someone who has a little understanding of of Onyx runtime and Olive and also playing with this SDK where can I go to square my product understanding and my SDK understanding for like the Venn diagram between what works on all of Windows what just works here and also how do I interact with it so great question we have a couple sessions that are about Windows and Ai and how to use Onyx runtime coming up today and tomorrow and so what we'll do at the end of those sessions those are always recorded of course but we're actually going to give you a QR code that's got a landing page it points you to a landing page with all the great documentation so everything we talk about at build everything we show at build is going to be posted up so you can get all the sample code and do it today and it's really exciting right because last year at build we had Steve batish get on stage as a executive keynote and espouse the idea of hybrid how you're watching your AI models work in the cloud and you can move them back and forth between Cloud a client well today it's a reality and so that's what we're super excited about so pay attention to some of the windows and AI sessions that are going on this afternoon and tomorrow afternoon because we're not just going to show it to you we're not going to demo it to you we're going to make available code samples and blogs everything about it so I I think you'll you I suggest you'll have a great understanding of the Venn diagram if you will after looking at the content we're going to talk about later today okay thank you good question did you have anything the only thing I'll just amplify is you know this um this this AI engine direct based EP we're going to keep you know improving that right so the reason that we're making it publicly available is so that you can like go get the latest one and plug it in and why we made the how to for some of our materials so again we're trying to be transparent about you know how it all goes together so you can you know have a clear picture about that nothing else in the room at the moment there's a question online yeah uh more Moto Johnson writes will the windows Snappy SDK be improved to include the tools so Linux is no longer required for example model conversion quantization Etc Jeff I think this this is probably for me okay so yes the short answer is yes we're like I don't know maybe halfway through that so when you go you download the windows um uh neural processing SDK today it will contain some some Linux artifacts at least it's all packaged together now so you don't have to like pick and choose and I would say within the next quarter or so we're going to complete the transition for fully Windows native uh tooling right we announced at the um at our Snapdragon Summit in the winter I guess in in November uh upcoming more upcoming tooling so we announced the Qualcomm AI studio uh not to take from the keynote this morning the the model Studio AI Studio a little bit of a Different Twist that'll be focused on these issues about uh quantization about conversion this kind of stuff more than like tuning prompts and so on but that's also will be coming and it will be you know it'll be it'll run on Windows it'll run on Linux so you have a lot of choices there too assume that's still good all right we'll continue on online uh does the Qualcomm AI stack work across Hardware that's from Sumit yeah yeah so let's go back if that uh we're gonna make that point if uh if it wasn't uh kind of already made so um here we are okay so yes so it doesn't actually literally say Hardware at the bottom um you know we have a lot of operating systems we support but today the AI stack runs on approximately 50 of our uh socs so when you're talking about a use case that you want to run on this device let's say and then you've got an adjacent use case in a I don't know a handheld scanner and a factory that is running you know windows or something um likely there's a part that that scanner runs that will run this stack right and so we've tried really hard to build it I think the quote from Bill Gates about the platform this morning we've tried really hard to build a platform so you really can kind of build once and Port that Innovation to other use cases or adjacent use cases and markets and so the short answer is yes we support a huge range of our socs we make updates monthly to our stack so you get kind of a continuous upgrade and we try really hard even on those really older long tail parts to maintain kind of a current set of features on those old parts so you can kind of keep innovating even you know once a part's been in market for a while that's great and Jeff what I think is particularly neat is we're looking at a bunch of developers here and I'm sure there's many online uh the aha moment I think at least for me would be hey we've got Qualcomm there's other Hardware people out here that produce chips but today in Market the Snapdragon hcx gen 3 has the most powerful neural Processing Unit possible and so you know you have a lot of Frameworks and runtimes we're all talking about but let's not bury the point that a lot of these AI models can move directly to the mpu that's right floating the compute from the CPU and the GPU and so you just get benefits yeah and this gets to application architecture like concurrency oh on my CP or my GPU is busy my GPU is running a game I can offload not only can I offload so I get another processor to help do something complementary but I'm you know in that processor you get best power performance in the industry right and so these are all day this laptop you know if you're going to fly to London you want the battery to still be good when you get there you know for your presentation you know that's what we want to do and we don't want to compromise the AI experience or the also complementary AI experience just because I ran it on the GPU or I ran it on the CPU where I'm not power efficient right and so that's a obviously a key thing is that custom silicon and there'll be deeper talks this talk on like hybrid Ai and these other talks we'll talk about you know kind of foreshadow our roadmap we continue to improve that device and make it more and more programmable and also more and more sort of power efficient that's great in the room please go ahead otherwise I'll just keep I'll keep dominating off yeah I think more Moto has a an apologies it was mono Moto Jonathan I think had a follow-up question what's being done next to have a native experience with Onyx runtime move away from using DLC we follow the Onyx format cool good question so here's somebody who's actually okay you want to go ahead ah I was gonna ask my friend here all right so to move away from DLC isn't downloaded content kind of thing so this does talk a little bit about deployment right and so I don't have a great like here's clearly the right answer but like there are a bunch of different texts that we're trying to leverage to think about how you go deploy models keep them updated the serviceability of it and whatnot and so you know I asked you a question earlier about the neural processing engine uh SDK as opposed to your engine AI kind of stack and how deployment of models works there and so not the perfect answer but it's something it's something we're working on uh something I think we are furthest along down the path of our qualcom friends here but it is acknowledged that this is a tricky a tricky thing I don't know if you have anything to add yeah I think he actually so there's we might have this the term two turn two two three letter acronyms for two different terms right uh which is DLC is also a format we use in Snappy so this might be a developer who's used to ask you nappy EP so let's assume that that's the question for a second right um in which case yeah we're moving to use the the Qualcomm AI engine direct based EP and that will come with an upgraded set of features I think it'll be a more natural experience again because we don't we're going to rely on the Onyx runtime to be that orchestrator that heterogeneous compute kind of traffic coordinator part of the stack and then you know when you want to get the hexagon you go through the Qualcomm engine direct EP and so that that kind of I'll call it artifact of the stack will go away as we do finish this integration that's great all right continuing uh Ivan Burger rights well I met or other quantization techniques from Onyx and Qualcomm be talked about further build so devs can use or build quantized models to run on Qualcomm MP Hardware yeah so let me uh I I don't know specifically all the talk tracks I would suspect that vanesh um who will do a kind of a deep dive we'll talk about a lot of that you can also get some insight into that by looking at the stable diffusion how to a lot of talk about the tricks we used for quantizing generative AI to make it fit on the device and make it perform it and look forward for kind of more of that kind of tutorial material coming from us and somewhat related we also have a breakout session a Hands-On lab and a discussion q a similar to this around optimizing your AI models for the Onyx runtime so this is the technology we're introducing called Olive and so there are a bunch of different sessions about what is all of how it works how it optimizes your model and so definitely pay attention to that uh show up at a lab it should be a lot of fun and we're going to just show you how to do this firsthand one find one of the Qualcomm people come to the booth come to one of these talks you know quantization is uh you know I think at the Forefront particularly generative AI they really get the best power performance you know I'd like to think we're leaders in that space um but that's not to say it's an entirely a trivial matter it's we have a lot of tooling around it because it's got some tricks um but we're we want to you know demonstrate What It Takes and you know help developers get there because quantizing your models just saves an incredible amount of memory bandwidth and power and I think when you look at these you know 10 billion parameter Networks if we can get them quantized down you know we can effectively move the frontier for what's possible at the edge to go to bigger bigger models and so that's super important we want developers engaged in that activity that's right and you know we haven't done a lot of uh why would you want to run on the edge 101 but basically running your models on the edge which is again on at the computer chip level on a neural Processing Unit you know that's you get advantages in latency security privacy and even economics yeah and so there's a lot of reasons why you'd want to do that and again this hybrid AI framework kind of showcases how you would do both take advantage of the space in Azure and run up there on the web as well as for the advantages we just enumerated run it on the edge yeah and we you know we're talking about use cases earlier and you know we hit a little bit on like security and privacy like you know these generative AI models these co-pilots imagine a personal co-pilot they could like examine your calendar you know suggest how to rearrange it it's not really clear that people want like their entire contact database and their entire calendar and everywhere they've driven like in the cloud right like I'm not sure I would want all right but doesn't mean we should limit the experience of the end users the use cases like just because we can't figure that out we can bring a lot of that Innovation to the device and you can have a real like virtual personal assistant or a helper a co-pilot like a real co-pilot in life that sits on your device and protects your privacy and the immediacy of the experience right so a big spectrum of use cases here really exciting there's more in the room so as a developer I think a lot about what products are in market today and what silicon's powering them right so you know the latest the latest clock obviously some Surface Pro 95gs I see this device here yeah that's the HC gen 3 or that right it's 8280 I think we call it something like that so how as a developer you know continue to innovate you continue to build new Chips there's generation after generation how do I think about you know backwards compatibility and or support of more Legacy silicon as you continue to crank out new and new products yeah so it's a huge topic right and so we of course you know if we look at the AI stack and what I said earlier we support 50 different socs okay we didn't do that overnight and we put a lot of attention on making sure that but the API stays stable and we tell developers when we're going to change an API I think it was just exchanging system email we finally went to like 2.0 of our API for the the neural processing SDK after I think like 60 or 70 releases of like basically the same API so it's it's top of mind and I think in a market like compute like I'll call it a Windows market right people expect their apps like they expect to get a new device and get their apps and they expect them to run and at the same time we want to take advantage of that latest Hardware so we're doing a lot of work right now to and you know like you said ai's evolving super fast and so when you think about something like Onyx runtime we want to move some of that I'll call it late binding onto the device so that a model that you've like quantized or trained or you've got a certain architecture that you like will run on like this device and we'll run on the next device it's coming again build it once run it lots of places but that next device can take advantage of Hardware Innovations new operators more streamlining so we're really thinking about that right now literally have teams of people like think about that because when you guys innovate in Onyx and we want to add an operator or you want to add you know something else we want to make sure that you can map that Innovation onto our silicon and we have to do that by translating those models quantizing them and then doing that mapping and we want to do it in a way where it's going to be portable and so there's a lot of innovation in the ecosystem going on and that could involve compiler technology it can involve you know like late binding just in time kind of binding and we're looking at all those strategies to preserve the developer investment and the end user experience that makes sense thank you I believe we might have a question online yeah I think people are excited to see a billion parameters here uh follow-up question then hey what are the techniques to move towards 10 billion parameters on edge oh if I could tell you everything you know is it simply 8-bit quantization or anything else needed okay so I mean I think I think you have to look at it at a system problem you know the challenges of generative models that are text to text have a certain set of problems and maybe like voice to text to image for example have a slightly different set of problems in the image thing you're doing a lot of what we think of a CD today you're pushing tons of pixels around and you're driven mostly by the the the memory and the bandwidth required to do all that pixel computation in language the language you're moving a lot of basically words around you're moving big lookup tables around and that's a big pressure on the memory subsystem so quantization is a key right making those models physically smaller they take less storage they take less time to read off of your SSD they take less time to bring into RAM you know at some point these devices don't have enough RAM for these Mega models the quantization is clearly key and then really it comes down to quantization it comes down to doing it in a way that preserves accuracy and really managing system resources so that we spent a lot of time how do we get that on hexagon how do we do it at low power hour how do we do things like do you look ahead so we do multiple tokens in one pass or making efficient use of the memory that we've brought into the processor a lot of those kinds of techniques so really system level hard computer science problems so there's the question from us actually there's a question there yeah yeah please great okay great we got a brief soul so this is a question about device ecosystem I kind of hear about the ubiquity of AI accelerators in hardware and I'm just curious if you can share forecasts or your opinion like when will this matter to most of my users as as a developer when when will they have the best experience uh available is this something going to happen in certain tiers of PCS okay so you you elaborate on your question I was like okay so the first answer your question is you know it matters now right like the experience that most of or the developers on Qualcomm silicon they're taking advantage of this high performance low power AI accelerator and I like for literally thousands of use cases now uh the question about um the second half of your question about you know when's it gonna be available you know we're here uh windows on Snapdragon this is an extension to the windows ecosystem of every investment we've already made and the rest of our silicon and we're we're going to push it super hard we're really excited to partner with Microsoft Onyx runtime I think it matters now because like this silicon is here you can buy this device the next one is like literally here it's in the lab we got we've got this gen stuff running on the next version of silicon in our Labs right now in my lab in San Diego it's here so like you know you have to go to where the hockey puck is going right um really encourage I mean you saw how much generative AI happened in the first what is it five months of the year we have another you know six to go uh so I think now take advantage of it now it's like a great opportunity for us working together with Microsoft to have make it happen on device for sure and it's a good question because like if I do my job right the framework for the platform above the hardware actually just lights up right so the better the more mature the hardware gets the more mpus that are entering market and stuff like that you would hope and our intent is at the Onyx framework with olive optimization and stuff goes and leverages what's available so it knows what the hardware is it understands that the hcx Gen 3 has an incredible npu kind of thing and then goes and offloads things and so we're on a journey we're not there yet but today we're showcasing hybrid AI how we're running in different places hybrid also means we're running on different processors and so you know soon uh not today everybody will have a neural Processing Unit AI workloads will are uniquely designed to run on these mpus right and so pretty soon adopting Onyx framework means you're adopting the best hardware opportunities you can for your app and that goes back to the backward compatibility or forward compatibility we're going to work with Microsoft to make that experience transportable so that this next next laptop gets even better and it you can really leverage that experience but without leaving your customers behind on their older Hardware right now like I said that yeah okay appreciate it thanks any other questions from the audience I think we're almost at the session Mark so you guys saw five minutes or so is that about right yeah I think we have about five minutes great so I'll tee up a couple softballs so like hey you have a booth here yeah I think you had a couple things running do you have your sample app air Derby running and stuff yeah there's a there's a there's a bunch of them that's gaming that's TI and gaming right so again I think that you know AI is really going to be everywhere this idea that like you could have um you know uh an agent you know I don't know I'm not a huge gamer but you could play against somebody on the plane you know your buddy is not available to play you know an opponent and you're gonna get an opponent that's gonna learn your tricks and you know really give you that competition um super resolution we didn't talk about that but super resolving and you know images in a game right XR experience where you can super resolve the part of your eye that's phobiated or whatever you're looking for and like not spending rendering energy on the pixels your eye can't see AI is like you know the back to this gentleman's question it's like becoming so pervasive and in every level of the user experience from the camera that's looking back at you like the experience pack like a super resolution I know working with you guys on some supervised stuff but we also do it with Mata we work on XR super red stuff it's everywhere and and it really you know I think one of the challenges is that we we're here we're as developers we're a developer conference but our end users like it's like magic like it we want to make it invisible to them for the most part we just want them to have some magical experience right that's awesome and so hey we're winding down I believe there's a final demo we we perhaps want to close out on yeah I think let's have another demo okay another one knock it yeah let's do it um maybe at the risk of like making it a real demo we could take a prompt from that's audience you're out of mind so anybody have an idea prompt s yeah anyone online with a suggestion for a prompt okay get set up there you go there you go for you sports fans we got a sports all right let's yeah let's go and do that one what do you have what do we got we got portrait of Lionel Messi red and green let's see what else we're gonna throw some of the side posts we tried it earlier you had like Angry Eyes or serious eyes or something serious eyes let's see if we can get uh for you soccer fans out there let's see what we can do the eyes are always tough let's see what it does and this will show you that it's like it's a it's amazing imagine what your developers can do if this is just you know the start of the Gen AI Revolution okay it's looking good it's going to be like a head-on shot right yeah yeah it's amazing I said red on green and there you go how's that that's great serious eyes very nice thank you yeah what if we said funny eyes or something right laughing say laughing and see what happens you never know this is like Vegas only better again it'll be in our booth you could come between sessions we wouldn't want to you know you have to do the Onyx runtime session but if you have a minute come by type your favorite little Aquarian you'll see it live you'll see that it's uh related we have a Windows on arm session as well and so we showcase some of the great Qualcomm stuff there as well excited about it there you go fun so okay this is good yeah it's been a lot of fun thank you very much thank you very much and uh thank you for having us at build really excited uh lots more coming honor to have you thank you very much appreciate it thank you thanks everybody

2023-05-26 10:23

Show Video

Other news

Meta Turns to Nuclear Power to Run AI | Bloomberg Technology 6/3/2025 2025-06-09 18:58

The Engine of Our Dreams Exists. It's a Clean, Powerful, Supercharged and Rotary Valved Two Stroke 2025-06-01 10:00

A Tech Insider's Look at Nuclear With Faraz Ahmad 2025-05-26 06:40