this is how intelligence is made a new kind of factory generator of tokens the building blocks of AI tokens have opened a new frontier the first step into an extraordinary world where endless possibilities are born [Music] tokens transform words into knowledge and breathe life into images they turn ideas into videos and help us safely navigate any environment tokens teach robots to move like the Masters [Music] Inspire new ways to celebrate our victories a martini pleas call light up thank you Adam and give us peace of mind when we need it most hi moroka hi Anna it's good to see you again hi Emma we're going to take your blood sample today okay don't worry I'm going to be here the whole time they bring meaning to numbers to help us better understand the world around [Music] us predict the dangers that surround [Music] us and find cures for the threats within us [Music] tokens can bring our Visions to [Music] life and restore what we've [Music] [Applause] lost Zachary I got my voice back buddy they help us move forward one small step at a time [Music] and one giant leap [Music] together and here is where it all begins welcome to the stage Nvidia founder and CEO Jensen [Music] [Applause] [Music] [Applause] Wong welcome to CES are you excited to be in Las Vegas do you like my Jack it I thought I'd go the other way from Gary Shapiro I'm in Las Vegas after all if does if this doesn't work out if all of you object well just get used to it I think I really think you have to let this sink in in another hour or so you're going to feel good about it well uh welcome to Nvidia in fact you're inside nvidia's digital twin and we're going to take you to Nvidia ladies and gentlemen welcome to Nvidia your inside our digital twin everything here is generated by AI it has been an extraordinary Journey extraordinary year here and uh it started in 1993 ready go with mv1 we wanted to build computers that can do things that normal computers couldn't and mv1 made it possible to have a game console in your PC our programming architecture was called UD missing the letter c until a little while later but UDA UniFi Unified device architecture and the first developer for UDA and the first application that ever worked on UDA was sega's Virtual Fighter six years later we invented in 1999 the programmable GPU and it started 20 years 20 plus years of incredible advance in this incredible processor called the GPU it made modern computer Graphics possible and now 30 years later sega's Virtual Fighter is completely cinematic this is the new Virtual Fighter project that's coming I just can't wait absolutely incredible six years after that six year six years after 1999 we invented Cuda so that we could explain or or expressed the programmability of our gpus to a rich set of algorithms that could benefit from it Cuda initially was difficult to explain and it took years in fact it took approximately six years somehow six years later six years later or so 2012 Alex kvki ilas sus and Jeff Hinton discovered Cuda used it to process alexnet and the rest of it is history AI has been advancing at an incredible Pace since started with perception AI we now can understand images and words and sounds to generative AI we can generate images and text and sounds and now agentic ai AIS that can perceive reason plan and act and then the next phase some of which we'll talk about tonight physical AI 2012 now magically 2018 something happened that was pretty incredible Google's Transformer was released as Bert and the world of AI really took off Transformers as you know completely changed the land landcape for artificial intelligence in fact it completely changed the landscape for computing altogether we recognized properly that AI was not just a new application with a new business opportunity but AI more importantly machine learning enabled by Transformers was going to fundamentally change how Computing works and today Computing is revolutionized in every single layer from hand coding instructions that run on CPUs to create software tools that humans use we now have machine learning that creates and optimizes new networks that processes on gpus and creates artificial intelligence every single layer of the technology stack has been completely changed an incredible transformation in just 12 years well we can Now understand information of just about any modality surely you've seen text and images and sounds and things like that but not only can we understand those we can understand amino acids we can understand physics we understand them we can translate them and generate them the applications are just completely endless in fact almost any AI application that you you see out there what modality is the input that it learned from what modality of information did it translate to and what modality of information is it generating if you ask these three fundamental questions just about every single application could be inferred and so when you see application after applications that are Aid driven AI native at the core of it this fundamental concept is there machine learning has changed how every application is going to be built how computing will be done and the possibilities Beyond well gpus gForce in a lot of ways all of this with AI is the house that GeForce built GeForce enabled AI to reach the masses and now ai is coming home to GeForce there are so many things that you can't do without AI let me show you some of it now [Music] [Applause] [Music] [Applause] [Music] that was realtime computer Graphics no computer Graphics researcher no computer scientist would have told you that it is possible for us to rate trce every single Pixel at this point we Ray tracing is a simulation of light the amount of geometry that you saw was absolutely insane it would have been impossible without artificial intelligence there are two fundamental things that we did we used of course programmable shading and Ray traced acceleration to produce incredibly beautiful pixels but then we have artificial intelligence be conditioned be controlled by that pixel to generate a whole bunch of other pixels not only is it able to generate pixels spatially because it's aware of what the colors should be it has been trained on a supercomputer back in Nvidia and so the neuron Network that's running on the GPU can infer and predict the pixels that we did not render not only can can we do that it's called dlss the latest generation of dlss also generates Beyond frames it can predict the future generating three additional frames for every frame that we calculate what you saw if we just said four frames of what you saw because we're going to render one frame and generate three if I said four frames at full HD 4K that's 33 million pixels or so out of that 33 million pixels we computed only two it is an absolute miracle that we can computationally comput tionally using programmable shaders and our R traced engine R tracing engine to compute 2 million pixels and have ai predict all of the other 33 and as a result we're able to render at incredibly high performance because AI does a lot less computation it takes of course an enormous amount of training to produce that but once you train it the generation is extremely efficient so this is one of the incredible cap abilities of artificial intelligence and that's why there's so many amazing things that are happening we used gForce to enable artificial intelligence and now artificial intelligence is revolutionizing GeForce everyone today we're announcing our next Generation the RTX Blackwell family let's take a look [Music] is [Music] [Music] here it is our brand new gForce RTX 50 Series Blackwell architect the GPU is just a beast 92 billion transistors 4,000 tops four pedop flops of AI three times higher than the last generation Ada and we need all of it to generate those pixels that I showed you 380 Ray tracing Tera flops so that we could for the pixels that we have to compute compute the most beautiful image you possibly can and of course 125 Shader teraflops there is actually a concurrent Shader teraflops as well as an Inger unit of equal performance so two dual shaders one is for floating point one is for integer G7 memory from Micron 1.8 terabytes Per Second Twice the performance of our last generation and we now have the ability to intermix AI workloads with computer graphics workloads and one of the amazing things about this gener eration is the programmable Shader is also able to now process neuron networks so the Shader is able to carry these neuron networks and as a result we invented neurot texture compression and neurom material shading as a result of that you get these amazingly beautiful images that are only possible because we use AIS to learn the texture learn a compression algorithm and as a result get extraordinary results okay so this is this is uh the brand new RTX Blackwell 9 now even even the even the mechanical design is a miracle look at this it's got two fans this whole graphics card is just one giant fan you know so the question is where's the graphics card is it literally this big the voltage regul to design is state-of-the-art incredible design the engineering team did a great job so here it is thank you okay so those are the speeds and fees so how does it compare well this is RTX 490 I know I know many of you have one I I know it look it's $1,599 it is one of the best investments you could possibly make you for $15.99 you bring it home to your $10,000 PC entertainment Command Center isn't that right don't tell me that's not true don't be ashamed it's liquid cooled fancy lights all over it you lock it when you leave it's it's the modern home theater it makes perfect sense and now for $1,500 and99 $15.99 you get to upgrade that and turbocharged the living Daya lights out of it well now with the Blackwell family RTX 570 490 performance at 549 [Applause] impossible without artificial intelligence impossible without the Four Tops four ter Ops of AI tensor cores impossible without the G7 memories okay so 5070 490 performance $549 and here's the whole family starting from 5070 all the way up to 5090 5090 twice the performance of a 4090 starting of course we're producing at very large scale availability starting January well it is incredible but we managed to put these in in gigantic performance gpus into a laptop this is a 570 laptop for $12.99 this 570 laptop has a 4090 performance I think there's one here somewhere let me show you this this is a look at this thing here let me here there's only so many pockets ladies and gentlemen Janine [Applause] Paul so can you imagine you get this incredible graphics card here Blackwell we're going to shrink it and put it in put it in there does that make any sense well you can't do that without artificial intelligence and the reason for that is because we're generating most of the pixels using pixels using our tensor cores so we retrace only the pixels we need and we generate using artificial intelligence all the other pixels we have as a result the amount of the Energy Efficiency is just off the charts the future of computer Graphics is neural rendering the fusion of artificial intelligence and computer graphics and what's really amazing is oh here we go thank you this is a surprisingly kinetic keynote and and uh what's really amazing is the family of gpus we're going to put in here and so the 1590 the 1590 will fit into a laptop a thin laptop that last laptop was 14 14.9 mm you got a 5080 5070 TI and 5070 okay so ladies and gentlemen the RTX Blackwell family [Applause] well GeForce uh brought AI to to the world democratized AI now ai has come back and revolutionized GeForce let's talk about artificial intelligence let's go to somewhere else at Nvidia this this is literally our office this is literally nvidia's headquarters okay so let's talk about let's talk about AI the industry is chasing and racing to scale artificial intelligence int artificial intelligence and the scaling law is a powerful model it's an empirical law that has been observed and demonstrated by researchers and Industry over several Generations ations and this the the scale the scaling law says that the more data you have the training data that you have the larger model that you have and the more compute that you apply to it therefore the more effective or the more capable your model will become and so the scaling law continues what's really amazing is that now we're moving towards of course and the internet is producing about twice twice the amount of data every single year as it did last year I think the in the next couple of years we produce uh Humanity will produce more data than all of humanity has ever produced uh since the beginning and so we're still producing a gigantic amount of data and it's becoming more multimodal video and images and sound all of that data could be used to train the fundamental knowledge the foundational knowledge of an AI but there are in fact two other scaling laws that has now emerged and it's somewhat intuitive the second scaling law is post trining scaling law posttraining scaling law uses Technologies techniques like reinforcement learning human feedback basically the AI produces and generates answers the hum based on a human query the human then of course gives a feedback um it's much more complicated than that but the reinforcement learning system uh with a fair number of very high quality prompts causes the AI to refine its skills it could find tune its skills for particular domains it could be better at solving math problems better at reasoning so on so forth and so it's essentially like having a mentor or having a coach give you feedback um after you're done going to school and so you you get test you get feedback you improve yourself we also have reinforcement learning AI feedback and we have synthetic data generation uh these techniques are rather uh uh Ain to if you will uh self-practice uh you know you know the answer to a particular problem and uh you continue to try it until you get it right and so an AI could be presented with a very complicated and difficult problem that has that is verifiable U functionally and has a has an answer that we understand maybe proving a theorem maybe solving a solving a uh geometry problem and so these problems uh would cause the AI to produce answers and using reinforcement learning uh it would learn how to improve itself that's called post training post training requires an enormous amount of computation but the end result produces incredible models we now have a third scaling law and this third scaling law has to do with uh what's called test time scaling test time scaling is basically when you're being used when you're using the AI uh the AI has the ability to now apply a different resource allocation instead of improving its parameters now it's focused on deciding how much computation to use to produce the answers uh it wants to produce reasoning is a way of thinking about this uh long thinking is a way to think about this instead of a direct inference or One-Shot answer you might reason about you might break down the problem into multiple steps you might uh generate multiple ideas and uh evaluate you know your AI system would evaluate which one of the ideas that you generated was the best one maybe it solves the problem step by step so on so forth and so now test time scaling has proven to be incredibly effective you're watching this sequence of technology and this all of these scaling laws emerge as we see incredible achievements from chat GPT to 01 to 03 and now Gemini Pro all of these systems are going through this journey step by step by step of pre-training to posttraining to test time scaling well the amount of computation that we need of course is incredible and we would like in fact we would like in fact that Society has the ability to scale the amount of computation to produce more and more novel and better intelligence intelligence of course is the most valuable asset that we have and it can be applied to solve a lot of very challenging problems and so scaling law it's driving enormous demand for NVIDIA Computing it's driving an enormous demand for this incredible chip we call Blackwell let's take a look at Blackwell well Blackwell is in full production it is incredible what it looks like so first of all there's some uh every every single cloud service provider now have systems up and running uh we have systems here from about 15 uh 15 15 U uh excuse me 15 computer makers it's being made uh about 200 different SKS 200 different configurations they're liquid cooled air cooled x86 Nvidia gray CPU versions mvlink 36 by 2 MV links 72 by1 whole bunch of different types of systems so that we can accommodate just about every single data center in the world well this these systems are being currently manufactured in some 45 factories it tells you how pervasive artificial intelligence is and how much the industry is jumping onto artificial intelligence in this new Computing model well the reason why we're driving it so hard is because we need a lot more computation and it's very clear it's very clear that that um Janine you know I it's hard to tell you don't ever want to reach your hands into a dark place hang a second is this a good idea all right [Applause] [Music] wait for it wait for it I thought I was worthy apparently yor didn't think I was worthy all right this is my show and tell this is a show and tell so uh this mvlink system this right here this mvlink system this is gb200 MV link 72 it is 1 and 12 tons 600,000 Parts approximately equal to 20 cars 12 12 120 kilow it has um a spine behind it that connects all of these GPU together two miles of copper cable 5,000 cables this is being manufactured in 45 factories around the world we build them we liquid cool them we test them we disassemble them shiping parts to the data centers because it's 1 and A2 tons we reassemble it outside the data centers and install them the manufacturing is insane but the goal of all of this is because the scaling laws are driving Computing so hard that this level of computation Blackwell over our last generation improves the performance per watt by a factor of four performance per watt by a factor of four perform performance per dollar by a factor of three that's basically says that in one generation we reduce the cost of training these models by a factor of three or if you want to increase um the size of your model by a factor of three it's about the same cost but the important thing is this these are generating tokens that are being used by all of us when we use Chad GPT or when we use Gemini use our phones in the future just about all of these applications are going to be consuming these AI tokens and these AI tokens are being generated by these systems and every single data center is limited by power and so if the perf per watt of Blackwell is four times our last generation then the revenue that could be generated the amount of business that can be generated in the data center is increased by a factor of four and so these AI Factory systems really are factories today now the goal of all of this is to so that we can create one giant chip the amount of computation we need is really quite incredible and this is basically one giant chip if we would have had to build a chip one here we go sorry guys you see that that's cool look at that disco lights in here right if we had to build this as one chip obviously this would be the size of the wafer but this doesn't include the impact of yield it would have to be probably three or four times the size but what we basically have here is 72 Blackwell gpus or 144 dieses this one chip here is 1.4 exop flops the
world's largest supercomputer fastest supercomputer only recently this entire room supercomputer only recently achieved an exf flop plus this is 1.4 exf flops of AI floating Point performance it has 14 terabytes of memory but here's the amazing thing the memory bandwidth is 1.2 petabytes per second that's basically basically the entire internet traffic that's happening right now the entire world's internet traffic is being processed across these chips okay and we have um 103 130 trillion transistors in total 2592 CPU cores whole bunch of networking and so these I wish I could do this I don't think I will so these are the black Wells these are our connectx networking chips these are the mvy link and we're trying to pretend about the Envy the the Envy Ling spine but that's not possible okay and these are all of the hbm memories 12 ter 14 terabytes of hbm memory this is what we're trying to do and this is the miracle this is the miracle of the Blackwell system the blackwall dies right here it is the largest single chip the world's ever made but yet the miracle is really in addition to that this is uh the grace black wall system well the goal of all of this of course is so that we can thank you thanks boy is there a chair I could sit down for a second can I have a m AO Ultra how is it possible that we're in the mobe ultra Stadium it's like coming to Nvidia and we don't have a GPU for you so so we need an enormous the computation because we want to train larger and larger models and these inferences these inferences used to be one inference but in the future the AI is going to be talking to itself it's going to be thinking it's going to be internally reflecting processing so today when the tokens are being generated at you so long as it's coming out at 20 or 30 tokens per second it's basically as fast as anybody can read however in the future and right now with uh gp1 you know with the new the pre Gemini Pro and the new GP the the 0103 models they're talking to themselves we reflecting they thinking and so as you can imagine the rate at which the tokens could be ingested is incredibly high and so we need the token rates the token generation rates to go way up and we also have to drive the cost way down simultaneously so that the C the quality of service can be extraordinary the cost to customers can continue to be low and uh will continue to scale and so that's the fundamental purpose the reason why we created MV link well one of the most important things that's happening in the world of Enterprise is a Genentech AI a Genentech AI basically is a perfect example of test time scaling it's a AI is a system of models some of it is understanding interacting with the customer interacting with the user some of it is maybe retrieving information retrieving information from Storage a semantic AI system like a rag uh maybe it's going on to to the internet uh maybe it's uh studying a PDF file and so it might be using tools it might be using a calculator and it might be using a generative AI to uh generate uh charts and such and it's iter it's taking the the problem you gave it breaking it down step by step and it's iterating through all these different models well in order to respond to a customer in the future in order for AI to respond it used to be ask a question answer start spewing out in the future you ask a question a whole bunch bu of models are going to be working in the background and so test time scaling the amount of computation used for inferencing is going to go through the roof it's going to go through the roof because we want better and better answers well to help the the industry build agentic AI our our go to market is not direct to Enterprise customers our go to market is is we work with software developers in the it ecosystem to integrate our technology to make possible new capabilities just like we did did with Cuda libraries we now want to do that with AI libraries and just as the Computing model of the past has apis that are uh doing computer Graphics or doing linear algebra or doing fluid dynamics in the future on top of those acceleration libraries C acceleration libraries will have ai libraries we've created three things for helping the ecosystem build agentic AI Nvidia Nims which are essentially AI microservices all packaged up it takes all of this really complicated Cuda software Cuda DNN cutless or tensor rtlm or Triton or all of these different really complicated software and the model itself we package it up we optimize it we put it into a container and you could take it wherever you like and so we have models for vision for understanding languages for speech for animation for digital biology and we have some new new exciting models coming for physical Ai and these AI models run in every single Cloud because nvidia's gpus are now available in every single Cloud it's available in every single OEM so you could literally take these models integrate it into your software packages create AI agents that run on Cadence or they might be S uh service now agents or they might be sap agents and they could deploy it to their customers and run it wherever the customers want to run the software the next layer is what we call Nvidia Nemo Nemo is essentially a digital employee onboarding and training evaluation system in the future these AI agents are essentially digital Workforce that are working alongside your employees um working Al doing things for you on your behalf and so the way that you would bring these specialized agents into your these special agents into your company is to onboard them just like you onboard an employee and so we have different libraries that helps uh these AI agents be uh trained for the type of you know language in your company maybe the vocabulary is unique to your company the business process is different the way you work is different so you would give them examples of what the work product should look like and they would try to generate and you would give a feedback and then you would evaluate them so on so forth and so that uh and you would guardrail them you say these are the things that you're not allowed to do these are things you're not allowed to say this and and we even give them access to certain information okay so that entire pipeline a digital employee pipeline is called Nemo in a lot of ways the IT department of every company is going to be the HR department of AI agents in the future today they manage and maintain a bunch of software from uh from the IT industry in the future they will Main maintain you know nurture onboard and improve a whole bunch of digital agents and provision them to the companies to use okay and so your H your it department is going to become kind of like AI agent HR and on top of that we provide a whole bunch of blueprints that our ecosystem could could uh take advantage of all of this is completely open source and so you could take take it and uh modify the blueprints we have blueprints for all kinds of different different types of Agents well today we're also announcing that we're doing something that's really cool and I think really clever we're announcing a whole family of models that are based off of llama the Nvidia llama neotron language Foundation models llama 3.1 is a complete phenomenon the download of llama 3.1
from meta 350 650,000 times something like that it has been der red and turned into other models uh about 60,000 other different models it it is singularly the reason why just about every single Enterprise and every single industry has been activated to start working on AI well the thing that we did was we realized that the Llama models really could be better fine-tuned for Enterprise use and so we fine-tune them using our expertise and our capabilities and we turn them into the Llama neotron Suite of open models there are small ones that interact in uh very very fast response time extremely small uh they're uh sup what we call Super llama neotron supers they're basically your mainstream versions of your models or your Ultra model the ultra model could be used uh to be a teacher model for a whole bunch of other models it could be a reward model evaluator uh a judge for other models to create answers and decide whether it's a good answer or not give basically give feedback to other models it could be distilled in a lot of different ways basically a teacher model a knowledge distillation uh uh model very large very capable and so all of this is now available online well these models are incredible it's a a number one in leaderboards for chat leaderboard for instruction uh lead leaderboard for retrieval um so the different types of functionalities necessary that are used in AI agents around the world uh these are going to be incredible models for you we're also working with uh the ecosystem these Tech all of our Nvidia AI Technologies are integrated into uh uh the it in Industry uh we have great partners and really great work being done at service now at sap at Seaman uh for industrial AI uh Cadence is during great work synopsis doing great work I'm really proud of the work that we do with perplexity as you know they revolutionize search yeah really fantastic stuff uh codium uh every every software engineer in the world this is going to be the next giant AI application next giant AI service period is software coding 30 million software Engineers around the world everybody is going to have a software assistant uh helping them code uh if if um if not obviously you're just you're going to be way less productive and create lesser good code and so this is 30 million there's a billion knowledge workers in the world it is very very clear AI agents is probably the next robotics industry and likely to be a multi-trillion dollar opportunity well let me show you some of the uh blueprints that we've created and some of the work that we've done with our partners uh with these AI agents AI agents are the new digital Workforce working for and with us AI agents are a system of models that reason about a mission break it down into tasks and retrieve data or use tools to generate a quality response nvidia's agentic AI building blocks Nim pre-trained models and Nemo framework let organizations easily develop AI agents and deploy them anywhere we will onboard and train our agentic workforces on our company's methods like we do for employees AI agents are domain specific task experts let me show you four examples for the billions of knowledge workers and students AI research assistant agents ingest complex documents like lectures journals Financial results and generate interactive podcasts for easy learning by combining a unet regression model with a diffusion model cordi can downscale global weather forecasts down from 25 km to 2 km developers like at Nvidia manage software security AI agents that continuously scan software for vulnerabilities alerting developers to what action is needed Virtual Lab AI agents help researchers design and Screen billions of compounds to find promising drug candidates faster than ever Nvidia analytics AI agents built on an Nvidia metr blueprint including Nvidia Cosmos nimron Vision language models llama neaton llms and Nemo retriever Metropolis agents analyze content from the billions of cameras generating 100,000 pedes of video per day they enable interactive search summarization and automated reporting and help monitor traffic flows flagging congestion or danger in industrial facilities they monitor processes and generate recommendations or Improvement Metropolis agents centralize data from hundreds of cameras and can reroute workers or robots when incidents occur the age of agentic AI is here for every organization okay that was the first pitch at a baseball that was not generated I just felt that none of you were impressed okay so ai ai was was created in the cloud and for the cloud AI is creating the cloud for the cloud and for uh enjoying AI on on phones of course it's perfect um very very soon we're going to have a continuous AI that's going to be with you and when you use those metag glasses you could of course uh point at something look at something and and ask it you know whatever information you want and so AI is is perfect in the CL was creating the cloud is perfect in the cloud however we would love to be able to take that AI everywhere I've mentioned already that you could take Nvidia AI to any Cloud but you could also put it inside your company but the thing that we want to do more than anything is put it on our PC as well and so as you know Windows 95 revolutionized the computer industry it made possible this new Suite of multimedia services and it change the way that applications was created forever um Windows 95 this this model of computing of course is not perfect for AI and so the thing that we would like to do is we would like to have in the future your AI basically become your AI assistant and instead of instead of just the the 3D apis and the sound apis and the video API you would have generative apis generative apis for 3D and generative apis for language and generative AI for sound and so on so forth and we need a system that makes that possible while leveraging the massive investment that's in the cloud there's no way that we could the world can create yet another way of programming AI models it's just not going to happen and so if we could figure out a way to make Windows PC a worldclass aipc um it would be completely awesome and it turns out the answer is Windows it's Windows wsl2 Windows wsl2 Windows wsl2 basically it's two operating systems within one it works perfectly it's developed for developers and it's developed uh uh so that you can have access to Bare Metal it's been wsl2 has been optimized optimized for cloud native applications it is optimized for and very importantly it's been optimized for Cuda and so wsl2 supports Cuda perfectly out of the box as a result everything that I showed you with Nvidia Nims Nvidia Nemo the blueprints that we develop that are going to be up in ai. so long as the computer fits it so long as you can fit that model and we're going to have many models that that fit whether it's Vision models or language models or speech models or these animation human digital human models all kinds of different different types of models are going to be perfect for your PC and it would you download it and it should just run and so our focus is to turn Windows wsl2 Windows PC into a Target first class platform that we will support and maintain for as long as we shall live and so this is an incredible thing for engineers and developers everywhere let let me show you something that we can do with that this is one of the examples of a blueprint we just made for you generative AI synthesizes amazing images from Simple Text prompts yet image composition can be challenging to control using only words with Nvidia Nim microservices creators can use Simple 3D objects to guide AI image generation let's see how a concept artist can use this technology to develop the look of a scene they start by laying out 3D assets created by hand or generated with AI then use an image generation Nim such as flux to create a visual that adheres to the 3D scene add or move objects to refine the composition change camera angles to frame the perfect shot or reimagine the whole scene with a new prompt assisted by generative AI and Nvidia Nim and artists can quickly realize their [Music] Vision Nvidia AI for your PCS hundreds of millions of PCS in the world with Windows and so we could get them ready for AI uh oems all the PC oems we work with just basically all of the world's leading PC oems are going to get their PCS ready for this stack and so aips are coming to a home near you Linux is good okay let's talk about physical AI speaking of Linux let's talk about physical AI So Physical AI imagine imagine whereas your large language model you give it your your context your prompt on the left and it generates tokens one at a time to produce the output that's basically how it works the amazing thing is this model in the middle is quite large has billions of parameters the context length is incredibly large because you might decide to load in a PDF in my case I might load in several PDFs before I ask it a question those PDFs are turned into tokens the attention the basic attention characteristic of a transformer has every single token find its relationship and relevance against every other token so you could have hundreds of thousands of tokens and the computational load increases quadratically and it does this that all of the parameters all of the input sequence process it through every single layer of the Transformer and it produces one token that's the reason why we needed blackw and then the next token is produced when the current token is done it puts the current token into the input sequence and takes that whole thing and generates the next token it does it one at a time this is the Transformer model it's the reason why it is so so incredibly effective computationally demanding What If instead of PDFs it's your surrounding and what if instead of the prompt a question it's a request go over there and pick up that that you know that box and bring it back and instead of what is produced in tokens its text it produces action tokens well that I just described is a very sensible thing for the future of Robotics and the technology is right around the corner but what we need to do is we need to create the effective effectively the world model of you know as opposed to GPT which is a language model and this World model has to understand the language of the world it has to understand physical Dynamics things like gravity and friction and inertia it has to understand geometric and spatial relationships it has to understand cause and effect if you drop something a fall to the ground if you you know poke at it it tips over it has to understand object permanence if you roll a ball over the kitchen counter when it goes off the other side the ball didn't leave into another quantum universe that that's still there and so all of these types of understanding is intuitive understanding that we know that most models today have a very hard time with and so we would like to create a world we need a world Foundation model today we're announcing a very big thing we're announcing Nvidia Cosmos a world Foundation model that is designed that was created to understand the physical world and the only way for you to really understand this is to see it let's [Music] flip the next Frontier of AI is physical AI model performance is directly related to data availability but physical world data is costly to capture curate and label Nvidia Cosmos is a world Foundation model development platform to Advance Physical AI it includes Auto regressive world found Foundation models diffusion-based World Foundation models Advanced tokenizers and an Nvidia Cuda an AI accelerated data pipeline Cosmos models ingest text image or video prompts and generate virtual world States as videos Cosmos Generations prioritize the unique requirements of Av and Robotics use cases like real world environments lighting and object permanence developers use Nvidia Omniverse to build physics-based geospatially accurate scenarios then output Omniverse renders into Cosmos which generates photoreal physically based synthetic [Music] data whether diverse objects or environments conditions like weather or time of day or Edge case scenarios developers use Cosmos to generate worlds for reinforcement learning AI feedback to improve policy models or to test and validate model performance even across multisensor views Cosmos can generate tokens in real time bringing the power of foresight and Multiverse simulation to AI models generating every possible future to help the model select the right path working with the world's developer ecosystem Nvidia is helping Advance the next wave of physical [Music] AI Nvidia Cosmos Nvidia Cosmos Nvidia Cosmos the world's first world Foundation model it is trained on 20 million hours of video the 20 million hours of video focuses on physical Dynamic things so n n Dynamic nature nature themes themes uh humans uh walking uh hands moving uh manipulating things uh you know things that are uh fast camera movements it's really about teaching the AI not about generating creative content but teaching the AI to understand the physical world and from this with this physical AI there are many Downstream things that we could uh do as a result we could do synthetic data generation to train uh models we could distill it and turn it into effectively the seed the beginnings of a robotics model you could have it generate multiple physically based physically plausible uh scenarios that the future basically do a doctor strange um you could uh because because this model understands the physical world of course you saw a whole bunch of images generated this model understanding the physical world it also uh could do of course captioning and so it could take videos caption it incredibly well and that captioning and the video could be used to train large language models multimodality large language models and uh so you could use this technology to use this Foundation model to train robotics robots as well as larger language models and so this is the Nvidia Cosmos the platform has an auto regressive model for real-time applications has diffusion model for a very high quality image generation it's incredible tokenizer basically learning the vocabulary of uh real world and a data pipeline so that if you would like to take all of this and then train it on your own data this data pipeline because there's so much data involved we've accelerated everything end to endend for you and so this is the world's first data processing pipeline that's Cuda accelerated as well as AI accelerated all of this is part of the cosmos platform and today we're announcing that Cosmos is open licensed it's open available on GitHub we hope we hope that this moment and there's a there's a small medium large for uh uh very fast models um you know mainstream models and also teacher models basically not knowledge transfer models Cosmo Cosmos World Foundation model being open we really hope will do for the world of Robotics and Industrial AI what llama 3 has done for Enterprise AI the magic happens when you connect Cosmos to Omniverse and the reason fundamentally is this Omniverse is a physics grounded not physically grounded but physics grounded it's algorithmic physics principled physics simulation grounded system it's a simulator when you connect that to Cosmos it provides the grounding the ground truth that can control and to condition the Osmos generation as a result what comes out of Osmos is grounded on Truth this is exactly the same idea as connecting a large language model model to a rag to a retrieval augmented generation system you want to ground the AI generation on ground truth and so the combination of the two gives you a physically simulated a physically grounded Multiverse generator and the application the use cases are really quite exciting and of course uh for robotics uh for industrial applications uh it is very very clear this Cosmos plus o Omniverse plus Cosmos represents the Third computer that's necessary for building robotic systems every robotics company will ultimately have to build three computers a robotics the robotics system could be a factory the robotics system could be a car it could be a robot you need three fundamental computers one computer of course to train the AI we call the dgx computer to train the AI another of course when you're done to deploy the AI we call that agx that's inside the car in the robot or in an AMR or you know at the uh in a in a stadium or whatever it is these computers are at the edge and they're autonomous but to connect the two you need a digital twin and this is all the simulations that you were seeing the digital twin is where the AI that has been trained goes to practice to be refined to do its synthetic data generation reinforcement learning AI feedback such and such and so it's the digital twin of the AI these three computers are going to be working interactively nvidia's strategy for uh the industrial world and we've been talking about this for some time is this three computer system you know instead of a three three body problem we have a three Computer Solution and so it's the Nvidia robotics so let me give you three examples all right so the first example is uh uh how we apply apply all of this to Industrial digitalization there millions of factories hundreds of thousands of warehouses that's basically it's the backbone of A50 trillion doll manufacturing industry all of that has to become software defined all of that has has to have Automation in the future and all of it will be infused with robotics well we're partnering with Keon the world's leading Warehouse automation Solutions provider and Accenture the world's largest professional services provider and they have a big focus in digital manufacturing and we're working together to create something that's really special and I'll show you that in the second but our go to market is essentially the same as all of the other software uh platforms and all the technology platforms that we have through the uh developers and ecosystem Partners uh and we have just just a growing number of ecosystem Partners connecting to Omniverse and the reason for that is very clear everybody wants to digitalize the future of Industries there's so much waste so much opportunity for Automation in that $50 trillion doar of the world's GDP so let's take a look at that this one one p one example that we're doing with Keon and Accenture Keon the supply chain solution company Accenture a global leader in Professional Services and Nvidia are bringing physical AI to the $1 trillion warehouse and Distribution Center Market managing high- Performance Warehouse Logistics involves navigating a complex web of decisions influenced by constantly shifting variables these include daily and seasonal demand changes space constraints Workforce availability and the integration of of diverse robotic and automated systems and predicting operational kpis of a physical Warehouse is nearly impossible today to tackle these challenges Keon is adopting Mega an Nvidia Omniverse blueprint for building industrial digital twins to test and optimize robotic fleets first Keon's warehouse management solution assigns tasks to the industrial AI brains in the digital twin such as moving a load from from a buffer location to a shuttle storage solution the robot's brains are in a simulation of a physical Warehouse digitalized into Omniverse using open USD connectors to aggregate CAD video and image to 3D Light Art to point cloud and AI generated data the fleet of robots execute tasks by perceiving and reasoning about their Omniverse digital twin environment planning their next motion and acting the robot brains can see the resulting State through sensor simulations and decide their next action the loop continues while Mega precisely tracks the state of everything in the digital twin now Keon can simulate infinite scenarios at scale while measuring operational kpis such as throughput efficiency and utilization all before deploying changes to the physical Warehouse together with Nvidia Keon and Accenture are Reinventing industrial autonomy in the future is that that's incredible everything is in simulation in the future in the future every Factory will have a digital twin and that digital twin operates exactly like the real factory and in fact you could use Omniverse with Cosmos to generate a whole bunch of future scenarios and you pick then an AI decides which which one of the scenarios are the most optimal for whatever kpis and that becomes the programming constraints the program if you will the AI that will be uh deployed into the real factories the next example autonomous vehicles the AV revolution has arrived after so many years with weo success and Tesla's success it is very very clear autonomous vehicles has finally arrived well our offering to this industry is the three computers the training systems the training the AIS the simulation systemss and and the and the synthetic data generation systems Omniverse and now Cosmos and also the computer that's inside the car each car company might might work with us in a different way use one or two or three of the computers we're working with just about every major car company around the world whmo and zuk and Tesla of course in their data center byd the largest uh EV company in the world jlr has got a really cool car coming Mercedes because a fleet of cars coming with Nvidia starting with this starting this year going to production and I'm super super pleased to announce that today Toyota and Nvidia are going to partner together to create their next Generation AVS just so many so many cool companies uh lucid and rivan and Shi and of course uh Volvo just so many different companies Wabi is uh building uh self-driving trucks Aurora we announced this week also that Aurora is going to use Nvidia to build self-driving trucks autonomous 100 million cars build each year a billion cars vehicles on a road all over the world a trillion miles that are driven around the world each year that's all going to be either highly autonomous or you know fully autonomous coming up and so this is going to be a very L very large industry I predict that this will likely be the first multi-trillion dollar robotics industry this IND this business for us um notice in just just a few of these cars that are starting to ramp into the world uh our business is already $4 billion and this year probably on a run rate of about $5 billion so really significant business already this is going to be very large well today we're announcing that our next generation processor for the car our next generation computer for the car is called Thor I have one right here hang on a second okay this is Thor this is Thor this is this is a robotics computer this is a robotics computer takes sensors and just a Madness amount of sensor information process it you know een teed cameras high resolution Radars Liars they're all coming into this chip and this chip has to process all that sensor turn them into tokens put them into a Transformer and predict the next PATH and this AV computer is now in full production Thor is 20 times the processing capability of our last generation Orin which is really the standard of autonomous vehicles today and so this is just really quite quite incredible Thor is in full production this robotics processor by the way also goes into a full robot and so it could be an AMR it could be a human or robot could be the brain it could be the manipulator this Rob this processor basically is a universal robotics computer the second part of our drive system that I'm incredibly proud of is the dedication to safety Drive OS I'm pleased to announce is now the first softwar defined programmable AI computer that has been certified up to asold D which is the highest standard of functional safety for automobiles the only and the highest and so I'm really really proud of this asold ISO 26262 it is um the work of some 15,000 engineering years this is just extraordinary work and as a result of that Cuda is now a functional safe computer and so if you're building a robot Nvidia Cuda y okay so so now I wanted to I told you I was going to show you what would we use Omniverse and Cosmos to do in the context of self-driving cars and you know today instead of showing you a whole bunch of uh uh videos of of cars driving on the road I'll show you some of that too um but I want to show you how we use the car to reconstruct digital twins automatically using Ai and use that capability to train future am models okay let's play it the autonomous vehicle Revolution is here building autonomous vehicles like all robots requires three computers Nvidia dgx to train AI models Omniverse to test drive and generate synthetic data and drive agx a supercomputer in the car building safe autonomous vehicles means addressing Edge scenarios but real world data is limited so synthetic data is essential for training the autonomous vehicle data Factory powered by Nvidia Omniverse AI models and Cosmos generates synthetic driving scenarios that enhance training data by orders of magnitude first omnimap fuses map and geospatial data to construct drivable 3D environments driving scenario variations can be generated from replay Drive logs or AI traffic generators next a neural reconstruction engine uses autonomous vehicle sensor logs to create High Fidelity 4D simulation environments it replays previous drives in 3D and generates scenario Vari ations to amplify training data finally edify 3DS automatically searches through existing asset libraries or generates new assets to create Sim ready scenes the Omniverse scenarios are used to condition Cosmos to generate massive amounts of photo realistic data reducing the Sim toore Gap and with text prompts generate near infinite variations of the driving scenario with Cosmos neotron video search the massively scaled synthetic data set combined with recorded drives can be curated to train models nvidia's AI data Factory scales hundreds of drives into billions of effective miles setting the standard for safe and advanced autonomous driving [Music] is that incredible we take take thousands of drives and turn them into billions of miles we are going to have mountains of training data for autonomous vehicles of course we still need actual cars on the road of course we will continuously collect data for as long as we shall live however synthetic data generation using this Multiverse physically based physically grounded capability so that we generate data for training AIS that are physically grounded and accurate and or plausible so that we could have an enormous amount of data to train with the AV industry is here uh this is an incredibly exciting time super super super uh uh excited about the next several years I think you're going to see just as computer Graphics was revolutionized such incredible pace you're going to see the pace of Av development increasing tremendously over the next several years I I think I think um I I think the next part is is robotics so um human robots my [Applause] friends the chat GPT moment for General robotics is just around the corner and in fact all of the enabling technologies that I've been talking about is going to make it possible for us in the next several years to see very rapid break breakthroughs surprising breakthroughs in in general robotics now the reason why General robotics is so important is whereas robots with tracks and wheels require special environments to accommodate them there are three robots three robots in the world that we can make that require no green fields Brown field adaptation is perfect if we if we could possibly build these amazing robots we could deploy them in exactly the world that we've built for ourselves these three robots are one agentic robots agentic AI because you know they're information workers so long as they could accommodate uh the computers that we have in our offices is going to be great number two self-driving cars and the reason for that is we spent 100 plus years building roads and cities and then number three human or robots if we have the technology to solve these three this will be the largest technology industry IND the world's ever seen and so we think that robotics era is just around the corner the critical capability is how to train these robots in the case of human or robots the imitation information is rather hard to collect and the reason for that is uh in the case of car you just drive it we're driving cars all the time in the case of these human robots the imitation information the the human demonstration is rather laborious is to do and so we need to come up with a clever way to take hundreds of demonstrations thousands of human demonstrations and somehow use artificial intelligence and Omniverse to synthetically generate millions of synthetically generated motions and from those motions the AI can learn uh how to perform a task let me show you how that's done developers around the world are building the next wave of physical AI embodied robots humanoids developing general purpose robot models requires massive amounts of real world data which is costly to capture and curate Nvidia Isaac Groot helps tackle these challenges providing humanoid robot developers with four things robot Foundation models data pipelines simulation Frameworks and a Thor robotics computer the Nvidia Isaac Groot blueprint for synthetic motion generation is a simulation workflow for imitation learning enabling developers to generate exponentially large data sets from a small number of demonstrations first Groot teleop enables skilled human workers to portal into a digital twin of their robot using the Apple Vision Pro this means operators can capture data even without a physical robot and they can operate the robot in a risk-free environment eliminating the chance of physical damage or wear and tear to teach a robot a single task operators capture motion trajectories through a handful of teleoperated demonstrations then use Groot mimic to multiply these trajectories into a much larger data set next they use Gro gen built on Omniverse and Cosmos for domain rand
2025-01-14 13:32