What's New for Xe2 Graphics and AI on Next-Gen Core Ultra | Talking Tech | Intel Technology
- Hi, and welcome to "Talking Tech." I'm your host, Alejandro Hoyos. And today we have a great show for you guys. We're gonna be talking about Xe2 cores. We're gonna be talking about Lunar Lake graphics.
And also, we're gonna have a chat on AI PC. (upbeat music) To have this amazing conversation, we have TAP with us. Hey, TAP.
- Hey, Alex. Good to see you. - Good to see you too. So tell us, so we have this new Xe2 core architecture. Can you tell us a little bit about it? - Well, Xe2 is a huge investment for us here, obviously, right? It's our second generation of Xe.
And you can think about every time we're developing a new graphics core it's usually evolutionary from where you were. So we've been looking at performance for gaming, performance for AI, media, and display. And we've really invested a lot to make this core our very best ever.
So Xe2 is our next generation. - That's great. So what do you look at when you are trying to create something better when it comes to graphics? Do you look at specific workloads? What are you looking at? - Yeah, you know, in our first generation we couldn't run all the workloads that we wanted to 'cause we didn't have hardware. But now that we've got our Arc graphics cards, we can get full performance testing done.
And we've looked at hundreds of games, really thousands of games, in lots of different settings, lots of different configurations and we've uncovered a lot of things that we can improve versus our prior generation. Same with AI. You know, AI has exploded over the last five years and effectively it's a primary workload for graphics on our PCs. So the whole AI PC market is new, and our Xe2 graphics core accelerates that dramatically. - This is a sort of, no, not sort of.
This is a scalable architecture, right? - It is a scalable architecture. - And one of the things you can actually gonna do with this nice building block is you're gonna implement it in something else. - We are. So this Xe2 architecture is obviously the basis of our graphics for Lunar Lake, which is integrated, but it's also going into our Battlemage next-generation discrete graphics. So Battlemage is gonna be a larger, bigger configuration with faster memory subsystems, and bigger caches, and all the rest of it, and maybe even different clocks and different power envelopes, but the fundamental building block, that Xe2 core, is going to be the same. And this is hugely important because it makes us much more efficient, much more faster time to market, and we can leverage all of our engineering efforts across all of these different market segments.
- So one way to look at it is like you have this amazing little piece of LEGO block that you're not gonna change. Like, it's already amazing. We have improved it, perfected it, and now we're just gonna build stuff around that. - Yeah, I mean, you know, the biggest thing that changes between integrated and discrete is obviously the memory subsystem. So on integrated, you have a shared memory subsystem with the CPU and it's generally a little bit more constrained by the economics of a PC, using system memory in a shared resource, right? On a discrete graphics card you have your dedicated memory controller, which is maybe GDDR or who knows what, and it's gonna be fast, and it's gonna be local, and it's gonna be built for you.
So the memory subsystem and sort of like the power budgets are what change dramatically between integrated and discrete. - Yeah, that makes sense, specifically for something like Lunar Lake. You know, this is for a laptop, so we're thinking about battery life and a bunch of other stuff, but you also want to have that great performance in there.
- Absolutely. I mean, to me, it's like if you get our Xe-core is our building block. And Xe-core from generation to generation is gonna improve primarily on perf per watt and perf per area. And those two characteristics drive both our better performance for integrated when you're power constrained and better performance for discrete when you're less power constrained, but more like how big a chip can I build, you know? - Yeah, no, that makes complete sense.
All right, so we kind of covered a little bit of how it is a building block, but now let's dive on what does this building block have inside and what has changed? - And what's different? - Yeah, what's different? - Okay, well, I would start from, you know, our architecture is built around Xe-core. Xe-core is our sort of like fundamental computational block, and it's primarily made up of two big things. One is our vector unit and the other one is our matrix unit. There's some cache in there as well.
So for the first time, we're taking our XMX instructions and moving them up and down the stack. So our integrated products, as well as our discrete products, implement XMX. And XMX allows us to run vector arithmetic very fast.
And that's all brand new in Xe2. So matrix acceleration for integrated is all about enabling AI PC. - So XMS is all matrix instructions that helps to do all these matrix calculations that is leading to that two-letter acronym. - AI. Yeah. Yeah, XMX is our architecture that accelerates matrix multiplication.
And it's built around a systolic array, which is a fairly complex thing. But think of it as it's a way to cache computations so that you can do these big vector, oh, sorry, big matrix computations without writing back to memory a lot. So you can keep all that stuff local as you're iterating across these large matrices. And that is gonna dramatically accelerate the performance on GPUs for integrated, for AI.
- So let's talk a little bit more about AI. So for AI, and let's start from the top. - Okay. - For AI, what has been going on nowadays? What has been changing? What are the new things? - Oh, my gosh. - Because that's on everyone's mind right now. - AI is obviously changing every market.
And it does seem like it's going to continue to accelerate. And the big thing that's happening right now are these things called large language models. I think everybody's familiar with ChatGPT or other things like that.
And what's really starting to happen is the amount of information that's being trained into these models is allowing AI to start doing higher-level reasoning and more generative AI. Where you're generating things that normally required people to do before, now you can do it, you know, automatically without a lot of human intervention. So that's all being enabled by a fundamental change in architecture of AI called transformers. And transformers were invented a while back at Google, but now they've become much more widely adopted. And they're the kernel of most of these large language models. So transformers becoming mainstream, large language models becoming the core of what's gonna drive the experience on a PC.
So the way I think about it is, you know, you gotta start with what can you do with a large language model on a PC? And one thing you could think about is, well, you could have datasets on your PC that are private that you don't wanna put up in a shared storage somewhere, and you can have your AI on your local PC working with that data to make you a more effective producer. You could be a better financial person. You could be a better creator. You could be a better coder.
All of this can actually be done on your local PC. But I think we're just at the beginning, right? These experiences for AI PC are rapidly accelerating. You know, we have a huge developer relations team that's working with many, many hundreds of developers for AI applications. So that is definitely the tier one workload that's driving Xe2 to adopt this sort of more AI-centric approach to design.
- So you're saying like there's new applications that we want to keep within the PC, hence the AI PC, because of, I guess, some of the these privacy issues, latency, and other stuff. - Yeah, there's privacy, latency, and cost are the big three reasons. Privacy is you have things that you don't really want to share with the rest of the world, and in some cases the rest of the world really doesn't wanna host that data for you. Second things are obviously some applications need to be very, very quick.
You can't afford to go back up to a cloud and down to a PC. And then the third one is cost. When you have millions of processors deployed that have this hardware acceleration for AI, it saves the application developers, you know, millions of dollars for building servers in the sky.
So there's lots of economic advantage to having AI acceleration inside of client PCs. - No, that's great. What I'm going over right now with my head is trying to see, like, how we can talk about, how do we able to bring all these large language models into a PC? Like, try the right approach to that. 'Cause right now my brain is going through like, "Okay, so." - How do you take billions of parameter models. - Exactly, parameter models. - And make it work on a laptop, right? Well, that is actually a fairly complex topic, Alex.
And it does bring to fore, you know, like a lot of technologies that Intel's been working on primarily centered around this thing called quantization. And quantization is an art form where you're looking at these billion-parameter models, saying, "I can't run that on a PC. How do I make it work?" So there's techniques that are trying to shrink that down from, say, 32-bit floating-point models to much smaller data types like INT8 or maybe even a smaller data type. And being able to accelerate those lower-precision data types on a PC is critical, both for compression of the model, so you're not saving, you know, a billion times, you know, 32 bits for every parameter, you're saving much smaller parameters, but you're still getting good results. And all of that is a science that's kind of emerging. I think you're gonna see more and more acceleration techniques.
Quantization is like at the fore of it right now, but there are other ideas of, you know, maybe sparsity. Like, how do I make my model smaller without sacrificing, you know, accuracy? - We were talking earlier about embedding and how it will do, for example, text prediction, and all those things. So can you give us a little bit of an insight on how does that work from a computer perspective? - Okay, so you're kind of getting at the fundamental thing that's new with large language models. And it's this idea that I can give it a prompt and it will generate something. And that something that it generates could be a picture, it could be music, it could be more text.
And the fundamental idea is built around this concept of embedding. And what you're trying to do is say, "I want to have a long vector which is just a list of numbers, and I want that list of numbers initially to just represent a word," okay? Now, the way you might generate that list of numbers is by looking at how words are used in sentences, and say, "I would like to make words that are used in similar positions relative to other words have similar codings," right? And machines do this, so humans don't have to get involved. But they're looking at the entire Wikipedia, and they're looking at the internet, and they're generating over a process of large training these vectors that represent that word in its own by itself. What large language models do is they found a way using these attention models to take that initial embedding, and now look at other words that are nearby in the prompt, and then modify that embedding to actually represent the context of not just that word, but the entire context that preceded that word as well. So you're generating this one vector that no longer represents one word.
It now represents the entire context that led up to that word. So if you can do that, which is a very complicated process, but you can do that, then you could use that context that now represents everything and you can use that as an input to something that's trained to generate from context. So if you're trained to generate from context, you could generate words, you could generate pictures, you could generate music because you've got this magical vector that represents the meaning of everything that came before it.
- No, that's great. So for example, if we're like trying to give the meaning to the word orange, the algorithm and everything will look at different sets of data in which the word orange comes out and then it will see how it's related to other words. So it should be able to determine that oranges are sweet, that oranges will grow on the trees, and so forth and so on. - Yeah, well, actually, you know, computers don't really know what orange means. All it knows is that orange appears right before the word tree.
And other words appear right before the word tree, so maybe like orange, and apple, and pear are somehow similar. The machine doesn't really know how they're similar, but what it says is that, "Hey, let's go ahead and make the embedding for apple, and pear, and tree be very similar, and let's just iterate." So now it's gonna look around. It's gonna say, "Oh, you know, I see apple appears near pie, and cherry appears near pie." So now it's going to, again, apple and cherry, you're gonna kind of have similar embeddings relative to pie. So it's this really iterative process of looking at a lot of words, how words are connected together, and then trying to generate embeddings that reflect that correlation.
- So this is how you can quickly see how just one word has so many embedding or so many correlations and why the size of these models are huge. 'Cause orange can go with tree, can go with juice, can go with all these different things. They're just different types of. - Yeah, and that's the problem that attention networks have have improved, right? 'Cause embeddings of one word don't know anything about the context. So that's not sufficient to be used as generation of a new word or generation of a picture. You need to have a vector that represents the full context.
So that's what these attention metrics do. These attention machines they look across all the other embeddings and they pull forward or they push the embedding of the current word towards other embeddings that are relevant. And it's a very cool kind of magical process. So that whole thing drives AI PC. - It just does that little thing. - Yeah, yeah, that whole thing. It really does.
And what's amazing is it's not magic, right? It's just this really amazing invention that has resulted in the explosion that's generative AI. And so that drives Xe2 architecture and future architectures because AI workloads are becoming the primary application for things like graphics on PCs. - What will be like a common AI workload, for example, for graphics in a PC? 'Cause I know there are different workloads for different.
- I think what we haven't seen yet is probably gonna be the most common workload, okay? 'Cause I could tell you about like background blurring, or face removing, picture editing, sound editing, all that stuff. Those are like the very first applications of electricity. If you think about back in the day, electricity was invented and it took decades for PCs to show up. And it took, you know, who knows how long before computers showed up. So we're in that phase. We have AI in a PC, and people are struggling, and they're working hard to figure out how do I use this engine? I think what you're gonna see is invisible AI.
So this is the type of AI where programs are talking to other programs and they're using generative AI to communicate to other programs, right? So you would no longer, you know, ask Excel. You'd not be, like, typing on Excel. You might talk to something, an agent. The agent would be talking to Excel and you would get back an answer.
- Oh, like that demo that was shown back at ITT about the RAG and? - Yeah, yeah. It's very cool. So RAG is another kind of piece of AI invention that's happening right now where it allows you to take additional context and use it without retraining a large language model. Because these large language models are obviously huge and they take months or many months to train, you can't retrain them for every bit of information that's local. RAG is a technique that allows you to enhance the knowledge base of one of these large language models based on local data.
- Yeah, and just to get a quick word on the people who are seeing this. So the demo that we had it was someone was giving a prompt to the system about writing a function for it. And he will use Whisper to get that function and then it will create and write the function for you. - Yeah, it's very cool. Very cool. And RAG is another one of those recent innovations that just shows you how early it is.
You know, we are right at the beginning of this AI PC revolution, and everybody is just running as fast as they can because we know it's gonna be amazing. But it's not exactly clear what the long-term ramifications are yet, you know. - So at the beginning you were saying that transformers was invented by Google or something, okay? Give us a little bit of. - [TAP] The original paper was a Google paper, yeah. - Yeah, can you give us a little history? Or how does it work? Yeah, like, how? - Like the evolution, how'd they get there? - How did they get there? Yeah, and today.
- Well, I can give you a little history. Like if you go back to the very beginning, I would say the AI Big Bang occurred when some folks figured out how to use GPUs to accelerate AI for, you know, AlexNet, effectively. And that was one of the seminal papers where they really used GPUs to dramatically improve recognition of images, right? You know, it's basically better than humans. And that was a huge step forward. But after that, models have just gotten better, and better, and better.
And people have stopped, you know, kind of with a traditional feed-forward network model. Things have evolved through CNNs which were doing sort of local, spatial computation because of image processing. And then people realized you needed sequentiality 'cause we're trying to think about not just what's happening now, but the context that occurred prior. And so models that were called like LSTM or other forms of memory storage models, RNNs, were invented. And I'd say those were sometime, I wish I had a picture, around '11, '13, 2013 or so, 2014. - [Alejandro] 2012, I think.
- 2012, okay. And then I think it was 2015 or so, was the attention. I think "All You Need Is Attention" was the paper from Google.
And in that, these guys basically said, "Hey, there is a method that we've invented called an attention method." Or I think it was called an attention matrix. And it basically did this thing that we talked about with embedding, using embedding, which was a prior invention. But they used the embedding, and now they're modifying that embedded vector using context. That's really the thing that kind of blew up in 2015.
I didn't know about it at the time, right? I was just doing my education then. And that just sort of bubbled along getting bigger and bigger. Then you heard next about ChatGPT.
ChatGPT uses the same architecture. It uses an attention model. And then now you've heard of bigger ChatGPT-4. These are all using that same architecture, getting bigger, and more data, and, you know, new techniques to make them faster.
So there's been a big explosion, and it's not done. - Yeah, it's pretty interesting 'cause like we didn't hear about, or like you said, the AI kind of didn't start to hit people until like ChatGPT came around. But if you do the research, like you said, it started way, way back. - Oh, yeah. Oh, yeah. Well, I mean, this is not the first time AI has been big, right? AI started, I don't know the exact date, but I'm thinking the very first stuff was in the '50s and '60s. And then kind of expert systems kind of evolved, and they had their moment, and then they faded.
And now we're in sort of a resurgence of AI because computation is now available and models and architectures are evolving so quickly that we're going faster than Moore's Law. Like if you think about the AI complexity and the AI computation, that is definitely moving faster than anything in history. - Yeah, yeah. We can definitely see that. - Yeah. Yes. - So talking about quantization, we have some tools here that we can actually provide to software developers that can help with this quantization. Can you tell us a little bit more about that? - Sure, I think of it as our path for AI on a PC primarily follows two different software environments.
One is if you're using OpenVINO, which is an Intel-specific development package. If you're using OpenVINO, you get a set of Intel quantization tools that optimize that model to run on whatever Intel platform you're doing. And that will usually, today, get you your best performance on Intel. But there's also an ONNX environment, which is a much more multi-vendor standard thing that's from Microsoft today. It's an open platform. And we've provided compression technology for that called the Intel Neural Compressor Framework.
And that's now been open sourced to the ONNX runtime community. So compression is this idea of how do I shrink my model down? And it's very important to take big large language models and move them to mobile form factors like phones and PCs. So that technology is critical, and we make it available in multiple different patterns. - And it's not like you're just going, well, there's different types of doing that. It's not like you're just gonna chop it in half. - No, no, no.
Actually, I really think of it as there's three main ways of doing this. One is you just sort of shrink the model. You basically lower the size of the weights. And then you calibrate, meaning you test it after you lower it down, and say, "Did it work, or not?" And if it didn't work, you kind of like don't shrink it as aggressively. That's a relatively simple idea and it's easy to do, but the results are less good than some more sophisticated techniques. The second step is you shrink specific layers.
So you're now gonna have a mixed model. Some are gonna remain FP32, others are gonna shrink down to INT8. And you're constantly testing and calibrating with a known dataset as you're doing that.
So this is a little bit more iterative, and it's sort of like trial and error until you get as small as you can. And the third one, which is much more labor intensive, is recalibrating. So you're basically gonna retrain certain layers using reduced precision datasets. So it's a whole family of algorithms. And people are still spending a tremendous amount of resource inventing new ones because it's the magic sauce that lets us take these large language models and run them on mobile form factors.
- Oh, that's great. Well, so that, in a nutshell, is AI PC. - That is AI PC, yeah. That and, you know, obviously the hardware that enables it. And on the GPU side we've mentioned that our strategy is all about DP4a and XMX for these large matrix multiplications.
- And both of these instructions we have them on. - [TAP] On Xe2. - On Xe2. That was great. All right, talking about Xe2, let's go back to that. - Okay, Xe2.
- Let's talk about the vector engine and all the different things actually that are also related to all this. - Yeah, so when you're starting to dig into the Xe2 architecture, obviously our building block is the render slice and within the render slice is Xe-core. And Xe-core, as I mentioned, has a vector unit and a matrix unit.
And the vector unit has also had a bunch of enhancements for Xe2. You know, that's our fundamental computational block for most graphics. And it's moving to SIMD16. So SIMD is single instruction/multiple data, and it's the fundamental way that you organize your computational pipeline.
And going towards a higher degree of SIMD improves game compatibility. We found that a lot of games have certain SIMD preferences that are hard coded into the games, so the more similar to that we are fundamentally in the architecture makes it easier for us to support legacy games. So that's very cool. - SIMD means same instruction. - Single instruction/ multiple data. - Multiple data.
So you're doing the same thing over and over again, but with like? - Wider data. So think of it as when you're doing graphics, you're generally running the same operation across multiple pixels. So the way to do that is generally each lane represents effectively one pixel.
And you're doing lots of similar computations using different data to get different colors for pixels. That's grossly oversimplified, but that's the idea. And that same idea applies for geometry where you're kind of running multiple vertices all at the same time doing the same mathematics to do sort of translation in world space or in screen space. You're doing basically these large matrix multiplications all at the same time on different data. And so that's why our pipeline is organized like that with multiple threads and multiple data for single instructions.
- Okay. So that makes sense. And what about the extensions, the XMX extensions, we were talking about earlier? - Oh, so XMX is this whole matrix thing. And on graphics, you know, the matrix extensions are harder to make use of. They're mostly about accelerating AI-style processing. So XMX on a graphics would typically be doing a post-process like XCSS. XCSS is obviously enhanced for Xe2, and it uses our XMX instructions to get great-quality supersampling.
And that now applies to our integrated products as well. - That's great. There's another thing that also has been brought here. I can't remember if it's new or not, but you have the ray tracing and also has been improved.
- Yep, we've improved our ray tracing performance. We've upped the triangle intersections so you can, you know, if you back up and say, "What is ray tracing?" It's an engine that allows you to in sort of three-dimensional space you can cast a ray out into the world and ask the engine, "What triangle did I hit?" And if you think about that fundamental operation, it's kind of like, well, is that good for something? And you use that fundamental operation to do lots of different effects. Like you might calculate a shadow using that engine, or you might calculate global illumination, or you could calculate reflections. So using this simple operation, which is cast a ray into this space and find that triangle, that is very useful to do many, many different effects.
So that whole engine is a significant advancement and it's getting better for Xe2. - So let's talk a little bit about the render slice and kind of what does that mean when it comes to kind of performance. - Sure. Well, I mentioned the render slice is sort of our fundamental building block that we can scale up. And it's important that that thing gives you great performance, but not just performance, but it's gotta be efficient.
So most of the work we've done for Xe2 is present in the render slice, and it's improving performance dramatically. We're gonna see Lunar Lake performance I think we're all gonna be very happy with. And this is also gonna translate dramatically into Battlemage performance. So our engineers have been very, very busy improving over our Arc GPU to where we are now ready to take our second crack with Battlemage. - No, that's great.
And there's a lot of people looking forward to it. I think to both of them, Lunar Lake and Battlemage. - Yeah, I think I shared a few microbenchmarks in the presentation. Maybe you can flash them up. But you'll see performance improvements. And these are just microbenchmarks.
They're sort of this engineering-oriented thing. We're not giving out real game data right now. But you can see the variation is somewhere between a huge amount, and that's when we're doing ExecuteIndirect. I didn't mention ExecuteIndirect yet, but that's one of those instruction types that turns out is very important for next-generation game engines.
And it's a different way of processing data in the graphics pipeline. We didn't support it natively in Arc GPUs and Xe. Now we do on Xe2. So you can see our benchmarks that are focused on ExecuteIndirect are, you know, 12X and 7X.
That's because we're moving from more of an emulated style to a directly implemented style. But all across the board, whether it's bandwidth or it's computation, you're gonna see much more close-to-peak performance. - It's very exciting, and like it's funny to say, but it is off the chart that we actually had to shrink the chart in order to see it. - Yeah, for sure. And, you know, this is, again, your mileage will vary. This is not clock related. This is not power related.
This is just talking about Xe2 and the fundamental architecture that's gonna be driving both Lunar Lake and Battlemage. - Right, 'cause your mileage might vary depending on the design of the platform, power, thermals, - Absolutely. - and everything else. - Absolutely. - Right now that we're on the side of the platform, let's talk a little bit more. I mean, I love Xe2. Looking forward to this amazing core. But Xe2 is the core for the graphics on Lunar Lake.
- Yeah, it is. - So let's talk a little bit quickly about, yeah, about. - About how Xe2 impacts Lunar Lake, right? - Yeah. - So I think I have a curve in there you can pull up which is showing effectively a frequency diagram, right? This picture here, the Xe2 GPU performance picture you're looking at, says that we are 1.5X versus prior generation.
And the way to think about that curve on the top is Xe2's GPU performance. And you can see compared to Meteor Lake H and Meteor Lake U, we're always above the line, which means that at any power we're gonna deliver a higher performance. And it also is one line versus two lines.
So for Meteor Lake we used to have two different variations. We had a Meteor Lake H SKU and a Meteor Lake U SKU. Lunar Lake, there's just one implementation, and that spans the entire power range all the way from that high mobile part up to higher performance discrete parts, or higher performance sort of, you know, different application parts. - It's amazing how big of a wide range.
Before, you needed two, and now with this new whole architecture, you're able to span both of them. - Yeah, it does scale much better. And this is, of course, showing you the gen-on-gen performance versus Meteor Lake, which is pretty cool. And that's what it's all about.
You know, we can talk about, you know, XMX instructions, and vector instructions, and all the work we've done on our pixel backend or all our work on our caching structure, but what really matters and people care about most is what's the delivered experience. And I think a good baseline for Lunar Lake is about 1.5X prior generation. - So we have talked a lot about the graphics and Xe2-core, but there's also other blocks in there. For example, the display. So let's talk a little bit more about the display. - Well, a new display engine for Lunar Lake as well.
One thing that's gonna be different this time is we're not connecting the display engine to Xe2. So the display engine is gonna be moving forward on its own cadence, just like the media block is going to be moving forward on its own cadence. But the display engine for Lunar Lake does add eDP 1.5 for the first time, which is gonna bring a bunch of new features for how panels can interface directly to Lunar Lake. And it does things like brings much better support for variable rate panels directly to Lunar Lake.
So that's gonna be very, very cool. And on the media side, we're supporting VVC, which is a brand-new media codec. It allows us to do basically higher-quality images at lower bit rates, which is, you know, a very good thing.
And it also allows us to do dynamic compensation for streaming congestion. So there's some new techniques inside of VVC that will automatically adjust the transmission without sort of tearing down and resetting your media stream. So it's gonna make things like Netflix, once, you know, everything's deployed, look better. - No, that's great. Actually, I had a question for you with respect to this out of curiosity.
When you are implementing a new codec, it doesn't have to be VVC or anything else, what do you have to take a look at from an architectural perspective? Like, what do you have to do? - Wow, well, the first thing to start with is all these codecs are collaborations. So there's large standard bodies that are developing these together and everybody's bringing their best ideas. But there's a couple of techniques that have emerged in codecs. I actually did a video, maybe you can link it, about, you know, how do we do media codecs? And it's super complex and it's super cool, but there's a basic technique, which is there's a frequency compression that happens. So you're gonna take your image and you're basically gonna do some math to convert it to a frequency representation.
And then there's actually some interesting algorithms for how to prune data in the frequency space. The other thing you're doing is temporal compression. So you're looking at frame to frame how much of that data is the same. So the algorithms typically focus in those two spaces. One is how can I detect more reuse of pixels from prior frames? And maybe instead of looking at one frame in the past, I'll look at two.
Or maybe instead of looking across a smaller region, I'll look across a bigger region. So these techniques are always kind of looking for things that are high ROI, big bang for the buck, because these codecs tend to get more complex. And that means it's harder to compress things, it takes more energy to convert the original raw bitstream to this compressed thing, and it generally will take more energy to decompress it. So there's a trade-off between sort of compression quality and energy required to decompress. That's why we generally put all this stuff in hardware. So when you look at our compression, and actually, right now, it's just decompression for VVC, but that is all in hardware now.
- Wow, that's impressive. And it's in a lot of things in Lunar Lake. And, I mean, from XMX to the compression/decompression, to all the new codecs, to eDP 1.5. - And AI. A new NPU. We've got a new Xe-core with faster XMX.
So I would say that Lunar Lake is shaping up to be quite a part. - Yeah, I can believe it. I can't wait to play with it. And, TAP, thank you so much for spending so much time with us.
- Hey, happy to do it. - This was a great conversation. - Yeah, always happy to be here. - I appreciate it.
Thank you for having us. (upbeat music) (bright music)
2024-08-25 11:33