Intel Core Ultra Processor (Series 2) Architecture and OC Overview | Talking Tech | Intel Technology

Intel Core Ultra Processor (Series 2) Architecture and OC Overview | Talking Tech | Intel Technology

Show Video

- Hi, and welcome to "Talking Tech". I'm your host, Alejandro Hoyos, and today we're gonna be talking about Intel's latest desktop processor, Intel Core Ultra Series 2, also known as "Arrow Lake". (upbeat music) - I'm the Platform Marketing Manager for Arrow Lake. Essentially, what that means is I'm the central product truth owner for all things Arrow Lake encompassing the entire platform.

- So I'm the Execution GM, running the Arrow Lake program, the technical program manager for Arrow Lake for the entire family. We have it on mobile, starting 15 watt, all the way to, to desktop to 50 watt. So I'm managing the, the entire program. - I'm Dan Ragland, my title is Senior Principal Engineer. And we're here in Intel's Overclocking Lab in Hillsboro, Oregon. - What is Arrow Lake and what do you mean by "the entire platform"? - Yeah, so, you know, Arr-, Arr-, you know, when we talk about the entire platform, Arrow Lake isn't just about, you know, the CPU itself.

It's everything that encompasses the platform, whether it be the connectivity, you know, graphics that they'll attach to it, things like the chip set, different software features, different utilities that are gonna function with it. And so we sort of own, as the, the platform marketing group, we sort of own all of that in terms of conception and, and then eventually bringing it to, to market. - Yeah, that makes sense. And so we have, for this architecture, we have three different products, right? - Mm hmm. - We have Arrow Lake S, Arrow Lake HX, and Arrow Lake H. - [Greg] Mm hmm.

- And so how do we start with S and HX? Can you tell us a little bit more about that? - Yeah, so S and HX are specifically, Arrow Lake S is our desktop- - Mm hmm. - ...family, and then HX is the, the sort of enthusiast mobile family.

So both of 'em are enthusiasts, desktop and, and mobile products, sort of the top of the line for Arrow Lake. And they share the same tiles that are on the, each part, so they're both, you know, an 8 16 configuration and that kind of thing. So that's the, that's sort of the, the top of the, the product stack. (indistinct) - Yeah. - So it's a multi tile processors. - Mm hmm. Yep.

- Okay. So multi tile. What, what will be the tiles that we have here for, for our- - Yeah, so I mean, first of all, it's the first time we've had a desegregated part in the enthusiast segment. And so we've got essentially four tiles.

So there's the SOC tile, the IO tile, the GPU tile. Then, of course, the compute tile, which has our compute cores on it. - [Tomer] Moving from a mono, monolithic to a desegregation architectures have several benefits. It's, one, it allows you the, the flexibility of mix and match between different, different process technology and different tiles. More like, you know, if you want to, to uplevel the discussion, kind of a Lego things and you can replace one piece without changing the, the others which help in, in delivering a complete portfolio. And, and even for the future, if you want to enhance a certain tile, that's a flexibility that it enables.

In addition to that, using the, the desegregation technology, we apply the Foveros to that and Foveros technology, it's what allows us to, to take all the tile and put them together into a base die. That's, that mechanism allows us to, to save the, or to shrink the, the overall package size of the, for Arrow Lake, by about a third. So 33% smaller than what we had on the prior generation. - [Alejandro] So, okay. So can you tell us a bit more about the, the compute tile and the cores and what we have now? - Yeah, so what we have now on, on, on Arrow Lake, specifically S and HX, you know, we have eight P-Cores, that's the new Lion Cove architecture. And then 16 E-Cores, which is on our, our brand new Skymont architecture.

- Right. So we've seen the, these kind of brought over from, from Lunar Lake, right, but these are, have been tweaked or- - Yeah, so there's, there's some minor differences on the, on on the P-Core specifically. There's a, the higher L2 cache on the P-Core, but the architecture itself is the same. You're right. Yeah. - Okay. So all the different gains that we saw that from being completely redesigned from scratch are carried over to there. - Yes, exactly. Exactly.

So a lot of that's the same. When it comes to things like IPC, there's gonna be obviously some slight differences because this is a completely different power envelope - Mm hmm. - than Lunar Lake typically operates in. And so you won't see the exact same type of IPC stuff.

- We were able to, to get performance benefit at the lower power. So performance per watt delivered at the much higher scale in, in Arrow Lake than what we had in prior generation. - [Alejandro] So from the P-Cores, we have done, kind of make the choice of take away hyper-threading.

- Mm hmm. - And that's kind of what we were talking about earlier. And you were saying like that's because the cores have changed. - Mm hmm. - Can you talk a little bit

about that? - Yeah, so from the beginning of the core architecture design phase, right, where we had a set of design goals in mind and the two main things that, obviously there's more than two goals, but two of the main things that we were talking about obviously is what really matters at the end of the day is the performance for our end users. That's the most important thing. And so the other thing we're focusing on is performance per power, per area, optimization. And so when we look at things like SMT and hyper-threading, obviously that was a, you know, big deal when we had one type of core and we needed to maximize our multi thread performance. Now that we've moved to a P-Core E-Core configuration, that's not necessarily the case. And so when we think about what we're bringing to market and the type of multi-type performance gains, we're bringing over a 14th gen on Arrow Lake, we're really excited to show that when we can in a bigger way.

And so I don't think anyone's gonna miss that. (laughs) - Yeah. - Miss the hyper-threading. - The hyper-threading, right? So it's, it's, it's, it's gonna look pretty good.

- And one of the things that we have seen is that the performance of the Skymont cores are amazing. And one of the changes that we did for the, for the P-Core, for the Lion Coves is like that we went ahead and we removed the, the hyper-threading. Any, what's kind of the reasoning behind that and how's that gonna, gonna work? - Since you are able to add more efficient cores, the, and get the, the, the performance core at lower power, that will be a much better trade off as opposed to have a, a performance core running on a, on a multi thread with higher power consumption. - You know, one of the things that we're also talking about today, other than, you know, the, the raw performance is the fact that we're delivering this, you know, performance increase, but we're also, you know, 30% less power than, than than 14th gen. And one of the things that happens when we do these type of redesigns and we are deciding what to keep and what not to, to keep, is make very important decisions in terms of that IP, you know, and staying lean there is really what helps us deliver that power improvement. - And that also has to be done on, I guess a little bit like when, when it comes to the E-Cores, right? Because that's kind of the new- - The overhaul was on both the, the P-Core, E-core, right? We've moved to a modern design database.

It's gonna make it a lot easier to sort of iterate on these designs as we go forward. So that's really important to us. And the, the type of IPC gains we're getting, both in single thread and, and, and, and multi thread are really, really good, especially on the Skymont E-Core.

- So we know that we have P-Core and that each P-Core has a cache and also the E-Cores have also a set of cache. - Right. - And there's this whole hierarchy. So can you explain this? - Yeah, I can break that down.

Yeah. So the, you know, we have the L3 Cache, or, or L three Smart Cache, and that has 3 to 6 megabytes, right? For the, for the L3. In terms of the L2, it's obviously split between the P-Core and the E-Core. Each P-Core has three megabytes of, of L2 cache, which is an increase from our, our previous generation.

And then each E-Core cluster, which is four E-Cores per cluster, will have four megabytes of cache. - [Alejandro] Of cache. - Mm hmm. - [Alejandro] All right.

So we're talking about E-Cores and P-Cores, but can you tell us a little bit more when it comes to performance, when it comes to IPCs, where are we in the different types of cores that we have? - Yeah, so from a performance standpoint, especially from an IPC standpoint, from a, a line code perspective for Arrow Lake S and HX, we're seeing about 9% IPC. But the, the thing that really gets me excited and the team excited is some of the IPC gains we're seeing on the E-Core, on Skymont. And this will lead into a lot of the multithread performance that we will eventually be showing off and seeing in, in, in, in the future. But from an E-Core perspective, right, there's obviously integer and flowing point to major things in, that we calculate. - Yeah. - And with regards to the IPC from a single type perspective on integer, we're seeing about 32% IPC improvement over- - [Alejandro] Improvement. 32%.

- [Greg] Yeah. And on floating point, we're seeing around 72%. - [Alejandro] 72%? - [Greg] 72% IPC improvement. And that's, you know, we took a, a number of workloads, about six or seven workloads and, and the average was about that. That means some of those workloads are higher.

So, and obviously some were lower, but, you know, it's, it's, it's really incredible what we're seeing. And then from a multi thread perspective on the, the Skymont E-Core, we're seeing very similar numbers. So again, 32% on an energy side, floating point side, about 55%, those are very large IPC gain numbers and we are- - Those are huge. - Yeah. (both chuckle) And so they're very, so we're really, really excited about that.

And again, this all comes back to being able to deliver incredible performance, but also save a lot of power. So, you know, keep in mind, you know, when we talk about these IPC gains, we're showing off these, you know, really large numbers, but it's not really at the expense of kicking up the power. We're still gonna be, you know, up to 30% lower power than 14th gen.

So this is- - Is- - Yeah. - So all these are- ? - It's probably my favorite part of this whole story. (laughing) - Performance, and especially when they're double digits. - Yeah. Yeah. - It's great.

And this is all compared to the previous gen, to the 14th gen? - All compared to the 14th gen. - No, this is amazing. You can see like all the different changes that we have. Well, we have designed from scratch all the cores, and so you can see now how it's paying out. - Mm hmm. Yeah, yeah.

- [Alejandro] Oh, that's great. - Really excited about it. - So now that we have the P-Cores and the E-Cores and it's a different setup when it comes to Arrow Lake compared to, to Lunar Lake.

- [Greg] Mm hmm. - We had fabrics that tied 'em together. How is it done here in Arrow Lake? - Yeah, so on, on Arrow Lake, it's, it's similar to what we've done on products like Raptor Lake and Meteor Lake, where the E-Cores or the 16 E-Cores are on the same compute tile as the, as the, the P-Cores. And so they're not sort of, you know, off the main ring, so to speak.

- [Alejandro] Right. - Where you can run the, the E-Cores on the, the thing that we have in Arrow Lake, we can run the E-Cores without having to really power up the P-Cores. And that's really what's generating a lot of the power savings there. Yeah. Not the case on, on Arrow Lake,

but on Arrow Lake it's, it's much more about efficiency at power. - Right. - When they're fully utilized and, and we're running on the CPU. - So you can say that they're like in, in the same, in the same ring architecture or like at the same power ring architecture.

- Yeah, exactly. Mm hmm. - [Alejandro] Okay. And the other thing that it was a little bit different than I saw is like there's no, there's no low power island, right? On Arrow Lake. - There's no low power island on Arrow Lake S and HX. - Okay. - Right? So Arrow Lake H will still have a low power island and we're gonna talk more about that at a later date. - Yep.

- But yeah, no, S and HX doesn't have a low power island, as is pretty typical for this segment. - So, what is thermal envelope set up to for, for Arrow Lake? - Yeah. So, so, so Arrow Lake is a, it can scale all the way to 250 watts.

So we have a PL1 that we're running at 125 and it can go to, to 250 watt, very similar to, to to, to the prior generations. We are maintaining the, the, the high TDP that, that we want in order to get the, the performance associated with that. - So we have this complete new processor, right? This means also a new socket.

- [Tomer] Yes. - Why is there the need to, to change sockets? Like, previously we have an LGA 1700, now we have, the new one is an LGA 1851. Why, why the need to change? - Yeah. You know, we would love to, to maintain compatibility forever. That, that, that would be a dream.

But at the same time, you know, we want to drive innovations. And when you bring new features, when you put new cores, when you, you bring an integrated NPU, when you drive certain level of innovations, you have challenges that coming at the, the package level, the signal integrity and others, sometimes you need to route more pins, and, and that's, that drives us forward change, that, that's in order to maintain the flexibility of bringing innovation, bringing new features, we, we, we, we, we sometime being forced to do the change, we are trying hard to maintain at least two years compatibility. So we're trying to avoid gen over gen year over year changes. But unfortunately, changes are, are required to, to bring innovation to the market. - We were talking earlier about the, the now how it's configured, the E-Cores and the P-Cores.

How does the thread, the workload threading or like the thread director will be doing that? Is it the same as Raptor Lake or-? - Yeah, so there's things that are carried on from Raptor Lake that are the same. Some things are new. So there's some things in Thread Director that we're bringing to Arrow Lake compared to 14th Gen that obviously we didn't have from 14th Gen, first of which being is we now have E-Core Telemetry. Before, the feedback to Thread Director was just, "Hey, how's the IPC on the, on the E-Core?" Now, we're getting much more full sort of telemetry that we have in the P-Core. On the P-Core side, whereas we did have telemetry feedback coming from the P-Core in, in the, in the 14th gen, we've, I wouldn't say, "overhauled", but we've improved a lot of the metrics there and added a few more in terms of the telemetry feedback. And then the last thing that we're doing is we're being much more smart, smart on core assignment to workloads.

And so one of the things that we're, we're talking about is, sort of, efficiency containments, where we can lock certain workloads that are running really well on specific, very specific cores. And, and that's somethinD that the Intel Thread director is helping to do. So it's, it's, it's, it's about, you know, there's efficiency gains we're bringing to the market with Arrow Lake, and then this feature on Thread Director is really just tweaking out a, a few more, you know, percent here and there when it comes to efficiency and performance. Mm hmm. - That's pretty, I would really like to see all those, all those changes that we haven't seen in a while. So.

- Yeah. Yeah. Yeah. - [Alejandro] It'll be interesting to, to see that. So, okay, so moving on to, so that's kind of the overall of the core tile.

Let's talk about a little bit about the GPU tile. So tell us a bit bit more on, on that. - Yeah, so we're, we're bringing the Xe core architecture to the Enthusiast segment for the first time. And it's, it's actually the same architecture that we have on Meteor Lake, but obviously Meteor Lake didn't have it, the enthusiast separate products. And so really excited about that. Compared to 14th gen, we're gonna be over greater than a 2X improvement on graphics.

It's greater than 2X the compute capability as well from a graphic standpoint. So we're really, really, really excited about that. Yeah. - No, that's been interesting.

So, here's when we do a little bit of a, I guess, a fork on the road when it comes to the, the current products we're talking about. So we have the HX and S and they're gonna be using the, the Xe, right? The current? - Mm hmm. - And then for the H, it's a little bit different, - H is a little bit different, yes. So, on the S and HX, it's a four Xe core, all right, with the Xe architecture, great architecture. On H, it's, it's what we're calling "Xe with XMX".

And so there's some nuance there. One is, obviously, it's larger, so it's, it'll be an eight X E core instead of just four on Arrow Lake H. And it also will have XMX arrays.

So what these are, are systolic arrays that are attached to each and every vector engine. So there's actually 128 of these arrays in the, in the graphics core. And these are AI workhorses, right? They will greatly improve the AI performance.

So on Meteor Lake H for example, was capable up to about 18 tops from an integer eight imaging standpoint. - [Alejandro] Mm hmm. - [Greg] On Arrow Lake H, we'll be able to go up to 77. - [Alejandro] Right. - [Greg] So this is four times, four times. (laughs)

- Yeah, that's pretty big. (chuckles) - The, the sort of raw capability from a, from an AI throughput point of view. There's some other changes as well. You know, we have twice the cache size and it'll have to, it'll also be twice the ray tracing capability as well compared to, to Meteor Lake.

Yeah, so cache size, ray tracing. But the, the, the, the big thing that's really exciting us and exciting some of our partners as well is the, the how capable from an AI standpoint the, the graphics tile is. - [Alejandro] Yeah. And I guess staying a little bit on the AI subject, this is the first time that we're seeing- - An NPU. - [Alejandro] Yes, exactly.

- In the enthusiast segment. Yeah. For Intel. Exactly. No, this is something that's really, really a game changer. And there's a, a, you know, it's not just about running large language models and things like that, which is very, very, very cool. - Yeah. - Obviously. (laughing)

But it's, it, it's also about, you know, you know, you know, a lot of these designs and, and systems that these products are in are gaming rigs, for example, right? And we're seeing the NPU and, and AI workloads work their way, work their way into sort of every aspect of life. And so, for example, you know, we've got something that we've been talking about where, you know, we can offload some of that work from, let's say, a gaming and streaming perspective, right? If you're doing, you know, background removal - Right. - or anything like that, AI workloads, like that type of thing while you're streaming, typically that kind of stuff runs on a discreet GPU that you're also gaming on, right? - [Alejandro] Mm hmm. - Well now, you can run on the NPU on your big gaming rig and you'll get 10, 15% better performance from your gaming performance.

- Yeah. I mean that, that's good. - That's pretty good. (laughs) - That's really good because then you- - I'll take 15% gaming performance. - Oh, I'll take double digits anytime.

- Yeah, exactly. Yeah. - [Alejandro] And it's, it's pretty cool 'cause then your GPU then actually can be dedicated. - Exactly. It could be dedicated to what GPUs were- - Yeah. - Always meant to do, right? - Meant to do. - Was to play all of my favorite games. (laughs)

- [Alejandro] Exactly. - Yeah, exactly. - [Alejandro] So one of the things that I wanted to ask you, going back to the, to the graphics chat we're having is why that separation? Or why do we have Xe and XMX on H, but it wouldn't have that on S or HX? - Yeah, so with S and HX, these are our enthusiast top of the line type of systems, right? These are in your biggest and baddest gaming designs.

These are in big, you know, you know, mobile workstations and desktop workstations and stuff like that. So, you know, one of the things that happens in these designs is the vast majority of 'em have discrete graphics. And so there's really no requirement or need to have anything larger than something like a four Xe core solution in those. Arrow Lake H, on the other hand, satisfies a lot of different segments as well. You'll, you, you'll get some designs with, on Arrow Lake H that have discrete graphics, especially in gaming and things like that, but they're also like the premium thin and light and powerful type segments where you wanna run just on your integrated.

And that's where we need that, that performance and power out of the, the graphics tile there. - Ah, that's awesome. Yeah, that, that, that completely makes sense. - Yep. - So going back to memory, so we haven't improved the, the memory, the bandwidth and make any changes there? - From the system memory point of view? - [Alejandro] Yeah.

- Yeah. So we've actually in increased our, our, the memory speed that we started up to DDR5, 6400. It comes in all forms.

UDIMM, CUDIMM, all that stuff. We also obviously support dual bandwidth memory support as is typical for the segment. And then we're actually offering ECC support for more secure, security conscious designs. - [Alejandro] Designs. - Mm hmm.

- A really cool thing, a little history on, on Arrow Lake. We first started talking about this back in 2019, working with Intel's memory architects, the Overclocking team. And we set some very aggressive goals.

And for the first time ever, we set these, what was that, five years out goals, specifically we created a new tiered system. We have a, a tier one, two, and three. Internally, we have goals and targets that we're all aligned to achieving. And so we've said we want to reach these different goals. Some of them are for the average everyday XMP customer.

Some of 'em are for the more aggressive air cooled overclocker. And some of 'em are for those folks that are out there setting world records and reaching new heights. Super proud of the hard work done by the memory architects at Intel. They put together an amazing new memory controller, brand new from the past. It's, it is capable of reaching higher heights overclocking wise. We have the ability to support technology called CUDIMMs.

And, and from an overclocking perspective, we're seeing just incredible results. So a CUDIMM is very similar to the traditional UDIMM but the, one of the main differences that there is a, a re-driver, what we call a clock driver, internal to the memory module. And so it will generate essentially its own clock, nice and clean.

And that can enable higher speeds, especially for those kind of mainstream and low frequency XMP levels. And regardless of some of the characteristics of the motherboard, that could be an impediment. So like even lower price point boards could actually really benefit from this.

- All right, so this, and it is an enthusiast product, so of course we have to start talking about overclocking. - Yeah. - What is new? - Yeah. Yeah. - What does Arrow Lake bring new to the table? To the overclocking table? - Yeah, okay. So, you know, we're not really gonna talk about, you know, how much frequency- - Yeah, exactly. - Yeah, yeah, yeah. But we are bringing a ton of new features into how we overclock.

- Arrow Lake has a tremendous number of new overclocking capabilities and features. See, having a new architecture, the new SOC, a multi chip design, presents new opportunities for overclockers. So we have a lot of new features.

Right off, just knowing that we now have multiple chips, how about the ability to overclock the interfaces between the chips that are talking to each other? So we have a new die to die overclocking feature that's really exciting. Die to Die is an interface that is between our compute die and our SOC die where the memory controller resides. And what's cool for over clockers is now we're allowing you to overclock that interface, creating a whole new opportunity to improve bandwidth and further tune your system. We have a new feature that allows you to configure the processor's over clocked ratios in 16.6 megahertz steps. In fact, they can be larger.

But 16.6 megahertz is interesting because now overclockers can move in sub hundred megahertz steps. So now, overclockers can configure each core. For P-Cores, in 16 megahertz increments and their E-Cores, each group of four can be also configured in 16 megahertz increments for those top frequencies.

So imagine you reach the end, what you thought was the end, 'cause you couldn't go another a hundred megahertz up, well, with this, you might be able to go 16 or 32 or so on to find where that absolute max threshold is for your system. We've got a new dual B clock architecture. Now, everyone knows about B clock going back 20 years. You know, there's generally always been a single B clock input. You know, that's your base clock. Now, we have dual B clock and that is helping us because you can now have a separate clock frequency for your compute die versus the rest of the chip, the I/O die, memory, et cetera, which is really important because you're, with, when it comes to B clock overclocking, you're always at the mercy of the first subsystem to fail.

So by being able to separate that into dual B clock, you essentially can have one section of the, of the die, of the processor, at a higher frequency than others. You can have, for example, one external clock and one internal clock, or you could actually control them both internally. So it gives you a lot of flexibility, a lot of choice. Now we have the ability to do fabric over clocking on Arrow Lake, another opportunity brought by the new architecture. This essentially can improve the communications within the SOC die itself.

And when you're doing memory overclocking, you, you may choose to overclock the fabric, die to die. And then of course, the memory frequency as well to ensure that full bandwidth without bottleneck exists. - We also have something called DLVR bypass. So this is one of the more higher end, sort of higher tech type of features.

It lets you totally bypass our automatic voltage control. - [Dan] DLVR is Digital Linear Voltage Regulator. And it is a new power delivery architecture for desktop processors. And it brings with it some, some great efficiency benefits. So that power efficiency is there.

And it, it can allow the opportunity for overcclockers to essentially have per core voltage. So for each P-Core, you can have its own voltage and for each group of four E-Cores, you can configure a voltage for those as well. So it's added flexibility.

You don't have to run them all at that same fixed voltage. For our extreme overclockers, people looking to go after world records, that top 1% of all consumers, they can actually bypass DLVR. Again, it's, it's overclocking, it's extreme overclocking. We expect to see that done with like, liquid nitrogen, but you can bypass it such that your voltages can go, you know, even higher and enable, you know, those, those maximum LN2 liquid helium type frequencies.

Arrow Lake's new overclocking features are available also within the extreme tuning utility, specifically the 16.6 megahertz ratios, the dual B clock, viewing the memory telemetry in a few settings as well. All these settings can be found in the XTU user interface. So for me personally, one of the things I'm excited about with Arrow Lake, really, are the new 16.6 megahertz ratios. It's actually hard to pick a favorite. So I'll just also have to say exploring that die to die interface, overclocking, and the fabric overclocking as well.

And I think for me, on the memory overclocking side, I'm gonna get some modules that can run at that 8,000 mega transfer range, but that have headroom so that I can push them a bit further. I'm gonna really go after that 9,000 plus range over time. So we'll see, see what I can do there.

But I'm just really excited to, to really get a handle on and explore these new, really, customizations that I can make that were never possible on prior generations. - What is the thing that you're most proud of when it comes to the development and, and launch of this product? - It's really the team, team dedication. Arrow Lake was a product that we execute to, pretty much around the globe. We had major, major sites, you know, there are many other sites Intel spread around, but there were four major sites contributing to, to this amazing product. We had, we have team in, in, in the US, we have team in, in Malaysia, we have team in India, and we have team in, in Israel. And those are major teams, different IPs, different functions.

And if you look at, at the, at the time zone associated with that, it's just crazy. - Yeah. - But yet, the team put it together. There, there was really creative ways, automation and, and very, and hard work that really bring all of them together and and deliver an, an amazing product. So the, the team of effort and the dedication and the, the, the one intent that came across in this product is, is by far the the things that I'm most proud of.

(upbeat music) (bright music)

2024-10-15 13:30

Show Video

Other news

AUTONOMOUS MOBILE HOMES YOU CAN LIVE IN FOR YEARS 2025-02-12 07:08
The Future of the Automotive Industry: Innovations, Technology, and Sustainability 2025-02-10 23:05
ASML's Record Surge, Microsoft Probes DeepSeek Code | Bloomberg Technology 2025-02-06 13:17