Welcome to the Siemens EDA podcast series on 3D IC chiplet ecosystems brought to you by the Siemens Thought Leadership team. In our recent podcast on 3D IC topics, we talked about chiplet ecosystems, the design workflow needed for 3D IC, the current state of 2.5 and 3D SIP design flows, and the evolution required to support the design community to develop these 3D IC-based devices. Today, we will discuss some of the front-end architecture, RTL level design, and verification aspects of the 3D IC flow. I’m pleased to introduce two special guests today: Tony Mastroianni, who is the Director of Advanced Packaging Solutions at Siemens Digital EDA, and Gordon Allan, Product Manager for Verification IP solutions, also at Siemens EDA. Welcome back,
Tony, and hello, Gordon. Thank you both for taking the time to talk with me today about 3D IC front-end architectural aspects. And before we dive into the discussion, would you both mind giving our listeners a brief description of your current roles and background? Sure, John. My name is Tony Mastroianni, and I’m responsible for developing our 2.5 and 3D IC strategies and workflows at Siemens EDA. My background prior to Siemens has been primarily in IC design, and mostly project management over the last several years. I was involved in advanced
packaging flows at my previous employer, which was a fabulous semiconductor. And there we were developing very complex integrated circuits for various customers. I was there about 18 years. In the last three years or so, we started getting involved in 2.5D designs, incorporating HBM interposers, as well as a design where we actually split a chip into two dies that were integrated in a 2.5D package. While I was there, it became apparent that our traditional design methodologies needed some major enhancements. So, for the last three years, I was working there, developing our new integrated packaging and IC design flows. Started about a
year and a half ago at Siemens, as I mentioned, working on the 2.5 and 3D workflows. Hey, John, thank you for the introduction and for inviting me to talk with you today. My name is Gordon Allan, and I’m the Product Manager for our Verification IP portfolio here at Siemens EDA. My background is in SOC design and verification, stretching back to the early 1990s, where I gained the broad knowledge from spec through to silicon. But my main area of expertise is verification. I was one of the architects and authors of accelerating UVM. And I’ve been with Siemens EDA for the last decade, bringing solutions based around SystemVerilog and UVM to our customers.
Thanks to both of you for sharing that stuff with us. Let’s get into the front-end design topic for 3D IC. We’re hearing a lot of talk in the industry about these technologies around 3D IC and the technical challenges. And many think these technologies and flows are dominated by physical aspects, packaging technologies, thermal stress, mechanical stress, all of those great problems that we have to solve and they were putting flows into place for. But what about the front end of the IC design process, chip architecture, and RTL design, and RTL verification? Is that even relevant to this? And why are we talking about this? Yes, John, this is very relevant. Traditional IC design scaling has been accomplished primarily through IC technology scaling over the past 30 years or so. This process is referred to
as Design Technology Co-Optimization. But as the IC technology scaling has dramatically diminished with Moore’s law, a new process named System Technology Co-Optimization is extending this design scaling. This System Technology Co-Optimization, referred to as STCO, is about enabling architectural and technology trade-offs early in the system design process to achieve high-performance cost-effective solutions and reduce timeframe. Predictive modeling is a fundamental component of STCO that leverages high-level modeling tools during the planning phase to home in an optimal solution. 3D IC design implementation requires co-design and co-optimization workflows. But before you jump into the back-end packaging flow,
this decomposition needs to happen at the architectural level. But how do you do that? How do you even know what different options are available? And whether they’re valid for your requirements? And if so, how do you home in on the right microarchitecture, where you’re starting to partition things into blocks? Now, with 3D IC and chiplet ecosystem, those blocks may be blocks within an ASIC or they may be separate chiplet blocks integrated into a package. These templates could be off-the-shelf devices, or they could be a full customer basic design.
So, to determine which microarchitecture is best for your application, you need to do some high-level predictive analysis quickly and find an architecture that meets your specific product requirements. Those requirements or priorities may include power, performance, physical size, or footprint of the product you’re building, nonrecurring engineering cost and the unit cost of the devices you’re building, and time to market. So, some customers will be designing at the bleeding edge of technology, while others will be looking for more of a “Lego block” type of approach where they can just put these chiplets together and snap them together like a Lego block to optimize the overall cost of these complex system and packages. Now, going the chiplet route needs to be considered as an additional design paradigm. So, let’s look at those multiple things, actually all of the usual concerns for any normal single-die SOC project, architectural analysis; and then factoring in manufacturability, that would be test, package, thermal, stress analysis; and the functional verification of your design your interfaces, memories, processors, integration; and then there’s physical assembly and verification, which is floor planning, timing, bandwidth, signal integrity, and power. And we’re looking now at async interposers, packaging, and PCB techniques. So, for the microarchitecture
analysis and optimization, the idea is identifying your die-to-die interfaces and your typical components and the associated data interfaces, capturing those viable design scenarios, and then run your functional simulations and do some initial high-level analysis of the power of thermal throughput. So, this is the predictive analysis we need to do upfront. There is an analogy here, I think at the 3D package level, it’s like doing upfront floor planning. And just as we must do SOC pad ring design today, and now we’re talking about synchronizing multiple pad rings within the package. And in my world, which is RTL functional verification,
there’s the need to have functional verification of the off-chip, and die-to-die interfaces with bandwidth and latency, overall design behavior, performance, throughput, and so on, all of these need to be measured. Ideally, we’d like to be able to do as much of this analysis as possible at this initial high level before the serious RTL development and integration begins. So, the idea is we will go in and start selecting the packaging technology and then the chiplet mapping and interconnect, make some choices there, understand that high-level floorplan and pad rings. And then we would do the mapping to target implementation where we can look in more detail, verifying our PPA. And the tool flow can enable more detailed floor planning and some rough routing of some of the channels to do some preliminary signal integrity analysis and so on.
And there we’re getting towards the collaboration between the system designer or RTL architects, and package architects and design teams who would assess those deeper back-end concerns. Okay, you’ve got me convinced it’s necessary to get started with the 3D IC process upfront. So, let’s look at the front-end design and verification aspect of 3D IC. What do
the SOC architects, design leads, verification leads need to know as they take steps into the new flows for their upcoming projects? There are several areas here that we can discuss. One is architectural decisions around the packaging partitioning reuse that affect the ICs functional architecture. The second would be interface connections; how to communicate from die-to-die and how to design that communication channel. Third would
be interface verification; how to integrate and verify all of those die-to-die connections using standard protocols and memory interfaces. That’s right. In any 3D IC project, certain packaging and partitioning decisions will be made upfront from fixed criteria and experience, and others will be decided during the course of the architectural exploration and definition phase by evaluating several options and choosing one that meets the evolving requirements. Still, others will be deferred until the project has sufficient technical unknowns resolved to finalize these decisions. You could see that the introduction of this new technology has an impact here. Actually, it brings a great opportunity that was not previously available to chip architects. But with that, comes a need to consider the impact on the whole design flow, manufacturing flow, and associated cost. Right. Packaging and partitioning is hard
enough in a single-die SOC flow. But now we’re introducing chiplets as solutions for reuse, for integration, for handling disparate power or other technology aspects. The chip architect needs to evaluate and narrow down the options early on in the process. The key messages here, you’ve got a lot more tools in your toolkit and one more degree of architectural freedom that you didn’t have before. It’s a problem but there’s room for optimism here, it’s a good problem to have. We believe that 3D IC is a topic that every front-end design lead or SOC architect needs to be
informed about today. In fact, there are two new things just announced in this last week, which are relevant here and which will help change the game for 3D IC technologies. I want to mention them both during our discussion today. First up is the new industry consortium announced on March 2nd, the Universal Chiplet Interconnect Express standard, or UCIE for short, rhymes with PCIe. This has several major market players bringing together the best of PCI Express
Gen 6 technology, and Compute Express Link or CXL technology into a proposed interface standard for die-to-die communication. UCI Express will leverage the PCI Express Gen 6 interconnect and physical layer to provide a known quantity, robust, fast die-to-die interconnect. Letting you as architects make partitioning decisions your own way, knowing that the interconnect can be a solved problem. We anticipate that this will help with the expected ecosystem of chiplets available for integration and packages. And of course, UCIE joins a list of other contenders for that die-to-die interconnect, XSR, USR, AIB, and others. But we really see its potential here. So, clearly, there are a lot of concerns to juggle early in the chip definition and development project before we go off and write RTL and verify it. You mentioned that in some respects,
this is just like a normal SOC design process, but just more. Can you comment on some aspects that are particularly interesting or enabled by the adoption of 3D IC and chiplet technology? One area of interest, I think, is functional safety and redundancy. We see in certain markets, like automotive, redundancy is a big deal because it’s a harsh environment. So, you need chiplet redundancy in hardware and software systems if that chip is going to drive the car. So, they’re going to use different architectures, so if one fails, they have a backup. But even on the interconnect and 3D IC, these things are like tiny little bumps on a substrate or interposer. There are a lot of vibrations going on in the car, so there’s going to be some finite
risk of mechanical failures and so on, on those connections. This necessitates smart technologies baked in with Redundancy and Repair techniques, or R&R for short, that are supported today, for example, with HBM memories and in some of the other interconnect protocols that’s going to be required for automotive applications. That’s a good point. In harsh environments, such as an automobile, the internal die-to-die interconnect on the interposer, which connects the triplets together, can fail from either electrode migration or chemical stress. And
for these types of designs, redundant routing channels can be deployed. So, test hardware and methods are available to detect and actually repair these defects by rerouting the failed channels to redundant channels that are designed into the interposer. This approach can also detect and repair memory defects. Additionally, as you mentioned, redundant internal blocks or
even chiplets can be deployed in the system, and then swapped in on the fly during the operation of the device if one of those components or devices is not functioning properly. It’s interesting what multiple chiplets bring to this problem. Some aspects of functional safety should be considered upfront at the architect stage. There are multiple levels, the concept of redundancy within the package or within the die, as well as with the package multiple levels of potential redundancy. So, a chiplet approach could help deal with your multiple redundant elements which are now often separate chiplets, so they’re unaffected by point failures in common mode concerns like power, thermal, mechanical. If one of the chiplets fails, the other two are going to survive and still drive the car. That’s a very interesting conversation, guys. So,
what are you, in EDA, doing to help front-end chip design and verification teams who are looking at partitioning, and interconnect, or architectural definition, redundancy? That is, what flows and solutions do you have here? We’ve heard a lot in other podcasts about different parts of the floor. My own specialty is in verification IP. We provide solutions for PCI Express Gen 6 for Compute Express Link, CXL, for advanced memories, DDR5 and HBM3 memory interfaces. And we have customers across multiple markets, processor makers, memory leaders, SOC makers, aerospace and defense leaders – they’re all interested in this technology. One of our areas of strength is the automation we provide to auto-generate test
benches and let design verification teams get up and running in minutes. This kind of productivity enables architectural exploration of the sort that we’re talking about here, and will give early confidence in the solution space that architects are looking at. And you can be sure that we will be providing solutions for the emerging 3D IC interfaces such as UCI Express as part of Siemens EDA overall end-to-end flow for 3D IC. As an industry, we’re making a bet on this,
in that we can tip the balance financially in technology and help customers achieve more complex designs by disaggregation, decomposition, by deploying STCO. And with this disaggregation, we can have multiple teams, each working on their respective domains, making verification easier to divide and conquer. So, we help those teams make their architectural trade-offs and their early experiments and exploration. As Gordon said, we can provide verification IP and workflows that
help to automate the rapid generation of these scenarios for exploration. And then, ultimately, the rapid generation of more detailed scenarios once we’re honing in on a chosen architecture, so that we can go into the packaging implementation flows with all of this upfront design completed, including the architectural design and the upfront RTL verification, knowing that we have a solid solution that’s going to meet our functional goals going into the packaging flow. So, earlier, you mentioned that there are two pieces of major news this past week. We talked about the UCI Express already. I’m curious, was the other one related to Apple’s new M1 Ultra processor chip announcement? Is that causing the stir? Yes, indeed, it was like a one more thing, a major 3D IC announcement from Apple that we’ll talk about. But first, the audience listening to this podcast might be in all kinds of different end-use markets for their ICs. And the question
is not which markets are relevant for 3D IC, it’s becoming more like which markets are not relevant for 3D IC. It looks like this technology is applicable to multiple markets. We’ve been talking with customers from [17:55 inaudible] and space, all the way to high-performance compute, and consumer applications. As I mentioned, 3D IC is a topic that every front-end SOC design or verification team should make themselves aware of. The innovations that we see from the market leaders, such as Apple, will surely ripple down to all of us. Consumer applications – so, take a look inside your Apple Watch or your i-device, look at the new Apple M1 chip family. They all use
chiplet and die-to-die technology with wide memory and package, for example. And in larger desktop devices like the Mac, we see the new M1 Ultra processor chip just announced last week by Apple, that uses eight advanced memory chips in the package for a total of 32 fast memory channels, that’s 800 gigabytes per second of memory bandwidth. But more importantly, for 3D IC, the main SOC in that package consists of two of Apple’s existing M1 Max dies, connected edge to edge by a silicon bridge. They refer to it as an interposer, which hooks up over 10,000 signals that were pinned out along one edge of the existing M1 Max die to enable this doubling of processing capacity. That thing is about 47 millimeters long in total.
The interconnect alone is 2.5 terabytes per second of bandwidth from die-to-die, which is more than four times the bandwidth the other multi-chip interconnect technology could provide. And what’s interesting about this approach is it looks like Apple architected it that way much earlier in the M1 design process with the floor planning and interconnect already layer in the M1 Max chip that was essentially future-proofing their design. And now they have
one of the fastest integrated processors and GPUs on the planet. So, these boundaries are being pushed by the top of the market, but we can be sure that the technology and design flows will trickle down and become more accessible to all, and that’s part of our job as EDA. We’re going to see a healthy mix of proprietary and industry-standard solutions on the menu. So, we’re seeing a lot of innovation coming from Apple, Intel, and consortiums, and foundries. What secret sauce is EDA working on for the front-end? And what are the flows and solutions that will grow up with this emerging technology to help the front-end architect and RTL teams? Interesting question, John. As we previously discussed, we offer the ability to capture alternative designs scenarios, leveraging chiplets and 3D IC technologies, and the predictive models and workflows to assess each of those scenarios. So, as the complexity of these systems increases,
this can be a daunting task to generate and assess a multitude of scenarios. So, this challenge lends itself very nicely to leverage machine learning technologies, to automate the generation assessment and optimization of the solutions to hone in and best meet the design requirements in a more automated and timely manner. So, I think this is a key area of innovation that will extend this technology adoption to a broader set of customers beyond the current small set of very advanced users that are in this space today. Absolutely. And we encourage all of our audience
to invest time in learning this from day one. And we’re really excited to be investing to provide you with the tools and the flows to help you do that. That’s great. I want to thank you, Gordon and Tony, again, for another highly informative discussion on front-end architectural design verification considerations in this episode of our 3D IC series. We’re all out of time today, but we’re looking forward to the next 3D IC podcast with you. Again, thank you both. And we want to thank all of you, our listeners,
for listening to our podcast today. Yes, thank you, all. And thank you, John. Thanks, John, for hosting. And Gordon, great discussion.
2022-12-19