The power requirements for AI hardware continues to grow and that requires innovations on all levels. The architecture level of the power flow in the data center, a power supply or on component level. We are even innovating on quantum computing hardware, which could be the next big thing in AI.
Hello and welcome. This is the podcast for engineers, the podcast. You just have to listen to if you're interested in what's going on in the semiconductor market. I'm your host, Peter Balint and today we are joined by Richard Kunčič , who oversees the business line responsible for power supply units, battery backup units, intermediate bus converters and much more in the realm of powering AI.
So welcome. Thank you Peter. It's great to be here on your podcast. First question I have for you is we're seeing this boom in power, which is needed for powering AI and for example, a few years ago, we could look at a server rack in a data center, and it consumed maybe 60 kW of power.
Today, it's a different story. We're talking about 150kW of power and more. So what does this energy, explosion mean in terms of the trends that are happening right now? You're right. You're referring to increasing power levels in server racks and data centers.
And I'm I will be happy to, to dive into that topic, but I would like to give it a little bit of framing because we here at Infineon, we believe that what we see now with respect to AI adoption and and in particular, also the increased server and computing capabilities needed for AI is really just the beginning, right? What we see here is really a convergence of technologies. It's software on one side it's hardware evolution on the other side. But it's even, you know, all about decentralized energy sources. And all these three things combined together converge together and create what I would call a exponential growth of technology and technology adoption.
And I believe that this exponential growth in this exponential era will really and truly transform our society within the next few years, whether we like it or not. Yeah. And what we see right now with respect to, the AI that every one of us can use are really just, you know, the first, opportunities to use AI assistants like, you know, Grok, DeepSeek or AstraAI for example, if you want to help yourself with math exams, it's cool. But the potential of AI is much broader. Intelligence so far was basically confined, to humans, to us as human brains.
And now it's becoming available. It's becoming abundantly available everywhere. And the AI models are improving very quickly. Right. Large language models have now tested up to with an IQ of up to 150, while just maybe nine months ago, it was around 90.
So you see this, this exponential improvement of solutions. And now just add the advancements in robotics and you can see where basically this is, exponential age is, is taking us. Yeah. It's fascinating, but what is the role of energy in all of this? That's a good question, Peter, because I believe energy and intelligence are very closely linked. On one side, smarter AI can, help us to tap better into energy sources, in particular decentralized energy sources.
And this is important to bring also the energy sources closer to the data center. And on the other side, more energy allows us to build up bigger data centers and even our, you know, train bigger models, which are then more, more powerful. But with the advancements of AI, what we see is that the demand for electricity grows, in fact, over proportionally.
And that is a bottleneck right now for, for the build out of AI. And so there are multiple ways to address it, but three of them are mentioned, very often. One is more and more decentralized renewable energy sources. That's point number one. Point number two is working on really the the software and and here DeepSeek the models showed that with smart preparation of, of the data, you can really reduce the amount of energy needed to train the models. And the third one is about really improving the hardware in a data center, in particular, everything related to the power flow in the data center Then this is where Infineon can really make a great contribution, because we're looking at the complete power flow from grid.
And then with all the conversion steps down to the core. So the to the AI chip. So we are on one side, enabling better power supplies in this conversion steps.
And on the other side, we're even thinking about what could be the best architecture approaches. So the best conversion steps within the data center to make the whole power management more efficient. So then let's dive into this idea of increasing the power to the rack. To the server rack.
Why do you think it's necessary for architecture changes to happen moving forward? To put it very simple, we have two challenges to solve. On one side, the GPUs. So the chips are getting more and more powerful.
We are today standing at numbers. Head up to 2000 watts per chip, and the numbers keep rising as we speak. And on the other side, the, architects of this racks want to put more and more GPUs into that very same rack. So that means in the future, that leaves you with a challenge that you don't have enough space to also house the AC/DC power supplies into that rack. So you want to move the power supplies out, which has its own challenges, that has its own challenges, which is basically you need to bring the power back in and you need to bring it back in.
Typically they use bus bars but this bus bars need to be modified because they now need to handle very high power at low voltages, which means a lot of current needs to flow over this bus bars into the racks. And in addition, you have all these thermal challenges, despite the fact that you already use water cooling in these racks, there's a lot of heat. Basically all the power of the rack needs to be dissipated. So then how does this translate concretely to rack architecture? So we see that everything that can be moved out of the rack, like the AC, DC power supplies will be moved out. But then you need to bring back in the energy. And today this would happen at lower voltages.
So 50 volt for example. But that means lower voltages means higher current. Right. So if you want to improve that you could move to higher voltages. That would that would then supply the the the rack.
But once you start to use higher voltages, you anyway need to take care of different safety and isolation requirements. Right. And the next logical step, would be to go to a voltage class of 400 volt or maybe even plus and -400 volt. But once you do this, you get a lot of additional advantages and disadvantages. So tell us about the advantages first.
So the first and most obvious advantages that you can bring in energy and power with much thinner wires into the core. The second is related to the next conversion step which is the DC DC conversion. And this can be done in a much more power dense way. And the third advantage to use 400 volt is that it's a voltage that is today widely used already in electric vehicles.
So you can reuse a lot of the product ecosystem that has been developed around the 400 volt. So those were the advantages. The disadvantages are that you really need to prepare for a different infrastructure in the, in the data center. And some of the older data centers might have legacy, equipment.
That is, it is not compatible with the safety and isolation standards that you need to apply here. So I assume that this 400 volt DC will be rather something used in new installations like the the programs around Stargate, where everything can really be designed from scratch. And you can already prepare for that. There's one more disclaimer that I would like to really make is the work that we do today around really bringing more and more GPUs in one rack is still based on the hypothesis that that is a system requirement, right? And the system requirement comes from the fact that today the GPUs are connected via copper cables. And because they need to, take care for a very high bandwidth, the length of these cables is limited to 1.5m.
So you really want to have all the GPUs in one rack, right? In the future. You could also imagine photonics and and basically laser light connecting, the GPUs. But at this point in time, we believe that this is something that's at least 3 to 5 years out.
So for the time being, we work really with the hypothesis that, this physical proximity of GPUs is necessary and the power management needs to adopt to these requirements at this given moment in time. Okay. Interesting to look at the advantages and disadvantages. But if we go to this 400 volt architecture, then what would this mean for our power supplies? Well Peter, such a change in architecture of a of a datacenter really requires innovation on all level of the power conversion.
And for us this means that we would really need and we would like to use really our complete portfolio of power products starting from silicon, silicon carbide and even gallium nitride. Yeah. So let's start with, with the first conversion step, which would be the AC/DC rectifiers. So the front end power supply that converts up to 600V of AC voltage down to 400 volt, right. In this case we would very likely use very innovative novel topologies.
I'm thinking about five level, topologies in the power factor correction piece of it. And we would use silicon carbide products together with silicon depending on the on the use case and for best efficiency on one side. And then of course also price performance. So in particular efficiency is important for this AC/DC rectifiers because they are outside of the rack.
So there's enough space. But efficiency is important because there's a lot of energy that needs to be converted. Right. And we are targeting efficiency levels up to 98.5% for that.
For that, a piece of conversion, the next conversion step is from 400 volt DC down to 12 volt, or even sometimes below 12 volt. And here we are now moving back into the rack. We are moving close to the GPU. So here the requirements are slightly different. It's all about power density. The real estate around the GPU is scarce and super expensive right.
So we want to come up with the smallest possible form factor. And for that we use our latest and greatest products in gallium nitride. We use, for example, bidirectional gallium nitride switches.
But we also use silicon. And we use depending on the sweet spot of the of the technology we use these components according to to the use case. But as a blend we believe we can really deliver the highest power density for these DC/DC converters that are then feeding the GPU. So 400V. This seems like a possibility that's right around the corner.
But what lies beyond that. Yeah. So the the described solution with the transition to the 400 volt, this would bring us to power levels of around 300 kilowatt per rack.
Right. And that assumes that the power supplies are standing next to the rack. That is good. But that's not good enough. The power levels will continue to go even higher.
They will reach half a megawatt, very likely in a foreseeable time horizon. So for that further and more fundamental changes in the power architecture are needed. And the ultimate goal is to have only IT racks standing in the data center supplied by a 400 volt DC, busbar across the data center.
So you would have centrally produced 400 volt, maybe plus minus 400 volt and then distributed across across the data center. But that comes of course, with a complete own set of safety and isolation requirements. You need to have dedicated personnel to handle this kind of equipment.
And you also have the challenge to have battery backup units that operate on that 400 volt level. You need to have protection features and protection ICS protection, potentially e-fuses that are able to to be used in 440 environment and Infineon is innovating on all that, product categories in order to be ready with a portfolio of of products for the 400 volt ecosystem. So as we mentioned in the beginning of the podcast, the challenge is the increased need for more energy and power. And you've gave us a couple of solutions how we can work around that, that are maybe right around the corner, but what other alternatives are there? What else can we apply here? Yeah, in fact, I do see a lot of potential in quantum computing. You know, for certain use cases, quantum computers have clear advantages over traditional brute force approaches where you put a lot of transistors on a chip, you know, billions of transistors, and you then use 100,000 of these GPUs in a data center to retrain models. Right.
There are certain use cases, like, for example, optimization problems and one of the most famous optimization problem is the traveling salesman problem. And I briefly described this here for the audience. So it's it's if a salesman wants to visit customers and they live in different cities and you want to find the shortest path of this customers, you can of course, calculate all the options and then see what is the shortest path. If you have a customer count of ten, this is still something you can do with a classical computer, because in total, you have already a pretty remarkable number of 100 or more than 181,000 of options to visit these ten customers. So knowing the right sequence.
But once the customer count increases, for example, you go to 100 customers. It's impossible to do this with brute force. It's impossible to calculate all the options because there are more options than there are atoms in the observable universe.
This is where quantum computers come into play. This is where you can really use them at their best and sweet spot. And you have this kind of optimization problems as in other fields, for example, in DNA sequencing, you can also find it in optimization for supply chain problems. And in the semiconductor industry, for example, we have a pretty complex supply chain to tackle with. So even if AI advances big time, there will be certain limitations to the classical approach. And even if the times are still early, right? We in the early days of quantum computing, it appears that using quantum hardware to train AI models for some use cases will be significantly more energy efficient and potentially might even open up solution spaces and options that you where you cannot go with, traditional hardware.
So, okay, you introduced this topic of quantum computing to solve some problems. But we also have traditional computing. What are the benefits and where would each one of these be used? Yeah. So in indeed, I see that those two solutions, could coexist. I see quantum computers excel wherever you have, challenges, like with a limited amount of data. So a small dataset.
But that dataset has an enormous and vast amount of opportunities to evolve. Examples for that could be the folding of proteins. Right. So where a handful of different proteins can fold together in a huge amount of possibilities to then create and generate complex molecules, this is something that you can much better describe in a quantum environment, rather than a classical, computing science on the other side.
Ever. You have vast amount of data, for example, pictures in the internet that you want to process and compute. This is where traditionally AI hardware is still, I think, state of the art and will continue to be. So I believe for different challenges you will see different highly specialized approaches in how to compute data. Okay. And it makes sense. And quantum computing sounds like a real solution.
But give us an idea of what Infineon s role is with quantum computing. Infineon is innovating with a dedicated team on trapped-ion quantum computing. What we are doing is really the processing unit...so that's really the core processor of a quantum computer.
And we have been investing in this since 2017, combining our expertise in high volume semiconductor production with some specialties like integrated photonics or integrated control circuits for, for quantum computing. We are working together with three industry leading partners. And in the meantime, we have more than 100 patents in the field of quantum computing. So at this moment we see another technology converging with AI together. This could be really big.
So it looks like quantum computing is just yet another solution out there. And this is a whole field by itself which we don't have time for. I will ask you to give a final few words and wrap things up. Yeah. Thank you very much. You know, even if it's sometimes breathtaking, if you if you observe this exponential evolution of AI, it is important to stay optimistic and forward looking because AI is here to stay. And we should embrace this future with confidence.
Because with the smart energy solutions of Infineon, we are not only embracing the challenges of AI, but we are also making sure that we save energy along this way. Perfect. Thank you so much for coming in today and sharing your knowledge with us. Thank you very much for having me. And thank you for our listeners for dropping in today. If you have any questions or ideas for future episodes, please don't hesitate to contact us at wepowerai@infineon.com.
Thanks a lot and see you soon.
2025-03-30 11:30