ISC 2023 -Sustainability Through Renewable Energy

Show video

foreign coverage of ISC high performance 2023 where we're covering all things HPC machine learning AI high performance analytics and Quantum Computing and one of the most important topics in the HPC Community is sustainability and in this segment we're going to try to more deeply understand how organizations can achieve sustainability through green energy and renewable sources and so with me to do that are Matt Foley from AMD jabon journabon from Dell Technologies and ghee Towers at from at North gentlemen welcome to the cube thanks so much for coming on thanks for having us big big picture if you think about all infrastructure power um how much of that actually is consumed by by data centers and then and then how much of that is is HPC well first of all Dave let's put it into a context you know if you're looking about how much is the overall consumption um of data centers you will see that from the entire worldwide figure you'll get into one one point five percent um which you might say well doesn't look like big but in reality is is enormous it's very very large so based on several um assessments done by um by Hyperion research you will see that for instance the on-prem HPC represents approximately a fifth a fourth of the overall server implementation especially here into into emea so this you can get into what because the the the the the consumption or the service into an HP system usually the compute the nodes are very very uh hungry in terms of power consumption you might say that we are getting into probably 0.5 0.7 percentage of of uh of consumption this is not at all easy to to maintain it it's really important to understand um how to build it and and how to drive it because it represents an immense amount of power which in these days comes with also a very large cost if you think about the past we used to say half of the price was for the acquisition of the hardware part and the other part of the other half of the cost of running an HPC represents the power I think these days it started to be slightly different is the mix shifting I mean it sounds like it is it sounds like the HPC and of course all this AI you know talk over the last you know couple of months is consume is it consuming is HPC now consuming a larger and larger portion of that pie or is it more sort of steady state I wonder maybe well we are we are in an I.T world so the usual answer to start with is it depends it depends what you understand by HPC you know and you know with activity which support Discovery creative process generative AI everything right now you know it really depends on um how you understand that HPC is used outside of the data center and then you know for instance you have products like fpga which are used in cameras or or Edge um more and more getting into um into production so HPC usage is growing so as well as this is what happens as well with the with the power consumption Associated yeah so I should have shared with the audience so Shaban works with a lot of Dell customers making sure that they're they're getting the most out of their their infrastructure Now Matt um you're with AMD so you guys are down deep into the semiconductor land talk about your role and your area of expertise well certainly yes I manage the field application Engineers for AMD across Europe Middle East in Africa for the commercial business and hvc and adaptive Computing is really what we're about at AMD and it's one of those things that we as a company have focused on because it provides a lot of Market good insight and so the the advances that we make in order to compete successfully in the HPC Market we can then take those and use those across the piece as Siobhan was mentioning before about how HPC is infiltrating all sorts of other different areas outside of the outside of the data center and then beyond that what we see as well is that aside from just the usual efficiency moves where you you know take a better process or use better packaging technology what we really see here is a need for heterogeneous Computing where we're actually going to start taking the problems and decomposing them into different ways that acceleration can can better solve them and by doing that we we believe we can achieve step function improvements in terms of efficiency and sustainability instead of mere percentages and we really look forward to working with all of our partners to to package that up and provide that to the market yeah that's exciting I mean orders of magnitude Improvement would be huge uh gee I wonder if you could tell us a little bit more about at North and in your world your background is multi-dimensional you've worked for many many firms and have a lot of expertise in this space yeah so um North is Data Center and HPC and AI provider so we uh we are basing everything based on our very sustainable data centers placed where we can find the the highest degree of renewable energy and we built a whole stack so we built it with with all of us here together and we have very large Enterprises from all over the world bringing their workloads first they bring them uh from on-premise when their data centers are no longer up to date for running HPC or AI or or fairly accelerated Computing so they bring it over to to ours who are entirely built for uh this purpose or we have more and more customers who migrate away from the cloud from the public Cloud because they have experience that it's actually very good for general purpose Computing and occasional usage but for HPC which tend to be used on a constant base and optimized in the 24x7 it becomes way too expensive so they come to us for cost reason for total cost of ownership reason but most uh and and actually foremost for the sustainability reasonably oh thank you for that okay so maybe sure about you could answer this for me so my understanding is in the near term we're going to get to about 125 kilowatts of power in a in a single HPC rack and I was trying to figure out okay is that a lot it sounds like a lot but so it's like for a single day's usage is probably you know in kilowatt hours it's at least five to six times the average U.S household they're probably more like 10x because most of the time you're sleeping so is this a concern well let's let's put it in a different format I I think that 125 kilowatts per rock is already possible today actually 200 kilowatts is very likely to be possible in in tomorrow but high density power is always a concern and you know beside of the uh energy cost then you have safety then you have cooling uh challenging and these are only contributed to this one nevertheless the the there is a result of these days semiconductor development and like any vendor research we will need to cope with these challenges now in in the future we do foresee uh changes in the environment including the rack design including cooling method that's included included power distribution but you know the energy issue uh and and the computational efficiency um is not I think we should go um further into this one and understand exactly when when we are discussing is the overall um power footprint that a system like that an expert system like that is using yeah I think we're mark from our perspective here we certainly see you know the more dense you make the power the more efficient the system is that 125 kilowatt rack is a very efficient rack but it's a difficult physical challenge to uh to power and cool that to get as much power there as you need and to get as much cooling there as you need and that's why we are constantly asked about liquid cooling immersion cooling um you know does air still have a future and what I see is that all of the above are still still very relevant and still still definitely in in play and I don't I never count out the the engineers behind all of those Technologies because of significant breakthroughs when it comes to those to those Cooling and power discussions and also there's other regulatory um discussions as well that that are had with the the areas that host these data centers yeah so there's no Silver Bullet as you say it's all the above the combination so like what is the scale of the sustainability issue I mean obviously the economics are important people want to you know maintain or you know ideally they'd love to lower the power bill but that's like not likely but but you want to at least maintain it as a percentage of your overall spend or maybe even compress it but but why should HPC be so worried about this Matt maybe you've got some thoughts on that so I think from from our perspective we we need to make these tools as great as these tools are they're only as good as they are accessible and So currently we can do an extra flop in in 20 megawatts so that's what the the exit flop system at Oak Ridge National Laboratories does and so with regards to to that if you expand that if you continue that trajectory it quickly becomes unsustainable you're looking at half of a nuclear power plant in order to get to a Zeta scale system and so going Beyond this and continuing on that present course in speed is is simply unsustainable which is why again for us what we really need to do is figure out how to redo these problems you can argue we've spent the last 20 or 30 years moving all applications to one architecture and then writing that semiconductor curve down whereas in the future we're going to have to reconsider the actual computer science the actual problem set there and uh and take you know take the part problem apart in a way that can actually have that it can actually be accelerated meaningfully so okay thank you so Sugi my understanding then is 125 kilowatts per rack is a good thing but you got to cool it right so so what is the sort of state of the art how are people trying to reduce their power consumption you know using waste power uh is that something that's common how are organizations approaching this and what's the impact well what we hear from our clients and especially from our Enterprise clients they want to see the total TCO and that's still quite predominant and what is then the sustainability impact of all this so does it make sense for their business so that's why they look at the whole the whole combination uh is it in the right location so actually they don't need to cool so much secondly the data center design the cooling design how is the power being distributed it has already been set here but the whole stack comes together meaning every element of the technology comes together and that's then the whole equation that determines yeah is this done at a good TCO at the total cost of ownership that makes sense for the business and yes technology wise it's absolutely possible to run more than 125 Iraq but what we then see in in in let's say European data centers all over the place is that they leave the rest of the data center empty to be able to just uh convey and cool that one right that doesn't make any sense right right so we want to do this really with an optimized data center fully occupied balance Power Balance Cooling and manageability of all that because then the the technology equation also goes up all the technology you need to add comes to the cost of your Poe if you add redundancy if you add complexity and so on it really comes it comes all with a cost and that needs to be paid at the end so that's why it needs to make sense that's what we hear from our clients and we have clients we say yes we want to prove uh the future that is possible we want to really make a stake in the ground to show it off and that's a very good reason to go beyond but if it just to be economical and to be sustainable we we look for more like an average design that that makes sense with a TCO not just for the sake of the the most dense and the most technology wise but so that also makes sense for for example for super Computing centers when you want to show that off so so I get that I mean you got to look at the whole picture the whole house if you will but I still get to ask Matt you may or may not know so but I should ask you before what does Quantum do does Quantum is it more power consumptive do do we know yet uh is it less power consumptive [Laughter] well we are comparing different Technologies and I don't think it's correctly what Quantum it will be at the end of the day is a different accelerator or a different way to use acceleration in place right so it's definitely not the GPU right right but for that reason but it will definitely um help the industry to develop photo at least you know um put it like that HPC um drives Innovation to progress the humankind in Industry like healthcare manufacturing lifestyle whatever you know and sustainability including this one it will be a very important one to take forward yeah there's a lot more social good go ahead Matt see to some extent the answer is going to come from HPC because a lot of the applications that we get asked to to um to Benchmark and to to understand how well they work on HPC systems are actually the quantum simulators so those are the ones that are we're actually modeling and trying to figure out and understand and hopefully answer the question you asked with regards to power efficiency sustainability and really overall usefulness interesting Quantum and aisle for Quantum and AI um uh ghee I think it's probably appropriate for this question correct me if I'm wrong but most data centers today are probably powered by coal is that correct and uh so how do you design a data center differently if the power is is green is there is there a different way to think about it well that location is the first element to retain then you get renewable energy to a 100 or to a very high degree and in some areas it is or it's very fluctuating or it's just like a very low percentage and that is that happened to be much easier for us that's why we are focusing so much on first of all HPC and AI but also on the location being in the nordics um where where actually a lot of uh yeah in all the countries we can get uh close to 100 or 100 renewable energy so location is the first the first uh the first part and then the designs of the data center entirely built for uh HPC and Ai and for high density workloads that's the second one from a PowerPoint of you especially from a cooling point of view and uh yeah we we use more and more heat recovery and uh reuse the the wasted heat and Sell It Again into the municipalities and that's happened also to be well developed in the Nordic countries and not yet or not uh in in in many parts in the rest of the world thank you Siobhan is the Quest for a p-u-e of one is that still a Holy Grail Milestone if all the power is renewable well we we heard gee earlier saying that locality is one of the important things to consider there were times where we build data center because the land was cheap um now we are looking to understand if the workloads we are running are running into the codec Data Center and here we have a lot of concerns related to security and things like that Poe is a power usage efficiency right is is an indicator of how it a system is running right but probably on a on a short term we should look as well on um on the carbon footprint and not only from a price performance per watt but also into what does it mean um in terms of the carbon um which we are which are releasing so I I think into this one um definitive must drive for um greater efficiency um and you know sustainability is part of how you are designing your solution um in order to have a system able to perform on your needs so I I don't think that Pewee alone is the only one I think that the answer is that we'll need to look constantly at new indicators related to how a system is used got it makes sense go ahead I'm just going to chime in about the efficiency part right in terms of if we have if we don't strive for continued efficiency continue lowering the puv we're wasting really you know even if it's all renewable we're wasting the renewable energy that could go to other uses in society and so I think it's it's extremely important to make sure that we are as efficient as we can be that we're good stewards of the resource that we have because you know renewable energy isn't as yet enough to power you know the entire world so the more of it that we can you know use efficiently and the more that it can be shared across the other areas of the economy and of society I think the better gotcha uh gee is it better to move data centers to locations where energy is green or maybe you can use you know outside air and where the temperatures are cold you know up in up in Iceland or wherever what what are your thoughts on that well the the success we see now and the the demand we see is that people are getting it of course this is not new we're doing it since 10 15 years that we started to build out first in Iceland and attracted to workloads from literally first all of Europe but now a lot of U.S companies and and and and workflows being run out of Asia so everything that is I would say everything that's not latency dependent and HPC usually is not or machine learning is not you can you can run it wherever in the world so you run it at the best place where actually the poe is low but then the other aspect of the Pue that is also that why not always you need full redundancy you're not always need tier three or or very high tier data centers where there is a lot of redundancy where everything is fully fully redundant so that's also part of energy being used for that redundancy and that efficiency if not needed why why over design it so that is the other the other aspect but location is is definitely the first one there is so many data centers that are just used because people are used to have it nearby and want to see the lights flashing but it's not it's not really needed it's not uh yeah there is no technical or economical reason to to keep it running in very inefficient places and that's what we see now we see a really global companies moving at Mars away from inefficient locations or from General Public Cloud because they see also that general public Cloud are run in Europe in the flat countries mainly which are predefinition made also for General compute and not optimized for HPC so actually not fully efficient so that is the other equation that we that we see that that people are really moving very very large workloads so we we have now for example BMP pariba has done that since five years that bring over their their heaviest calculations first in Iceland that in Sweden and um and yeah literally all of it is now brought over and they have just saved more than 50 of their carbon footprint so uh that that's that's the proof of it a very large corporations are doing it it's put some um some back uh in perspective what we start the conversation from is that architecture is the design and not the last is the reliability of the system is exactly what gee was mentioning earlier related to why you need to double and to make it super redundant when you can if the system is reliable you can use it uh better in this perspective and that saves a lot about economics and power Matt what are the big sources of sustainable energy that are being used to power HPC today and how will that change in the future I think it's really vocational right in terms of if you're near a lot of water you can use water if you're in a sunny location sort of in the like the Middle East for example solar comes into play um geothermal certainly the the case in Iceland and I think it really depends on the the environmental characteristics and and how the uh you know where the renewable energy uh where the renewable energy sources come from I mean once you once you get the renewable energy and also figure out a way to store it um then you know for that from that perspective and that point on it's it's basically just energy and and so then it can go into the data center and power the workloads but we need to have a broad portfolio of renewable energy sources because of the uh because they're all different in in terms of the different areas of the world have strengths and sound and weaknesses and others and so we need to be uh I think the answer really isn't as much of a direct answer as you were looking for but the the answer really is that it is situationally dependent on on really what's available kind of another it depends question what's the right model for from a sustainability standpoint should I run my workloads in the cloud or should I run them on Prem oh gee do you have any thoughts on this well we just published the white paper which is based on a lot of customer feedback uh so yeah please find it out on our website and it's fresh so in that is when you have occasional use which sometimes is an HPT very important you have no idea you know how much you need to use you have no idea for how long well that then the cloud model is perfect to try it out once you have a lot of views which a lot of APC centers have they have 24x7 it's the 99 that it's 100 used that's not that's absolutely not where the cloud is made from so then you do it or you do it on premise when you have it at the right location at the right full stack sustainability and efficiency and but then the model is really the the most efficient model is that you have a baseline capacity for catering your day-to-day needs so that you have the best return on investment or all your engineers uh time but also on your software licenses so you really have the most efficient system and the best dco available for your Baseline and on top of that you scale up and you scale down but what we offer is that do that on the same on the same location so your data gravity that you don't yeah because we talk here about machine learning we talk about big simulations where the data pack is also huge so if you then need to send this from an on-prem to a public Cloud that cost of egress Ingress is total waste and that's also sustainability that is lost because you just send it for for sending data back and forward so actually what we do is a combination of both of Both Worlds The Best of Both Worlds is that we cater for a very solid Baseline very efficient and there we grow the cluster and we shrink it back so that is actually the best of both worlds without the data gravity in the middle yeah you're seeing the whole Cloud operating model move to on-prem I mean it's obvious that that's happening this has been a great power Channel pun intended but gerbon I'll give you the the last word give us the the summary and bring us home look um it's uh it first of all it was a joy to um to have my colleagues here um together with us um I I sincerely believe that uh what we are putting on sustainability drives forward into on how every one of our customers is going to run the business back to the previous idea you can build it on-prem you can consume it as a service either on demand HPC on demand as a collocation or into a as a service via VIA Apex and everything which we are trying to to put together um in front of our customers is to make their Journey much more predictable much more easy to be understood from um uh from a results point of view I think um I think that the future is really bright and when I'm looking to what we can put together into by the industry into this segment is incredibly promising excellent gentlemen thanks so much it was great to have you on the program to learn learn about this really important issue all right keep it right there for more coverage of ISC 2023 in the cube your leader in Enterprise Tech coverage foreign [Music] foreign

2023-05-29

Show video