Armando Acosta, Dell Technologies & Alan Chalker, Ohio Supercomputer Center | SC23
hey guys and girls welcome back to the cube the leader in Live tech coverage here in Mile High City Denver Colorado covering super compute 23 Lisa Martin here with John furer John we've had you guys started last night with Savannah we've had a great day so far today but this next segment is has got me on the edge of my seat tease it for us well we got super Computing show sc23 we're talk about super Computing here and just highlights that the product changes are radically changing overnight AI plus HPC is really creating a power Dynamic that's going to spark massive Innovation that doesn't foreclose the future and also Builds on what super commun has done so this should be a great segment of some great stories yes and you you hit the word overnight Armando aosta is back one of our alumni the director of HPC solutions from G Technologies Alan Chalker is here as well the director of strategic programs for the Ohio supercomputer guys welcome thank you you for having us I got to go right to you the the Ohio superc computer this is an ' 80s baby found out in '87 talk about its Mission its Vision as it relates to data sure sure so uh first off the origin story is really interesting if you go back to the mid 80s National Science Foundation created some national supercomputer centers there was a group of Faculty members in Ohio who said look we could be a national super computer center but the NSF disagree but what did they do they went back to the general assembly and the governor and said the federal government won't fund this will you fund it and they said yes so we're a little bit unique in that we're not necessarily affiliated with one particular University we are a state agency a state entity and we are there to provide benefits to the entire State not just academic but also commercial industry and also to raise awareness of Ohio as being a great place to work and wonder so you can't can't pick a team then you that's okay I I do like the Buckey got to FR for the bear cats and everything what are some of the the the benefits that you're delivering across the landscape that you were talking about love to comes down to aggregation uh we as a larger State entity are able to buy in large aggregate happens to be from Dell we happy to have four clusters right now from Dell about almost 60,000 cores uh that then could be used by anybody in the state be it academics be it Private Industry you whatever and we have $25 million in Dell Hardware on our floor right now that they can leverage any given University for the most part can't do that right it's huge yeah yeah I mean and you know that's the beautiful thing about it is that you know what I love what Allen's trying to do he's trying to enable more HPC users right uh you know at HPC Community event yesterday we talked about all these different use cases but if we enable more users we enable more use cases we're solving harder problems but not only that it raises all boats and so that's what I love what they're doing at you know Ohio Armando I want to talk about that and tie that back over to Ohio super Computing because the event yesterday was the Dell HBC Community event but it wasn't just the Dell event it was a lot of other vendors there it was community ecosystem event basically what does this tell us because to me when I was in there I was expecting to see something different what I saw was an industry kind of lock step on the future they clearly got an AI Focus but it's a celebration this is not like the talking about well AI is this that and other it's like this it's go time on the product side I mean so we kind of talked about this before we started the cameras but you know when you look at it you know some people think is it either HBC or is it Ai and really what we're saying it's both and what you see from our customer stories whether it's Tac Ohio supercomputer is that you want to do both because it enables new types of research right so prime example you look at weather modeling in the past you would run a simulation and that model will tell you okay here this where we think you know the Hurricane's going to hit right for one example but can you imagine now you take that result out of that simulation and you plug it into essentially a network a neural network you run essentially training and that Insight combined with that model gives you two different perspectives so you can actually get a better Insight without each individual one so that's the beauty of AI Allan what's your story you got some good stories right now what's state-of-the-art what's going on in your world give us some come on show us some tech love come on you're holding back so building upon what Armando said right now as we speak we have in Ohio artd design undergrad students at Ohio University in Athens Ohio connecting on their iPads to Ohio supercomputer Center Dell resources using something we call open on demand we're going to talk about that here in a second they log in they launch a stable diffusion app they type in a prompt behind the scenes that fires off onto our clusters gets a three generation old Nvidia V100 GPU in kubernetes there's a string of words here I'm technical you know we we we love it but there the students are time slicing that GPU few seconds later they get back a generative AI image they have no they're art students they can't spell HP can they but they have no idea what all is involved behind the scenes exactly like he was saying they don't need to all they need to know is that they can go and connect some of the most powerful Computer Resources in the world using the latest and greatest technology two years ago if IID said that that sentence that I just said it would have made sense right right but now we all know exactly what it means this is a huge enable go ahead no but I was saying you know when you look at on demand that's what I love of what you're doing right is because they're abstracting the hard stuff from HBC that scares a lot of people right and so if you make it easier for the users to interact with the hgpc cluster guess what you get more users and once again like I said it raises all boats and so you abstract that you make it easy that's exactly what you're doing that's I want to get at least get jump here for a second but I want to just follow on what you said that the magic is putting that under the covers and I think you mentioned kubernetes clusters that's key orchestration it's becoming like a lingua Fran like what Linux is kubernetes from a from a cloud perspective pulling all this together is invisible to the user that's going to create a a new class of user expectations are different um applications will probably look different you're seeing that now this changes the game this this is actually the purpose of HBC yes to provide this kind of horsepower absolutely absolutely I mean the number of science domains that we have using our systems here in Ohio I always said art design studs horiculture crop science students anthrop ology students political science students fields that you never would have thought of would make use of this amazing techn we're surrounded by here we eat breathe and think in this inside baseball day in and day out but they don't and they don't need to that's not what's important for them that's a great point is they don't need to know what's under the covers to be able to to create what they need to like that in almost real time open on demand from what I understand is about 10 years old give us a little bit of the history and how it's developed because as you said Alan even two years ago what you described wouldn't made sense so in terms of the John talked about things you know rapidly changing overnight you're living that y absolutely yeah so uh this is actually the 10th anniversary of us introducing open on demand to the world uh basically we were in the right spot at the right time if you go back to the late Ops we like many of our other peer organizations were starting to get requests from our clients and this is where if you think about it the iPhone came out in the mid ODS and everybody got used to online banking and eBay and everything like that all those Enterprise applications the consumers were using every day they were like wait a minute I don't want to see the green text scrolling across the screen like on the hacker movies that's not for me right right you know I they're like I don't write code like exactly we were in the right place right time that we started to develop web interfaces to our systems introduce that to the community and the community was like oh my gosh that's great we have a copy of this interface and we're like oh wait a minute this was just a thing that we played with in house thing we went to the National Science Foundation and we're now on the fourth of a series of multi-million dollar awards that NSF has funded to take that and deploy it and make it available open source to the community so I can announce right now that as of today we have nearly 700 research Computing sites 62 countries all over the world that are using open on demand as their primary interface and yes you can do it on your cell phone now in Ohio we're there's no law against drinking and Computing but we don't NE condone it but I've seen people in the bars using open on demand on their phone I'm dead serious I've seen pictures of grad students sitting in a pub logging to see how their jobs dra what's going on with open on me that adoption has been amazing how did you facilitate such wide adoption and I think that we were in the right place our time lightning struck twice for us okay I already mentioned we're in the right place where right when iPhones were taking off and everything the the other thing little bit of Silver Lining was the pandemic that we just got through now what happened during the pandemic was many universities had to go to a remote learning model and as a result the students were not able to access the oncampus computer labs what were they able to do well there were so many sites out there that had open on demand which provides remote desktop capabilities and remote software access that I get stories left and right from from uh just different academic center saying if it weren't for open on demand we might not have been able to continue to teach throughout the pandemic because our you were there and was in the right spot right time one of the things that came up on our earlier segment with AI coming is that there's going to be some loow hanging fruit you get some benefits from existing stuff but it's the new things you don't see yet that's going to be compelling as you get this on demand and this kind of cloud-like experience with Gen of AI and compute and gpus and CPUs dpus qbus we're going to have our own processor the the ability to do new things and test be creative the barriers to entry to do stuff and experiment is going to be very low okay so if you believe that to be true then the next question is what do you guys see right now as these new enablement use cases what's some of the things that are coming out that could give us a uh dots to connect to what we might see coming out of these U big large scale environments cuz I mean obviously if you have massive amounts of compute that's got generative in some intelligence and reasoning inference yep training is great but inference is is the Holy Grail right that's where you want to get to right you got to do training first that's what a lot of people don't understand you got doain infer is the new web app at cucon that was a big phrase we were kicking around but this points to what's next we don't yet know but what do you guys see as signs uh so you want to go first I mean so when you look at you know generative AI you know some of the use cases that we're looking at internally is how do we maybe improve process right or hey how many maybe we put some inputs in there and hey maybe we can modify some types of code uh what are ways that we could use generative AI to automate manual tasks so that we can validate test and build our products faster so so those are just some of the examples of what we're looking at from a gen perspective uh you know about you Alan I I'll I'll build on that give you a very precise example just yesterday one of uh uh colleagues we collaborated with at Idaho National Labs Ido lab is one of the Department of energy nuclear labs they are the experts when it comes to the Us's nuclear you know energy um came up to me and they and they they've got open on demand they make it available and they said hey Al by the way we just deployed a chat GP like it's not chat GP but a local help app and open on demand so that clients when they have questions instead of calling our help desk first they can type in just conversational Ai and by the way we have this linked into to our existing documentation our existing training materials and we can just add dump more stuff in there and it'll steer them in the right direction first before ever going so where is it going it's allowing us to reduce that burden reduce that friction so when people have questions they don't need to call up the experts necessarily right away and the role of government is going to be important here too not from a regulation standpoint I'm anti-regulation just for the record on guard rails okay cool virtue signaling with guard rails okay whatever but if you take this concept of successful pointland you guys are doing we have national parks in this country why can't we do like National compute Farms like why can't we is there going to be a future where you need all this compute where citizens like me could just get compute or is that going to be a private service so is there a movement because NFS you you're coming about NF funding this stuff that's how drpa started 50 years ago internet you know the paper was the first thing went out we're celebrating 50 years of the internet right this year uh well so I don't know if the government is going to give you know give you one big generative AI so that we can Lal I might not see it there but I do believe if you see gener of AI what you see taken off is AI as a service right and so when you look at AI as a service what you see is hey maybe you don't have the time maybe you don't have the experts maybe you don't have the data scientists but hey we will go build that infrastructure for you we will give you the tools and essentially all you have to bring to the table is your code right and so I think if you look at those types of models those will get interesting uh but in the long run I think you know you're going to want to keep control of you're not going to want to let some of your data out you're not going to want to that into into the public so I think you'll have enclaves but I you know to me I think tongue and cheek National Park we have national parks for people to to use but this brings up the question of democratization right this these what we're seeing is no barriers to entry to do something creative to move the needle for advancement I want to propose a different way for you to look at what you just uh said so the future what the future holds is it doesn't matter whether it's a national cloud or if it's a local resource or if it's a commercial cloud or if it's your cell phone all that matters is that the tool is there and you can click on it this is one of the things I've talked to various people for at NF NSF about is that they fund access they fund open science gr they fund local resources what if there's just a common interface open on demander of course what I want to what there a common interface so it doesn't matter where you're jumping across or what what's actually happening behind the scenes Let the magic happen we got plenty of smart people that can figure out how to route things and make it so that the end client is just able to see on their phone on their iPad on their Tesla on the metaverse and the VR whatever yeah I mean I think that's the smartest thing that you did is that he understanded how essentially his users wanted to consume the technology right and hey if it's an app and I just want to tap it and I want to use it well guess what we can make that easy and do it for HBC as well when are boring when things are boring they're being used kubernetes is being boring as they say my final question is if we if this continues um the ecosystem this community and supercomputing has been around since 1988 A lot's changing fast here you still got a lot of academic you got a lot of algorithm Long View conversations around architecture chips design all that good stuff that's been going on for Generations here and decades but now you have a very fast pace of play of commercial applications coming in yeah what do you guys see as the ecosystem what changes what stays the same how do you see the CU Partnerships are going to happen if things go away and they're access and consumed people will be playing together right with their data what changes in the ecosystem what stays what do you think sure so uh you know I I think at least two of us have been around long enough let me let me give you just one example of what I've seen and then where we're going uh I'm sure many people remember asky red program not that long ago the federal government spent hundreds of millions of dollars to get the first Tera flop computer out there okay we have sitting on our floor at our data center single node of Dell systems $60,000 a node 55 Tera flops it's $1,000 a tera flop and we're having those you know clients are doing amazing things on that that's two decades we've gone from hundreds of millions of dollars to, per Tera flop right okay so where is it going to be it's going to continue to go but what's happening now is it's the data there's just so much data and I'm sure you've heard this from other people in terms of the inest from all the remote instruments and the edge Computing and all that stuff making sure that data all gets ingested appropriately processed securely we we have confidence to where it came from you know there's not disinformation that injected into that things like that single single version of the truth no but what what you know from our perspective is we want to enable you know this you know what Allen's doing but the other big thing that we're going to stick to is we're going to stick to standards we're going to put standards on every new technology that comes out whether it's cxl whether it's essentially accelerators we want standards across the board right but the other big thing that we want to be able to do is essentially have these Standard Building blocks so that customers don't have to waste their time trying to figure all this out on their own that's a beauty of why we do validated designs this is why we work with Allen at OSU because we want to learn from them and not only that you know we can't build everything uh Partnerships are still going to remain you're still going to you know have Hardware but what I believe is that you know the time now is software so on demand he saw it but when you see the software ecosystem I see a tighter integration with hardware and software and with that tighter integration I think you're going to get essentially better performance uh you know you know all about the different chip makers around every different AI use case right and I believe if you can enable all those different AI use cases with the right set of tools then hey let the user go do something with it and you'll be amazed at what the results are tell you guys are doing a lot of great work I want to give you props on that we appreciate you guys um being in the industry one thing you said yesterday at the community event Leisa was really interesting around because we always ask the question what's the impact of AI to the workflows yeah AI is iterative you you were pointing this out on your keynote as well as other presentations and there's a new era of what did you what are you iterating can were you writing it down like it's like making sauce it's like what did I put in there like so you got to iterate to get the models you guys now have programs for customers to come in stand up HPC iterate lock in know what to measure how to make it repeatable this is going to be the new challenge the memory of not the machine the memory of what you did for the AI this is Model Management yeah I mean and that's the biggest thing that you understand so once you build a model in your environment you just don't set it loose and just say hey I'm never going to touch it again that's not how it works right so you talked about data management so when you you build a model and say hey we're trying to do something for infine and we're trying to predict accuracy of a credit score so we can loan somebody money right well guess what parameters change you know different variables strange and so you have to go retrain that model constantly to make sure that your accuracy stay up to date but not only that you want to have what we call reproducibility so hey who touched the model when did they touch the model what essentially data did they add to the model did they add new layers did they add new essentially weights to the model so you have all these things where not only you think about the data governance but it's also now the model governance and not to say this a bad way but hey if you get you make the wrong decision you might get sued guess what you have to show them how you got to that decision so reproducibility is key last question in 60 seconds Alan the future of open on demand what does it look like to you so the future open on demand is a community right now it's been all osc for 10 years 10 years from now I want there to be just Jupiter something like that you know maybe red hat even something like that and there's a community building it using it doing amazing things and not limited by just what me and my colleagues can Envision well it sounds like you're well on your way there guys thank you so much much for joining joining me on the program talking about what you guys are doing together and the and the massive adoption we're definitely going to be keeping our eyes on this space thank you all right thank you you thank you for having us our pleasure for our guest I'm for John fer I'm Lisa Martin you're watching the Cube live from sc23 we're going to be back with our next guest after a short break so we'll see you then
2023-11-20 00:45