Making it Easier for Businesses to Deploy AI with Comprehensive Solutions with Seamus Jones | Intel
Hello everyone, and welcome to Intel on AI. I'm your cohost, Ryan Carson, and I'm your other co-host, Tony Mongkolsmai. So, today we're going to be speaking to Seamus Jones on everything from AI factories to AI accelerators. Seamus is director of Technical Marketing, Engineering for Networking and Compute Systems at Dell. It's great to have you on the show, Seamus. Thanks for making time.
-Thanks, Ryan. Happy to be here. -It's good to have you on the show. There is a lot going on in the world of AI and we've got a couple of things we'd like to touch on. The first question is that we're hearing the term AI factory more and more. Can you define and explain that term for our listeners in case they haven't come across it yet? Absolutely right. So... AI is everywhere, right? When it becomes a a term that my grandmother knows, okay, then it becomes something that's just ubiquitous everywhere.
Let's put it that way. in making AI real. Right? We’re turning it into something that is tangible and deployable within a customer's infrastructure estate.
Dell is putting together these AI factories, which are effectively the building blocks that customers would need to build out a functional AI cluster all the way from, you know, the hardware infrastructure, the GPUs election, the networking topology and fabrics right through to the the services and deployments. So that way it can be kind of a one stop shop and making it as easy as possible for customers to deploy. Because the main feedback that we got this past year, Dell Technologies World, was really that, look, we're interested, we want to dip our toe into this space. We want to understand how to make it work for our business and make it really impactful for our business.
But how do we go about doing that? Right? And we have the expertise, experience, knowledge to be able to put all those pieces together, deployed on a customer's site on prem. So that way we've got all the benefits of an on prem, a factory. That's the thought process of the, the ethos behind it. So there's a lot of hardware that goes in the obviously Dell. When I think of Dell, I think of a hardware company.
But you talk a little bit about software. What does that look like, what kind of software your customers are expecting to be on this type of platform? Yeah. And the software stack for AI is really the functional bit that makes it more it most interesting, right? I mean, the hardware, while that's the foundation element, the software stack things like we have an enterprise hub on hugging face that can streamline that whole process for customers and the services expertise is another piece that really makes it functional and applicable to their business. These use cases, the customers have each need different software, and if you're looking at the software options that are out there in the AI space, holy cow, it's there are thousands of options that are available. And how does a customer decipher? Well, if I'm setting up a in a retail space, I need computer vision and I want to set those frameworks. How do I do that? Well, we have that experience and expertise.
So it's really just trying to use our wealth of knowledge that we've built up with customers and actually deploying these in real world and then using that across a larger space. Right. You mentioned use cases and I would love for you to actually touch on some of the exciting use cases that you see being rolled out in the near future.
You mentioned computer vision and retail, and we'd love to hear you kind of tell us some of the exciting things that you see going on. Well, I mean, the new one in not necessarily that new, but one that's definitely starting to hit with a lot of customers is Multimodal. And if you're not familiar with that space, you know, Multimodal is effectively taking multiple implementations, you know, different ways of interacting with data sets to make use of it within an LLM, for example.
Right? Let's make this real. If we were in some of the use cases that we've done within our customers, one was a transportation logistics firm where I was speaking with them last week, where effectively they have shipping containers all across the globe. Right? And what they're able to do is take real time results from the shipping containers, things like humidity factors, actually data points that are within each of the shipping containers, aggregate those real time, deliver that back to a centralized hub that then can write out and deliver reports. So when let's say the shipping container clears Baltimore Harbor, then they know exactly if any of the containers have been damaged or they need to address any of that before it even enters the port.
So all of that type of stuff eliminates the manual process and steps that makes it a real world kind of implementation. Anything that can eliminate this, those manual steps and the human intervention means that it's going to become a lot more efficient, there's less chance for error. And in that detail and it's immediate, right? So I get a report, you know, any time that container ship actually reaches port. Right? Or hits any of the tracking results. -So some really interesting use cases. -Yeah, absolutely.
One quick follow up on that. So obviously, traditional machine learning has been used for a long time now and solves a lot of problems. But now we're obviously focusing a lot on Gen AI and hence the factory nature of this. We're actually producing tokens now at scale for a lot of things. What do you see as some of the newer use cases in an enterprise sense, like not just a chat bot, but where do you see Gen AI starting to change the way businesses work and needing, you know, factories for a AI now? You know, it's one that, we run a fully functional lab, okay, where it's, you know, 98 racks of full of hardware setting up configurations. That's everything.
The one, piece that we've been able to deploy within our lab that was super impactful: log analysis being able to actually not manually have to run through logs and decipher what does it mean, what's going to be the resolution path. I mean, to be able to have immediate log analysis and suggested resolution path has been absolutely time saving for any of our IT administrators in my lab. And I know that that's the same with customers that are doing the same thing. It's just a small in that you wouldn't really necessarily think about it, but it just makes such an impact on a person's day to day workstream that means that my IT administrator or Sys admin can now go off and actually deploy things that are going to be helpful instead of chasing down issues. -Right. -Right?
That makes a big difference. -It's not sexy, but it's actually useful. -Exactly. That's exactly it. And we're getting to that stage.
First it was all about digital human. It was all about these really interesting use cases. But now it's actually getting into usefulness, right? It's where are we actually going to be able to reduce the time it takes to do certain tasks or menial tasks that just nobody wants to do, No one signing up for that job? -Yeah. -You know?
-I'm not. -I'm not either. So you talked about a little bit about the container example, like a computer vision example, and you mentioned multimodal and you actually said for you guys, you guys use it for log analysis. I'm, you know, using some type fro LLM, obviously, for that with text input.
When you have customers who are looking at buying something like a Dell AI factory, does the use case matter so much? Do you have it? I guess the better question is, do you have it to the point where you say if you want to do a LLMs, you want this type of configuration, If you want to do computer vision, you want this type of configuration, How do you guys guide your customers through that process? Getting that right hardware right software stack on their systems. So we start off with a few different ways, but we actually have these Dell validated designs, which what that means effectively is that, you know what, we've set it up configured and, and run it. So we've, we've set it up configured and run it in a lab and we can ensure size and characteristics between concurrent concurrent users, token size tokens per second return.
You know, we can ensure that it's fits within a specific framework. I mean, if you're setting up a 7 billion parameter model versus a 70 billion parameter model, there's going to be drastically different configurations, right? And you don't want to overspend on one or underspend and have it in and have the solution not be performant. So to be able to give guidance on that, my lab, as well as some other CTO labs within Dell have been putting together sizing parameters that we can kind of give guidance of Now, obviously every customer is unique and different, but we can give a parameter guidance of this many concurrent users, this type of parameter model. You know, we can give some sort of sizing and it really is it really is that determinant factor of, hey, what is going to be your your GPU spend, what is going to be because that's really where a lot of the cost of these deployments is coming from is the accelerators that you're implementing today, at least.
And the platforms and software stock and the services surrounding it. So that's been the unique thing thing because in some instances you might not need GPUs, you might actually be able to do inferencing on CPU only. And if that's the case, you know, just use a scale out model with more nodes.
Whereas if you have a if you have a more intense model that you're training, then you're going to need higher end GPUs and more of them. So to follow on from Tony's question, it feels like we're getting closer to the point where an enterprise can instantiate, you know, a certain hardware instance for an AI workload, but where, you know, there's a lot of ML ops still and there's a lot of glue, but I can tell we're getting close and Dell is, I think leading the way on this idea of deployable solution based hardware. But do you think in the next couple of years we'll see, you know, a typical enterprise be able to instantiate and deploy a AI workloads pretty quickly? Or are we still talking about enterprises that have pretty specific AI workloads? Yeah, I mean, well, today I think it is heavily focused on those cloud service providers that are really investing in the frameworks to be able to deploy, you know, LLMs on scale.
Okay. Or the likes of X. I mean, you might have seen some of our news feeds around all the large clusters that we've been deploying there. What we're actually seeing is that customers are doing two things, right? They're bringing that down the stack, so they're deciphering where and when makes the most sense for them to dip their toe in the water and actually ensure that, hey, for this use case model and size parameter, it makes invest, it makes the best approach for investment. And then the second piece is that they're allocating some of their previous IT spend and budgets to a viable solution that makes sense for for their specific business. For me, I think it really comes down to this, this idea of democratization of AI, and that has to do with two two key factors in my mind, right? One is the GPU choice. So GPU options that are coming out there that are inclusive of that software stack.
And then the other option is the Ethernet fabrics. So as Ethernet fabrics get faster and more capable, I mean, we're at 400 gig, we're going to have 800 gig very soon. And as that Ethernet fabrics are become available, it means that it democratizes that deployment which makes it even easier to deploy on. -Yeah we're pretty... -of customers environment. we're pretty bullish on open standards need for that as well so folks don't get locked in so it's it's good to hear that. Absolutely, yeah I mean we're standardized on you know we have our own switching infrastructure to be able to accommodate that.
But it's all open standards based, right? Awesome. So it's a real useful deployment, let's put it that way. Does the switch cost start becoming more and more important as we move towards these faster networks? I know that you mentioned that accelerators are a bulk of the cost right now when we look at AI, but obviously as you talk about, you know, scaling up and out, the networking becomes very important. And I'm, I don't know what the cost of the networking is. I look at the cost of GPUs typically, but I'm assuming that people who are listening are like, well, okay, I care about total cost. We're talking about networking now. What does that look like?
And is that growing in relation to accelerators? So the networking cost is not as significant as accelerators. It's nowhere near. But when you looked at the historic goals, if you were deploying, I mean, while InfiniBand does still have a place for low latency and, you know, high performance traffic, I think that the networking costs around Ethernet, I mean, as you go to two copper, right, you're able to really significantly reduce costs. You're able to ensure that you can go lower, you can go different distances within Iraq and make it much easier to handle on a daily basis.
The costs are, you know, significantly less expensive where it's tends to get more expensive. I mean, if you think about the fabrics within these AI clusters, based on the large chip clusters, there's really four fabrics. Right? And the most considerate fabric is the one around the GPU fabric. So the GPU, the GPU fabric. And when you look at the topologies for those switches, it's now a fat tree versus a rail topology.
And depending on how large you're going to scale, I mean, if you're going to go up to 64 nodes, then a factory might work, right? But if you're going to go beyond that, when I say 64 nodes, that means what is that, 512 GPUs? So it's pretty significant number of of deployment. Right? But as you scale, we want to make sure that the performance equally scales and you're not doing multiple hops to try and she's trying to accommodate that. Absolutely. On the node of accelerators, obviously, we're really excited to partner with you on bringing a brand new AI workhorse to the race here in Gaudi 3 and Dell has been a wonderful partner on that and we're really excited It's about giving enterprises developers more choice, right? Yeah. Driving the total cost of ownership down for everybody across the board. So... as we see accelerators really fast, ones come to market, become competitive and drive the total cost of ownership down for everybody.
Where do you see enterprises using accelerators and what workloads should foxes in the show think about running, you know, either on prem or in the cloud of a accelerator versus on the edge? Like we're going to see a lot of edge computing clearly. But let's talk about this accelerator workload and where you're excited to see enterprises really get the biggest squeeze out of the juice. Yeah, I think customers that are trying to decipher whether the juice is worth the squeeze, right? I think that's part of it.
That's a decision they're trying to make every day. And, you know, the cost, does the cost incurred make it worth it for my business? Right? That's the true impact statement in my mind. And for me, it's going to be anything that's going to help either give them a competitive advantage, modernize their infrastructure and, or reduce time for their staff so that way they can get more from their staff within their you know, within their chosen industry. Any of those factors is what's going to really try and drive the use of accelerators. I mean, we've seen in the past, you know, the big portion for accelerators was VDI, right? Within a vSAN cluster. You would you choose to deploy accelerators for VDI deployments.
Now it's all about LLM. Well, it's all about Gen AI. it is ensuring that you're choosing I mean, if you look at it, by the end of next year, Dell will have somewhere in the range of 12 unique accelerators in our portfolio.
-Wow! -Okay? And that's both PCI and OEM ss7 based, right? So it's one of those things where we are giving customers option in the market space because they're demanding it. And if you remember anything from a few months back when certain accelerators lead, times were difficult at best. -Yeah. -Right? You know, we want to give them some choice.
Yeah. And we've got some great partners like Intel that are, you know, bringing some GPUs to market that are really going to help offer that choice for customers. And not just the silicon on the accelerator, on the GPU, but the software stack that they're bringing. Right?
I think that piece of it is going to be really interesting. Yeah. And we are really excited about the Intel Gaudi 3. Obviously, you know, I mentioned earlier before we we started the podcast that I worked on the Gaudi Data Center.
I imagine that Dell is excited to have a chance to play with those as well. I know you guys have some of those, I believe in-house as you guys kind of prep because we've already announced that we're going to be working together with that. How do you think enterprises are going to utilize AI accelerators like Gaudi 3 in the near future? Yeah, I mean, whenever you're looking at any of these use cases that are going to be large models, right? They need a lot of memory within each of the GPUs. That's going to actually be a significant implementation use case That combined with the 800 gig connectivity, I mean, it means that we actually had to partner with Broadcom to bring our C 9864 switch, which is a dedicated 800 gig switch to be able to interconnect all of those Gaudi 3 x86 eighties that we're bringing to market. So it's super exciting.
It's one of those things where you know the size I think we're going to be able to address larger datasets, larger parameters than we ever have in the past. And not only do that but have some really amazing now I haven't seen the numbers yet, no one has. Because we literally unboxed the system yesterday. Right? And we're thinking figuring out cabling and all this other stuff but really excited to try and get our hands on and and see what the performance looks like of those platforms you know as the software is there and and as their software stack continues to mature even more.
Wow it's like Christmas I wish I could have been there. -Yeah. -To unwrap those. You should see you know because these things are heavy it takes four of us to lift them. Right? And you should see, like, little kids, all of us, like whether it arrives. it's so jealous, you know, on that front.
So, you know, we are hearing people talk about ten megawatt clusters and 100 megawatt cluster someday. And a trillion, you know, clusters. And it's clear that compute is getting bigger and bigger and bigger. And it appears that we're going to unlock superintelligence at some point by just adding compute.
So it's fun to be in the middle of this with you. Now, my question is, it feels like there's this ocean of data that enterprises have that is clearly not being used even for ML right now, but it's going to be unlocked and we'll need to compute to work with it. But do you think as we start to roll out these massively powerful clusters, hopefully together, that it'll be on the training workload or on the inference side? Where is your kind of gut instinct And then how do you think that what's the takeaway for folks listening to going to show? yeah, from most customers that we hear there, their first adoption is going to be on the inferencing side.
Average customer, Joe, is going to, you know, really going to be approaching that inferencing subset. And because they could probably do that with some of either existing hardware today or with small changes or implementations, they can set up some inferencing and base it off of a pre-trained model. Right? The thing is, though, as customers data becomes more important to have on premise and you want to use obviously you want to use your own data to garnish your own results. Yeah. So that way you're not you're not training it off of someone else and making assumptions and not knowing if that hallucination is from something that was wrong with with the, you know, the root data or is it something that is actually something wrong with the model, The training of the model.
Yeah. So the training portion is going to come. I don't think it'll be I think people are starting to dip their toe into it, especially within their, their own personal environments. But we're seeing these large customers, you know, the likes of Samsung and other customers that are doing really interesting things like, you know, translation on your cell phone, immediate as as you speak into your cell phone, it translates it as it goes across the wire without any need for connectivity. I mean, it's absolutely awesome, those types of things. The training is being done back in the data center on these large clusters.
And that's kind of the use cases that we're seeing for customers now is a forty person accounting firm going to be doing training? Probably not. Right? That's just not realistic. But what they might do is, you know, there might be a training model that becomes available. I'll give you a perfect example. My wife's a child psychologist.
She's a school psychologist. Nice! Some say that I'm a really smart ten year old. Okay? You know, so I take it all the tests like that.
-I love it. -But the bottom line is that, you know, they're actually within her field. They are now using LLMs to help write documentation like she'll run testing and things like this. And they have a pre-trained model that they're able to run it through. Smart. Now, whether or not that has a decent route data set is one thing.
So if we were able to take her historical reports and train on reports that she's written in her style. -Yes. -And then be able to to deliver new reports in her style and things like that. -Wouldn't that be useful, right? -Be amazing.
And so and I want to thank your wife through you for her work, because as a parent of kids, you know, it's very important work. And yes, if that could be augmented, with AI, that's such a beautiful use case of the technology and I think it will really literally change lives. -Definitely -Please thank her for us and pass on the good news.
Seamus, it's been lovely to have you on the show. We're really excited about our partnership with you and we think, you know, bringing compute to more people will change the world and we're excited to be doing that with you. What's the best place that people can go to learn more about what Dell is doing in AI space or any particular messages you share as we close? Absolutely. So the first place to start would be obviously dell.com/ai And then the second place I would go for more technical content use cases Get Hub code, would be infohub.delltechnologies.com It's a little bit more technical, like that's the triple click the meet around the bone, if you will, around how these AI factories and use cases are put together and best resources to kind of deploy on your own.
Perfect. Well, thanks again for your time and wish you all the best and hopefully will. -We'll see you online. Take care. -Awesome. Thanks, guys. Good discussion.
2024-07-29 20:44