Agent Q, no AI in art, and AMD acquires ZT Systems

Agent Q, no AI in art, and AMD acquires ZT Systems

Show Video

AI agents what are we expecting next how do we put um planning and reasoning alongside this large representation of the worlds we have now are we going to have products that truly never incorporate generative AI I think never is such a strong word and what's the most exciting thing happening in Hardware today it's nice to see that finally we built Big computers again I'm Brian Casey and welcome to this week's episode of mixture of experts uh we let Tim go on vacation this week so you're stuck with me and I'm joined by a distinguished panel of experts across product and research and Engineering vmar ulick who is the VP of AI infrastructure Chris Haye who is the CTO of customer transformation and Skylar Speakman senior research [Music] scientist there's been a lot of discussion in the market around reasoning and agents um over the last you know six months or so and so the question to the panel is do we think we're going to get more progress in building reasoning capabilities through scaling compute and this is just over the next year or so scaling compute algorithmic progress or from Good Old Fashion software engineering so vmar over to you uh very clear algorithmic progress Chris software engineering all right Skyler algorithmic that's the next step all all right I like it we got we got some different opinions on this and this actually leads us into our first segment that we're going to be covering today um which is a company called mulon uh released a new paper around agent Q uh and this paper is demonstrating improvements in reasoning and planning and the scenario they defined in the paper which was using an agent to actually book restaurant reservations was using llms combined with other techniques like search self-critique uh reinforcement learning and they demonstrated some like order of magnitude Improvement uh in just the success rates of llms and so maybe Skyler as a way of just kicking us off I'd love to hear a little bit about just like why do lmms struggle so much today with with reasoning and like why is you know some of the work going on in this space exploring other ways like so important to to making progress so llms have this amazing ability to build a world model um I think I've seen that phrase popping up more and more sometimes it will get criticized and say oh all they're doing is predicting the next word but in order to predict the next word as well as they do they actually do have this I'm not going to say understanding might be too long of a stretch but they have this model of the world up until these new recent advancements they had no real reason motivation agency whatever you want to call it to really go out and explore that world but they had created that model of the world and they could ask answer questions about it uh so I think this idea of llms being limited to creating the model of the world they did a very good job of that I think some of these next steps now are all right now that we've got a representation of the world which is pretty good at the next token prediction problem how do we actually execute um actions or make decisions based on that representation and so I think that's kind of this this next step we're seeing um not just from aging q but lots of research Labs here are really trying to figure out how do we put um planning and reasoning alongside this large representation of the worlds we have now so I think these guys are off to a good start uh one of the first ones to kind of put something out there um uh the paper down you know uh out available for people to read um lots of other companies are working on it as well so I wouldn't necessarily these guys I wouldn't necessarily they they're ahead of the pack yeah maybe Chris I know we were talking a little bit about this which is like how indicative do you think um some of the work that the team did here is of just like where everybody's going um in in this space like is this is this paper just like another piece of of data in like what is a continuation of everybody sort of exploring um the same sort of problems and do we think this is you know pretty dialed in on kind of where the problem space is going to be around agents over the next year or so I think it is actually pretty dialed in so when I when I read the paper it's kind of similar to some of the stuff that we're doing with agents herself so that's always kind of goodness there but if if you really look at what's going on there is they're not really using the llm for the hard bits right they're using the Monte Carly Tre search right to to actually work out so one of the major things that they're doing is they're using a web browser as a tool so if they're trying to book a restaurant for example then what they're actually doing is doing a mon Carl research and they're navigating using that Tool uh to different spaces they're using the llm to self-reflect they're using the llm to create a plan in the first place of how they're going to book that restaurant but they are relying on outside tools they're relying on outside pieces like uh the tree search to be able to work out uh where they're going and the fact is that is cuz llms are not great at that right so it's like it's more of a kind of hybrid architecture in that sense and everybody's doing the same thing with agents as well right you're bringing in tools you're bringing in outside memory you're bringing in things like uh graph searches for example so graph racks becoming really popular in these spaces everybody's sort of bringing in Planet and reasoning as well I think they're doing some really interesting stuff there with the self-reflection and the fine tuning so that it's more of a kind of virtuous circle in there within the paper so I I think they're probably further ahead than than a lot of people in those spaces but even if you look at the open source tools the open source agent Frameworks we started with things like Lang Jan but now you'll see things like glang grath is becoming really popular um and then you're moving into other multi-agent collaborations such as crew AI so I everybody's on a different slightly different slant on where they are in this journey but they're definitely on the right track I would say at this point in time and and by the way back to my earlier argument that is software engineering my friend that is not doing anything different with the llm it is engineering and putting stacks and Frameworks around your tool set to that point Brian I do want to hear uh vul Mar's take on why algorithmic was his was his pick so you have to hold you have to hold us to our answers and he's going to go next so um my background is we we I built self-driving cars for seven years and we this was always this U decision between you know how much software engineering can we do and how much can we train into a model and then in many cases what Chris just said is you know it's often times a packaging of different Technologies together and I think where we are where we are right now is we we have as you mentioned this really powerful tool which is LM so we have some basic form of world understanding and we have the world model and now we are trying to make some something do stuff which we haven't seen it's not oh just predict predict the next thing you do on Open Table right and so now you're on a in an unknown open world where you need to explore different uh you know different choices and then I think what the next step will be you know you run this Brute force and then once you have those choices you actually will train a model that's my expectation because that's the path I've been on with with driving so we always came up with some euristic huge Data Corpus tried something out and then in the end it was always like oh yeah now that we figured out what the underlying problem is let's train a model to make this more efficient in execution and so in the end the model is just an approximation of an of an extensive search right and so I think that's why algorithmically I believe that um the uh uh the algorithms we will build uh are effectively those you know graph searches Tre searches Etc which ultimately then will feed into a simpler representation which is easier and in real time to compute I was I was kind of disappointed by the paper if I'm honest and I'll tell you why and uh and and Brian's dreading what I'm about to say now but um but I'll tell you why I was disappointed because the whole example was the Open Table example now unless I am wrong and I don't think I am isn't mulon the company that claimed that they were the agents behind the strawberry man the uh I rule the world Mo Twitter account so uh you know that would have been the uh the agent example I would have wanted to see in the paper it is that that was actually a question um I was like I was thinking a lot about because they they they talked about reinforcement learning as part of that and like one of the interesting things that I've just seen in the market the last I don't know a few months or so is there's this like like light backlash happening to to llms within the ml Community even a little bit particularly I think the people who have worked a lot in reinforcement learning um you know and you even heard you know folks like like people talking about llms being a detour on the path to to AGI and I'm seeing like as as we've slowed down a little bit in terms of progress I've seen like the folks who love who operate in those kind of reinforcement learning spaces like starting to pop their heads up more and being like hey it's back like um the only way we're going to make progress around here is some of the other techniques and you know I'm curious like maybe two questions is um maybe I'll start with this one is like do you all think if we fast forward to a world where like agents are a much more significant part of just like the software that we're all using every day do we think llms are like the most important part of that or Chris to your point around this paper that make extensive use of lots of other techniques do we think like a bunch of other techniques are going to come and like rise back to prominence as we actually try to like make these things do stuff um and um so yeah maybe I'll stop there and just see if like anybody has a take on that yeah I I definitely think RL is going to come back into this um I know they were using RL and that paper and they were also using things like DPO and stuff but I I think it's going to come back into this so I keep thinking back to alphago and the deepmind team and you know winning at go there and and again they were using similar techniques as you could see in that paper there um but but if if you take a deep learning algorithm today on your machine and you get it to play the simple game of snake or play the Atari games like deep-minded um very very simple architectures like uh CNN DNN type things absolutely rock that game if you get an llm to play and it doesn't matter whether it's an agent or not that is the worst playing of snake I've ever seen from Frontier models right and GPT 40 is Terri at it um you know Claude is terrible at it they're all terrible playing at these games but really simple RL deep learning uh you know CNN style uh architectures actually rocket those games and therefore I I think that as we try and solve and try and generalize I think some of those techniques that were really successful in the path in the past have to come back into the future and I'm I'm pretty sure that's where a lot of people are going at the moment so we're going to see software engineering we're going to see improvements in architecture we're going to see improvements in algorithms it's going to stack stack stack and hopefully all of these techniques will come together into hybrid architecture but but when you take llms and put them into an old sort of gaming style environment they absolutely fail today do we think there will be like general purpose agentic systems like over the next you know short term let's say like next couple years or is everything going to be task specific um because like one of the nice things Chris like to the point about this thing being an open table like go book of reservation it's a very easily definable objective um right and that means that you can pull in a bunch of these other techniques in a ways that are harder to make kind of like fully generalizable and so it's like when we look at agents do we think we're going to make a lot of progress on kind of generalizable Agents over the next you know year or two or is is everything going to be just in this Tas specific land Skyler maybe it looks like you got some slots on that no don't think we'll have General within two years I think there will be some areas and this might even lead to our next topic areas around uh language creativity I think that will that will surpass uh some humans abilities but the world works on much more boring mundane business processes and I think there's still a lot more ground to make on that to to get those systems to a level of of trust uh that people will use it's one thing to to have these methods you know create a funny picture write a funny story uh but to have llms execute Financial transactions on your behalf different different ball game and we're not going to be there within two years I I'll be proven wrong you can timestamp this that's okay but uh yeah yeah no we're always accountable for our predictions on this show so um so Brian I I think where we may go is we will probably get you know now we are going through examples you know open table and we try another 20 I think we will get into a tooling phase where you know you you can actually explore a domain and um with some human intervention and some human guidance you know you will have tools which can explore let's say a web page how to interact with it and then you may go through some pruning process which may be manual but I think we will get to more automation that it will be you know 10 times or 100 times faster to build this but I think as Chris said there will be a software engineering component to it uh which you know for until we are fully autonomous you just point at something and say learn uh that will take a while and then the question is where does the information come from is it through trial and error or we could even just read the source code of the web page right I mean we we have source code in puton business processes I can just give you you know here's my billion lines of code of sap [Music] adoption for the Second Story there was the CEO of this company procreate um they are a company uh that builds and Designs illustration tools and I think it was on Sunday night um their CEO came out and released a video um in which he said that they are never that one he actually said he hates gen AI um I think he actually used the word hates um to describe it um and he said that they were never gonna include gen capabilities um inside of their product and like the reaction from their community and the design Community broadly was was like super excited and supportive of of this statement like I think as timer recording um that video has got like almost 10 million views um on on Twitter and I have like a bunch of different reactions um to that that hopefully we can you know pick apart here a little bit but one of the things that was like most striking to me is that the way two different sets of like Creator communities have reacted to the arrival of llms like within the I have friends and col colleages who are software engineers and like llms for code um people are generally pretty enthusiastic about that look at it as a great productivity tool they get more work done than they were ever able to do before I also have friends and colleagues who are writers who work at Hollywood who are creatives and who like look at the arrival of some of this technology like the Grim Reaper um basically and so it's just like wildly different responses um from from these two communities and I'm just curious like maybe Chris throw it over to you to you know maybe get some initial thoughts and reactions to it is like you have any sense of like why these communities are responding so different differently um to to this technology I think never is such a strong word that be one of my other reactions to it never so far really uh no feature at all yeah yeah yeah I I'm never ever gonna stream video content because I believe physical is more important well you know what you're out of business Blockbusters so I don't know I I think there is a general wave I applaud them right I think they make tools for their particular uh audience and Their audience doesn't want that and I I think that's going to be a unique differentiator um I'm not sure how that stands the test of time I I think never is such a strong word there the industry is moving fast and different audiences have different needs right I mean I'm pretty sure that if I use procreate there's no chance ever I'm going to produce anything that is of any artistic quality and that is cuz I have no artistic talent but you you're not the target audience I am not the target audience but I am grateful for AI generated art because it allows me to produce something that I would never be able to produce otherwise so things like PowerPoint slides Etc so if they are they are focused on the creative professionals and creative professionals don't always want to have ai geni within that and I understand that that's great you've got your audience you've got your Target and that's fine but I think and I think there will always be an audience for that but I think the tide of time will uh push against them there and I think that's that's really going to be a very strong Artisan statement to make before we move on Chris what what sort of PowerPoint art are you doing um like that was was my I I mean generally if I'm honest it's almost always of unicorns with rainbow colored hair that is that is my pretty CEO presentations um every CEO loves a picture sure all the other ones do you know that's it resonates um with with me but Skyler vmore I'm curious if either of you have takes I'm just like the community's really reaction to like these two different sets of tools so I think we are in a world where um you know we have artists and craftsmanship and we are going through a phase of automation of this Artistry and craftsmanship and so the bar will be really really high and there will be always unique art we still today you know I can buy photography I can buy you know a copy of a mon you know some of the greatest artists in the world and can hang it on my wall but there is still a need and a demand by people to have you art which is theirs and I think that will stay like and and we've seen this across you know the progression of time you know horses used to be forms of transportation and now they are a hobby right and so and car old cars is going the same way and you know hopefully at some point that's with airplanes and I think um these these unique pieces of art if I can automate the creation and I can you know industrialize it the industrialization wins it always wins but it doesn't mean that those tools and those artists and that craftsmanship shouldn't be supported it will just shrink dramatically because uh you know the the capabilities become more accessible to everybody you know if you used to have typus now everybody can type all the typists are gone right and there will be the same thing one of the things I thought was interesting is that you made this point about craft like I think a lot of people choose their life's work because they like the Craft um of of that right they chose to be an artist or a developer because they like like doing that work and so having a tool come in and like do all of it for it is like robbing you know some degree of value from um you know the things that they do day in and day out and um one of the things that I was also thinking about and I'm just curious if in your in within your teams within your own like set body of work you're doing with clients that y'all are working at do you also see like of the other places where I was thinking about tension um around this sort of dynamic is um in the relationship between management and practitioners um where like one of my observations is that like management is oftentimes particularly enthusiastic about adopting these tools because of the productivity benefits like I can get more things done I could reduce my cost I can you know drive more Revenue whatever it might be and you know because those are the things that like they're running their entire organization to like Drive deliver those results and in some cases they've become as they've gotten more senior maybe one step removed from actually doing the craft so the loss of The Craft maybe feels like less of a consequence um to management sometimes but to practitioners it's like this is my thing uh and this tool is coming around and just like doing it for me in some cases so I'm curious if youall have also observed any sort of like when it comes to adoption of some of this stuff any tension between like management and practitioners um in terms of like their level of enthusiasm for for this technology I'm not sure about tension of management and practitioners uh there might be a sum of I've witnessed of uh which flavor or which version so they're going to say no we're going to use this one and uh back actually behind the scen somebody's using a different a different tool and some tension back back and forth on that one so it's not necessarily the adoption uh but maybe the channel or the Tool uh has had has had a bit of uh that one or this one and um so yeah that would be what I've observed I think it's also the question you know when you look at at um Craftsmen um there's 20% of work you love and 80% of work you hate often times it's like the majority I mean ask a data scientist like 80% is data cleaning do you think they like data cleaning no right so um if you I think the tools like if they support the the toiling the useless work and make people more productive then you know you shift more into the the work which you actually like and appreciate so I think there is from the from the engineering I mean I'm mostly talking software Engineers here from the engineering perspective I think it's actually an improvement you know nobody likes jro ticket reviews and writing comments and all that stuff if that can be automated away then that's you know an improvement in the life of people or I don't need to go to St overlow and try to find that algorithm I can just ask the model to write it and I'm done and so I'm more at the architectural level um and I think uh from a management perspective I mean they want to get productivity out but there also productivity in an Engineering Process in many cases is that you you know need to convince all the people to do these pieces of work because they're necessary for the product but everybody hates them so and I think to a certain extent you know it's an improvement on both sides that's that's a great point I I always well it's probably not a probably not safe for description of it but I always like to tell we we share those things amongst the team so everyone should just mentally come to terms with some percentage of your job is the work that none of us want to do in this team but we're at least going to spread it around um the group a little bit but um but that description like actually so I like a lot of the teams that I work with are operate a lot on just like ibm.com do a lot of things around content and like we the dot property has tens hundreds of thousands million ions of pages as part of and we're trying to do way more with like Automation and like how we connect content together and stuff like that it turns out in order to do that like all your tagging has to be like really good across the entire property across tens of thousands of pages and it's like oh my God the amount of time that we are going to spend cleaning up the metadata on like this chunk of the website it's like just just kill your calendar for three days for like some whole chunk of the organization to go through this stuff and if we can instead like build just like a really good classifier um um and you know ways of doing that it's like that type of stuff actually lands like a huge relief and like lets us focus on doing the work that we actually signed up to do so like at least within my team like that's a lot of what we're doing is we're looking at this type of tedious work that is really um it's important and it has to get done to your point but like nobody really wants to spend their day um doing that can we do as much of that so we can actually like focus on doing the work we want to do but like when it comes to using llms for like the core core thing that we're doing everybody's still a little skidish um honestly at least in some of these now it's not on the software engineering side of our teams but on like some of like the you know more Creator side of it so it's like some of this some of these announcements like kind of resona with me because I see it with some of the folks that I work with a lot I think one of the other things is I don't think it's just tedious stuff I think for kind of prototyping type stuff you know and ideating it's really good like so and I don't think it matters whether you're producing content or you're producing code or you're producing images sometimes you're like I have an idea is this going to work H it's going to take me quite a lot of time to sort of build that up let's just get the llm to do something or the image generator to go through this a little bit I get an idea what it looks like and then I'm going to start pruning it and then I'm going to start building the idea a little bit more and I I personally again more from a software development side of things that's kind of how I work so I at the moment I'm sort of trying to create a distribut a distributed parameter service for training llms there is no chance that I would be able to just sit and code that straight up myself right I need an llm to help me out figure this out a little bit and then I will engineer through where I need to be with that right and and I think that is true and it's the same with image generation right it's like you know uh if you're doing a concept and you need that unicorn with rainbow colored hair get the get the image model got it yeah exactly get it get it out there and then you go okay you know that that doesn't quite work in context you know I need this and then you can go and draw your pretty unicorn at that point right but I I think prototyping is a really important use case and I think Chris like when when you're doing that prototyping right it's like you can have a dialogue you know with with a machine and you get major refactorings done in in seconds right because you can just like I want this other thing let me split this into four classes or let me collapse them the amount of work you would have to do and that's all the tedious stuff you know refactoring of code and we have idees to do that but they kind of suck so if you can actually get an llm to do that H it's just amazing and and like the time you can do it in an hour you know somewhere on a plane and you can actually write massive amounts of code and experiment with it Brian before we leave this topic I think we just need to remind ourselves that you asked kind of an art question to three nerds I'm I'm I'm safe in saying that right I mean just put a disclaimer here I think it would be a fascinating conversation uh to have uh artist representation on this question uh so all of this just taking you know we're talking about inevitability and tools and all of that and I think that's that's where our brains go but uh uh really fascinating to have this conversation uh with with the artists with the a very like one of the reasons why is because I do have like like I said I do have like friends who do both of these things um and I have just like observed how different the reaction is um from them and from like the community um that that they operate um and and like there's a bunch of like interesting economic factors here that play into like this like I think there's less concern in some cases about like more like real industry disruption happening with like the software engineering community than there is on the creative side so it's like I think there is that just like a little bit of that kind of core underlying economic anxiety that is not quite the same in in those two places even though um you know you're really just dealing with like just different types of models um that are helping improve productivity in different types of domains um but it'll end up Landing I think pretty differently potentially so I think it's a great point we did not totally represent that other side of that um of this but it is um it is just a super interesting topic I think and I think one of the things will be interesting is just to the point about never um I feel like there's so many tools that like you use them as part of a workflow and you don't even know what the underlying technology is it's like you know if you want to take a background out of an image like do I know that's gen or something else or what like do I even care um in some cases so you know in some of those places I'm like man never really U but I think it will be it will be interesting to see like how this SP evolves um over the next couple [Music] years earlier this week AMD announced the acquisition of ZT systems um and so I think as everybody knows like the hardware space has been like one of the biggest winners if not the biggest winner um so far in terms of like the early days at least of like the geni and uh llm sort of cycle um and AMD is a company like obviously we've talked and everybody's talked a ton about Nvidia but like AMD is obviously making um big play in this space um as well their CEO Lisa Sue was on cmbc um earlier this week and she was talking about the acquisition and one of the things is that like AMD historically has invested a lot in Silicon uh they've invested a lot um and even doing more on the software side of it and that the way that they talked about this acquisition is that they were starting to bring together a stronger set of capability from like a systems um perspective and so maybe vmar as just like a way of kicking things off like why is it so important like why is this Market moving from just like Silicon silicon to systems and like why are systems and like these almost like vertically Integrated Systems within this Bas like almost like so uniquely important so if you look at um AMD and the am the AMD offering AMD acquired ATI you know a decade or two decades back and that's the heritage of their AI accelerators uh and they are kind of head-to-head with uh Nvidia over the years and they own some spaces and Nidia some spaces I think what Nvidia did very well over the last couple of years is to look not only at the GPU itself but looking at you know many gpus in a box and then when you go into training you go multibox so you need many machines and the integration if you look at the Acquisitions Nvidia did is they acquired um a company which is you know providing the software stack to run very large scale clost stores uh which is the base command product and then uh they also acquired melanox which is the leader in like reliable network communication and so AMD is sitting there and like okay so what do we do um and they don't have a uh a Consolidated Story how they can put you know a 10,000 GPU training system on the floor so they're kind of locked in the box and they are not yet at the scale where they could actually compete on the training side and that's I think also the reason why Nvidia you know owns like 96% of the market um a when you when you're trying to train you can pretty much only use Nvidia and then you already did all the coding on Nvidia systems and all the operators are implemented for Cuda and performance optimized because otherwise you didn't train the model then running it's kind of trivial right and so switching an ecosystem is really hard um Nvidia went down this route of you know having like the dgx system so they built full SS with all the network communication Etc and AMD I think is just now catching up so they're catching up on the network against melanox they announced Ultra ethernet and now they are catching up you know how to get these big systems into at scale into into the industry and you know they need to get into the cloud providers and so I think systems you know being a boutique shop which makes very large scale infrastructure deployments happen is is a lot of good conclusion that makes one of the um I think you mentioned training a lot like one maybe as like a follow-up question um to that you know one of the observations I have just about like the GPU Market in particular is that it feels like more vertically integrated than the world of CPUs um does at least like somewhat um and is like one I guess would you agree with that sort of characterization and two if you do like is is building out the sort of unique set of um requirements maybe around the training stack like is that like the underlying core force around why this Market is like behaving the way it is and why it's behaving differently or do you kind of see those that story like differently than the way I just kind of laid it out I think the training system Market is a a traditionally very esoteric Market which is the high performance computer market and you know at IBM we built like top 500 like like number one and number two top 500 superc computers with blue Gene you know lpq and uh the the follow on systems um and suddenly we are on a world where that is not anymore a a domain of the labs which drop you know $100 million dollars uh and get a computer uh suddenly every company which wants to train a network at scale needs similar technology and so what we are seeing is after 20 years or 40 years almost like HPC being a very esoteric field of you know let's say 50 supercomputers in the world Suddenly It's a you know it's a commodity and you start up it to we should all have a superu exactly you know like oh yeah I need a super computer you don't have one so and you know the I got unfinished basement like you know the joke con I was like I'm GPU poor right so I only have like 100 so the uh and and if you want to play in that market you need to actually offer a solution and I think AMD has been traditionally in the desktop market with the GPU or like Enterprise market with the GPU and they they sell s but they never build these systems Nvidia being an an actual GPU vendor amazingly has captured like 85% of the dollars spent in the data center right so it's like yeah your Intel chip good luck and a little bit of memory and everything else we take we take the switches and we take the the the ethernet cards and we take the GPU and that's the other 85% and so for AMD to get something deployed at scale I think they need to have an offering which is on par I think Intel with gud is in a little bit better shape because they have Partnerships over you know 50 years with Dell and Lenova etc for them it will be easier to get into that market because they already have an ecosystem and that's not the case for AMD this is why I don't get a vmar I actually don't get the acquisition because if like let's say I was an Apple company not Apple but an Apple company and my market everybody bought red delicious apples cuz they're great apples but my company sold Granny Smiths and nobody ate Granny Smith apples why would I buy a company that makes better packing boxes for my apples I I that's that's my problem with it I'm I'm kind of like if I'm spending $5 billion you know spend the5 billion on getting better gpus right and and come go compete with Nvidia that's that's where I don't quite understand it in my mind I think the uh the uh Nvidia figured out a way of actually delivering it deploying it to Partners and to a certain extent AMD got locked out in that space so they need to find a a way to Market and what that way to Market if you look in the training space a huge percentage of the training is actually happening with the hyperscalers um companies like they want to put Nvidia cards on their premises but in many cases in for early Beginnings they go into the cloud nctt is deliver to the hyperscalers so for them it's a way to get into the hyperscalers with a solution where they say okay we give you the whole thing so you you take down the risk on the highers scalers I'm not sure people do want to use Nvidia I think I you know I I think that nvidia's got this Market Lo and Nvidia is awesome they make great gpus but but at the same time Apple seems to be doing well on the desktop Market or the laptop Market with their uh with their chips and with mlx as a framework so you know custom Apple silicon seems to be working out well you're seeing companies like Google invested in their own kind of Asic based chips chips with tpus you see other people move into as6 as well I I think there is a space for a lowcost alternative to Nvidia chips and I I think there is a market for that because otherwise other other companies hypers scales Etc wouldn't be investing in that and that's why I'm saying I don't get it I you know Nvidia by far makes the best gpus across the board they're an incredible company I I just think if I was a competitor I would try and find an eanet space which isn't the packing boxes yeah I I think the the uh the really for in the training market right now Nvidia is just the only choice you have and I think this is primarily where indd is trying to break in I think in the inflence market there will be you know like you said apple and you know there's Qualcomm there's a ton of Chip vendors and there's a you know a pleora of startups in Silicon Valley who are trying to make like super low power Etc but in the training Market if you look where AMD is going and the wattages they are putting down you know where it even goes above a th000 Watts on a on a GPU in in the next Generations that is um we are you know Nvidia is effec the only game in town and I think they want to put something up against it and you only have for pre-train maybe for pre-train maybe but not necessarily fine tune fine tuning I think you can in many cases you can do in a box like you do not need a huge system yes but in the pre-training market you you do and this is where are right now you buy Nvidia or you buy Nvidia and you know gudy isn't there yet AMD isn't there yet and so I think this is effec a an attempt and who knows how let's see how it plays out right I mean I thank God I didn't have to make the decision um but um you know I think this is an an attempt of breaking into that large scale training market and delivering you know very very large HPC systems you know companies run 100,000 GPU training cost building that takes you know a year it's massive investment you know it's in billions of dollars and so if you want to capture some of those revenues then you need to have it's it's not you know um like oh we we collect like three engineers and they put up a supercomputer it's like no this is a this is a construction process right and and this is where where AMD with this acquisition finally has a chance of of you know bringing the guys with the hardheads in as well because you need to put Power in and Cooling and all this stuff right and I think they don't right now because that's all rled out they do not have the experience and so they I think they're buying the the competence but that point about competence was actually something I saw come out a lot in the discussion um post the acquisition where um you know this is a company that does have a lot of capability around doing exactly that building out large scale clusters some of the biggest in the world essentially um and it's a kind of inter in theme that I've heard at every level of the whole gen stack at different points over the last year or so you know you hear it in the hardware side you hear it and it's really like to the point about being like you're almost rate limited by the amount of expertise that's in the market right now it's like in the hardware side I heard it on like the training side you heard it for a while on even like the prompt engineering side like you know people refer to them as like you know magic encampments um for a little while and there was like this like only a certain like group of people even really knew how to prompt the model um correctly and a little bit of what I've observed over the course of like the last I guess like almost two years at this point is that like as this thing has blown up like it feels like some of those skill shortages are like getting less acute like more people know how to train models more people are getting competent uh working with models more people obviously are like attracted to the hardware side of the equation because of some of what's happened over the last couple years um I'm curious like across the board like how much do you feel like our progress in AI is still rate limited by just like raw expertise um across the world in in this space and like how much has that improved or not um over the course of the last like year or two and so maybe Skyler just kick it over to you sir I I have this conversation pretty regularly uh with our our director and I would say it's not necessarily the overall amount of skills I think that definitely is monoton Al increasing but how it's distributed across the globe that's becoming more extreme and so I think that's something that uh we are we are experiencing you know we're IBM research Africa we represent a billion people uh but uh the talent that here is probably going to immigrate and what does it look like to have that Talent uh here and and bring that culture here so yes it is increasing but I think at very different rates across the globe that' probably be my my my short summary of that and it is something that uh we we do talk about on a regular basis is what does uh capacity in generative AI look like on a really global scale so that's probably another another session entirely in itself that was a I was not expecting that that was a fascinating perspective on that so yeah Chris vmar thoughts okay yeah I think um there is such a uh Gold Rush and it's a new technology and so it's a lot about you know even trying it out and every day there's something new so you need people who are really passionate about it um and you know that they you know spent their living and you know half sleeping hours on it uh and so the skill set I think will develop over time it's I I feel like you know we are repeating the the Gold Rush of the repb era where I was like oh my God you can write a web service isn't that amazing and now it's like yeah you know everybody can do it and so I think we we are just in this in this uptick with a very like extreme Supply shortage and because it's it's so deep like you know when you just pluged a computer into a network it was relatively easy I mean it's like okay you know here's a computer on a network go now it's like the training is different you know do you need to even understand what math is and most Engineers hate math that's why they like computers and so there's this this set of skills which need to be built up and you know until it actually rolls to the universities and we get people who are truly practitioners so you first you need to get the education and then you need to become a practitioner and you need to toy around with it for 5 years so I think for the next 10 years we will probably be in this and plus the speed of change we will in we will be in this world of you know there's Supply shortage everywhere uh I think on the flips side coming from the systems corner it's nice to see that finally we built Big computers again so I I really like this and you know that we are actually going away from like the cloud providers do everything for us and we need to actually look at system design with a you know fresh angle I think that's a that's a goodness for the industry so it was kind of locked in and the only you know there are like five companies in the world who still know how to plug a computer into into a network into power socket and I I think it's good that we are actually going through more of a of a Renaissance of you know computer architecture and and and design at least you know yeah I'm the total opposite I think that people are I I think skills people are learning the skills and they're doing a great job of that across the globe um but at the end of the day if you want to train a large language model you need an awful lot of gpus and you need access to an awful lot of data and that is outside of the access to the average human being so there is a lot of really great skill Talent and they are not going to be able to practice their craft because access to the gpus to be able to learn what is the effect of this data it just isn't there now they can you can learn from doing things like fine-tuning and training very very very small models Etc but at the end of the day we know that for the larger models it it it emerges uh on the higher scale and and therefore and and at the scale now is it's tens of thousands of of uh gpus to be able to do that and I think that is what's locking out average practitioner so me personally I I want to see more distributed compute I want to see more access to gpus and skills and therefore I think to kind of Skyler's point I think that will open up a really talented set of people that are uh distributed across the globe to be able to uh make great contributions in that area but at the moment it's going to be concentrated in the big tech companies because they're the ones with the gpus Chris I want to fight back on your fighting back that's why we do this right if if I have a researcher that comes to me and says the only way they can make their case is that they need 10,000 gpus that's that's not a good argument that researcher needs to be able to make their case off of two gpus so AG where you know where's that conversation start about making the case off of this uh this 2 GPU example show that then we can talk about the 100 the 2000 the 100,001 I don't I don't think it's it's fair to say I can't make progress unless I have 10,000 I don't I I I I agree Skylar but again we're sitting in a company who has 10 thousands of gpus right so they can go to you make the argument with two gpus and then you can give them access to to scale right but the average person they might get so far with two gpus and then they're like huh I don't have the money now well I'm gonna go and do something else so we're moving to a world of universal basic compute um it sounds like I feel like that's been a little bit mey um recently so we will we will call it a day there um vmar Chris Skyler thank you all for joining great discussion uh today U and for those of you out who are listening to the show uh you can grab mixture of experts on Apple podcast Spotify and podcast platforms everywhere so until next week thank you all for joining we'll see you next time

2024-08-26 03:38

Show Video

Other news