NEO 1X Robot, OpenAI chips, The AI Scientist, and the future of prompt engineering

Show Video

Tim Hwang: My opinion is that prompt engineering is never going to die. It's a forever thing. Kate Soule: Anyone who's worked with large language models has experienced some of the pain, dark art, black magic of... if I shout loudly enough at my model, maybe like literally if I type in all caps, maybe this time it will do what I'm asking it to do.

Tim Hwang: The creepy factor is big, but these robots are also pretty cool if you can get them to work. Kaoutar El Maghraoui: I would love to have one actually in my home, cleaning dishes and cooking. Tim Hwang: How many scientists are going to be out of a job in the next 10 to 15 years. Shobhit Varshney: I'm just looking forward to a world where we start using the word "we" when AI is actually starting to do something meaningful for us.

Tim Hwang: All that and more on today's episode of Mixture of Experts. I'm Tim Huang and I'm joined today as I am every Friday by a world class panel of engineers, researchers, product leaders, and more to hash out the week's news in AI. On the panel today, Kate Soule is a Program Director of Generative AI Research, Shobit Varshney, a Senior Partner Consulting on AI for U. S., Canada, and Latin America, and Kaoutar El Maghraoui, Principal Research Scientist, AI Engineering and AI Hardware Center. So as always on Mixture of Experts, we're going to start with a round the horn question, and that question is, will prompt engineers even exist in five years? Kate, yes or no? Kate Soule: No, Tim Hwang: Shobhit, yes or no? Shobhit Varshney: Not at all, man.

Tim Hwang: Uh, okay, alright, and how about you, Koutar? Kaoutar El Maghraoui: I think it's gonna evolve to a different role. Tim Hwang: Okay, alright, well let's get right into it. The prompt for this first story that we want to cover today is that we've just had kind of a slew of sort of subplot, you know, sub B kind of announcements coming out from all the companies.

They haven't been the most kind of prominent things they've been announcing, but it has really kind of created a little bit of a pattern. I think, Kate, you flagged this for us. Which is that a lot of the companies have all been working on prompt automation, right? So Anthropic announced a Meta Prompt system that helps generate prompts for you.

Cohere is launching a prompt tuning feature, which takes a prompt that you have and improves it automatically. And then Google recently acquired a company called Prompt Poet, which is very much in the same functionality. Um, and so this is a big deal, right? If you're familiar with LLMs in the past, right, a lot of the work has gone into making a good prompt. Um, and, uh, I think the big thing about this is the future of basically taking the human out of the loop, the idea that you won't need prompting anymore. Um, and I guess, Kate, as someone who kind of threw this topic to us, do you want to just explain for our listeners, like why, why is that important? Right, like why, what changes when that happens? Kate Soule: Yeah. Uh, and I, I like what you did there, Tim, the prompt for today.

Uh, so look, I think anyone who's worked with large language models has experienced some of the... pain, dark art, black magic of... if I shout loudly enough at my model, maybe... like literally if I type in all caps, maybe this time it will do what I'm asking it to do, right? Uh, which can be a really frustrating process and doesn't make like, logical sense, like I think we're all rational beings and ideally there would be a really rational and structured way to try and prompt these models. So I'm really excited to see a lot of work come out, which is trying to...

not take a human entirely out of the loop, but take a human out of the loop of finding these phrases and tokens and words and patterns... that seemed to be more effective for one given model, uh, to perform a task that's in question. So, you know, being able to, for example, search a broader space of natural language and try and identify, okay, if I frame my question this way, um, now I can get an improved level of accuracy. I think that is going to be really powerful, um, overall just to improve productivity and, and reduce some of the stress when working with models. Tim Hwang: Yeah, for sure. And now Kaoutar, you said actually in your response Is you, you, you agreed with everybody that kind of, well, maybe prompt engineering is kind of not long for this world, but you, you did say that you feel like the role will shift.

Um, do you want to tell us a little bit more about what you're thinking there? Kaoutar El Maghraoui: Yeah, sure. So there has been a lot of recent developments in prompt engineering that is leading to significant changes, particularly in how prompt engineers interact with large language models like Kate mentioned. Things, for example, the meta prompt prompting from anthropic, uh, meta prompting.

And the development here, it shifts the focus of the prompt engineers from crafting these individual prompts to designing systems that guide the AI to adjust its own behavior. So prompt engineers may increasingly hear focus on creating frameworks for meta prompting... or refining the logic that underpins it. And this creates a more robust Uh, role where engineers manage how prompts evolve in real time. And if you look, for example, at what, uh, prompt tuning from coheres, for example, the prompt tuner. So here, the, the prompt tuner from Cohere enables user to fine tune and optimize prompts specifically for different applications.

And, you know, here, the implications prompt engineers may transition from manually crafting prompts to overseeing or curating automated tuning systems. So this kind of democratizes the prompt creation, and this could reduce the some of these technical barriers to entry, pushing prompt engineers to focus more on more complex or high impact tasks... where deep expertise is still required, such as, you know, designing industry specific models or optimizations of scales. So, and there is also other, like, also if you look at the Prompt points, uh, Poet acquisition by Google.

So here, you know, this acquisition emphasizes automation here in the generation and the optimization of prompts. And the implication here, this kind of further blurs the line between AI systems and prompt engineers. So, AI systems here, like, Prompt Poet evolve as they evolve, the role of the engineer here may shift from towards more supervising role, uh, so where you're supervising these AI systems that continuously optimize themselves. So human prompt engineers might focus more on edge cases or creative tasks or model specific customizations. So I think the implications overall here is kind of shifting from manual to kind of a supervisory role.

I don't like to say that, you know, we're going to completely remove human out of the loop here, but more increased focus on optimizations, expansions of the skill sets here for the prompt engineers. They will need a broader set of skills, including model training, data set curation, the integration of the LLMs into broader AI pipelines and also some niche specializations. So I think to sum up is kind of the prompt engineering is likely likely evolving from hands on manual role into a more, you know, supervisory role where engineers focus on higher level design optimization and supervision of these automated systems.

Tim Hwang: Yeah, that makes a lot of sense. And it's sort of interesting that kind of like the process that's happening in the movement to like AI agents... will also sort of happen in the the prompt space right which is rather than kind of like, you know doing everything you're just sort of like monitoring the system as it goes and keeping it together. Shobhit Varshney: Yes, I think the prompts will get more and more personalized to that particular person. And over time, there will be a lot more context that will automatically pull in. So the center of gravity is going to keep moving towards more hyper personalization to show it as an individual.

Uh, so the way the prompt, when I say something to a model, the way it expands it out and makes a meta prompt out of it, that'll be super hyper personalized to the context, the memory of everything that I've done in the past, right? Uh, like I, I feel like being a good prompter... to these LLMs at work has made me a much better parent... talking to my eight year old daughter.

Uh, she just Tim Hwang: Explain it clearly think through it step by step, you know. Shobhit Varshney: Yes, I have to talk to my daughter saying that Anya, you are, uh, you just turned nine. You are a big girl now, and then I walk her to a chair and start reasoning and I get the answer I'm expecting her to say that no, I should not have ice cream before I sleep Tim Hwang: Got it, right? Exactly.

That's the desired outcome. Shobhit Varshney: Absolutely, and there's a lot, and that's a two way feedback training, right? And now we're at a point where, say it's, um, it's 8 p. m. at night, and if I say, Anya, her response is going to be, "Papa, I'm almost done eating." Because she understands that there's a pattern that when she's eating and she's taking more, more time, I'm going to probably be checking in and seeing if she's eating properly or not, right? So she has a lot more context on how to respond to Shobhit itself, right? But if my wife is us calling her, her name, her response is going to be slightly different. So I think the hyper personalization of these meta prompts, that's a direction that we will be looking at going forward.

Tim Hwang: Yeah, for sure. And I guess, Kate, maybe to turn it to you before we move to the next topic, I think this exact point was one thing that I did want to bring up is, you know, when we think about prompting with humans, we encode in language, right? What's sort of interesting is that, you know, the prompting that we've done is both to kind of, like, help us understand how we're interfacing with the system, and then also direct the system. I think...

I don't know if you buy this which is like, many of the optimizations may use tokens that don't even look like, you know, normal grammar, right? Like it could just be like a random string of numbers and letters that actually get the best results out of the system. And so I got some kind of curious like do you feel like prompts over time will become like more and more kind of obscure to us, right? Because it turns out like the optimal encoding for the language model may actually not be something that's particularly human readable or easily understandable at all. And so there's almost this very interesting trade off of like optimization and readability. Just wanted to kind of get your thoughts on that.

Kate Soule: Yeah, well, I think to answer that question, it's important to recognize that there's really kind of two different sides of innovation that are happening on, uh, happening around this area. So one is improving our ability to prompt the models, but the other is improving the model's ability to take structured and more reasonable action. prompt. So, you know, instead of talking to a show of its eight year old daughter, like, can I talk to a software developer that understands, you know, structured inputs and can provide very structured responses? So if we only innovated on the prompt optimization side, where we're trying to create new tokens and, you know, keep the model frozen, then yes, I think we could get to a point where we're starting to see a non human readable prompts.

But I think we're also seeing like with OpenAI structured outputs, like more and more structure being baked into these models to make it more standardized and systematic and how we work with these models. And ultimately, you know, I think that's where the real value would get unlocked and where a lot of, um, really exciting workflows could develop, especially in agentic patterns. If we can really start to focus more on having very structured, formulaic, maybe not perfectly machine read, uh, human readable and that it's not, you know, it's not like storytelling when I read what the model is happening, but a very formulaic way to work with these models, I think is, is ultimately where we're going to end up. Tim Hwang: Yeah, it'll be so funny because what you're describing is we're reconverging towards like code, right? Like structured language as a way of getting systems to do what they want them to do. Kate Soule: Yeah, we started structured, created a bunch of unstructured, and now we're like, wait, that was actually, there was some good things there that we should maybe bring back. Tim Hwang: So I'm going to move us on to our next topic.

Um, uh, we spend a lot of time on Mixture of Experts talking about software, we talk a lot about enterprise, but I think one of the most kind of, uh, interesting things viral, if you will, AI moments of the last few weeks was the launch of a humanoid robot called NEO from a company called 1X Technologies. Um, and specifically they're working on the ideas to work on humanoid robots that are designed to be at home assistance. So, you know, this demo, basically if you've seen it and if you haven't, it's worth kind of looking up on YouTube or whatever, um, is a humanoid robot helping out around the home, right? Cleaning dishes, helping to clean up and otherwise kind of assist on, on tasks.

Um, and. You know, again, I kind of wanted to kind of ask the question, and I think it's always an important question to ask in the world of AI, which is how much of this is going to be a reality? How much of this is like a really cool demo? Um, and maybe most importantly, would you buy one for your own home? But we can address that at a certain point. Um, Kaoutar, I'm kind of curious about your thoughts, if you saw the demo, what you thought about it.

And, uh, you know, if you, if you think something like this is really going to be a reality, and I think in part, right, I think the question is like, whether or not this is like, a real affordable thing from a hardware standpoint, there's like a bunch of really kind of very practical, you know, bits and atoms kind of questions here that I would love to get your take on. Kaoutar El Maghraoui: I would love to have one actually in my home, cleaning dishes and cooking, someone who spends like an hour, one of the tasks I hate the most, of course, the demo was very impressive from 1X. And I think 1X is among the, one of the most prominent companies in the emerging field of humanoid robots. But will humanoid robots become a reality or still a pipe dream? So I think, you know, humanoid robots have been the focus of science fiction for a long time. And transitioning from dream to reality comes with significant challenges.

So the argument for humanoid robots is that they can fit into environments designed for humans, use existing tools and interact more naturally with people. However, I think there are still several challenges that need to be fixed. You know, first, I think there is the mobility aspect, building a robot with these human level dexterity or mobility has proven very difficult. Uh, while there are some progress, I think there is still a lot that needs to be done. Technologies like soft robotics and advanced actuators are making strides here but are far from a robot that can perform all human tasks autonomously.

The other, you know, challenge is the energy efficiency. I think these robots require significant power to function and which limits, you know, their practical use. NEO, for example, and other similar projects are working to make these robots more energy efficient, but the issues around battery life, energy consumption, there's still bottlenecks. Uh, the other, the other thing is the cognitive and social interactions here beyond just the physical tasks. You know, these robots must navigate.

Um, the complexities of all the human life, interactions, perceptions, and developing an AI capable, uh, robot of, that's, that is capable of interpreting these social cues, responding appropriately, making decisions in real time, is still an ongoing research area. There, there is still a lot of work around AI and reasoning. So I think it's going to take time for us to get there. And another challenge, I think, is the economics of this.

Building something that is affordable and, uh, and versatile, reliable, it's still a major hurdle. And for many industrial and service applications, simpler robots or specialized machines are more efficient and cost effective than having this general, uh, purpose humanoid robot. So the complexity and the costs of these humanoid robots, I think, uh, especially in their design, still limits, uh, the adoption to especially niche markets.

So I think there are, you know, challenges, you know, what's the reality versus the long term vision? You know, at present, uh, it is a transitional phase. Existing prototypes, they are far from ubiquitous, but there are really nice demos and it shows a lot of promise. But I think we are still not there in terms of the mass market tools and, and adoption and, um, there, there, but in it's not just, you know, a technological pipe dream. I think, um, it's gonna happen.

That's my, my thinking. But it's, you know, for the full realization, it's gonna take years, if not maybe decades away before they really become a reality. Tim Hwang: Yeah, that functionality gap is very interesting to think about, like, I love the idea that for a period of time people are purchasing these, but it turns out there's like not a whole lot you can do around the home with them, so that they end up just like being like all the lonely Pelotons you see in people's houses, or it's like this really expensive piece of hardware that just kind of sits around, but it's, it's just funny because it's like a humanoid guy, basically, um.

I guess, uh, I don't know, Kate, Shobhit, if you've got a kind of view on this, if you're a little bit more skeptical or if you kind of agree that like, yeah, maybe, I don't know, Kaoutar, you didn't put a date on it, but like in our lifetime, you know, we'll see this become like a practical reality. Shobhit Varshney: Yeah. So, um, I'm a big geek and I will, I will go and buy stuff that I think is, is, is awesome, right. So I'm Tim Hwang: You're going to have the Peloton, uh, NEO robot in your house.

Shobhit Varshney: So, um, I feel that the same argument about one massive model. That's just absolutely stunning, can do everything like a GPT 4.0 model or cloud models, right? Versus the argument that a smaller set of models and a niche for specific use cases, a lot more efficient and a target for a particular use case, right? I'm on the camp of, I would rather have a device. That is helping me for a particular task and it's incredibly doing a good job at that task.

As an example, uh, I use the Roborock S8 MaxV Ultra, whatever the highest end of their robot that does vacuuming and mopping and goes back and cleans itself up and dries itself up, comes back again and finishes off that last little bit of scrubbing that it missed somewhere. More specialized tools helping us augment what humans aren't good at... I think that's the future direction in the short run. It'll take a while for us to get to something that solves for all the constraints that we just discussed before you get to a point where a humanoid replica of you can actually start doing things. So I think in the short next five years, specialized tools that do a particular task incredibly well, are cost optimized, it's repetitive, they nail that particular use case. I'm more, I'm more in that camp.

Kate, do you think the same? Kate Soule: I completely agree. If you think about how model specialization has progressed, uh, you know, we see the same exact trends as you pulled out. So, I'm 100 percent in the same camp. It also reminds me of, you know, the common, you know, story that you hear of where if you asked someone back in the horse and buggy, days what they wanted, and they always said they wanted a faster horse, and then, you know, Ford came along and released the first cars. And I think we're in a bit of that scenario right now where it's like, I just want more human time to do the things that I don't want to do as a human. So create some humanoid robot, but really like, can we rethink of like what the right way this is to make, um, humans more superpowered, not just create more humans that we don't have to worry about feeding them or other potential, uh, labor issues.

Shobhit Varshney: Okay, that sounds more like, say, uh, how we solve the dishwasher paradigm, right? Kate Soule: Yeah. Shobhit Varshney: We figured out that there's an optimal way of washing dishes and it does an incredibly good job at a very low price point and it nails it, right? So we have changed the way human workflow used to work, right? Earlier, as a human, I would take a dish, rinse it, and keep it somewhere else. We did not try to optimize that particular workflow. We said there's a better way of solving this particular niche use case.

It's very custom optimized, and we'll nail it. So, I'm on that camp with you that I think we'll get to a point where smaller machines that do a particular task really well will I don't want, like, for example, in our pool, we have, uh, we have a skimmer that just skims and removes all the dirt from the top. Now, a human will take a net and try to clean up each one of them one by one. That's not the optimal way of solving for that problem. So I'm with you that the workflow, the human workflow has got to change. And then we optimize.

By the time we get to a point where we get a humanoid that can then solve for all the problems that we discussed around cost and flexibility, dexterity, and things of that nature. Tim Hwang: Yeah, and I think you, for what it's worth, I think also just, like, you can't discount, like, the creep factor, right? Like, I do feel like it's, like, a little bit, it's a little bit spooky to have, like, a, you know, a large human in my house. Um, and, uh, and I do think that will be part of the adoption, almost, like, leans in favor of these more specialized applications, because they kind of don't raise that fear. Uh, I don't know.

We'll have to see in practice whether or not X1 is able to pull this off, or 1X, sorry, is able to pull this off. Kaoutar El Maghraoui: Yeah, I think it's interesting development, you know, and it's all it's all comes to to what people are also able to consume and the capabilities, of course, specialization versus generalization is always going to be a concern. But of course, if we can combine both, that, that would be great. Um, so it's like what these LLMs are doing, but we still need special models. But, you know, the evolution of LLMs is still important. Having these large models that can do a variety of things, but then specializing them for certain tasks.

Can we have the same argument for these, uh, humanized robots that, you know, can do a variety of tasks, but maybe you can press a button and tell it, now I want you just to be focused on cleaning the dishwasher or the pool, or so something that's maybe take a subset of that model that is specialized within that humanized. I think that would be cool to have. Tim Hwang: Yeah, I mean, ultimately, you're going to have like, you know, the humanoid robots going to be the one that does the maintenance for all the other smaller robots. It's just going to be robots all the way down. Kaoutar El Maghraoui: It's like a hierarchy over here.

Tim Hwang: Yeah, exactly. Shobhit Varshney: I think what Kaoutar, I mean, this Kaoutar, just the way you framed, I think you're looking at a Transformer robot.... Kaoutar El Maghraoui: Exactly. Something... Shobhit Varshney: ...a vacuum cleaner so it can do that

one job really, really well. That'll be the world we live in. Kaoutar El Maghraoui: That would be cool. Yeah.

Tim Hwang: Um, so I'm gonna move us onto our next topic. Um, so there's a fascinating paper that was shared by a friend of the pod Kush Varshney, who, uh, if you're a listener, has been a recurring guest on this show. Um, and what I love about some of these papers in machine learning is that they like pick the most dramatic name for their paper. And so the name of the paper is The AI Scientist. It has a long title about kind of towards, you know, effectively like using AI to automate end to end science.

Um, and it's a proposed system that tries to really see and kind of push the limits of whether or not large language models can really help out with scientific discovery in a fully kind of automated way. And this is a big deal. I mean, you know, you think about how you know, societal progress happens, right, like these technological breakthroughs are really critical. And so, you know, one way of thinking about it is that we've got this kind of bottleneck for the researchers, the brilliant minds that we have.

And so, you know, the hope is basically, can we augment that process? Can we accelerate that process with AI has been kind of a real focus. Um, you know, what I always worry about these papers is that the results look almost too good and like the ambition is too great. Um, but I mean, Kaoutar, I know you looked at this paper in some detail. I'm curious if you're coming away with this feeling like, yeah, they really kind of hit upon something here that, that really could be the kernel of something, um, new, or if you feel like, you know, you know, ultimately like the way AI fits in science is going to look a little bit different from the way they're proposing here.

Kaoutar El Maghraoui: Yeah, I, I enjoyed reading the paper. I think it's really, um, Uh, put forward a very nice way of, you know, kind of thinking of this automated AI scientist, which made me also worry, you know, what's going to happen to the scientists in the future? Um, so, so it presents, you know, this very nice framework where large language models generate research ideas, write code, run experiments, visualize results, and even write papers. So, and they also showed some very interesting papers that were, uh, you know, generated by this AI scientist. Uh, one thing that... Tim Hwang: Yeah, I just needed to do the, uh, the paper session at the conference, uh, the poster session at the conference.

Kaoutar El Maghraoui: Makes you even worry, you know, what's going to happen to the conferences in the future and some of the papers, are they really generated by real scientists, or this is all, you know, LLM generated? Um, so these advancements could significantly impact scientific discovery, reducing the cost and also increasing the speed of research. So there, there could be some benefits to this, especially if you look at it as an augmentation for human, uh, research. The, the thing is the controversy surrounding this paper is largely, you know, coming from the methodological concerns that they have using. And especially when you look at, uh, you know, the, the reliance on automated review systems. to evaluate the scientific quality.

And that kind of raised some concerns to me. Uh, you know, the questions here, whether, you know, such reviews can truly assess novelty, creativity, and rigor of the work. Uh, and also I think one thing that's skeptical is whether the AI could really fully replace a human intuition in scientific discovery, especially when you're dealing with more abstract, uh, or interdisciplinary fields.

So this, I think AI is still not there yet. when you're really looking across multiple fields and, you know, kind of mimicking that human intuition. And I think another thing, uh, is also the broader ethical and social implications for automating scientific research. So there are a lot of concerns here, but I think from a scientific perspective, it's a very nice piece of work. Um, And but has a lot of implications, of course, ethical and and also the automated review the process that they have.

So... Tim Hwang: That's right. Yeah, I'm curious. Kate, I mean, as a researcher yourself, how do you how do you feel about all this? You know, I feel like it's like we're very interesting, for example, seeing like engineers be like, well, They're never going to learn to code as good as I am, so I know there's kind of like a tendency to kind of push back on it, but I'm curious about how you think about these types of experiments. Are they like fun toys? Like, would you use these? Like, would you read the papers produced by these AIs? Kate Soule: Yeah, well, I'm honored you call me a researcher, but I certainly work with a lot of amazing researchers here at IBM Research, even if I'm not one directly.

But, you know, I, I actually question whether, as a non-researcher, this might be a naive opinion, whether there isn't something that, uh, LLMs can do well in terms of understanding what's been done in the past with related literature on a much broader scale than what's humanly possible to go through and analyze and read and try and find similar methods or approaches to apply to a new problem that's related. Um, I don't know if Kaoutar you have any, any thoughts on that, if that's a, maybe a jump too far. Kaoutar El Maghraoui: No, I think I agree. You have a point there.

So there might be stuff that they're discovering that scientists are not able to discover because they're pulling from a wide variety of sources. Uh, but I think we still need human in the loop here to validate, verify, uh, you know, these experiments and then take them to the real world and try them and see the results. So we cannot just take the results out from, you know, these LLMs and then just apply them directly to, so there, I think there still needs to be some verification as probably these systems will get better and better as, you know, we use them more for scientific discovery.

Tim Hwang: Yeah, I think one of the interesting things here is that, uh, you know, some of the people I know who research this space think a little bit about like the burden of knowledge, which is like, there's just like more and more knowledge and more and more papers. And, you know, part of the hope with some of these systems is simply that, like, there's a lot of findings that could exist purely in. Like finding connections between papers that just people are not making the connection between. And so that ends up kind of reducing it more to like a search problem, right? I think what's kind of interesting here is the idea that like, then you want them to run the experiment. Then you want the AI to do the empirical stuff. You know, I think there's a question about how far kind of beyond just the question of search you need to go.

Shobhit Varshney: Yes. I think just like any workflow from an enterprise perspective, we help a lot of, uh, people clients with their R&D research and things of that nature, right? Coming up with a new formulation for a, for a new food item or a perfume or like product research for the next, uh, car, you know, so on and so forth, right? Battery research, whatnot. So across all of them, just like any other workflow in an organization, you figure out that here are all the steps that are needed. When you are hiring somebody brilliant from MIT to come join your team as an intern, you're giving them a specific task to augment what a senior researcher in the field for a decade has been doing, right? So, Kaoutar, you will plan out saying that, hey, here's a task that I'm going to go give you, go research this particular topic. I think we'll start to incrementally see more and more AI helping out on specific tasks in the research spectrum end to end.

I don't think, just like any other workflow, I don't think it'll completely be taken over by AI. I think it's. augmenting intelligence rather than being replacing. So I think that the good tandem between humans and and AI will also start getting better at what to request for help. So for example, you want to just mention a knowledge graph across a whole bunch of different research papers to figure out if somebody overseas in a different country had some novel idea that you just didn't think about, right? So I think we'll get to a point where this research, what I'm really interested in, is a conference that we get to where each one of us would have our representatives as AI going and talking to each other, right? Just imagine if you have a collaboration between a team of researchers with their AI counterparts in, uh, in Israel, talking to the same, like, their counterparts in the U.

S. and they're exchanging ideas and you come up with a new theorem and say, hey, I think we came up with this new idea that we should do X. I'm just looking forward to a world where we start using the word "we" when AI is actually starting to do something for us. Tim Hwang: Well, and like one of the big dramas in academia, of course, is like, who's the first author? Like I wonder if in the future it'll be like, you'll get into this big struggle with some LLM collaborator that you have is trying to take all the credit from you now. And, you know, we'll have that drama play out, but it would just be funny because it'll be, you know, humans and AIs. Kaoutar El Maghraoui: So I think it'll be competition between models who's writing the best paper and who's, uh, AI conference completely generated by AI and reviewed by AI.

Tim Hwang: That's right. Yeah, exactly. Angry that you're unjustly turned down for your paper.

Reviewer number two, you know. Shobhit Varshney: I would say that there are certain things that we don't think about quite yet in the whole research spectrum. When you, we are so focused on doing our, our actual novel research, when it comes to say peer reviews.

I'll give you an example of what we're doing with some of our utility companies. Utilities, when they have to go file for increasing the price of their electricity in a particular state, they have to go file for a case. And they have to make a case and say, here's why I think I should increase it by X cents, right? Five cents. We're helping these utilities create that whole submission package.

So we're looking at everything that they have submitted all competition. It's all openly available online. So you research and help create the first package itself. Then once you know who's going to be on the panel, who's going to be assessing it, we can then go look at every question that they've ever asked. So in this case, in a peer review, we know when Shobhit gets to be the reviewer, I typically ask more about ethical concerns about a particular paper and so on and so forth, right? Each one of us has a pattern on how we ask questions, right? So now we reverse engineer what the judges would ask on the panel and then we change the documentation so that the submission itself is going to address those proactively.

Then when you actually go and have to present your case in person, that's an interview that's happening. So then we are preparing the witness based on the kind of questions that the person has asked everywhere else and what's the right. chain of thought to go on to that.

So I think there are aspects of research that researchers don't want to do that I think AI will be really helpful in augmenting. Do you think that'll be helpful, Kaoutar? Kaoutar El Maghraoui: I think so, definitely. Yeah, of course, as humans we're limited and if we're augmented by, you know, AI, we're, we're going to be superhumans and, uh, hopefully in the right direction.

So... Kate Soule: Well, and I think it gets back to what we were just talking about, right? Like, are we going to have AI, like literally try and become its own researcher and just replicate what a human can do? Or are we going to have AI specialize in parts of the process and run that process faster and better and support humans and new, more efficient workflows? It's just, you know, now without the robots focused on scientific method. Tim Hwang: The news story of the week was that it was finally kind of rumored a new story kind of came out that OpenAI is going to be investing in trying to produce its own in house chips to support its work. And part of this is it's. You know, integration and collaboration with Apple, but more generally, you know, there's been something they've been rumored about for some time that now looks like it's now more in the realm of certainty that they really are kind of investing this in a really, really big way. Um, you know, Kaoutar, you're the most natural person to ask about this, but like, why would OpenAI want to do this? Like semiconductors are like wildly expensive, very hard to pull off.

You know, my understanding is basically like, you know, China, the whole country has been trying to like reproduce the Taiwanese semiconductor industry. And like, is only moderately successful at it. Like, why should, why is OpenAI kind of making such a big bet on hardware? Kaoutar El Maghraoui: I think, um, the CEO of OpenAI, Sam Altman, has made the acquisition of more AI chips a top priority of his company. And he publicly even said, he complained actually about the scarcity of, uh, of these AI chips. So given, I think, all the rising costs, uh, chip costs, the supply chain challenges, and the need for specialized hardware, uh, especially specialized hardware that's optimized for OpenAI models. It seems to me that this is a strategic move.

So designing their own chips could enable OpenAI to tailor hardware for their specific workloads, improving performance, efficiency, and scaling potential. However, of course, there are challenges here and financial challenges given the complexity, especially of the semiconductor design and manufacturing. Um, so by creating this in house chips, OpenAI can reduce its reliance on third party manufacturers like NVIDIA, which control a significant portion of the AI hardware market, almost 80%. So it's going to give them more control over the supply chains and allow them to specialize and optimize for their unique workloads, potentially improving their efficiency, performance, and scalability.

While semiconductor development is a challenging and costly endeavor, I think this move could enable OpenAI to differentiate its hardware and scale, its operations effectively. I think they've thought a lot about this, but I think it's a strategic move for them. But also to diversify.

Tim Hwang: Totally. I mean, as wild as what you're saying is basically like, you know, what's cheaper than trying to get H 100s? It's like literally building your own semiconductor supply chain, which is a really crazy thing to say. Um, I guess, uh, I don't know, Kate, Shobhit, but if you've got kind of thoughts on this, I mean, one big question is like, do we think it's going to be successful? Like I can almost see the argument for it, but man, if it isn't a high risk sort of thing, right? Kate Soule: I mean, certainly high risk.

I really want to emphasize one point that Kaoutar brought up, which is there's tremendous opportunities. We look at kind of this next generation of AI and what's going to come next on AI and hardware co design. So making sure that we're developing these models and the hardware that runs them in tandem to really unlock kind of new performance levels, new efficiencies and cost.

Um, there's, there's tremendous opportunity there. So, you know, I think it makes sense. It makes a lot of sense to start to put some skin in the game, so to speak, um, given that, you know, there's just a ton of ways that they could continue to innovate, um, once they have better control over hardware design. Tim Hwang: Yeah, for sure. And Shobhit, I guess maybe you're kind of ideal to wrap up this section and close this out for the episode is, you know, you think a little bit about how, what this all means for business, right? What this all means for enterprise. Like, can you paint a picture a little bit more, right? Because I think the semiconductor stuff is often very abstract.

But as Kate is saying, there's some very practical implications to, you know, our experience of these kinds of technologies in the systems. But like, I'm kind of curious, like, what does the everyday look like if OpenAI is really successful here, you think? Shobhit Varshney: NVIDIA is a great partner with us. We do a lot of work, uh, we have joint clients and whatnot, right? So we do a significant amount of work. Yesterday, I spent the entire day with NVIDIA. We're doing a lot of work around where, where they can go and work with enterprises beyond the hyperscalers themselves. So they got into quite a bit of detail, uh, behind the covers, explaining us the intellectual property they've built, the differentiation.

They have a significant moat today. Not just on the chip level but the way you do the architect the entire end to end flow. The total cost of ownership-- you're going down from a massive data center down to one box. Just the wiring in the existing data centers is more expensive than that one box from NVIDIA. So the total cost of ownership and Jensen made this uh this famous statement saying even if they're competitors who are the customers as well, even if they made free chips, the total cost would still be lower on NVIDIA. So they've done an incredibly good job on driving higher efficiencies, more throughput, 5x, 10x on the same kind of footprint.

So I think it'll take a while for a company like OpenAI to do everything that's around it. It'll take them a while, just like when Tesla came to market, it took them a while to figure out how to actually productionalize this end to end. Creating a car, the actual, the core of it, that piece was great. The researchers could solve for that.

But the whole manufacturing and the supply chain and the total cost, how do you get a car to actually be a $30,000 car that people want to buy? It'll take a while for OpenAI to get there. And I think there's that, in my view, is going to distract them a little bit... from their core business. They should, in my view, should be focusing more on how do we get to adding more intelligence, what Ilya just did with SSI, raising a billion dollars, uh, what Claude, uh, models are doing with more responsible AI and stuff, I think there's still a lot more focus that's needed on solving that side of the problem for enterprises.

The cost will come down over time, just the way the economics work, the cost of computing on NVIDIA has implemented in the last decade. So I think that the focus of OpenAI should still be problems that need to resolve before they start to go vertically integrating end to end. Tim Hwang: Yeah, it'll be fascinating to see. And as I said, I think this will not be the last time that we talk about this issue.

So, I'm not overly sad that we ran out of time today about it, but we will pick it up in the future. Um, uh, So that's what we have time for today. So Shobhit, Kate, Kaoutar, thanks for joining us on the show.

Um, and for all you listeners out there, if you enjoyed what you heard, uh, as always, you can get mixture of experts on Apple Podcasts, Spotify, and podcast platforms everywhere. And we'll see you next week.

2024-09-08 16:17

Show Video

Other news

Все равно купят?! — Тест RTX 5070 vs RTX 4070 SUPER, RTX 4070 Ti SUPER и RTX 5070 Ti 2025-06-02 18:18

Technology in the Workplace 2025-06-02 05:16

Salesforce to Buy Informatica, Apple’s Tariff Headwinds | Bloomberg Technology 5/27/2025 2025-05-29 12:47