Cursor hype, Perplexity introduces ads, and AI at the US Open
Tim Hwang: So is AI going to wipe out all sports journalists? Aaron Baughman: No matter the sport, you know, we're always working with the same constant. That's the human. Tim Hwang: Seems to me that paid search and perplexity poses some really big questions. Trent Gray-Donald: It's all about simple economics and who's incented to do what and are, are you the customer or are you the product? Tim Hwang: Should I be using cursor? Kush Varshney: You shouldn't ever ask a question of an LLM these days at least that you don't already know kind of the answer for yourself. Tim Hwang: All that and more on today's episode of Mixture of Experts. I'm Tim Hwang and I'm joined today, as I am every Friday, by a world class panel of engineers, researchers, product leaders, and more to hash out the week's news in AI.
On the panel today, Aaron Bautman, IBM Fellow, Kush Varsney, IBM Fellow, and Trent Gray Donald, IBM Fellow. So to kick us off, the U. S. Open is this week.
Um, and as usual, a mixture of experts. We're, of course, excited about the tennis, but we're really excited about, like, the AI, and I really want to talk about the role of AI in the U. S. Open. Uh, but first to kick us off, because I personally am a huge tennis fan, let's just go quickly around the horn. I want everybody's.
Uh, nominee for the best tennis player of all time, um, uh, Aaron, we'll start with you. Aaron Baughman: Yeah, so that's a great question. Easy answer.
Ben Shelton. Tim Hwang: Great. Kush. Kush Varshney: Leandro Piz. Tim Hwang: All right.
I like that one. Very good. Very good.
And, uh, and Trent, how about you? Trent Gray-Donald: Oh, I, I prefer squash. So Jonathan Power. Tim Hwang: Okay, great.
Well, thanks. Well, uh, I asked that question to kind of kick us off on the discussion today. Of course, the U. S.
Open is happening very right now as we record, uh, this episode. And as usual on Mixture of Experts, we're excited about the tennis, but what we really want to talk about is the AI. Um, and Aaron, in particular, I wanted to kind of have you on the panel and for you to kick off this section because I understand that you've been experimenting with using language models to generate both long and short form stories, uh, for the Open. Um, and I wanted to talk a little bit about what you're discovering, like, uh, like what's working really well, uh, in these experiments that you've been trying out. Aaron Baughman: Yeah. Well, thanks for having me.
It's really fascinating to watch how we apply these AI technologies in particular, these agentic architectures with a diversity of large language models deployed out at scale, uh, to the U. S. Open that's happening right now. So if you go to dub dub dub dot usopen dot org and even go to news, you can see a lot of our, Stories that are created with both human and large language models together, but in general, we have two different types of projects. One of them is recruiting hundreds of match reports pre and post long and short form for 255 of these different matches, and then the second project that we have is called a I commentary where we take stats and we transform that into a different data representation like JSON and then input that with the prompt to get out text and then that's voiced over with text to speech and that's embedded into these highlight videos. Tim Hwang: Yeah. That's really cool. And tell me a little bit more about like how that works exactly.
So how do you go from a game to like a report about a game, right? Presumably there has to be a feed about like, Oh, this person just, you know, had a, had a great serve for instance. Um, like how do you do that conversion? Right. Because I think what's interesting is you're going from, you know, Video and visual, uh, to a written medium and kind of curious about how you guys approach that problem. Aaron Baughman: Yeah, it's really neat. This is all about message driven architectures where whenever we get a score, you know, for example, in a match ends, then we get a message and within seconds, less than seconds, we'll then take that message and we'll pull in from about 14 different feeds that has raw data that describes the players, the match, where they are.
And also what they've done in the past. And we also forecast what's going to happen in the future. And we take all of that and we turn it into a representation that a large language model can understand, right? And we put it into the context of a prompt. So it being, it could be JSON elements that describe, you know, with key values what's happening in tennis. So like how many aces is somebody getting or how many breaks has somebody won in the match, right? And then all of that is packaged together And then we push that into the scaled out architecture that we have granite, for example, um, and we pass it in with the prompt and then the output would be a fluent text that describes the scene that's either Just happened or that's coming up and it's it's really cool to see it You know live as it happens And there's all sorts of fact checking that happens and quality checks novelty pieces and sim similarity to make sure that it's up to par so that we can go forward and I use the word par on purpose because we also do some things for golf as well, uh, which um Uh, is, uh, part of our over three year story, you know, that, uh, has evolved, uh, into the U.S. Open.
Tim Hwang: That's great. And Trent, I saw you nodding on the mention of Granite. I don't know if you've got a connection to Granite as a project, but, uh, I'm wondering if you can kind of paint a picture of where you see some of all this going, right? So we're assuming to do these experiments in golf. Sorry to do these experiments in tennis, should we expect to see like in five years that like, you know, a lot of sports coverage, a lot of sports summaries and commentary really will be AI generated or do you think this is more of like a, a sports specific thing, for instance? Trent Gray-Donald: I, I think that this is just the beginning of a lot of different initiatives. Uh, the, the reason I'm nodding is that Aaron and I actually, so I run the watsonx, uh, that does the inferences that Aaron causes.
So he's basically calling my service when he does the work. This is your baby. Well, right, but I, I'm always in the, like, I'm the plumbing, right? Yeah, sure. He does all the, the interesting domain specific work around tying together all the data sources, and it just ends up, you know, coming into our service. So he and I worked together on figuring out, okay, how do we make sure that we can handle the capacity and the latencies and all those different things.
Right. But in general, how, how Aaron's built it and the, I see this whole agentic universe. Uh, I mean, there's, there's from highly scripted through to let the LLMs do what they'll do.
And there's obviously a big meat, there's a big, um, Uh, there's a lot of different points in that spectrum. And I think that for live events, for, uh, More and more human things like sports, we're going to start seeing increasingly interesting agentic architectures emerging that will extend beyond a given sport into more and more. I could, I could see that.
I think the, the interesting question is always, uh, can you find the right unique snippets to tell people. Like one of the, one of the jokes that we have is when we, big baseball fans, and we're listening to the play by play, and they come up with these ridiculous statistics. This is the third player since 1943 who stood on their left foot and wiggled their ear. Tim Hwang: That's right. Yeah, yeah.
I've come to expect that. I mean, I watch a lot of, like, um, soccer, right? And like, it feels like the commentator's just, Fill space. Just have this remarkable bank of like the most edge case statistics you could think of.
. So, Trent Gray-Donald: well, exactly the question is, can, can we capture and distill that? Like obviously there's a lot of data mining going into producing those right now. It's okay. How do we connect those and make it engaging and interesting and human? Tim Hwang: Yeah, for sure.
Um, and I guess, Kush, curious about how you think about this. I know one aspect of your, uh, fellow work, right, is that you think a little bit about AI governance, which ultimately is kind of like how do we think about the influence of these systems on people. Um, and, you know, I think one response always is like, okay, well, we're What is a sports journalist supposed to do in the future in this world where a lot of the work that they currently spend time on is, you know, generating this coverage, generating this commentary? Uh, curious about your thoughts on like how that all looks, right? Because I think as Aaron has already said, there's like ways of getting humans and machines to kind of work together on this front. Um, but I would love to kind of hear a little bit about like how you sort of see that relationship evolving and, and is there a role, right, uh, I guess for humans in sort of an AI enabled, you know, sports future. Kush Varshney: Yeah, I think we're going to talk more about this towards the end as well.
I mean, different sorts of human collaboration and, um, the way I think about it, it's not so much of, uh, what is it about the job that, uh, we're trying to automate away and these sort of things, but really the question of the dignity of the humans involved in this, because, um, uh, if you're The human and you're subservient to the A. I mean, you have no dignity left in many ways. Um, so what are kind of the workflows that we can set up such that, uh, you get a better product still are getting the advantages of automation, but still leaving the dignity of the human intact.
And, uh, one way to think about it is, uh, Like, if you remember, um, House MD, Dr. House, the TV show, and, um, he had his, uh, whatever, residents, and, um, they were, like, doing stuff for him, conducting tests, whatever, um, but, uh, it was very much an adversarial sort of relationship, so, um, like, they were always trying to prove him wrong, and if we can get the AI systems to be in that mode, working with the humans, then the humans still stays kind of Uh, with the agency, with the dignity, but, um, so that's the benefit of, uh, of all of the AI technologies. So, uh, I think something like that, um, could, uh, could play out.
I mean, as we, we go forward with, uh, with a lot of different AI, um, human collaborations. Tim Hwang: Yeah. I love the idea that in the future, there's going to be like a sports commentator that has specifically an agent that generates the AI. Those like weird statistics that Trent was mentioning.
It's just like an expert on finding and identifying those as the action kind of evolves. Well before we move on to the next segment, uh, Aaron, maybe we'll close this segment with you because, uh, this is some of your work that's getting some shine at the, uh, at the open. Um, I'm kind of curious if there's like, You know sports that are going to be easier or harder to do this kind of work that you're doing at the open with, um, you know, I think a little bit about like, you know, is this something where theoretically any sport is going to be easy and amenable for the kinds of sort of story generation that you're working on? Or if there's certain aspects of You know, say tennis or golf, uh, that really make it sort of like ideal for your application case. Um, I guess what I'm asking ultimately is like, did you pick this largely because like you love tennis, uh, or, or they're actual kind of like scientific reasons for why this ended up being a really good test case. Aaron Baughman: Yeah.
You know, um, You know, the no free lunch theorem where, you know, there's not a perfect solution for every problem, I think is applicable here, um, because, you know, every sport has a pro and con, right? And it all comes down to what data is available and what is the scale and how, and what's the use case that fans want to see. So, so I wouldn't say that there's a perfect sweet spot in any one singular sport. There's always a challenge. Yeah.
Um, some, some of the challenges I think we've already discussed here is just making sure that we have meaningful stories and stats that bubble up and some, some of the things that we do is we use like standard deviations around, let's say, aces, right? Because you can't say that, um, a pure number, of aces is significant. It all depends on what's how many sets have been played, the gender of the of the match, who's playing. So we have to break that down and apart. And if we go to like racing, we go to football, we go to soccer, you know, it's all very similar, but you apply the same mathematical techniques.
To this, to the stats that then can bubble up, um, one of the other challenges, um, and, and I would say one of the other areas that's real exciting is getting human and machine working together because there's this pendulum of how creative do you want these large language models to be, as opposed to how prescriptive do you want to be? You know, with this few shot learning, for example, and we tend to go somewhere in the middle, but it's all experimental. You know, it's almost like the theory of mind, right? We want to be able to predict what action is a human editor going to take so that we can meet their expectations whenever we generate, um, that said text. And so no matter the sport, you know, we're always working with the same constant. That's the human. Um, and then the other constant is data, right? We need to make sure that we have access to the data.
Um, but it's, it's, it's fun, right? And it's very impactful. And it's a way to bring people together irrespective of creed, gender, and race. And, um, it's just really exciting to use a lot of Trent's work and a lot of Kush's work and bring it together for the world to see. Tim Hwang: Yeah, for sure. And I think this is where the magic happens, right? It's like AI can be very abstract for people.
It starts to become very clear if it's got an application like this, right? It's like, oh yeah, I already love this thing and AI is really helping me, you know, enjoy it more. It makes a huge difference. Aaron Baughman: Yeah, yeah, yeah. And, and, and I do encourage you to check out, you know, US Open. org so you can see our work live in real time and listen to commentary, read the match reports.
I mean, it's, it's fascinating to watch the field evolve. Tim Hwang: I'll introduce this by talking a little bit about Perplexity. Um, so Perplexity is one of these leading companies in the sort of generative AI movement.
Um, what they are largely providing is kind of language models as an interface for search. So the idea is in the future you'll be able to have much more sort of conversational search experiences, uh, than you have right now with something like a, a Google or something, right, where you kind of type in a search query and you get like a bunch of responses, um, back. And Perplexity has been kind of In my mind, one of the best products in the space. It's like one of the few ones that I actually pay for and that I actually use on a week to week basis. And there was an interesting news story that just popped up in the last few weeks where Perplexity announced that it would be finally moving towards a model where they roll out paid search. So the background on all this is in the past.
You have had to subscribe to Perplex, and you pay them a monthly fee. Um, but now they're saying, hey, we're gonna monetize by allowing people to, uh, buy ads on our platform. So if you go, say, search for, you know, what is the best exercise machine, uh, you might see an ad from something like a, like a Peloton.
Um, and so this is like a big shift. I think one of the big hopes about this technology was not just that conversational interfaces would be better, but that we might move away from ads to a world of subscription. Um, and, and as a result, maybe have a little bit more sort of faith, trust, confidence in, in the search results. Um, and so I guess I kind of want to ask, maybe Trent, I'll start with you because you're our new addition to the roster of folks at Mixture of Experts. How do you feel about this? Like, does it make search less trustworthy? Like, should we be concerned about this kind of shift on the part of perplexity? Trent Gray-Donald: Well, in my view, yes, absolutely.
I'm a big fan of say, follow the money. And, and it's, it's, it's all about simple economics and who's incented to do what. And are, are you the customer or are you the product? And it's very simple. As you shift to paid search, you become more of the, Product instead of the customer and so my, my usual reaction is that this is not going to bode well for, for us as consumers. Tim Hwang: Yeah, for sure. I just remember, I mean, famously, there's this essay that was written by, uh, Larry and Sergey, right, who founded Google.
And it's like, their essay that they wrote, I think, when they were still at Stanford, and they're describing the PageRank algorithm. And at the very end, they're like, and no search engine should ever use ads, because it would be the most terrible thing for a search engine to do. And of course, lo and behold, right, like, Google is like a 90 percent ad based company. But I guess it's very hard, Kush, isn't it, to like, kind of avoid these incentives, right? Like, the problem with subscription is that People need to pay to use your product. Um, and so it does kind of limit user growth and all these other things. Um, is there any way you think of escaping ads as a business model in the space? Kush Varshney: Um, I'm really not sure.
I mean, but, uh, one thing that I did want to point out, maybe a little bit counter to what Trent is saying is, um, uh, the investment into an ad based sort of approach. So. Should actually also lead into investment into certain technologies that do help with trustworthiness. So source attribution is a big problem with LLMs You don't know kind of where the information came from. That's in the generative output.
And, um, if that's what's part of the monetization, then there will be a lot more investment into the techniques, the scalability of the source attribution sort of things. And that can actually increase the The trust, um, maybe not necessarily always just for the, uh, the ad driven sort of, um, uh, platforms, but in general, because the more, um, and better techniques that we have, uh, and we do have better trust for, um, or where the information came from. You can then, uh, go back, trace through, um, different, uh, possibilities for hallucination, things. So I think, uh, incentives can kind of work in weird roundabout ways. So, um, uh, the ad driven aspect maybe will or will not. Uh, do a good thing for trust, but maybe that'll lead to investment into certain things that do.
Trent Gray-Donald: Yeah, I think so. I agree in theory, but in practice, what incentive does perplexity have in providing attribution in a better way? And do they just start obscuring it? And who's to Like, who's got the leverage to not have it obscured, right? I mean, that's always the fundamental thing, is there's always, we could, but we don't. Tim Hwang: I mean, I think there's also maybe another element, which is, I don't know if you buy the argument that, like, in a world of chat based search, the trust problem is particularly bad. Right. Because like in the past with Google, you have 10 blue links.
You can say, well, why are you giving me this link versus that link? Uh, but in the very least, if we say we can maybe agree that like, Oh, all the sponsored links should have like a little label and a box around them or something like that. But in a world where it's just like a paragraph, I guess you can offer citations, but who's going to actually click through those, you know? Um, but I'm curious if you want to kind of respond to, to Trent's thinking there. Kush Varshney: Yeah. I mean, first I'll respond to Tim.
I mean, I never click on the, I'm feeling lucky button because I mean, I always want to see the 10 results. Right. That's right. Yeah. Um, but, uh, yeah, I mean, I think the point that I was trying to make is, um, that, uh, whoever's paying for their stuff to appear, um, needs to be ensured that yes, I mean, um, the right thing is coming.
So, uh, if you're still using a language model in between, then even if the ad, um, Uh, needs to get through the language model to appear in the output. Just making sure that that happened, um, is going to be needed. And, uh, that same technology can then be used to trace other information or other facts or other stuff. So, uh, what I'm saying is, uh, the reason it needs to be there for an ad based business is because, um, uh, the people paying for the ads need to have some, uh, sort of guarantee that, uh, that their stuff will appear. Tim Hwang: Aaron, I'm not going to let you get away with being quiet on this segment. Curious if you've got any thoughts, uh, if you're on team Kushier or team Trent or, or neither, I suppose.
Aaron Baughman: Yeah. I mean, I think that the mixing of trying to drive revenue with trust and transparency could be potentially dangerous. You know, it could be used for, um, you know, potentially, um, alternative You know, methods here, but it is about balance. You know, um, I read this article a while ago about Goldman Sachs, where they said that there's too much AI spend and too little benefit, but in order to keep AI as an industry solvent, that there needs to be revenue and there's a large revenue gap, you know, today, and it could potentially be growing. Right. I, um, I know on this, uh, mixture of experts, we talked about the, what's 600 billion gap with Sequoia, you know, a while ago, you know, and, and so that, that really stuck with me.
But on the other hand, we need the trust and transparency to maintain users and demand because once people lose that trust, they're not going to use these systems, or at least I wouldn't, right? And one point I did want to make is that lots of the users for Perplexity, it seems are, you know, very highly educated, you know, they're high income, you know, earners as of now. Um, and so, and so they're very, if you can influence, right, that group of people, um, to, to walk down a certain way, then that can influence, you know, um, lots of other people because they tend to be sort of the leaders in fields. And so. And so just making sure that, you know, perplexity, a, they publish their papers that describe, you know, their algorithms, the systems that we can easily access and read much like Google did, I think is important, creating this digital passport, you know, that describes where the data is coming from, um, so that it's at least available.
Um, and then it's up to us as a group, IBM fellows, you know, to educate, Hey, you know, if you're using these AI systems, you know, you need to do your own due diligence as well. You know, um, still maintain your posture and, you know, your own belief system, um, and understand that you're using these tools to help you, but you still need to be a critical thinker. Tim Hwang: Yeah, that's well warranted. I mean, I think just to put myself in the shoes of perplexity, if they were here in this conversation, I think they'd say something like, well, why are we being held to such a high standard? I mean, Google's been monetized on, you know, ads for all these years, and people still use it with no problem.
You know, why is AI sort of like special in that respect? Um, and I, I suppose part of the worry here that, Aaron, you're bringing up, which I think is good, is, you know, this goes to, I guess, whether or not you think that people will be critical thinkers with regards to the technology, right? Like that maybe the AI, uh, makes us all a little bit too easy, um, in a way that, you know, maybe, like, actually limits in practice how much people will actually click through to the links. I mean, I, I, no, I certainly don't, right? Yeah. Aaron Baughman: Yeah, I will, I will say that, that whenever I use, when I'm driving and I'm using like a map software where this is Google Maps, I completely forget where I'm going.
And I probably couldn't retrace where I went because I don't pay attention, right? So, so there's a danger of not being a critical thinker. Because the information just becomes so easy to get. And, and I think we all just need to be careful. Tim Hwang: That's right.
Yeah, I had an incident a few weeks back where I like left my phone in the restaurant. I hopped into the car and started driving. And then I was like, I don't I don't exactly know how to get back to the restaurant now. It's very embarrassing, so. Um, any final thoughts on this trend? Trent Gray-Donald: Uh, I, I think some really good points there, and I think Kush's point about the advertisers are going to want to see where their money is going is actually an interesting loopback that is the, an incentive that brings towards, uh, being a little more transparent.
But at the same time, like, we're used to Google coming back with a list, and it's up to us. The problem with the chat is that it's more opinionated, and it's, for lack of a better term, it's got that humanness to it where it just, that, like, mentioned is, it feels much more like somebody's just talking to you. And we all know that LLMs talk with authority, and they talk with tremendous confidence, even though when it's not warranted. And so it's going to be interesting to see how the human how we develop the right filters. Like we all know how to deal with the Google page.
Okay. You scroll past the first four items and then you, or whatever it is. Tim Hwang: Right. Trent Gray-Donald: It'd be interesting to see how we build defenses here and whether they're harder to build. Tim Hwang: Yeah, I think that is going to be a big open question.
I think we're going to have to learn as a society, right? It's going to just be like, when the first ten blue links emerged, right? That was also a whole process, and so it feels like we're turning that wheel again. Trent Gray-Donald: Yeah, exactly. Tim Hwang: Um, well, great.
I'm going to move us on to our third story of the day. Um, So, former Tesla and OpenAI leader, Andrej Karpathy, tweeted out his love for this product called Cursor. Um, and has set off a kind of whole new discourse around the role of AI in software engineering and programming. Um, and the unique thing about Cursor, in contrast to something like Copilot or Kodi, this is like another company that's operating in the space, is that it's basically like an entirely separate product. stand alone ID, they basically forked VS code and said, okay, we're going to rebuild it from the ground up using this, this AI stuff. Um, and you know, I think one of the most interesting parts of the discourse, if you will, if you follow Twitter, it's a waste of time, but like if you do follow it, um, is that people were sort of making the argument that.
You know, um, cursor is particularly interesting because it's trying to get past the, the kind of paradigm that co pilot set down, right? So when co pilot launched, the idea was, oh, well, autocomplete is the way we should think about kind of assistance of AI in, in software engineering. Whereas, it's playing around with all sorts of things, right? They're playing around with dist on your code, they're playing around with um, you know, chat interfaces which you've seen elsewhere. But like, I think they're actively trying to push beyond, kind of auto complete as a paradigm. And I guess I'm kind of curious, maybe Kush I'll turn to you, is You know, do you sort of buy that? Like, is Copilot kind of old school already? Like, is it, it's already becoming like the, you know, version 1.0 of how we thought about AI in software engineering. And, you know, do you think that like we're going to look back in 10 years and no one's going to even think about using like a Copilot like interface to integrate LLMs in their, in their workflow? Kush Varshney: Yeah, that's a great question.
And, um, I mean, I think it relates to what we've already been talking about today. So, um, Do you trust this thing? Um, are those auto completes? Um, the things that you can verify yourself because um, Uh, you shouldn't ever ask a question of an llm these days at least that you don't already know kind of the answer for yourself, but um, Uh, some folks in my team, they've been doing some user studies and asking people what are the features that they would actually want to benefit from in the AI for code space and what we're finding is that um It's actually the code understanding problem. So when you're given a dump of a new code base, um, uh, like just making sense of it, that's the biggest problem. Like it's whatever, like thousands or millions of lines of code and all sorts of weird configurations. And let's say you don't even know the language.
Let's say it's COBOL or something like that. How do you just get a sense of where things are like? Kind of how this is organized, what it does, and that sort of thing, I think, is, uh, an even more powerful use because, um, uh, once you're at the level of, uh, kind of knowing that this is a line or this is a block that I need to write, um, you're already, like, well versed with what you need to do. So yes, it can speed things up, but even getting started, I think, is, uh, is a bigger problem. Tim Hwang: Yeah, it's funny to think actually that we've had so much focus on like the AI literally generating code, but kind of what you're saying is like the future of AI and software engineering is like, better documentation. It's like the thing that is always kind of like difficult to do and no one wants to spend time on doing.
Um, I'm trying, I guess like doing the watsonx stuff, I'm sure you're kind of like interested in kind of that interface. I don't know if you, you sort of agree with Kush here that like, it's really kind of almost this understanding and documentation layer that ends up being the most important thing. Trent Gray-Donald: Absolutely.
I mean, one of, one of my One of my day jobs is, uh, I'm, I'm Chief Architect for watsonx code assistant. So, very, very much my day job. And, I view this as a very, very young space. And everybody's trying different interfaces and different ways to do it. And, how do I, uh, like, I see all the statistics. And, the number of people who are using chat, Or, the chat like things that, Cursor makes easy is very large and definitely one of the first features asked for everybody thinks for a little while they want to do code gen and and there is a constituency that does want to do that, but most people actually revert back to the can you just tell me what the hell my codes doing and help me put it together.
That is, is a, is a big part and then figuring out how to do that and getting the appropriate amount of context. Now we have LLMs with larger context windows and we're getting better and better techniques to build intelligent prompts. But this is going to keep evolving. And then, the, the, the bigger one, to be honest with you, is going to be the evolution towards agentic, uh, where it's much more planning and discussing in the large and The question is, okay, is it going to be human in the loop, or is it just going to be prompt and see off.
Tim Hwang: Like, I just want an app that does this, and it just goes and does it. Trent Gray-Donald: And, and, I, I think that our, going back to the dignity comment, it, it's having these human in the loop is that Where you have a helper that says, you're trying to do this big thing. I think I've broken it down to these six steps.
Human, do you agree? And you look and say, oh man, okay, it went right off the rails at step four here. Let's, let's fix that up. Tweak, tweak, tweak. Off we go.
And this isn't going as, like, there's the whole Devon's and whatnot universe, right? Sometimes they go, everybody's experimenting from really, really tiny little baby steps to the other end, which is, hey, let it all fly. And exploring this problem space is going to be fascinating for the next several years because nobody's quite figured it out and the models are getting that much better. So I'm super excited about that. Where this all goes and I really welcome seeing the exploration that the cursor is doing around innovating on interface. For sure.
I think it's like, yeah, very exciting. And you know, almost the joke will be like everybody in the future will be like an engineering manager. Basically, it turns everybody into an EM, you know, over time. Um, I guess. You know, Aaron, I don't know if, are you a VS Code guy? Like, I'm kind of curious, like, I think one of the bids of Cursor, which I think is very intriguing, is, you know, people are very comfortable once they've set up their IDE.
Like, it's, it's like almost like setting up your office. Like, you want it to be comfortable, and you want to know where everything is, and you don't want the bindings to be in a particular way. It's kind of wild, which is like, Sort of what Cursor is kind of attempting in the market is, well, these AI features will be so killer that you would be willing to abandon all that, right? Or at the very least, like, get over the hump of having to, like, spend an afternoon just kind of twiddling it to get it comfortable. Um, Yeah, I'm kind of curious, I mean, as someone who, like, builds these systems, works on these technologies, like, you know, is that, is that prospect attractive to you? Like, have you tried Cursor? Would you jump onto Cursor? Um, I, I actually don't know what your daily setup looks like, but part of it to me is just like, is that value proposition strong enough to get people to do that shift? Aaron Baughman: Yeah. I mean, so, so yeah, so, you know, I write code every day, uh, VS, you know, the, the VS IDE is, you know, what I use of choice. First, I'm a big fan of paired programming and paired testing, but having multiple people work together, maybe on a singular task or a group of people working on an experiment because it does a couple of things.
One, it improves code quality, engineering quality. That's the scientific process, but it also creates long lasting teams, right? That stay together for years and years and continuity of people on a team, I think, is important. Um, and, and, and so relegating software and science to maybe prompt engineering, to me, has some cons.
Um, uh, Of course, the pros are, you know, it accelerates productivity, you know, it can help us code complete, it can create different types of comments so we can understand code. So there's certainly a place. However, I do think that we want to make sure that our engineers and scientists still understand code, can write algorithms, can create code. New programming languages, uh, new compute paradigms, for example, quantum, right. That's a new paradigm where I don't think LLM, uh, may, may maybe with Qiskit and be being able to create, you know, Python code.
Uh, but there's all these new languages that are popping up and, you know, l LLMs have to be trained on something, you know, on some kind of, you know, pile of data. And if a human can't create that pile of data in a trustworthy way, then I think some of the. creativity and skill of the engineer, you know, might, might be lost. So, you know, so the hype around cursor, I think is real and it's a very powerful product, but I would encourage, uh, perhaps folks to say, let's put a time limit on the amount of which we can use some of these tools so that we can maintain our sharp blade.
Um, you know, whenever we really need to do some. engineering, so we don't all become just prompt engineers, right? But that that's that's sort of my caution and, uh, my thought. And yes, I do use, you know, Watson code assistant, um, you know, pretty much every day through the plug in on VS code, and it helps a lot. It's really good.
It creates, you know, different types of comments. I also use, um, Google, right? Right. I'll go on Google and I'll use the gen AI feature to give me ideas on how to write code better. But I always try to limit myself and my team to say, Hey, let's, let's do 20 80 or 50 50. And let's make sure we're still communicating as a team. Um, you know, so, so that human interaction to me, um, is important.
Tim Hwang: Yeah. That implies almost two really interesting things. Like one of them is in the future, there'll be like almost like a screen time for these features or it'll basically be like, you've hit your limit for the week. No more, no more AI for you. Um, the other one also is like, I think, you know, particularly, There's been some discussion about, oh, well, are these systems eventually going to get so good that it actually, like, replaces a lot of jobs of engineers. But it almost feels like there'll be, like, this constant pressure to learn more and more obscure languages.
Because those will be the areas that basically AI can't touch because the data sets are more obscure. Um, which I think will be really interesting to see. Trent Gray-Donald: There are definitely, I mean, there's no surprise or secret that IBM's been around for a while and has created languages that, It may be a little long in the tooth, like COBOL or PL1 or whatnot, and sure enough, the amount of data, the amount of code on the internet that most of these models are trained against is very, very small, and they can't do these languages at all.
And so what one of the things that we've done is, of course, is we have more COBOL code, we have more PL1 code, we have more of these things, so we can build better models for that, and companies are approaching us with, hey, we built this weird esoteric language, can you help us do the same there? So it's while there is a barrier, Wherever there's a barrier, there's typically financial incentives to do something about it. So, esoteric languages are going to be a bit of a barrier, especially at the free. It's going to be tough. Tim Hwang: Well, great. I want to tie up today because we actually have a very unique pleasure of having the three of you on this episode. As you may have overheard, When I introduce these three guests.
They are all IBM fellows. Um, and for those of you who don't know, the IBM fellows program, I didn't know much about it, but it's, it's a crazy program. Basically, the idea is to bring together some of the brightest minds in technology, um, to, to work on projects at IBM. And so I think they've included, I was looking on the website, there's a U S presidential metal of Freedom winner.
Five Turing Award winners, five Nobel Prize winners, um, uh, and, uh, uh, and I figured we just kind of take the last few minutes for people to hear a little bit about sort of the program, what you've learned, um, and, and, you know, where you think the program might go into the future. And I guess, uh, Aaron, maybe I'll toss it to you cause you kind of kicked off our first session. So I'll bring you back into the conversation here, but I'm curious about kind of like how your experience with the fellows program has been and, and, um, and what you've learned. Aaron Baughman: Yeah, I mean, becoming an IBM fellow is one of those seminal moments, uh, where it's um, it's very surreal, you know, when it, when it happened and, and, you know, my first thought was, wow, you know, I really hope that I can live up, you know, to those who came, um, you know, before me, and then I can also be an example to those who will come after me. Right.
So, you know, I'm sort of, you know, In the middle, and I want to make sure that I can keep the projection of what's happening and what's going to happen. And, and I take that with a big responsibility that it's, um, both we need to ensure that we keep up to date with science, engineering, push it forward in a responsible way, but also, um, to usher the next generation of IBM fellows that will come after us. And, and the process, you know, of becoming a fellow, um, I found it very rewarding because it helped me at least to reflect back on all the people who helped me achieve something that I didn't know was attainable.
Um, and then being with Trent and Cush, you know, is, is a, one of those things where it's like, wow, you know, I always knew and followed their work. And, and I did not know They were going to be IBM fellows until it was announced. And so it was just, it was great to hear that, wow, I'm in the same class as Trent and Koush. You know, it couldn't be better, you know, in my, in my view. Yeah, that's great. Uh, Trent Gray-Donald: Trent, Koush, any other reflections? I think it's very important that companies in the technology space have leaders who are effectively partners.
Pure technologists and can be the right balance to the business at times, where one of the unspoken or what's actually spoken things about fellows is they are supposed to be a little bit of a check and balance on what we can or what we should be doing in a given space. Tim Hwang: Right, you're like the keepers of the technical flame, you know, yeah. Trent Gray-Donald: Because sometimes that's necessary. And, uh, but it's, it's, it's a huge honor to have become a fellow. And it definitely, the, uh, the, the number of, of people who've come before and, and have that I very much look up to is, is very large. Kush Varshney: Yeah, no, I mean, it is extremely humbling.
Uh, If you look at the list of, uh, all these people, as you mentioned, Nobel prizes and, uh, inventing all sorts of things that we take for granted, whether it's DRAM or, um, I mean, all sorts of different things, um, and, uh, it's just crazy to be thought of, uh, in that same light and, uh, I mean, it's been a few months now, um, I guess, uh, for the three of us. And, uh, I mean, one thing that I've learned is just, um, uh, some places that I've traveled both within IBM and, uh, outside it's people do look up to this position that it is something that people look to as an inspiration. And, um, I hadn't thought of it that way. And, um, I think it's just like, uh, Aaron said a responsibility and, uh, uh, like tr said, it's, I mean, um, a way to, uh, to, to have this check and balance as well.
So all of that in one role, I mean, and it is just crazy. So, um, yeah, I think, uh, the three of us are gonna do our best and, uh, keep, uh, uh, keep, uh, keep, keep this, uh, this traditional alive. That's great. Tim Hwang: Well, look, it's an honor to have the three of you on the show. Um, I hope we'll be able to get all three of you back, um, on a future episode of Mixture of Experts. Um, but that's where I think we'll wrap it up for today.
So thanks everybody for joining, um, and thank you for all you listening, um, joining us again on another week of Mixture of Experts. Um, if you enjoyed what you heard, you can get us on Apple Podcasts, Spotify, and podcast platforms everywhere, and we'll see you next week.
2024-09-02 15:55