AI at Wimbledon, ChatGPT for coding, and scaling with AI personas
hello and happy Friday you're listening to mixture of experts I'm your host Tim Hong back from the kitchen each week mixture of experts brings together amazing group of well experts to tackle debate and explain the biggest Trends in the fast-moving world of artificial intelligence this week on the show we're going to cover three stories first with the Euros Wimbledon and the cop America all in a matter of weeks we talk about Ai and sports will AI have a role in shaping the nature of the game if so how you know they they might just see gameplay whereas we see data and an opportunity uh to derive these types of insights uh to help find that signal in the noise U to provide those serendipitous type moments that connect people to the game second a new study out in the I E's transactions on software engineering reveals new data about gpt's performance on coding tasks what's it tell us about the future of coding assistance I worry a bit about you know this uh over Reliance on AI tools for problem solving especially as you're learning in the early stages for programming and then third a fascinating paper out in archive entitled scaling synthetic data Creation with 1 billion personas from 10cent Seattle lab does it provide a way forwards for resolving data bottlenecks and what can we use personas for in the future how confident are we about the coverage of these 1 billion personas and do the underlying large language models you know really understand being a a musai warrior as always I'm joined by an incredible group of panelists that will help us navigate what has been another action-packed week in AI so today we've got three panelists Skyler Speakman he's a senior research scientist at IBM Kar El mcgrow principal research scientist at IBM AI engineering AI Hardware Center and joining us for the very first time is Aaron botman IBM fellow and master inventor welcome to the show [Music] everyone first it's been a very busy season if if you're into watching sports uh Wimbledon the euros and the cop America are all happening basically like this week last week um I know we've talked largely on the show at mixture of experts on AI as kind of an internal business process um but I think you know particularly with all the sports in the air I've been kind of thinking that it might be a really good opportunity for us to talk a little bit about the ways in which AI might reshape Sports itself um and Aaron I want to start with you as kind of our our new panelists uh on the show just to pick on you a little bit you were actually at Wimbledon um and I'm curious as someone who works you know day in day out right like why did you go what do you think and as someone who works in AI all the time I'm sure if you're experienc of this kind of tennis tournament where you're like oh actually there might be a lot of ways for AI to apply so just kind of as an initial place just to to get the report uh on on uh wiblin yeah so I mean it's it's always fascinating you know to watch how we operationalize lots of these AI techniques in particular in this case with um our partner um you know um Wimbledon and I was lucky enough to go you know we've been doing this for uh with them almost 30 years now um and this year you know we focused a lot on generative AI um but also don't want to forget about classical AI right because both of them are uh very important um and we use many different uh techniques in order to ad do it but to actually be there right during the tennis and being and in sort of in the thick right of uh the space it's it's very um interesting um you know because there's there's many aspects there's a how's the technology performing how is the consumer acceptance you know of the tech um and then how's the back office um acceptance as well right um you know so it's it's always nice right to watch people around the world you know use it um you know we get billions of of users every single year you know that that use you know our systems um you know in this case for the generative AI just at the halfway mark right um people spent um 2,319 hours just looking and reading the generative content that we we produce well and I think if I can ask you to back up a little bit I mean our listeners won't necessarily be familiar with what you were working on we'd love to hear kind of a little bit more about like what is the technology that you were mostly focused on this year and uh what people were doing with it yeah so um so so we're um looking at um bringing the game in a personalized way to you right and so and so what we like to do is mix in different aspects you know we like to rank players we like to predict who might win a match and then we want to um create content to catch you up right so that you know if you join um in the tournament uh we want to create these digestible nuggets so that you know the consumers um around the world can you know View and understand you know what what's happening uh in the match right and it and it helps you know um the you know whereas you know they they might just see gameplay whereas we see data and an opportunity uh to derive these types of insights uh to help find that signal in the noise um to provide those serendipitous type moments that connect people to the game yeah for sure and I was talking with a friend recently about this was um so as someone who got into uh football soccer that is uh you know during my time like in the pandemic my experience of the sport has largely been like a visual experience right and it is so interesting to me that like you know having never gone to a game I'm a huge fan I watch all the time uh but my primary experience is kind of like intermediated right uh through you know social media and what I see on TV and it kind of sounds like there's been a similar exercise to try to figure out like how AI kind of plays a role in that interface right from like the fan and the viewer to kind of get more out of the game um and um I guess I'm Aon I'm curious like any Lessons Learned um things that you thought like worked really well um for this uh this work yeah um so so we used lots of different sensors around the course to gather data you know so we use like a Hawkeye system that has you know up to nine different cameras that track the ball track the players you know we get all sorts of stats that are streaming to us but there's just this Deluge of information right and it's hard for people to just comprehend it so one of the lessons learn that I think we had was to create these digestible narrations of you know pre-match postmatch about the players so they can go in and just quickly read up you know on their favorite players um and then and then some other um aspects that that we also learned is that sometimes it's nice to inject information that maybe somebody wouldn't ordinarily you know know about or read about or even think about right um so so it's nice you know to to watch that you know happen and and spread um so that's you know you know one of the pillars um I think just very quickly the other pillar uh would be on the operation side you know um that it's always great to have human and and machine and algorithms working together uh to create the symbiotic you know experience that can be um used um where whether it's mobile uh you might be on site you know as a fan um so it's uh so it's it's it's really evolving right into this sort of Moneyball 2.0 yeah I think uh the uh the power of AI uh in sports uh really is transformative and uh and AI here plays a multifaceted and transformative role uh espec especially not just on the commentary side on the user experience I think there are lots of different applications where we can see the power of AI here like things like performance analysis athlete and performance training where with using wearable technology you can have these sensors that collect data on the athletes movements the Biometrics Etc and analyze the data to provide insights of you know where the athlet can you know improve uh video analysis uh you know analyzing video footage of the training sessions and games assessing techniques identifying weaknesses game strategy also is also a a big application here health and injury prevention uh with doing things like injury diagnosis uh can algorithms also can assessing diagnosing injury injuries through image analysis uh fan engagement and experience of course is the the fun part of it uh like Aaron was talking about you know with the personalized content uh with the chatbot and virtual assistance and the augmented reality even you know you can have kind of a true immersive experience where you can enhance you know AR and VR experiences imagine watching a game and like you're there you know with the AR and VR I think that can be really a lot of fun and of course with all the game and event management you know there are also areas where you can use AI for scheduling logistic crowd management ticketing uh so lots of you know areas here where Ai and also gen AI can really play a transformative role and I I can only see this kind of uh growing that's right yeah I love the idea that in the future you'll be able to get like whatever commentator you want just generated algorithmically on the Fly you know like I want you know the uh you I want George Washington to narrate my my sports game and to have that audio generated on the fly would be really interesting I also think this kind of point about kind of like the backend is also really interesting about like all the operations it will help with and I know Skyler you were interested in particularly the idea that kind of you know maybe teams that will be able to like really manage all this data will have this huge advantage in the future um and like it'll be like a wonderful world where basically like you know someone managing a top tennis play in the future will like also be trying to get h100s to run their own fine-tuning runs so maybe actually two questions along those lines both to Aaron do you know of any of the tennis players have they used this are they looking at their narrative that was generated so you know has has it reached the player side I know we're talking about consumer facing Tech at this point but have any of the players commented uh and the second one is when is IBM going to bring this technology to Esports you know so that the data is almost already in a more usable format there but there's can be just as much hype and excitement and drama in some of these more recent um orts that are coming out and I think there's a great opportunity to bring this technology um to uh to electronic gaming that's one of my favorite pastimes so yeah Aon any comments on those yeah so so first you know um great great questions you know and suggestions um I think think that do you know players actually use some of our information and and it's really funny some of them do some of them don't some of them are very superstitious right if they were to look at one of our predictions right then it would sort of mentally affect the way in which they play the game upcoming and so some coaches do not let their players you know look at some of our uh features and then and then some properties were not even allowed to you know during a game to show you know any sort of predictions or um sometimes even gen content because it might influence right play as well but on the other side of the coin some players you know have you know used it and and they do look you know at the stats um you know that that we boil down and we we also had a project um with um the US Open um so um you know we worked with some ATP players and so on where um we we would help them train you know so they would see videos of themselves playing we would find highlights of of of they played so it was like a dashboard you know with the Developmental Center um so so so there's that that aspect and I I I was curious to do any of you play tennis or Sports and and not very well and and would you use these kind of insights or I think I would I would try to use especially if you know maybe trying to help with my performance I mean I'm not an active Sports person but I hope you know to help me maybe improve techniques and things like that but another thing is there any downside to this um and I I I worry a little bit maybe about the bias and fairness uh with with certain athletes these are always you know Flags you know red flags that we could have with the use of AI uh the AI systems can inherit biases pres present you know in the training data could this be lead to unfair treatment of the athletes for example biased scouting or algorithms might Overlook maybe some talented in individuals from under represented groups so there are some you know dangers I don't know arony this is something that you think the current algorithms are taken seriously or this is still early on and we're just evaluating the technology right now and maybe you know start looking seriously into these concerns privacy issue was bias fairness yeah fantastic topic that that could take you know hours right to talk through but um so so yes um you know fairness transparency and explainability you know where in gen might call Chain of Thought to understand what's output from the models uh but but a story real quick um you know in in tennis we used to measure um the or still do measure the excitement of videos right so we'll look at for example um signals like sound right gestures score um and we quickly found out that um somebody who's an amateur right uh who's playing golf they might have a really exciting shot but because they're not very popular player there's not a lot of people around them to make a very loud cheer right whereas there might be a you know top five ranked player who makes a routine shot it's not that exciting but has a huge cheer because there just happens to be a lot of people there right so so we take that into account and we'll debias with with postprocessors you know based on different uh restrictive uh traits that we have you know uh because yeah it's it's real right and and we work to make sure you know that we can debias these types and and and there's many debiasing ways and methods right and and the Gen AI space um you know we're I think just beginning you know um around that and um and a question for you all is you know um we try to balance creativity with factualness you know with these different generative content um you know how how do you think the field can do a better job right at doing that you know uh with respect if you think there's hallucination if you're more creative you know whereas you need fact Checkers you know um and and so on and so forth I think one thing that will become if we continue to kind of collect this data you'll be able to ask questions about how exciting was the current top star when they were just starting out you'll be able to go back in time 10 years and look at that top star when nobody was following him but he was still making the great shots we probably don't have that now because we don't have as much historic data but um if you can you know you'll really be able to watch entire uh I think careers play out over time uh at least with your ideas of um the guy who's not popular now but made a great shot um you'll be able to ask that same question all right Michael Jordan His freshman year of college didn't make the team you know that that type of that type of perspective but we're not going to be able to do that with the snapshots of data we have currently yeah and I am hoping that some of these tools will actually help teams kind of like see around corners right like I think some of the most interesting times in sports is when someone comes up with an entirely new strategy right that kind of like totally changes the nature of the game um and hopefully with data there's like a chance to kind of identify a lot of things that we might not otherwise uh in in the past so I'm going to wrap up this section I guess Aaron maybe question just to throw it back to you is you know it seems like you've done a bunch of work in tennis right so like Wimbledon US Open um and I'm kind of curious if there's like as a as an AI researcher is there kind of like a dream sport you'd really want to kind of like apply some of your techniques to or ones that like you know haven't really been investig at you know I ass seee one of the reasons for tennis is like you know it's a lot more controlled right you can like set up a bunch of cameras there's kind of a defined place where it happens but I'm I'm curious from like almost a CS Point like what's the next most exciting you know sport to get aied and and why yeah so so we focused a lot on tennis golf um we you know we're doing um some you know Fantasy Football uh which borders on e-gaming uh we we did do some e-gaming uh with with the OverWatch um which was very interesting um but I think um exploring um the intersection of gaming uh with that of a sport um you know because um I really enjoy the challenge of - gaming uh because you know um the the the physics engine can change you know you can get new skills and new abilities on the Fly you can get powerups you know so it's different um and the and your models have to adjust very quickly um and maybe quickly online learn a new her right that uh you know is transported into the game so that's that's interesting and and in a real live aspect one of my favorite sports to watch is basketball you know um um I I would love to um you know um look at that um analyze more of the team aspects um in play um and then also look at look at the Olympics um you know um um I saw an article where uh I believe it's NBC they're going to be using generative AI to recap you know some of the matches so I'm very curious about uh what they're going to do and how how that's going to um you know uh be accepted really you know by um the uh population but yeah so that's that's my answer I'm sticking with [Music] it there's been of course a lot of hype around the ability for generative AI to assist with software engineering um and a lot of excitement about the idea that at some point AI might just do the coder's job entirely from end to end um co-pilot of course um one of the most kind of successful I think products of the Gen AI era is is a great example of this um and there's a great paper that came out uh just last week in the i e transactions on software engineering entitled no need to lift a finger anymore question mark assessing the code quality of code Generation by chat GPT and basically the idea is to say okay well we know that you know these systems can can code um how good are they at doing it and so it benchmarks chbt against a number of different coding challenges to assess how well it is at generating code um and I think there's kind of two interesting findings I wanted to discuss with the group here today um you know the first one is basically that it turns out that these coding platforms have you know chat in particular has this huge variance in its ability to do coding tasks right so it turns out that for you know tasks in this Benchmark labeled hard it's only able to get it right about 40% of the time and then for easy tasks it's up to like 89% and I'm kind of curious as folks on the call who all code and presumably use stuff like you know co-pilot you know I think there's been a narrative which is you know okay these coding assistants are basically just like stack exchange Plus+ plus right they just help you search the internet and get an answer for easy things um but I'm kind of curious if you all kind of buy the skepticism of the paper right which is to say for the really hard tasks we are still just not seeing you know llms or generative AI be able to kind of like really kind of Advance the state-ofthe-art or accelerate our ability to solve truly hard CS problems and um kind of curious about you know what you all think about that if that's just a temporary thing or if that is a kind of ceiling um that we are all running into and I guess Kar do you want to kind of respond to that I know you might have a view on this particularly when it kind of comes to sort of like the the coding task and it's also relationship to Hardware yeah I really enjoyed reading the paper I think it it did a very nice study uh to evaluate you know DPT for coding challenges uh with which revealed mixed performance like like you showed so influenced also by this training data cut off and the inher limitations of existing models so for simple tasks doing fantastic and I think it'll continue to do fantastic for complex things it still has limitations you know gen AI today is still struggling with understanding the broader context of a project you know which leads you know to suggestions for example that don't really fit the overall design or architecture so especially when you have you know these uh design and complex systems where we have multiple components multiple apis that need to interact with each other uh so it's still because that requires reasoning and that there are C limitations with geni when it comes to reasoning so I think the complexity of the context understanding the integration challenges you know integrating also multiple components together and then those interfaces how they communicate with each other uh so that's still I think it's a it's a bit of a challenge for for Gen AI um so I think as we're improving in terms of the contextual understanding and the accuracy of the Gen AI models we will see I think better results uh better integration and user experience with these coding challenges but I think we're still there is still a lot of research that needs to be done so um and things like best practices in software engineering just us clear system design uh and prompt engineering you know can enhance also you know these tools but I think we're still in the early stages a fun uh contextual story for this uh we just hosted about 40 high school students here at the lab uh just to kind of show them what Industrial Research like looks like uh and they were asking questions with our software engineers and they were asking questions where you know do you use a co-pilot or do you use generative AI in in your code and they they kind of a little bit but to to a person everyone down our line it was always referencing stack Overflow and so I think your your your comparison of what you kind of really want is this kind of nice integration between the tools that were really used to like a stack Overflow sitting inside your IDE allowing you to code so much smoother it was it was a great example because the high school students weren't as familiar with this thing called stack Overflow and our software Engineers are saying no exact exactly and that was this point where there was kind of the old and new hitting together and um those types of resources are so incredibly useful and it will be interesting to see if these code generators um how much how much are they really taking from stack Overflow and in that paper they made this really cool analysis they broke down the coding questions that were before I think 2018 and the ones that were after 2018 um don't quote me on the date but it did very well on the old questions and very poorly on the new questions suggesting that the llm is not keeping up with the most recent content kind of experience on stack on stack Overflow so it was really cool to see that breakdown where they did the performance of the llms doing great on older established questions perhaps with answers already sitting on stack Overflow uh and not so well on the more recent coding challenges um that that came up after the training so I think we're going to see these these kind of comparisons with what exists on stack Overflow and what's been incorporated into the llms but a really cool a really cool place to see it play out wouldn't that require for maybe frequent retraining or re you know readjustments of the models yeah I think it can be a solved problem I was just giving hats off to the researchers who kind of understood that Nuance between the coding ability of the models and say wait a minute this model was trained roughly about this time let's see if we can ask it coding questions that didn't uh exist at least in the common uh the stack Overflow Universe um at that time and then they give the performance breakdowns but yes uh retraining and Co and and constantly uh taking into account new information would be a way to try to address that yeah and I think this is one of the really interesting sort of challenges that it brings up because you know like pre-training right like updating the training data is like actually kind of like a it's a it's a cost intensive task right you like have these models that don't necessarily get pre-trained you know every single day um and so there's this kind of weird thing that the paper sort of suggests which is that like if you're if you're working with older languages right you're actually going to be in trouble with these models um and kind of the best way to survive is like you'll actually see like everybody trying to migrate to avoid being automated to like there's more pressure to basically adopt new languages and then also similarly like those new languages are simultaneously ones that the model are like is not very good at assisting so it kind of imagines kind of this like bifurcated world where there's a bunch of these older systems that AIS can basically automate most of the coding for and then kind of this like Frontier of kind of code that like essentially can't get automated away um and uh and has really interesting implications for like where we see I think the impact of the technology will will be I I think this dependence on AI like if we have this over Reliance on AI tools I think maybe the danger could be uh this decline in problem solving skills would that is that going to be a problem because if especially the young generations of programmers if for all these simple tasks you know did you go ask tgp you know write a code that does this for me and how is that going to impact you know because usually you learn coding from these simple examples and then you build on top of that to go to more complex uh problems so I worry a bit about you know this uh over Reliance on AI tools for problem solving especially as you're learning in the early stages for programming and as you build of course maybe that's going to require new skills that we need to develop as programmers or coders or sorts of software developers is figuring out how to use these tools more efficiently and how to know whether these are kind of plausible Solutions or I need to change them and tweak them another thing I see is what does it mean for debugging when there are issues if I have relied heavily on these code Pilots to write code for me and when things are failing will I be able to debug things properly or should I also rely on AI to help me debug these things so it's kind of uh it's interesting to see how the interplay of all these different things will come to play and the role of humans here and coders um I think I don't know all the answers to this maybe if you have some insights but there is a downside and to this and of course lots of adventages in terms of enhancing code productivity uh but there are challenges as well we have to think about yeah yeah yeah I've seen in the field that uh many people they'll consult different types of code assistants you know because there's many different models that are specialized around different types of task and you know this agentic you know architecture where you have a mixture of experts right of which you bring together um such as many different large language models is almost like coding by crowd you know in an automated way and so now it seems like uh these developers and um scientists um and operations experts they sort of have to have analytical capability to discern about what's the best technique with these different opinions right because you're going to have many different opinions and coding Styles and perhaps even languages you know being sent to you right um and and I think one important aspect that I always try to follow is that that there's no free lunch right that um there's not a perfect algorithm suitable to solve every single problem right it depends on the context of the problem of which is at hand and so with that in mind you know I think the human really understands the context of Their audience what they're trying to build where they can deploy it uh whereas these code assistants at least today you know you know know a limited amount of the context and therefore it's important to get multiple large language model opinions as far as what they should or shouldn't do um you know and and what when one area that I have a lot of interest in is this automatic transpilation of code so say say you're running you know um an application in one language Let say python right um maybe it could be trans transpiled into rust right where maybe it would be less memory intensive you know on the Fly um and or you could have a human sort of look and say yes I agree no I don't agree you know and put sort of take that analytical approach but um but but I think it's all emerging right and um and I'm really excited about you know the future and uh what what we as IBM can also do with instruct lab right to um use skill building right um in a way to help with this as as was mentioned before but the sort of timeliness right of data of which it can understand with in context learning fine-tuning you know there's many approaches and and there's going to be many more uh in the future yeah I think one big thing that llms will probably play a big role in is um you know the average state of documentation for code is very very poor and I feel like the one enormous use case is even outside of coding assistance just taking a piece of code and make Shing it's it's like well documented will be this like huge Improvement in quality of life uh for this kind of work I I love that because documentation is always an afterthought and it is you never have time to do it so that would be a huge help yeah that's right it'll be so funny the biggest thing won't be like automated code it will just be like making sure that like someone's doing a good job documenting [Music] everything well great well I want to take us to our last topic of the day um there was a another wild paper um if you're kind of a weirdo like me uh and just like browsing archive for fun um this is one of the papers that kind of popped up recently that sort of caught my eye and you know the way it did this is because it basically is people doing SEO with their papers so the title is scaling synthetic data Creation with 1 billion personas and so with a name like that you know got to click it I got to download it I got to read it um and some of us need to even print it out uh like Skyler here um and uh it's actually pretty simple idea but I think this particular group of experts would be really good to kind of tackle it um you know to just kind of give the overall background right the idea is um you know there's a need to generate synthetic data often right because collecting real data out of the real world is very expensive and comes with all these operational difficulties so people always kind of trying to come up with ways of creating data from scratch um that they can kind of just generate on the Fly and use it to effectively train their models because it kind of like relases that that bottleneck and so these researchers out of tens and Seattle lab said oh well maybe one fun way of doing this is I can kind of instantiate what they call personas which is like a personality could be like you know your job is as a dog catcher or your job is as a professional coder at IBM and their kind of observation is well we can get these different personas to do different tasks to Output data for us and these tasks will generate very different reactions right so it turns out if you ask a you know dog catcher to generate code for you it will look like you know different from the code that you have if you prompt the model to say you're an expert coder and what they kind of make the argument for is that with all these personas we have a scalable way of generating lots of training data and they do a couple experiments to show that you can use the synthetic data to train you know an llm to do math problems effectively and um and I guess you know maybe just to kind of kick it off you know particularly uh csar with you on the line you know I think the big question here is like to nowadays like how much is compute the bottleneck and how much is like data the bottleneck um because it feels like here is the world where it says okay well if you just have lots of compute you can generate all the data that you need but you know a few episodes ago we were just talking about how difficult it is to come by compute and so I'm really kind of interested in your take on like what is the what is the bottleneck right now in in the machine learning workflow yeah that's a very good question um of course you know with Gen AI compute is a big ball neck right now with all the especially the mutm multiplications that takes a huge amount uh of comput with the the current accelerators and Hardware uh data you know when it comes to data bottlenecks right now I think in certain industries I mean whether what is the bot neck today it depends in certain industries uh we don't have much data especially in Industries like industry 4.0 where you have uh you know uh machines and so on and you need to understand you know their operations sometimes there is a lot of noisy data or you know you have sensors you know and you probably haven't collected the data for these sensors for an extended period of time to be able to have enough data to train a good model to understand predict anomalies or do things like that um so in certain areas in certain industries there is this huge lack of data uh when it comes for example to texts we have an abundance of text right now online however you know that text sometimes is not properly formatted or there's a lot of noisy and redundancy in the text so I see you know both of them are bottlenecks and it depends on maybe the industry the sector the use cases so data could be a huge ball neck uh in case you know you don't have enough data or you have tons of noisy data and you need to curate the right the right data the right context Etc to build the model and in that case you know the synthetic data generation could be of huge help of course comput is still a bottleneck you know especially right now with the you know shortages of acceler the hardware shortage that we have in the accelerators and you know this R you know with these large models you know we have all these comput and of course we've talked in other episodes about you know new approaches to the m m free uh approaches or the in memory Computing approaches and neomorphic and so on trying to reduce that bottleneck on the compute um so I see both of them are bottlenecks depending on the you know the context the US the industry yeah no for sure yeah there's a reaction I had to the paper which was basically like well this just makes all the existing bottlenecks more bottl necky right it just turns out like actually you know the great way to get data is more compute it's like okay well it's just like more pressure like people want even more chips exactly it's like chicken and egg problem here yeah that's right I guess Skyler maybe I'll turn to you as someone who has printed out the paper yes uh although someone who's printed out many papers and not read them I don't want to imply that you have read them but u i mean is this have we solved the synthetic data problem like how do you like this approach what do you think about it I I do like the approach I think it is it's quite creative and they they scaled it in a way that I probably wouldn't have gone with um I've actually used chat GPT to write uh bedtime stories for our kids right there alongside them and uh case in point here is they play Minecraft and so they will basically say write a story but make it about Minecraft so now you basically have just created a Persona of a Minecraft player who's responding to the prompt so we've been doing that as kind of an individual scale and this paper has now taken it up to the billion Persona level and they're keeping track of all those generated stories in order to try to get that that diversity so very cool in that angle uh but I want to spend a bit of time talking about that very important word at the end there diversity how how how confident are we about the coverage of these 1 billion personas and do the underlying large language models you know really understand being a a massai warrior Messi is a tribe here in Kenya and so yes you can ask the large language model to take on that Persona um whether or not the generated output from the Persona of a Messi Warrior matches reality that that I don't know how well they they really covered but but hats off to the authors for having that idea of let's take a generated text from all of these different type of personalities and actually put a number behind it of a billion uh very cool um we have not solved the synthetic data question yet and I think the the the most obvious question that comes up on this is how do we know that those personas are uh well represented in the underlying model so yeah that's those are some of my thoughts on that on that piece yeah for sure and almost we're kind of in a like an interesting and I see Aon you about to come in um you know like we're almost kind of interesting place where it's like the only way we could validate whether or not these personas are accurate is to have real world data of these like you know and so there's kind of this weird chicken egg issue which is like well I don't know how validated they are but in order to validate them we might very well get the data that we we need um Aaron do you want to jump in yeah no I you know I just saw this this a really interesting you know stat where you know the average human can read about what is it a billion words or a million words right in a year um right and then these algorithms can read about six orders of magnitude more in a single month right they're just thirsty right for this for this data right and so that projects you know um I mean don't don't quote me on this but but around 2030 2032 is we're going to run out of useful data in many different domains and you know being able to synthesize data is very important uh but if you stratify the data and in so many ways like in the paper the danger is are you watering down you know all the different personas where where we could glomerate and make them less like prune them a little bit because they're not really that much different right because it's almost close to the human population on Earth right of a billion personas right so that that's that's one area um but then then I do think that um what would be interesting to hear about would be um like the notion of the turning test 2.0
right you know um um how do we ensure that you know these new personas you know really do pass you know mustard right and that that that they actually do produce you know the transparency we need explainability train of thought fairness right um because we because we are going to be splitting up data in different ways and and there could be side effects about that so so I was curious what folks thought I I think the idea of um trusted personas or pruning personas Aon like that you mentioned is very important can we kind of distill all of these personas into few ones that we can trust that give us you know the best maybe accuracy and Pro in away the ones because there are you know billions here we're talking about a large number of personas another thing how do we tie this maybe in a real uh problem solving word scenarios or industrial use cases and what does that mean for example if I'm trying maybe to build a a person you know a foundation model for example for uh a factory uh you know failure diagnosis and so on does that is there Persona there uh is there for example can we maybe talk about different skills for example the uh the engineer the uh maintenance person the uh uh I don't know the uh you know the chip designer all of those could be personas here that maybe we bring together so they can bring different skills and then uh kind of collaborate together in this llm or Foundation model to solve a particular problem so it's like having multiple experts working together to solve a problem so I think the idea here of taking this maybe to the real world to solve real problems could be profound here and have lots of implications I like the idea the scaling that scholar mentioned uh the diversity aspect but this needs to be validated especially the Sol real or problems so final thoughts I mean I mean I think it's it's very exciting you know where where we are and where we're going you know and the combination of gen generative AI techniques with classical techniques is critical you know creating um I've seen the term floating around but the AI sandwich you know where you might use neuros symbolic you know pieces around you know these generative AI pieces um neural networks has been around for a long time but um but but I think um you know you know one last thought that I had is that I think uh you know Mother Nature is the ultimate teacher and we have a lot to learn from our own brains um and I'm I'm excited about what's next great thank you well as always there's more to talk about than we have time for um counter Skyler Aaron thanks for coming back on the show um and we'll hopefully Happ you for a future uh future episode so thanks for joining us if you enjoyed what you heard uh reminder as always that you can get us on Apple podcast Spotify and podcast platforms everywhere uh and thanks to you all out in radio land
2024-07-15 09:32