computer vision is it cool again now we can take that and then you amplify it across uh different uh problems to be solved here so is friend.com AI Hardware's breakout moment we already are so addicted to notifications and it's one more source of notifications for us we're estimating a 30% abandonment of proof of concept AI projects is that a bad thing yeah I don't think it's as pessimistic as it could be all this and more on today's episode of mixture of experts I'm Tim Hong and I'm joined today as I am every Friday by a worldclass panel of technologists Engineers and more to help make sense of a tital wave of AI news today on the panel we've got Vagner Santana staff research scientist and master inventor on the responsible Tech Team Kate soul Who's a program director of generative AI research and Ami Ganan associate partner Ai and Analytics [Music] so our first segment we're going to talk about Sam 2 uh meta this week announced the release of its next generation of a model it calls segment anything so the segment anything model is Sam and this is the next generation of it um and specifically what the model does is it allows you to segment imagery or video so you can select an object and kind of track it over time now I really wanted to cover this because you know there's just so much hype around NLP um and everybody's talking about chat Bots all the time but we kind of don't like we should not forget that there's like really really exciting things happening in other domains of AI and particularly in in computer vision so we're going to start off with a fun question which is just simply is computer vision cool again Kate yes uh Vagner yes and Ambi always has been yeah don't call it a comeback right um well I think with the violent agreement let's get into this segment I really wanted to kind of talk about this because of course it's another iteration of meta really kind of playing in the open source game but I think what's really really interesting is that it's also a really interesting marker in the ground for what sort of Open Source exactly means in the AI space so if you haven't been watching this place space very carefully you know in the first versions of Open Source in AI people said well we're going to open up the model and there's going to be weights that are available um and uh with uh Sam they're also uniquely releasing the data uh behind the model um and so Ami maybe I'll throw it to you as kind of the new panelist um is I'm curious about how you sort of see this like in the future is open data going to be a big part of what makes a model sort of truly open source um and kind of talking a little bit about how you think through some of that yeah uh so yeah listen um we we love open source right um yeah and open source means different things to different people uh it can be just you know releasing open data it could be having open weights um a whole Spectrum there right I'm really really excited that meta went ahead and did this on apach to license it's fully open weight um there is a lot of computer vision problems that we've been wrangling for several years right I remember back in my grad school days you know we would go and do uh traditional image processing and you know segmentation through watched algorithms and you know drawing little boxes and things of that nature um it's very painstaking oh extremely painstaking and extremely laborious right and uh um fast forward to today it's uh it's super super exciting to see something like this which can operate at scale on huge videos and think about this from an Enterprise setting right um my clients um I I work with clients that are you know they have huge manufacturing operations going on they have to go you know when you think about the supply chain there is um you know BS that need to be moved uh in the warehouse there is computer vision that's going on and tracking those objects or if you look at you know the the production settings in a lot of our clients a huge assembly line of objects of different types that need to be tracked through multiple different stages um or if you look at some of our um you know local governments for instance right one of the things that we've seen is um uh people tend to jump turn Styles right when you're going through public transport and that surprisingly is a huge um cost to cities and local governments right uh city of New York for instance it's uh it's a cost of like $750 million um and it it becomes a big problem to solve and in the past a lot of these have needed to be solved through very specific computer vision models custom trained for these um specific tasks right what Sam to enable is for you to be able to go and rapidly build Those computer vision models at scale because now you can go and do these automatic segmentations of large videos which means whichever domain right you throw in uh videos that it hasn't seen before domains that hasn't been uh trained on before it still is able to go and do those segmentations and track those objects over time right and so now this gives us a very very um capable mechan to go build these domain specific computer vision models at scale and so you know short answer really really exciting and that's why I think that open source uh capability helps now we can take that and then you know amplify it across uh different uh problems to be solved here yeah for sure and I think that's kind of one of the most interesting things because I think yet again this has sort of been a theme in a number of our conversations you know meta and its blog post is like this is so exciting because you could use it for AR glasses and I think one of the questions I had was like is this the technology that finally gets AR glasses to work and I'm kind of I don't know Kate if you got opinions on that or Vagner you got opinions on that but there's almost kind of one point of view which is like like again with AI like the big application is going to be like Turn Style enforcement right it actually won't be these kind of consumer elements but I don't know if anyone wants to speak up for like no actually this is this is the moment that's really going to make AR glasses work I mean I'm sure this helps us get closer not farther away but you know I'm I'm always wary of anything that's uh claimed to be a silver bullet but I I want to get back uh Tim to what you mentioned earlier about like open sourcing the data because I think it's really interesting to talk about you know meta strategy and and how in Vision they've released the data behind Sam 2 but and the license of the model itself is apache2 and you look at the Llama series and you know 3.1 came out uh just last week where it's under a specific llama license uh and there is absolutely no data uh that's released or even described really in terms of um what was used in in a little B more yeah for our listeners I think they'd really benefit from like hearing so what is the difference there exactly between kind of Apache and you know what's happening Obama and I guess why right this kind of question is like why aren't they consistent yeah so Apache 2.0 is a very popular widely used open- Source license that's been around for years and is considered a very permissible license anyone can build on top of it for commercial or other uses without having to worry about further attribution to where uh things came from where llama when the models were released meta created a llama license that is custom in bespoke to handle Llama weights another big differentiation is Apache 2 is normally used for licensing software um and the data that they released on Sam 2 I think was CC bya which is similar to Apache 2 but commonly used for data so you know there are different terms you want to govern different artifacts Apache for software CCB also often for data and now model weights people have started to come up with their own licenses CU model weights also fit somewhere between software and data it's a little bit unclear how to the jurisdiction there yeah I think it's such a great point to end on uh and I think if I can I like maybe just to take one more turn at that because I think it's a really important part of this question you know it strikes me that one of the reasons everybody's very excited about open source is the accessibility of the technology right this is not going to be something that you know a company just just kind of put up walls around and then charge you for access too um but it kind of s strikes me that like part of the problem of doing open sourcing is that it's also a lot more hard like difficult to control use right like you suddenly have this technology that kind of anyone can use and you know some of the people that use it are not going to use it in the most responsible way um and I feels like that's like a really hard challenge right because like I think you know kind of democratizing the technology also creates tensions with how do we like enforce use cases um and um yeah I'm curious if the panel has any kind of thoughts on on that yeah yeah I I was I think that what open sourcing has been one interesting mitigation for these situations because as the community notice notice that there's something going wrong wrong or there's a a specific um harmful use then Community takes action and uh uh we can look back to open- Source uh uh operational systems right they they are the most secure ones right that we have because the community uh um automatically or or they build on top of this uh openness right and they try to to tackle and also mitigate these um these issues so I think that in this sense I think open sourcing is a good strategy to mitigate this this uh issue if we're not transparent and open about the technologies that are available or will be available if as uh people continue to work in this area there's no way for us to build regulations and awareness and proper practices around it so I'd much rather had this be happening out in the open than you know behind some closed doors where we really don't have a good good line of sight into um what's going on that's right yeah I guess this model of like just trust us to like a world where we can actually kind of verify it by like the verification part yeah absolutely I feel like yeah I mean once you put it in the open there's you know lot more heads thinking through really tricky problems and there is a lot more diversity of solutions that come in terms of mitigating these problems right rather than trying to you know force and control it I think when you put it out in the open um you you'll have a lot more Creative Solutions coming to solve these [Music] problems okay for our next segment I want to cover friend.com um so as you
all may know right there's been a long in dream in the valley that one of the really exciting things you could do with llms is the notion of really for the first time creating a fully-fledged kind of AI companion assistant um and this dream is kind of manifested in a bunch of Hardware projects that have taken place so the Humane pin that came out earlier this year is a good example of that um and friend.com is uh a most recent iteration of that so Avi shiffman and entrepreneur launched this with a teaser trailer earlier uh this week and um AV has taken a lot of criticism online but actually want to take this conversation in a slightly different direction which is that I think you know what's really interesting and what's kind of offered by friends.com is sort of the idea that maybe startups can actually start competing in the AI Hardware space and that you could actually in the future launch a AI Hardware project and even something so Advanced as like an AI Hardware companion um just being a small startup on your own right that this is not just going to be a kind of space where you know the big companies can only play um but that actually might be a place where startups can play as well um and you know I guess I want to kind of put forward this idea and K maybe I can pick on you is do you kind of buy the idea that like the costs of AI are coming down so much that you know we're about to kind of be a wash in these types of things like the idea of someone launching an AI companion product is not going to be like something only you know the biggest tech companies in the world can do but that you'll also have like these upstarts that will be able to kind of like do their own take on on this space yeah I I think it's a really interesting question because we're getting so many kind of in a way conflicting signals of what's going on in this space so you know uh Gartner just released a report yesterday or two days ago saying that they expect 30% of all poc's in gen to be never leave the PC phase yeah definitely we're going to talk about that later I think this is going to be the final final segment of the episode okay great but a lot of what they were talking about is citing the costs right so is uh that we're not seeing the ROI offset the cost enough and I I think that certainly makes sense given what we're seeing but on the other side we're seeing models get smaller and smaller and smaller like there is this clear Trend where we're able to pack more performance and fewer parameters where we're being able to get to the point where these models can run in CPUs and we don't need the Advanced Hardware at to the same degree that we did a year ago and you know some of these scaling laws are really exciting in terms of how efficient the technology is growing so I don't think it's unreasonable to think that that we could get to a place where startups could actually get into the hardware space um for geni type deployments yeah and I think it's kind of fascinating just because you know had you talked to me like five years ago I would have been like oh yeah the future is just like one one big company that has all the AI right but it kind of feels like we're going to just be a wash in intelligence like there'll just be models everywhere you know particularly with the developments in open source that we were talking about um I don't know if Ambi or Vagner you've got kind of thoughts on this about just like how accessible this and how competitive really ultimately a space this is going to be yeah so and definitely agree with Kate there right so I think small language models are becoming way more powerful and way more popular for variety of reasons right um in the consumer space like you mentioned uh you know it's uh there is a there's a lot of competition in terms of hey you know I'll put something on the edge um it could be a companion type of a device it could be for you know um something else that you just want to run on your phone locally um you know something that you want to run on a Raspberry Pi device that you're just you know tinkering with there could be a lot of different variations where you're trying to run these models on the edge um definitely on the consumer side but we're starting to see some of that on the Enterprise side as well right because now Enterprises are wondering um can I go and start building really domain specific uh models and you know this small language models then come and help them uh Power it through so if I have data that I don't want to expose at all to the internet but I still want these capabilities and I have devices in my manufacturing plant where I want you know these to be helping my uh plant workers and things of that nature then these become a solution right so small language models running on edge in local devices that's definitely becoming popular um both in the consumer phace as well as in the Enterprise phace yeah and I think thinking about the economics of this the other thing I wanted to touch on on friends.com is you know they the product is being offered for $99 with no subscription which is also like very intriguing like to think about the business model of this there's always been I think an assumption in the AI space which is well the consumers are going to demand they want better and better and better models over time but I also kind of think about like I had a tamagachi as a kid right and I built like very deep emotional relations with my tamagachi and they it's not like they sent updates over the wire to the tamagachi it was just like a thing that they printed in the factory and it came to me um and I actually wonder whether or not like there'll be almost a similar dynamic in AI like we're also you know onb to your point like there's almost assumption that like people will want the higher capacity models over time but I also kind of think that we may just have like a retr Computing movement in AI where people are like oh yeah gpt2 like that's like really where like the the peak of llm creation was um do you buy that it's like my weird take that I've been kind of playing around with is like actually it may be possible to do non-subscription AI businesses because if you have a model that someone really likes interacting with they actually may not want it to change at all um and yeah curious if folks have any thoughts on that Vagner I'll maybe toss it over to you uh well I was um reading a few pieces about the the friend.com device and uh uh one thing that at least looks interesting is that um it says that the context window well it it's not processing anything beyond the context window so if you think about small uh language models imagine that we could have one uh being hosted on your mobile phone then this could be possible but friend.com nowadays use clo 3.55 so uh it's processing elsewhere right so it's a device communicating via Bluetooth to your mobile phone and again to your point on time I go I think that it it's it's a lot different in Z tamagi like feeding on people's loneliness that's that's the model basically right so that that's different because the whole Dynamics is different because before we would have like to take care of the tagoi and that was the relationship right and nowadays with this specific device it's application like it's uh uh again I'm holding myself because I have so many things to talk about this but yeah now that you mention about tamagi is like the other way around right because it's uh we already are so addicted to notifications and it's one more source of notifications for us right and it's based on uh um uh the usage that or again and I I've read one really interesting um analogy for for this is like um treating uh uh uh loneliness with this device like offering as if was a really friendship is like uh uh giving junk food to someone starving like okay may help right now but it's not a solution in the long run right so that's again to your point I think that thinking about small language models without transferring the data elsewhere I think it's an interesting way of thinking especially for startups creating new technologies but this specific use I have so many concerns I think uh the the gp2 gpt2 3.5
level capabilities for generic conversation capabilities right that I think sure you can you know you can uh can have a quan version and I think you can have the small language models operating to a good degree of just general conversational capabilities um and then you could you could stop there but the moment you're trying to get to something uh specific right um you're trying to get to something uh a domain specific right you you go uh try to have a deeper conversation then you know I think you still need to get to some of the larger models right so I think I think where it will lead to is that you know um uh Solutions like this can give you that superficial shallow conversations but then the moment you try to go deeper and deeper maybe you know you you have to get out of those smaller language models at this point in time at least I don't know there was something like very satisfying to me to hear that it wasn't going to be subscription it wasn't going to try and be a large model that had deeper convers like to me it's meant it was almost more like a a meditative like tool for the near near term but like my dad is not going anywhere it's not trying to be like a real human you know like it it I really appreciated how much it can strained the scope of the use cases and what this can do by saying like look it's a device we're not going to update it and it's going to be you know running uh locally yeah for sure that almost actually is sort of interesting I mean I think all these points sort of come together is like oddly the fact that it is not updated that does not go to the cloud like almost presupposes a limitation in how far the relationship can go right it's like V to your point maybe it's actually the most ethical way of Designing this this architecture right is just like an intentionally limited system um we would actually be worried if it was like we're going to push updates and it's just going to get better and better and better and you're going to build this like massive parasocial relationship with this thing that's not a real [Music] person I'm going to move us on um so our next story and KR is already anticipated me a little bit on this is uh Gartner the industry research group uh came out with a report this week that estimated that about 30% of gen projects will be abandoned after their initial proof of concept by the end of 2025 and they cite a number of reasons for this you know poor data quality inadequate risk controls escalating costs or unclear business value and this kind of follows on a a string of reports in a very similar vein so just a few weeks ago we talked about the Goldman Sachs report and the Sequoia report um H but for this segment I think what's pretty interesting and I think this is the first place I want to start is is 30% all that bad like I was kind of taking a look at that and I'm like oh if we're doing 30% then like for a new technology we're we're killing it I had the same when I first looked at it I actually was like wait are they saying 30% will succeed or 30% will be abandoned cuz I assumed it would be the inverse honestly um so you know I I buy it I also don't think it's as uh pessimistic yeah I don't think it's as pessimistic as it could be uh and I think it's valid in that look the costs right now we're in this period where the costs are difficult and we need to have more um refined approaches for picking PC's identifying and understanding the lifetime cost and lifetime value of pcc's is going to be really important but also you know like we were talking about earlier this Tech if you look what it cost to do something a year ago versus what it cost to do something today and the rate that that's changing you know I think we're we're honestly um in a fairly optimistic place as we talk about emerging Technologies and and where gen is headed yeah this is actually a very powerful argument is almost if I hear you right you're sort of saying even if the benefit of AI stayed fixed the fact that the costs are dropping so extremely will almost end up justifying the technology like it's actually the the costs changing versus like the benefits changing over time um never really thought about it like that that's really I have a slightly different take on this you know maybe complete C here so I think when we say gen projects right there is a little bit of uh uh confusion and uh a misinterpretation on what those mean right um we've realized and when we especially work with Enterprises we realize that the the impact is when you do these generic projects you're you're trying to solve for specific problems in specific workflows and subtasks right so when you look at gen projects and solutions that are going and laser focus solving for specific subtask right those are being incredibly efficient we're seeing right um so I think when we say you know hey 30% um you know uh 30% abandonment of gen project I think there is probably a little bit of a mixture on what those gen projects mean right it could be really broad-based things not necessarily focusing on specific workflows or specific T so that's kind of how I view it right um I fully agreed that you know you know there is a focus on value that you know Enterprises definitely look at it and say you know when I'm putting in an investment into gen am I you know deriving the value out of it so uh 100% on that but when we say you know it's going into um a certain set of Abandonment rate I think it it depends on okay what exactly are you measuring right um are you measuring the things where it's going and solving specific um subtask and problems and automating a workflow or things of that nature yeah that's right and I think that was actually I mean outside of the 30% I'm giving them a little bit of a hard time on their report but I think one interesting observation was they they were saying look a lot of the AI benefits are productivity benefits and that's really hard to necessarily capture in terms of like increased profits and so there is kind of this interesting breakdown where the technology can legitimately be producing a lot of benefit but actually just like as a dollars and cense or in the very least on the bottom line standpoint like is it improving my profits may be a very hard time to kind of like draw that that connection I think that's why I think those measurements become more important right I think as the technology improves and as people start driving a lot of these I me starting to see those right um one of my clients now there is a maniacal focus on saying okay I'm going to go and um see if I'm impacting this particular subtask and subflow am I able to go and figure out what metrics I'm going on solving for and I'm going to monitor those metrics so those measurements are starting to get put in place so once those measurements start coming up more and more then you'll have more visibility into it right so I think it's mostly a question of are you getting the right level of measurements and [Music] Metric for our final segment uh I think one of my favorite things that's going on in the world of large language model evaluations right now um is that everybody has their own like kind of weird you know folk eval right we've got mlu and all the official benchmarks but really where most of the action is is that when someone sits down and starts talking to a chap SP for the first time they have their own set of evals that they roll out um one of the ones that's been talked about a lot online is simply asking a model is the number 9.11 bigger or is the number 9.9 bigger and it turns out models routinely fail on this and uh so for this final section I kind of want to just do a fun little thing with particularly the experts that we have here today which is to get their uh offthe cuff evals I think I do similar eil I I usually test out on like math problems right um that's a that's a good one um you know your your your standard um um multiplication addition set of problems um those are usually a good level of uh indicator right so similar to the 9.11 versus 99 but a different TR that's right but it's like just to go a little further it's like basic arithmetic you're asking you're like what is this five-digit number plus this five-digit number or yeah maybe a little more complex right here like five numbers and then you know sort them in a sequence or you know go multiply these and then go figure out what's the response and then sort them things of that nature right so becomes like a um a math problem that I would give a third grader or fourth grader yeah for sure Vagner how about you I there's one that I like that uh sometimes uh reveals a little bit of the bias and cultural bias it's about uh describing a breakfast how does a breakfast look like so then you usually buy the materials and what the the LM spits out then you can like have a grasp of what the data is coming from to describe aast right looking for like cultural bias like describe a breakfast that U bacon and eggs or is it that uh uh uh bread with butter or is that oatmeal like something different right uh it's it's a espresso coffee or it's a americano coffee so that tells a lot about the BIOS and Cal bios inside the that's awesome I'm going to start using that one um all right well Kate round this out take us home here uh there's there's a couple of good ones none of which I came up with on my own I mean the advantage of sitting within research is you get some really creative mind um but a couple of my favorite ones uh what type of animal is a chicken uh you'd be surprised uh when the model comes back with there's a couple around safety that that I like to do you know asking about you know there's two people from different Origins which one's a criminal and see what the model replies with just to try and feel out that some of the basic levels but uh yeah there we've got a long a long list of type fun things that we like to try along those lines those are great yeah I'd love to talk more about that as just like as I collect this kind of like little library of just they're very they're often very funny too like people are just like it's a real counterintuitive way at some of these problems um well look uh Vagner Kate Ami uh thank you for joining us today um Ambi I hope you had a good time hopefully you'll join us again at some point in the future um and to all you listeners uh thanks for joining us um if you joined what you heard you can get us on Apple podcasts Spotify and podcast platforms everywhere and we'll see you next week on mixture of experts
2024-08-09