How Real-time Data Can Unlock AI/ML Apps | Webinar

How Real-time Data Can Unlock AI/ML Apps | Webinar

Show Video

Steve Tuohy: Welcome to today's webinar. My name is   Steve Tuohy, I'm the Director of Product Marketing  at Aerospike. I'm excited to serve as moderator   for today's discussion: How real-time data and  vector can unlock AI and machine learning apps.   Before we get started, the usual housekeeping  items, today's going to be mostly discussion,   so this PowerPoint slide, maybe one other, but  mostly it's going to be discussion, not a push of   a presentation. We will incorporate your questions  as we go, or at least we will attempt to do so. So   use the question tool in Zoom, which should  be hopefully self-explanatory. I'm sure most   of you are on Zooms a lot of the day. Go ahead  and submit that, and anything we don't get to,  

we'll push to the end or try to follow up with  you afterwards if possible. You are muted and   you're going to remain muted so that's the  main way to communicate with our speakers.   And so onto those speakers. We're really  thrilled and honored to have Forrester's   Mike Gualtieri here as our guest speaker  today. I've tracked Mike's career, really   impressive. Mike's research focuses these days on  artificial intelligence technologies, platforms,  

practices that make software faster, smarter,  and transformative for global organizations.   He advises leaders around the world on the  intersection of business strategy, AI, and   digital transformation. And his background is as  a practitioner, so from writing code to managing   development teams to architecting complex systems.  And one of those I read in here was developing   an AI-based autonomous robot arm simulator for  NASA's jet propulsion lab. So we'll see if that   comes up in today's discussion. So thanks Mike. And then on the Aerospike side, Lenley Hensarling  

is our chief product officer. Lenley also  brings great experience to this conversation.   He's got over 30 years of experience,  engineering management, product management,   operational management. He's done it at  startups, he's done it at large successful   software companies, both enterprise applications,  infrastructure software. These days, he focuses   on real-time data and the applications. So we are going to dive in and tackle the   latest and greatest in artificial intelligence and  machine learning. Mike, I'm going to kick it off  

with you. Thanks for joining us. So everyone has  not avoided all the news on ChatGPT, generative   AI, so it's an amazing amount of innovation. We're  a little over a year into the gen AI revolution,   if you will. So the focus today is on enterprise  adoption of AI. And so I know in your role,  

and Forrester in general has a front row seat to  this. So just going to be open-ended and let you   give a brief intro of your recent research and  the background you bring to this discussion.   Mike Gualtieri: Well, I've been covering AI for well   over 10 years for Forrester and writing research  on it, researching best practices, researching   the technologies that have been used. And if you  asked me this before ChatGPT, I would've said,  

"Oh, AI is hot. Enterprises are adopting it,  they have it in their strategy." And then boom,   gen AI comes, and now it's like, "I don't know, is  it super hot? It's very hot. Now it's hotter than   it was hot." So the concept of AI wasn't new to a  lot of enterprises, I think they had seen enough   successes and enough use cases. And we call that  type of AI predictive AI. And the generative AI,   technically it's predicting a sequence of tokens,  words, sentences, but the generative AI set really   sort of ignited imaginations, especially in  the business and in the world in general, as   you mentioned. So a lot of conversations about gen  AI. My research focuses on, I get asked about use  

cases, I get asked about platforms, I get asked  about technologies, complimentary technologies and   architectures to build AI solutions. Steve Tuohy:   Fantastic. You caught me lowering my seat,  so apologies for that. Awesome. And Lenley,   you are talking to customers all the time as  well, and fortunately Aerospike has many of these   enterprises that are tackling this evolution as  well. So how are you seeing that from a high level  

in terms of recent AI adoption and interest? Lenley Hensarling:   Yeah, I'll say sort of the same thing Mike has  said that what we call classic AI, essentially   predictive AI, we've coined the term classic  AI, meaning the separation from the transformer   and attention based ChatGPT neural technologies  essentially, right? A lot of our customers have   been using this for a long time in data pipelines  and moving the data around and getting that in as   near real-time as possible and really focusing on  applying the features that are generated out of ML   at the edge for real-time inferencing, real-time  decisioning, both in terms of managing customers,   fraud detection, recommendation engines, but  also in dealing with more decisioning systems,   which are machine-to-machine even.  So it's not that fairly new but-   Steve Tuohy: No, I think there's-   Lenley Hensarling: As said, it's molten.   The whole gen AI thing has captured the  imagination of the world right now.   Steve Tuohy: Yeah, yeah. I think there's   a sentiment with both of you that, "Hey, AI was  hot. We've been talking about this. So yeah, it's   super hot now." I've been telling my parents what  I do for years and six months ago my mom asked,  

"What's going on with AI? What is that?" I'm like,  "Haven't you been listening to what we've been   working on?" But yeah, now it's front and center.  So we want to do a little sort of 101 on some of   the new concepts, so large language models and get  some of the definitions out there. A lot of our   audience will have familiarity with this, but to  level set. So vectors in particular are something   we're going to delve into in this conversation  and embeddings. So let me throw this over to you,   Mike, to give your accessible definition of a  vector and what that's going to mean here.  

Mike Gualtieri: Well, I mean, when you think about generative AI,   I think most people understand, "Okay, these  models have to be trained on some data set."   Normally it's an enormous data set. But that data  set is usually written language. It doesn't have   to be, it can be code as well, but let's just go  with the written language. So let's put Gulliver   Travels in there, let's put AC/DC lyrics in there  and everything else that we can, everything on   the web, and that's all in the form of words and  sentences. So these models are mathematically,   fundamentally, this is math, it's mathematic.  So when everyone's throwing the wrong term,  

"Well, it's vectors, vectors, vectors." A vector is just a prerequisite to training   of these models where you take that text  and you essentially convert it to numbers,   and you organize those vectors in such a way that  things are near each other, there's a similarity   to them. So I mentioned AC/DC lyrics, maybe that  would be closer to the lyrics of Def Leopard than   AC/DC lyrics would be to, say, Charles Dickens.  So the vectors is a mathematical way of storing  

that information computer. Steve Tuohy:   Well, we all need to search on Dickens. Mike Gualtieri:   It's a unique data structure. You need a  data structure to do that. It's a unique data   structure, which is very different from predictive  AI where that's not really the data structure that   you use for predictive. Steve Tuohy:   Thank you. So when we think about the database  then, and we'll be hitting the notion of a vector  

database, how does that fit in the old world  of databases, the RDBMS, the NoSQL database?   Mike Gualtieri: Well, I mean now with generative AI, everyone's   throwing around, "Oh, we need a vector database,"  yet another specialized database. So yes,   you do need a vector database, you need a vector  database capability. But one of the trends that   we've been following at Forrester for many years,  I mean maybe as long as I've been covering AI, is   the whole notion of a multi-model database where  it's like, "Oh, yes, I've got tables, I've got   documents, I've got XML, I've got blobs, I've got  all these different stores." And so what happens?   "Oh, now I have 17 different databases that have  introduced latency," et cetera, et cetera, right?   So the notion of a multi-model database in  the context of a vector, you would look for a   database solution that could store vectors. And I  think some vendors have that, a lot of vendors are   looking at it, some strategically don't plan to do  it. There are a lot of specialists in this space,  

there's some open source as well. Steve Tuohy:   Great, thank you. Quick question here I think we  can insert in. So you talked about multi-model,   there's a related notion of multi-modal, sometimes  they get used interchangeably. So question is,   you're talking LMS, large language models, about  gen AI from a language context. What about images?   Do vectors help with images as well? Mike Gualtieri:   Yes. I mean, it's all data that needs to be  processed and where you need to find similarity,   so yep. Steve Tuohy:  

Yeah, same notion, great. And Lenley, I'm going  to push it over to you. We're going to talk a   little bit about RAGs in a second, but before we  go there, did you want to add anything onto Mike's   comment about multi-model databases? Lenley Hensarling:   Yeah, no, I think it's important to point out  that it's not just storing vectors, which is sort   of a multi-dimensional array of data points that  represent either language or images or whatever,   but it's the similarity search capability and  the indexing for that where the magic comes in,   if you will, and where the differentiation  between vendors is going to show. Because   I think that what we are looking at is really  the challenge of doing this at high throughput   so that you could have millions of searches a  second coming in against your repository that you   might use for retrieval, augmented generation and  session. We'll get into that a little bit later,   but the high throughput nature and the  ability to handle that scale out, if you will,   as everybody on the internet comes to your door. I  love somebody once said, one of our board members  

who has been a lead architect of a number of  very large companies, said something about the   non-linearity of the internet, meaning that you  don't know how many people are going to come to   your door at any given minute, right? And so you  have to be able to cope with that. And that's one   of the things that we really tend to focus on,  that ability to maintain performance at scale.   Steve Tuohy: Great, thank you. There's a couple more   questions that have come in. I think we're going  to hit them, so keep the questions coming, thank  

you everyone. But I'm going to push onto that  notion of the RAG, Lenley. We've talked about it,   there's lots of communication about this retrieval  augmented generation. So give us the overview on   that and how it intersects these topics. Lenley Hensarling:   Yeah, I like to think of this in terms  of contextualization. And when you talk   about retrieval augmented generation, it means  that you have data specific to your company,   to your enterprise or to the business  problem that is being addressed and   you have to contextualize things because the  LLMs are the foundational models, if you will,   wind up being something that's trained on this  vast amount of data. So the analogy I use is  

when you call in an expert, a consultant to your  company from McKinsey or Deloitte or Booz Allen,   they know a ton. They've seen many, many,  many different companies, but they may not   have been in detail in yours, right? And so they  can tell you in general what might be a good path,   but they don't have the specific information. And so when you apply these foundational models,   it's like that. It knows a vast amount  and it can respond to whatever you ask it,   but to contextualize it, you have to supply some  of your own data. And that means taking embeddings   that you've generated, we said it's a map between  either visual information or textual information,   and then being able to feed that back into the  model that you're using and sort of hone that   to the context of your business or the question  at hand. And the application of that is what they   call retrieval augmented generation, meaning that  you have a vector database, you have the ability   to create the embeddings, as they say, and the  embeddings are essentially that map from the   document that's just pages of a document into  a mathematical representation of it that can   be stored, right? And so that contextualization  is really what we're talking about with RAG or   retrieval augmented generation. I don't know,  Mike, do you want add anything to that?  

Mike Gualtieri: Well, yeah.   Lenley Hensarling: You've had a lot of inquiries   on this, I'm sure. Mike Gualtieri:   Yeah, because I think there's the real-time  nature of that too. Because if you take,  

I know e-commerce example, I'm making this up,  but say you have an e-commerce, you have someone's   shopping cart, right? And they press order. Well,  you could put the cart as context and say, "Okay,   now send the order confirmation to the customer  and describe the cart." "Oh, it looks like you're   working on a plumbing project and some random  knitting project," or something like that. So I   think that's another consideration too, because in  that case, not only do you need fresh information,   real-time information, but you may also need it at  scale because this whole shopping experience was   motivated by some promotion so now all of a sudden  you've got hundreds of thousands of concurrent   people hitting this and needing to do this. Lenley Hensarling:  

Exactly, which leads to some of the questions  about when you're using RAG, you want to be able   to have what people are now calling semantic  caching. So if you were doing a promotion,   you would try and cache a lot of the vectors you  might need to describe what's on sale, what's the   promotion, and even the context that you posit  that you might be selling into, as you pointed   out. And so all of those things have to be able to  be applied to wide scale. And I think that's what   everybody's struggling with now. Mike Gualtieri:  

And I think lot more people need to think  about RAG at scale, because I've talked to a   lot of companies over the last six months,  insurance companies, financial services,   they're downloading some stuff from Hugging Face  and messing around with a little RAG project,   "Oh wow, this is cool. I got Llama 2, I'm doing  a RAG project, look what I can do." But that's   a little experiment, right? If you then start  to think, "What if I was going to do this at   scale for my entire organization," even if I had  50,000 employees internally or tens of thousands   or millions of customers, then you have to start  thinking like an architect again about, "Okay,   where are all these bottlenecks? Where is latency  introduced? And what happens when I start doing   this at very high concurrency and need real time  RAG?" And then you have to start thinking of the   components and how to build that architecture. Lenley Hensarling:   Yeah, exactly. And I'm glad you said that, Mike,  because we haven't released our vector product   yet. But what we're focused on is exactly that,  because that's what we've done for the application   of classic features, if you will, at the edge. And  we've focused on that and the ability to scale out  

and be elastic with the search capability  and to meet those demands of hundreds of   thousands to millions per second of these vector  queries that are going to have to be applied.   Steve Tuohy: Good stuff. I'm scribbling   notes here and trying to incorporate some of the  questions. So challenges, right, I haven't heard   hallucinations, but I think we know about that.  But context for one, adding on context through a   RAG so these advances can be more usable. Scale,  I think Mike brought up and real-time. And then   the notion you're piecing together different  parts of an off-the-shelf LLM for instance,   and your vector database. So let's think about if  there are other challenges, but I want to bring  

in an audience question here. Okay, so there's  been a lot of gen AI services like Aerospike   for dev teams to be innovative, ChatGPT was an  'aha' moment for the masses. In the IT space,   what end user tools do you see, which are  gen AI savvy? So a random tool, a Tableau,   do you see that being a vector query or RAG savvy?  So I'll let you interpret that. I'm hearing this   a little as kind of the stack and putting  things together. What are your thoughts?  

Mike Gualtieri: Well, I mean to me it sounds like   the question is about where are we're going to  find gen AI because you mentioned Tableau. I mean,   every software vendor is trying to figure out  where gen AI works, helps within their product.   And so a lot of companies kind of rush to try  to figure out the technical details of how to   do this and what sort of talent do we need? But  then they're quickly getting messages from their   business software vendors saying, "Hey, we have  something coming, it's going to make this easier."   So it's not going to make sense to customize  gen AI for every application, it's really wise   to actually scan your business software vendor  landscape to see, "Oh, do I need to build some   contraption here? Or is salesforce.com going  to have a gen AI feature there?" So I think  

the way tech execs are thinking about this now is  more the way they think about software, meaning,   "I need to figure out what's in the market, I  need to figure out what I have to build myself,   what's differentiated or what's so gnarly to  integrate that I have to do it myself too?"   Steve Tuohy: Lenley, anything you want to add?   Lenley Hensarling: Yeah, I was going to say that, I think this is   a pattern we've seen before where at first there's  just technology and some of the leading companies   are going to do some things themselves and embrace  this, but soon there will be packaged gen AI   solutions by most of the application vendors.  I spent years in the ERP business and I can see   where financials and being able to ask questions  about it is going to be added there. And being   able to have things like, "Can you get questions  asked about your 10k that you post for investors?"   Right? And those will be incorporated into the  financial systems and a way to handle RAG, or   retrieval augmented generation, to make sure that  you're not veering off course into hallucination   space when you answer those questions because  there's going to be liabilities around those   kinds of things. And I think those will be handled  in the applications to a great extent, I think.   Steve Tuohy: Awesome, thank you. Okay,   so you get vendors like us and analysts out  there talking about the promise of all this,   so we assume all these customers are up in  production with these different use cases   and changing everything. So let's transition to  some of these use cases. And in particular, we  

talk about challenges, where are customers hitting  walls and where are they reaching successes? What   are the use cases that are ready for success? I'll  let either of you take a first stab at that.   Mike Gualtieri: Well, maybe we can go back   and forth. The one that gets the most buzz is just  for personal productivity. And there's the coding,   right? Because there's a lot of code assistance.  It's not just human written spoken language, it's   computer language. So I've talked to a remarkable  number of companies who are just at least   messing around with it, and there was a lot of  controversy, "Oh, should we let our developers use   this?" That was kind of like saying, "Should we  let people use the internet or not?" Except this   cut loose a lot faster, so that's productivity. But then you have Microsoft's Copilot landing  

inside the productivity apps, and you have  a lot of impressive stuff out of Adobe for   their productivity, Adobe Firefly and everything  that they're building in. So I think that is the   biggest use case. And at Forrester, I cover  more of the technical architecture side and   the build it yourself, but we have a number of  analysts who cover workforce productivity who   aren't AI analysts, but most of their questions  now are fielding about how companies are using   this for productivity. So that's one. I have  more, but Lenley, do you want to add one?   Lenley Hensarling: Yeah, no, I think that one thing I'd say is that,   and in part there's the gen AI use of the whole  encoding mechanisms that are behind vectors,   but people are starting to apply that in classic  AI as well as a sort of richer feature set. We  

have a lot of AdTech customers, and they're  seeing ways that they can just have a richer   feature that's generated by their ML, and  then they need to apply it very fast with a   similarity search that can execute really fast.  And so we're seeing things like that as well.   And so I think there's going to be a  lot of creativity in applying some of   the components that are coming out of gen AI, if  you will, and recombination almost, if you will,   right? Because I think that generating patterns  or recognizing patterns and more IOT-based data is   going to happen as well. And there's work going on  to be able to generate embeddings or the encodings   as vectors of more almost time series kinds  of data, and then being able to look at those   patterns in a much more richer way and do it a  lot faster than they might've been in the past.  

Mike Gualtieri: There's a digital assistance for   customer self-service. I mean, that's an example  of where, okay, chatbots for self-service were   already a thing, but now companies say, "Whoa,  this can be a lot more contextual, it's lot more   articulate." So they want to infuse that. Next  generation search, right? This whole thing we've   been talking about RAG. I think the remarkable  thing about these models is how articulate they  

are. And that's part of the danger though too,  because many people confuse a well-spoken person   or a well-spoken model with accuracy and truth.  And that's some of the challenges too with this.   But other use cases, I call it generative  engineering, and you've probably seen, many   people have seen using it for chemical discovery  because there's a language to every engineering   discipline, there's a language to chemistry,  molecular structures, life, DNA, and even auto   parts, artifacts from AutoCAD and parts list and  so forth. So there's a lot of people thinking   about this not just from a human language, but  from a engineering language, and how can that add   to the creative process and design process? Lenley Hensarling:   And Mike, I think what's going to happen there is  that the low hanging fruit is personal assistance,   being able to answer questions and being able  to scale that out in a way where you can't just   engineer people to take phone calls when the  airline gets hammered, but you can generate new   digital assistance, if you will, to take people's  calls and deal with their questions and even reach   back into the system and take actions, right/ so  I think that's the low hanging fruit. But the real  

value is going to come when we start seeing this  apply to supply chains, to product development   and things like that. And I think that means more  specific models, and I think it's worth touching   base here, and a question to you, are you seeing  this movement towards what people are calling   SLMs or small language models? But I like to say  specific language models, if you will, right?   Mike Gualtieri: Yeah, that's better. Actually   I hadn't heard the specific. Lenley Hensarling:  

I just made that up right here. Mike Gualtieri:   Okay. But no, I mean, I think this was  a myth from the start largely pushed or   helped by OpenAI and even Microsoft too. It's  like, "Oh, there's no way you could ever build   a model this big because it costs us zillions of  dollars to build this." And so everyone thought,   "Oh, okay, well, I guess we'll just have  to use that model." But you're right,  

there's going to be tens of thousands if not  millions of models, some will be fine-tuned.   But if you look at the activity on Hugging Face,  huggingface.com, which if anyone's not familiar,   it's kind of an open source repository of  AI type models like Llama 2 and many others,   right? So yeah, I mean, there's  going to be a lot of these model.   And I'm already hearing companies that would  say, "Well, I am never going to build my own   model," to saying, "Well, we're looking at  it or we're going to fine-tune." I mean,   RAG is a wonderful technique there for many use  cases, but like anything, there's a spectrum, a   range of use cases. And if you look at Mosaic ML,  which tries to optimize the training of models,  

they've got, I forget a huge model that used to  cost $450,000 to train, they have it down to,   I don't know if it's 50, 60,000, but  orders of magnitude less. So I think yes,   we're going to see a lot of these models. I love the possibilities with AdTech too,   because the way AdTech works, they're constantly  testing. So in some ways it's going to be the   frontier because you can start doing some crazy  hypotheses about prompting and generative AI,   use AB testing and see what hits, but that  could change the entire AdTech industry.  

Lenley Hensarling: Yeah, absolutely.   Steve Tuohy: Yeah, the language piece is   very tangible thanks to OpenAI and other LLMs and  some of the examples you've had build off that,   and RAG obviously builds off that. You've touched  on this a bit, but I'm combining my own question   with one that's come in off the chat, but  let's take fraud for instance, right? And Mike,   you talked about the real-time nature and the  scale needs. This is not asking ChatGPT, "Is   Steve at Forrester at the point of sale?" Right?  As fast as it is, we're talking about millions of   people transacting and capturing that information  and acting upon that. So Mike, you distinguished  

language and predictive upfront. Mike Gualtieri:   Predictive versus gen, yeah. Steve Tuohy:   Yeah, gen versus predictive and acknowledging  there's some intersection. So the question   that came in that you can build off this or go  beyond, but isn't fraud handled fine pretty well,   they said, with more traditional models? So  how would vector help? And my own insertion is,   do you need an LLM in that or  would an LLM even be additive?   Mike Gualtieri: Well, people are   definitely researching and figuring out how to  use LLMs because there's different types of fraud,   right? And sometimes a LLM would be perfect at  fishing out a phishing attack, for example. So  

companies and vendors of fraud detection software  are definitely incorporating that based upon   different attacks. But you know what? You can also  use a gen AI model for classification just as what   predictive AI does, right? Predictive AI takes  a data payload or something, a transaction and   maybe some enhanced data, and it says likelihood  of fraud, low, medium, or high, right? But some of   the gen AI models can actually do that as well  because it may not just be transactional data,   there may be some sort of interaction  data like it may be the chat information.   And the other thing to think about is just like  we want to detect fraud and the world's gotten,   I don't know, better at it. Let's not say it's  good because there's still a lot of fraud,  

but it gets better at it. Those same techniques  people are looking at for what's called guard   models. So the output that comes out of a gen  AI, is there anything we could do to look at that   output to perhaps detect a hallucination? So you  can think of a hallucination as fraud, the model   being fraudulent, and so you can use some of those  fraud fraud techniques to govern the model.   Steve Tuohy: Lenley, your thoughts? I know some   of our customers, you've talked about the classic  models and an enhanced feature approach so a lot   of our customers are doing fraud. Lenley Hensarling:  

Yeah, I would say the discussions we've been  having are that because there are patterns   in language that indicate fraud, the way a  conversation progresses and such, and they   can start to save some of those conversations off  when they say, "Some of this may be recorded," I   think there are going to be new applications for  that definitely. And then they'll be able to mash   those patterns back and do similarity search  against what's coming in and be able to say,   "That's one more signal. And it's not the only  signal, but it's one more signal to fraud." And I   think that you're going to see that. And a lot  of this is just refinement and refinement and   refinement in that game against fraudsters. Mike Gualtieri:   Yeah. And I mean, we should mention that this  is one of the most wonderful tools that was   invented for committing fraud too. So I have  a prediction, cybersecurity companies will  

increase their revenue by three times next year  as the criminals start to figure out how to use   this. I mean, seriously, it's an incredibly  scalable tool for committing fraud as well.   Steve Tuohy: Very good. Okay, just looking at the clock,   we're going to shift forward here a bit on the  infrastructure side. We've got a question that  

touches on data models that you guys... Okay, so  is there an ideal data structure for predictive AI   and any limitations? Anyone want to..."? Mike Gualtieri:   Well, for predictive versus gen AI, I mean, you  got to think of three workloads. When anyone   says AI to me, I'm like, "Oh, there's three  workloads." There's the data prep workload,  

there's the training workload, there's  the inferencing workload. Inferencing is   sometimes known as scoring, but it's using  the model, right? So first you need all the   data needed to train the model, and then once  you train the model, then you need data and   you need to be able to inference that model. So most predictive models for training is a   table. So the data prep takes all of this data in  whatever form it is and it flattens it out and it   makes columns. And I mean lots of columns in some  cases, could be 2000. It could take 30 columns   from data structures and it could blow that out  to 2000. Why? They're doing feature creation,   they're taking ratios, they're taking a date  and splitting it up into month, day, and year,   quarter. So you can imagine how quickly  you could sort of blow out those columns.  

And that's for the training workload. Now, once the model is created, the model   determines, "Well, I need these six variables  on the input to predict the thing you're trying   to predict." Those six variables could be in  different locations and different formats.   So now you've got your second problem, which is  during the inferencing stage to retrieve enough   of that data at scale to call that model. Because  people, they worry about the latency of the model,   a lot of times it's not the model, but it's  getting the reference data and the other data   needed before you can even call the model. That's  normally where your performance problems are.   Lenley Hensarling: And Mike, this reminds me of one   of our star customers, I guess, LexisNexis  and the product they have ThreatMetrix,   right? And so the CTO there, Mattias, one time I  asked him, "Why do we matter to you?" And he said,   "Because I used to be able to apply a stream of  signals that were hours worth against weeks worth   of data, and now I can take weeks worth of input  that's aggregated and then match that against   months of data, and I get a higher fidelity  result." And then they actually charge people   more for that in their fraud detection because  it's a higher fidelity result, right?And so it's   just the application of more data all the time.  And I think that the same thing is we're hearing  

from customers about how they want to use gen AI. And I think the other thing that's happening too   is that we always say we can do that in a  cost-effective way, but there's also this   notion of semantic caching. And so you're not  going to have to go back to the foundational   model for every question, you can sort of check,  "Have we already sent that back? Do we already   know the answer in terms of vectors returned? And  then can we drive things internally and be able to   disintermediate some of the cost, if you will?" Mike Gualtieri:   Yep, that makes sense. Because even though  it could be a fraction of a penny or a   couple pennies a call, it adds up for some  use cases, and so caching makes sense.   Steve Tuohy: Great. Hey, last category, probably the   last four, three minutes is advice. And what's the  path for getting started for organizations? What  

do you see people tackling first and how should  they be thinking about costs and so forth?   Mike Gualtieri: Want me to go first?   Steve Tuohy: I intentionally left it open,   whoever wants to jump on. Sure, but  yeah, if you're up for it, Mike.   Mike Gualtieri: Yeah, so I mean, use cases are not too difficult   to find because you look through any business  process and you locate the opportunity for some,   where you're generating content or something. And  it's likely now that whatever your industry is,   that there's been some common use cases that have  been done by others and have been vetted up.   But once you find that use case, you have to  think about how you're going to implement that   at scale. And I've largely already said this,  but there's a lot of developers now, because one   of the things about gen AI doesn't particularly  help you to have a lot of statistical knowledge   like a data scientist because all that stuff is  abstracted, it's kind of done for you. So it's a  

lot of developers messing around with this stuff  when they see a use case and it's very accessible   to do a simple use case, but gen AI use cases are  going to happen so you have to pause and think   about how you're going to scale this, how you're  going to architect that. Because making an API   call to ChatGPT is very simple, making an API call  to anything. But the architecture to get it to   perform and to get all the data, that's the hard  part. So whoever is experimenting on the use case,   just like inputs and outputs, there should be  another team working on how it's going to work   out into the architecture. Lenley Hensarling:  

Yeah. I'll add one other challenge I think that  we've had with AI ML all along, and we're making   good progress on until the introduction  of gen AI, I would say, and we continue to   make progress on the predictive side, and that's  explainability of results, right? And with gen AI,   it's a particular challenge because the system  doesn't lend itself to that because it's not a   step-by-step kind of thing. It's doing this  generation and finding the next available   thing and the track can be very long and trying  to say, "Well, how did you come up with that?"   And I know there's work being done, but I think  that that's going to be a particular challenge in   deciding how to apply it. I know there's work  going on now to apply gen AI to diagnosis for   medicine, and it's like, "Well, how are you going  to say, "How did you come up with that diagnosis?"   Because it's going to have to be checked and  there's going to have to be tracking of that   kind of thing just because of the nature of  that sphere." And there are many other areas   that are going to be similar. And I think that  that's going to be something that's going to  

evolve over time. Mike Gualtieri:   Yeah, and another issue with that Lenley  brings to mind is you never quite solve it too,   because just when you solve it, you want  to retrain it with some newer data.   Lenley Hensarling: That's right.   Mike Gualtieri: And so you've got to have kind of this testing   process and because a lot of this is dealing in  probabilities and you can't be 100% sure on some   of the things, some companies are employing AB  testing strategies to this or what's in financial   services champion challenger, right? These are  processes that maybe a lot of software development   team, many software development team like  advertising and other things are very familiar   with these concepts, but a lot of software, it's  like, "Okay, we have these new bits, we're going   to test them completely and when they work, we're  going to push them out," right? But with some of   the Gen AI models and your prompting strategy, you  can't be 100% sure. So some companies are saying,  

"Well, I can't wait until I'm 100% sure  because I'll be waiting forever, so I'm   going to put this out to 10% of the traffic, 10%  of the transactions, and I'm going to see." So   it's a risk mitigation strategy, but it lets you  move forward with a degree of uncertainty. And I   think it's counterintuitive, but businesses in  highly regulated industries are actually pretty   good at risk management and assessing it because  they've been dealing with compliance forever.   Lenley Hensarling: Mike, that brings to mind   that many people are now talking about augmenting  this not with gen AI RAG, but with other types of   search. And so once you get a result, going  and testing it against just looking up facts,   right? Applying them, see if they fit. And then  if they do, that acts as a risk mitigation, and  

there's of quality of the generated content, which  means accuracy and likelihood of hallucination and   other things, which can be dealt with by just  search capabilities of a different kind.   Mike Gualtieri: Yep.   Steve Tuohy: Gentlemen, I regret to say we're about at time,   so I'm going to pull up our thank you, goodbye  slide. But that was a pleasure for me. I hope   our audience gained some new insights, I'm sure  they did. I did. Mike, on behalf of Aerospike,   thank you for being our guest today. Mike Gualtieri:  

Thank you. Steve Tuohy:   Really great perspective. So for those of  you who want to dig deeper on Aerospike here,   just a few resources. I'll give the guys a second  to give final thoughts, but yeah, dive in, set up   your database with Aerospike. Lenley mentioned,  I think he calls it classic AI, but among that   is using Aerospike as a feature store so we've  got some content that you might be interested   in on that little discussion on the use of vector  databases that our product manager Adam Hevner put   together. And then a piece on real-time AI from  Aerospike's Chief Scientist, Naren Narendran. Any   final thoughts, Mike, Lenley? Mike Gualtieri:   Well, we've had our fun thinking about how AI is  going to end the world, but now we've got to get   building scalable applications with it, we've got  to get beyond the experimentation phase and just   start building this the way companies already know  how to build real-time scalable applications.  

Lenley Hensarling: And I love that that's   the way Mike put that because that's our  focus and what we're building in terms of   our vector database capability, the ability  to scale out and do that while maintaining   performance and low latency in responses,  because people won't wait around just   because it's gen AI. Mike Gualtieri:   Right. Steve Tuohy:   It won't be the hot new thing forever.  All right, guys. Well, thanks everyone for   joining. Good questions that came in, and we'll  have a replay available. Have a great day.

2024-02-08 12:42

Show Video

Other news