AI and the future of search ElasticON AI

Show video

[Music] hi everybody Welcome to elasticon AI it's exciting to have elasticon again uh in San Francisco uh our first conference was in a pier about a few blocks down the road so uh it's really exciting to be back here and doing this elasticon events we're going to do many of them across the world um and this time we're going to focus on AI uh so let's get started um so our St by uh some reflection um I'm a you know CTO at elastic now I I took a vacation as a CEO for four years but I'm now back to actually working as a CTO um and uh and I've been in search for years uh you know one of the first uh users and committers in Apache Lucin or contributors to a Pache Rin um I worked on an open source uh project called compass that was built on top of a Pache Rin and afterwards elastic search and I've uh saw the uh growth in Search and the capabilities that we developed over the years and one of the proudest things that I have that we've been part of is this evolution of search as a technology um especially in the open source space uh beginning with A pachin and obviously with elastic search um that has been there for you know more more than 10 years now uh and I remember still that when we were trying to do search um then you had requirements around search for example you want to be able to filter all the products between a certain price range to be able to go and find them on a search bar and the way that you would do it with for example Lucin was to pad the text with zeros because it only did text search so we had to go and figure out how do we take numbers and index them in a new data structure so we'll be able to do efficient filtering when you do search afterwards there was a a need to do fating like really fast aggregation BAS based on the search that you're doing sort of analytics if you will um and that was uh pretty badly implemented uh in the early days like we had to uninverted inverted index load it to memory and do all the computation there and then over time we built a whole new column the store into a Pache rine so pachin has a colum no store called Dog values to be able to go and do this aggregation in a really fast and efficient way um this thing called iPhones came out and suddenly Geo search became all the rage and people wanted to go and do search together with and constrain the searches that they do for restaurants or other things on their phone so we had to go and figure out how to do geo search and in the beginning obviously it was pretty silly distance-based calculation in memory and I admit that I I kind of like read a white paper and and and implemented it on a train ride or something like that but afterwards we went and implemented the whole new data structure into a patulin called bkd trees uh that probably makes today a pachin and elastic search one of the most popular if not the most popular Geo search Geo database in the world which is extremely exciting and extremely efficient so there has been tons of Investments over the years uh in Search and obviously uh a big part of it is around Vector search now uh which we're going to talk about uh in a few seconds and search has change uh and it's exciting to see you know one of our deeper belief is that search unleashes data like you have data that comes in many different shapes and forms and you need to be able to go and explore it uh and search is the most human and effective way to interact with data you put a search box in front of someone they know how to go and interact with it and AI really empowers that experience and a big part for us is doing that together with a community of users billions of downloads community members developing all of it together and building this search technology moving forward and obviously over the past year an inflection point happened and AI or gen AI came along uh now obviously this is the slide where you ask an AI to generate a slide uh so you have to have it in every you know every talk I think it's really weird the last one looks like a Roman dinosaur or something like that but I guess we're men so we have to think about the Roman Empire um and and J AI you know I remember like about a year ago or so um Everybody were like I don't know it happened it's almost like felt like a day people want like search is dead gen is the thing it's going to kill search elastic search is not relevant Lucine is such an old technology nobody uses it it only does Tech search and it's not relevant and I was like really interesting cuz like let's talk a bit about it and by the way this slide probably applies to a lot of different things like you could put humans there or something like that if you talk to Max or other uh so it's really interesting but we're focusing on search here um and we really want to work and try to understand what does it mean what do search has a place does it really has a place in a j world where we have these workflows and large language models and yes um it seems like search is more important than ever and it took us time to try to figure out what is that place and you know I'd love to take you through some of that Journey that we took and what we plan to uh develop uh into the future so first of all geni is composed of something called the large language model one of my favorite terms that has started to emerge in this space is grounding like all of these large language models need to be grounded I mean all of us probably needs to be grounded a bit but definitely large language models needs to be grounded what does it mean to be grounded well they can hallucinate like if you don't give them context if you don't try to shape them towards giving you the right answer they can really go off the rails and we've seen some examples of it um and it's really tricky right it's like if you put it in front of your customers if you put it within your company this is something that you need to think very deeply about like how does the engagement with a large language model looks like and trying to understand how to shape it and constraint it it's also not powered by your data you're building a business that probably has fast moving data that keeps on changing your business generates data whether it's internal comp's data whether it's customers data that's not powering the Gen capabilities like it's an extremely smart uh capability in large language model but it's not being powered by your company's data and how do you think about security and privacy what is allowed to see some roles in the company can see certain types of data other roles cannot how do you make sure that people don't have access to the data that they don't need to and how do you make sure that people that do need that access actually have that access because that's a very powering capability and this concept kind of like emerged over the last uh you know more than six months or so called retrieval augmented generation or rag and what does that mean it means that when you ask a question or when you engage in a hopefully in a very like natural uh language way you can feed that question to the large language model that you're using but probably what you want to be able to do is to give it context uh for example we're you we're developing an AI assistant uh for some of our Solutions which we will show you and the first thing that we tell the llm to do is hey you're are very helpful and amazing and nice large language model that really wants to help people fix their security problems like that you really want to encourage the llm uh to do things so this is context this is like how do you shape uh the llm to be able to go and do it but there's the shape of the answer there's also the content that you want to feed to that answer um and that content that context is critical to the success and the relevance of the answers that that llm is going to give you and that comes from your business data whether it's the customer's data whether it's your company's internal data that keeps on changing extremely fast and you want to be able to give the best context possible and that context is limited so when you're trying to feed an context it's pretty constrained space it's like kind it depends it keeps increasing but it's not infinite it's you know abounded amount of token 4,000 tokens 20,000 tokens so really the ability to provide the most relevant context will make or break your users's experience when it comes to using an llm now you see here you can put your business data probably in a relational database you can put it in a data Lake you can put it in a document database but the all of these systems are missing a very critical capability because you're really bounded by the context the size of the context that you can provide at llm the relevancy of the data of the results that you're going to Fed into that context from your data matters so much you can imagine that if you give that llm a slightly different context you're not that helpful Ai and maybe you don't want to please the answers to you that you're giving to your users that's going to make a big difference to the users that use it so the relevancy of the data that you fit into that context window matters a lot and guess what search engines are pretty good at providing very relevant results uh and being able to be a part of that context and we've taking that set of capabilities the ability to go and build an inverted index and be M25 and Vector search and all of those things that we do so well at elastic we package it under something called ezred the elastic search relevance engine and we're using it and packaging it to you to be able to go and use it and you'll see some examples down the road now this is critical your business data moves fast it scales you know terabytes pedabytes of data that you want to have you can't feed an llm pedabytes of data right so you need to be able to take this natural language question translate it to something that you can go back to your business data and get the most relevant results into a very small window so it has to be the most relevant results to that experience being able to fit it to that llm and then get the answer back and that's like a critical part of the workflow of gen that we're seeing today and obviously search plays a huge role in that space so what is Ezra Ezra has a set of toolkit of capabilities and uh and and and and and apis to be able to go and develop these gen workflows it can be act as a component uh in that uh retrieval augmented systems it can drive the workflow itself it has tons of choice of models you can do text based search you can do Vector search Hybrid search between them the ability to filter data to be able to go and build security models and it's Enterprise ready we've been building search uh for years now so the ability to document in field level security to take that role that the person that searches together with their geolocation together with which um Department do they uh uh do they respond to or do they relate to and being able to take all of that information filter create a window a boxed window of data that they're only they allow to get and and then drive relevancy out of it like that's a really hard problem to do and that's what we've been building for many years now and at the same time that like after gen happened suddenly another thing happened which is like all you need is Vector search and Vector search became a big thing and Vector search is a critical critical component of these rag or gen workflow and elastic search and aach luine are a great Vector database solution and then next to Vector database you also need the ability to create embeddings so when you uh when you take data and you index it you generate embeddings to be able to go and search it those vectors basically and then when you do the query and you search you also generate embeddings to be able to do it so a big part of it is how do you generate embeddings what's the model that you're using this whole workflow as well and we'll show examples about it in a in uh in a few minutes and then there's the whole world around it how do I take the role of the user for example of the department that they belong to and filter only to the data that they they need to be able to see how do I do aggregations and filtering how do I take the geolocation of that person or my customer uh or my user that ends up using it and being able to filter the data only to what they need to see because that context window is extremely constrained and even if it increases inside and instead of 20,000 tokens it becomes 100,000 tokens or 150,000 tokens it's not a step function in size it will always matter significantly what's the relevancy of that data that you feed that you put into that context window so really a great geni workflow is having a great llm uh but also having a great relevancy engine to be able to go and feed that context window and with that I'd like to invite Matt I'll be back back but I'd like to invite Matt uh to do a demo of some of the things that I've just spoke about uh and also show some customer examples that's okay it [Applause] mind so I'm going to be doing two demos today that kind of build on each other um and if you think back to the architectural diagram that Shai showed um kind of that top left corner there's this question about how we pose a natural language question both to Alas search and then also posing that out to the llm with context from elastic search and so a lot of the work that we do is around making sure that we can interpretate that query in natural language and provide relevant search results back so the first demo I'm going to do is an approach to semantic search uh that is actually capable of interpreting those kinds of natural language questions and giving you good search results and then the second demo is going to take that and kind of pull it all together into a fully functioning retrieval augmented generation workflow like the one that we showed in the full diagram so we'll move into that now it'll start off with uh the bm25 search so uh largest city in Illinois and as you can see with a this is basically a search over wikkipedia data set and using bm25 matching it's basically your kind of typical keyword match and you can see how it's picked up on the keywords of the query and those are the things that are matching if I switch over to uh the index that's been processed with Elsa which is basically a pre-processing stage through a Transformer model which changes a bit the documents as they're stored in elastic search uh we get a much more semantically relevant result for the query at hand um and I'll show a few different examples of this so that's one one example banks that went bankrupt so we do that we sort of start to see even more of the kind of the the challenges with pure keyword matching right it's picking up some of the the specific keywords but not really understanding the meaning behind the word and if you're trying to feed this into an llm you really need to understand what that query is and those queries are going to be starting to come in in much more natural language form so if you compare that again to what else is able to bring back we get much better results uh and that have really understood the Nuance of what the query is and maybe the last example that I'll do here is something like a funny actor and again you get a couple of examples right but again you kind of get these uh these keyword-based results and then something like uh in Elsa you you really get something that works uh much much better so the the idea here is to provide semantic search kind of out of the box with elastic search what's unique about this is that this is a model uh commercially available with elastic search it ships with the software and you can use it on any content that you already have index so there's really no work for you to do in order to get these kinds of search results and this kind of relevancy out of a Content that you already have you don't have to go out and search for the right embedding model if you're doing Vector search or or anything like that this is a one of the nice attributes of a late interaction model is that they they're zero shot they're meant to work on any kind of text that you throw at them and as you can imagine if we're going to ship a model with elastic search it needs to be something that we're fairly confident will work on whatever kind of content you have in your application um so it maybe a the very last example I'll give here before moving to the other demo is uh one in a foreign language or in German so this is the um if we go back to to bm25 and type hello world in German how how are we going to do matching something a keyword like that actually Elsa today Works in English but we also have support for which I I think we'll talk about a little bit later we also have support for bringing in third party models something like a model that you find on hugging face for example and so we have another version of this that's running through E5 which is the multilingual semantic search model uh and you can see here it actually interprets the the language uh the German language query and matches correctly to the English language content so all of these things are kind of precursors to building a really nice retrieval augmented generation system which we'll skip to now so uh the example here kind of to set context this is the idea of building a retrieval system maybe for your internal employees so what we call workplace search but basically searching over all the internal data in your internet and answering questions for employees and one of those questions might be something like um you know does elastic own my side H hustle all right that's an interesting query in a couple of ways primarily because the the system needs to know how to interpret the term side hustle you know does how what do we match that too and you can see here at the bottom the search results that came out of our internal system it's a SharePoint Drive some stuff in teams uh these are the the actual search results those are the search results that are fed in as the context to the large language model to ultimately generate the answer here and as you can see uh this is the results themselves coming out of elastic search uh powered and then passed through to Azure open AI service to actually generate uh kind of the long form answer that comes out here but grounded in the exact or the the proprietary dat that is inside of elastic search so you could probably ask that question directly to a large language model but it's not clear which intellectual property policy it would be answering from so we want it to be answering from the one that's specific to our company uh another example here might be uh maybe one of my favorites which is what is NASA and so if you were to ask chat GPT that I think we all know what kind of answer we would get would be about the you know the Space Administration but at elastic uh NASA stands for the north am America South America sales region and that's specific to us that's very specific to our company and it comes out of uh a document over giving an overview of our sales organization and then a final example here around uh that gets to the point that Shai was just ending on around how we've worked really hard to build all the capabilities to enforce things like document level and field level Security based on access control and things like that and in in an internal internet system some people are going to have access to some documents others will have access to others so if you ask a question like um let's see here uh how are senior Engineers compensated in this case uh because I'm logged in as an engineer actually it doesn't have information about the entire compensation Matrix and all the things that go into that but if I change my role over to a manager and reexecute the query we're able to actually find the right kind of content through the search results and then feed that into the llm and actually provide kind the answer so this kind of pulls all of this together but I think the the primary thing to take away from this is in order to build a retrieval augmented generation system like this you really do need uh a smart system that can interpret these natural language queries to make sure that the context that we're providing to it is actually going to be the right kind of grounding so we've spent a lot of our time not just building uh Vector search into elastic search but also just thinking deeply about what it means to get really highly relevant results out of elastic search with whatever approach you might want to take Vector search is often one of them hybrid search is another things like Elsa I think are really showing the possibilities of what we're capable of here um with that I'll take uh we'll switch back to the slides and just talk briefly about what we're seeing from a couple of customers you can imagine the use cases of these kinds of um these kinds of queries in in a certainly as I speak to customers it's all across the board from you know HR like the example we just gave to customer success and e-commerce search examples and what you might notice about all the queries on the screen right here is that they all in order to answer them they all all will require proprietary information inside the business right it's going to have we're going to have to bridge that gap between what the llm was trained on and the actual specific information inside of your company and so that's really you know what we're aing at here um a thing a few things that a couple of our customers have been doing so Cisco um and sujith is here and he's going to be actually speaking later today and he'll take you through this in a lot more detail but uh Cisco's been working on internal Enterprise search uh search experiences for quite some time and they've been along this entire Journey every time there's been a new advancement in in any of these Technologies they're one of the first people we hear about uh doing doing some interesting things with it so we've been really excited to see what they've done with their Enterprise search system internally for their employees and the uh kind of the rate of adoption of they've had for the Technologies we brought to elastic search um similarly another company called relativity that does uh eisc Discovery search um and they're really looking to use kind of a rag type system to retrieve all of the information in an ecovery kind of case study but also pass that to llms for doing things like summarization and Analysis of those documents but again they all kind of fall back down to ultimately uh a rag type implementation uh so in with that I'm going to hand it back to Shai to talk to you about what's [Music] next so I think one of the things that I want to uh pause for a second and talk about what Matt showed when we think about gen we immediately jump to think about like we need to use large language model chat GPT or other services but there has been an amazing prog progress just in this concept of Transformer models and you don't necessarily have to go and figure out how to use chat GPT how expensive it is uh or other systems and how do you constrain it there's like a lot of amazing models that make a step function difference in the capabilities of search like Elser that you can have access today and use as if you were using regular search bm25 search in elastic search that includes some models that we develop internally Elser is an example or models that are developed by Microsoft like the E5 model is a model that is publicly available developed by Microsoft and it's extremely capable especially for multilingual search for example so you don't have to wait as a company to go and make a material difference in the experience that your users have in that search box if that makes sense and that makes the search box extremely more valuable and you saw the example of about Wikipedia and I completely agree Robin Williams is the funniest actor Ever uh so I approve of Elser relevancy model uh next U so I wanted to show you a bit about where we're heading uh obviously there's tons of work going on in elastic church and in other places but maybe a few U points that help communicate the spirit of the direction that we're heading the first one we're going to make Lucine the best vector database in the world we're making signif we have been making significant investment to make Lucine a great Vector database and we're making even more Investments as we speak as a company elastic search together with the Lucin Community to make Lucine a great Vector database the same way that we made it a a a great numeric search database GEOS special database columnist store we're making Lucine the best Vector database out there Lucine is more than 20 years old it's one of my favorite or I think one of the most romantic open source projects in the world hopefully elastic search will live up to it at one point it Rhymes around a lot with things like with other open source project like posg and it survives the test of time it was released in 1998 um and uh and it has evolved and I I it's a it's an honor to be part of that Evolution and we're going to make the Investments to make it a great Vector database we get a chance to hear about all of the things that are happening from SIM destructions improvements to a better product quantization and other stuff but the end result is it is a great Vector database today and it's going to be the best Vector database in the future next we're going to make elastic search the best at semantic search and retrieval augmented generation now you've already seen that it pretty damn good already um we've been building a lot of the components and investing in it for years now um but to be honest I think we can do a better job in connecting all of them together as we've been building it there wasn't like this General theme around geni and the workflow wasn't obvious uh and as a result of it the components go deep but they don't combine and they don't work well together and I'll give you an example we had a whole offsite about it trying to figure out how to simplify things I want to walk you through some examples so before or today to be honest uh this is what it takes to try to use Elser and I'm going to walk you through the apis and how it's being used so the first thing is to go and install Elser now because Elser is a built-in model we know how to go and fetch it and download it and install it into our machine learning nodes and that's not a problem next we need to go and start Elser which is fine you know you install it it's installed but you can start and pause it so it doesn't take resources so you can decide when to use which I'm afterwards you need to go and Define mapping so this is your schema you define your schema of your data uh now it already starts to sound fishy because there's a type there called rank features uh and rank features is a previous uh capability that we built to ability to do sparse vectors now rank features make sense in the context of just rank features um but it doesn't make a lot of sense in the context of gen and vectors it's not called SP sparse vectors but we've used rank features as a way to implement sparse vectors next we need to put an inest pipeline so the process of uh of uh uh using a model is at ingestion you take the data and you generate embeddings those are the vectors that ends up being stored sparse or dense then afterward that query time you need to do it so we're going to do it at inest time elastic search is a concept of an injust pipeline so when you index data uh it does additional work um obviously here you need to Define the model ID so I need to remember what was the name of the model that I used uh and I need to go and which field is it going to use and then some magical configuration over there it all makes sense on its own when you build it in isolation and in a very generic way next I'm going to index data and I need to remember when I index the data to tell it to use that injest pipeline that I just configured because I don't want it to be indexed without that information without generating the embeddings and then I can go and do a search but then when I do a search I need to use text expansion and Define the ml tokens field that was generated as a result of those embeddings and I need to again remember the model ID that I'm using and go and use it and then if I want to go and use something like RF then I need to go and use a query and that's the term that I'm using and then combine it with another query and Define something called rank now at my sub searches uh capabilities now I think you can see see that a lot of the things that we've developed go deep rank features is great its ability to store sparse vectors injust pipelines make sense but they don't combine well together you know to us now that we know the workflow the thing is that we didn't know what the workflow was now that we know the workflow this is more complicated than it should be so here's where we're heading you define the mapping and you just say which model you want to use now because Elser is built in then we'll go and just use those mappings we call this model as an analyzer if you've been using elastic search the same way that you say this is English or this is something else you're just going to say this is just using a model now a model is like an analyzer uh and it's using a type called semantics text which is super simple like doesn't matter dance Vector sparse Vector whatever that that model knows what needs to be done I index the data now because that's part of my schema elastic search will do the right thing it will go and understand that it needs a model it will go and generate the embeddings index them you don't have to configure injest pipelines and all of the other work obviously we're using all we're going to reuse all the components that we've built but it needs to be packaged much more simply and then when you do a search it's a simple match query we already know that this field is semantic search text we know that it's simple we know what needs to be done we can replace behind the scenes a a a sparse Vector type search like text expansion and just make it work this looks almost exactly like elastic search from 10 years ago which is pretty impressive like this is amazing because it does so much more than what it used to be uh 10 years ago but it's as simple as it was 10 or 11 years ago um when it it just got gotten started and then if you want to go and use uh something like RF we're going to introduce by the way all of this is still in design phases uh so I hope the engineer our engineering team is not going to uh kill me if I were're going to change some of the names I like the retriever model by the way my suggestion is to replace query with golden and then I can say that we have a golden retriever uh query um but uh but uh this whole retrival model is again a new framework that is happening now in the context of the search world where um you need more than just executing queries you need to have kind of like a retriever search pipeline if that makes sense where you combine multiple search requests being able to do linear uh uh scoring for example of of of multiple requests like a bm25 one together with Spar text or something along those lines so this is all Something That We're uh in the midst of Designing and developing as we speak but you see how simple now it is to use Elser right it's exactly like you used to use analyzer in elastic search and you don't have to worry about anything it's like super super simple now a lot of the work has already been done there's still a lot of work left obviously but but it's about combining these things together and we can do it because now we understand the flow we understand how it's being used in a in a in in a practical sense if that makes sense here's another example so this uses sparse Vector what happens when you use dense Vector so this is an example with E5 so the Matt showed E5 Elser is a sparse Vector implementation so that uses an the existing inverted index um semantics uh E5 is a dense Vector so that uses Vector search um so first of all you go and use we have a tool that can go to hugging phase for example reach out to hugging phase take E5 download it and install it into elastic search um so that's that's what you do here U afterwards you need to go and create uh a field that is type dense vector and you need to go and Define the dimensions of that dense vector and what SIM similarity it's going to use and which text field so that's the map and afterwards you need to go and Define an ingest pipeline if you remember to be able to generate the embedding at the ingestion time uh using that model uh and you need to go obviously and then injust the data and Define the injust Pipeline and the search now uses KNN query right K&N search uh because you're using dense vectors uh but again you need to define the K and the query Vector Builder and then you need to remember what the name of the model was because it needs to go and generate the embeddings at time as well as it does at inj time uh and obviously here's how an RF implementation look like there's a query there's Cann but Canon is not really a query it's at the same level again like it's a bit confusing uh in terms of why is it not a query is it something else is it a retriever uh so all of this is great like can and query implementation is awesome that our ability to go and run an E5 model within elastic search that's amazing the ability to combine things together in generate embeddings great all of them work really well they don't mesh in a great way so here's where we're heading again everything is in the schema you just go and Define a actually the same type semantic text and now it uses the E5 model and the model has inside it all the information that it needs the dimensions all of that so we can go and understand from the model exactly what is needed um that's in the schema let tied to that field now we index the data we don't have to Define injest pipelines or something on those lines it just index the data and immediately knows thanks to the mapping the definition of it then we go and do a match query and guess what we know that we can go and replace it with a can and query behind the scene because that's exactly what needs to be done when you query a dense Vector type search and again using the same retriever model we can now make sense of the fact that there's one that is query and another one that for example is Cann and make it work so that's where we're heading and I hope that you see in the spirit of it like the things that has been extremely impressive I think what what our team has done we really invested heavily in building like great Vector search capabilities in a pach Lu scene combining them together exposing all of these capabilities as you've seen in the demos uh in elastic search but now we have an opportunity to significantly simplif that experience and make it as simple as elastic search was for regular search if you will uh more than 10 years ago so we want to make Lucine the best Vector database out there we're going to make elastic search the most comprehensive and simple search platform in the world what's the third part that we're focusing on open we've always been an open company and elastic search needs to be extremely open when it comes to the EO system of gen Tools around it we're going to work with every single llm out there public or private that's not a problem because we're building the pluggability to be able to do it you're not going to be tied to a specific implementation or something on those lines we're going to be able to host llm models and LM models not that large models if that makes sense like Elser and E5 and others we're going to host all of these models as well to be able to all ow you to have access to all of it in a very simple and easy to use system and we're going to work with all of the various workflow systems and open source projects like Lang chain and you're going to hear some of them today and hugging face and llama and open AI we're going to be a system that is open to everything else that happens we will take your private real time privacy oriented data store it in elastic search and be able to expose it to any other system uh uh when it comes to the Gen ecosystem and we're going to be able to also run some things ourselves um and you can progressively decide how much you want to rely on external tools versus Us Elastic search but the core of what we do is to be the best search engine in the world we have done it for many years and we think it's critical in gen workloads and this is exactly where we're focusing on building it today next and I've talked a lot about elastic search and and and all the work that we're doing in that stack and openness around it I'd like to invite Ken to also talk about what does it really mean to have Elser and elastic search when it comes to our Solutions in security and observability can thank you Shai uh and good morning uh great to be here um so as we've talked about a lot this morning elastic has been investing in Ai and ml for a number of years many many years um and we've been doing this along two dimensions a lot of what we talked about today are the things that we've been doing on the first timeline uh things that we do to enable developers to build search applications and generative AI applications with these Technologies these foundational building blocks like vector search like like Elser and like ezray that we recently launched but we've also been investing on a second timeline and that is incorporating these Technologies into our own Solutions elastic has two out-of-the-box solutions for observability and for security into into solutions for observability and security and we've been embedding these Technologies there so that you can use things like anomaly detection uh for observability you can use things like ml power detection rules in security uh and today I'm going to talk about uh a new thing that we've been doing called the elastic AI system um I suppose like a lot of companies these days we launched an AI assistant earlier this summer everyone's been doing co-pilots and AI assistants but I think the the world of security and observability are kind of unique um they're uniquely sort of primed for disruption and and and to benefit from the power of generative AI you have lots of specialized knowledge in these spaces you have practitioners who rely a lot of domain specific knowledge they're getting increasingly more difficult to hire um you have people that rely on on pattern matching and and processes that are fundamentally more more sound and and more efficiently solved by machines than humans so I think you're going to see a lot of disruption happening in the world of observability and security because the tasks that people Pro uh perform there are better handled by machines than humans three of these such tasks are detection diagnosis and Remediation we're already seeing this with detection so the the whole promise of aiops for example was about detection automating detection helping you use anomaly detection for example to automatically detect issues in your software you're seeing this also in security with with machine uh machine language power detection rules but you're also going to see this in the area of diagnosis and Remediation diagnosis like a lot of the what we do today is we provide uh security analysts tools for threat hunting but that that act of threat hunting uh can be automated as well uh same thing in the area of observability a lot of the diagnosis that that Sr professionals do is about looking at the different signal types using tools to correlate issues across the different signal types to get to root cause to get to root cause that can be automated too and eventually once you automate that you can start getting towards remediation and automating Remediation I I used to think that this world was far away I don't think it's actually that far away and I think we're actually starting to see it today and some of the tools that we're providing at elastic um we launched earlier this summer something called the AI assistant for security uh and the AI assistant for security does some simple things like it it provides query conversion to help you do uh queries in different languages but it starts to help with remediation as well so it looks at a critical security issue and it you know helps you figure out what you know what how to interpret that but then it suggests sort of a Playbook or run book for how to remediate that uh there's a talk later today by James and a couple folks on my team they're going to dive deeper into this I encourage you to go check it out uh and learn more about the AI assistant for security um the other assistant we launched is the AI assistant for observability um and rather than talk about it I'm actually just going to show it to you uh this is something that we launched uh just a couple weeks ago and um it is in Tech preview so I encourage you to give us feedback we're still working on it and perfecting it so we can get it to GA um but I think it's really exciting now I'll walk you through a scenario here so I'm I'm an Sr professional and I've gotten an alert uh an alert has fired and has told me that uh there's a log threshold that has been breached and you can see the alert here uh there's been a spike uh in in airor in errors related to the set of logs one of the things that we do right now is we automatically run a log Spike analysis so that when you open up this ticket open up this alert we we we decorate that with some additional information so that you can you can get more uh details about why this is an alert so we run a log Spike analysis this is essentially a a machine learning process that looks at this uh deviation Compares it to a Baseline and determines uh some things that you should look at so it's it's done some pattern matching looks at the different uh field names and field values and it prioritizes this this is cool it gives you lots of information prioritizes what you look what you need to look at but this is kind of traditional machine learning this is not this is not special this is this is stuff we've been doing for years but what is new here is I have this new box down here called possible causes and remediations this is powered by the AI assistant and what I can do now is I can ask it to analyze what I've done up above and tell me what they it thinks the possible root cause is and how to remediate it and what it does is it passes all that context to an llm so the entire log Spike analysis gets passed to an llm and then it determines what the possible root cause is and in this case this is powered by uh gp4 um it's telling me that it believes the issue is related to something called PG bench uh and then it gives me some uh steps for how to remediate this and if you look at the initial log Spike analysis you can see yeah in in fact there were a bunch of uh issues that were uh high in this list that related to PG bench uh so it makes sense that this would be what what the llm determines as the root cause that's cool but I can take this one step further I can then sort of chat with the you know continue this chat and ask you know for example what is pgbench and it uh you know passes the context to the llm and then it comes back and tells me pgbench is a a benchmarking tool for post grids okay that's cool but that's fairly simple I can do that in chat GPT there's nothing special about that you know maybe it's maybe it's nice that I have this in the context of my application but there's nothing really new here let's get to something new let's get to something that really is special what if I wanted to then ask did this incident have an impact on my revenue on June 7th 2022 now how on Earth could chat GPT or GPT 4 know that it doesn't know uh the operational history of what I have it doesn't know how to interpret this question uh it doesn't know how to you know how to assess Revenue impact well what we're going to do now is we're going to see ezray in action this is the the retrieval augmented generation workflow um we're we're going to we're going to prompt uh uh chat GPT or or GPT 4 with some additional information including how to ask us information how to get how to get more information out of elastic search and the AI assistant and what what what's going to happen is uh they're going to determine that uh they need to get a list of service names so they're going to invoke the function get APM Services list and we're going to then ret we're going to then pass a list of service names and then gp4 is going to determine that it probably needs to get some information um from a particular uh service name called checkout service and it's going to ask for APM timeline series data that that is then going to get interpreted and passed back and it's going to generate two charts one for checkout service throughput and one for checkout service failure rate and what you can see is in fact yeah there was an impact uh depression in in in throughput on the checkout service through uh and as well as a spike uh in the failure rates and it's going to it's going to assess what happened it's going to determine that yes on June 7th there was an impact to the checkout service which it believes was related that did have an impact on revenue and all this happened as sort of a back and forth between our assistant powered by ezray and in the llm now let's go further one more time what if we asked it to can you visualize my Revenue data for June 7th 2022 again there's a lot of additional things that needs to do it needs to understand how to interpret Revenue what is revenue and how do I assess that and again it's going to go back and forth and it's going to ask some additional questions uh and it's going to determine that it needs to uh invoke a function called recall which is going to then ask the Elser model uh some that's been trained on our knowledge base how to get Revenue information from us and then it's going to determine that it needs to invoke the function lens uh and then it's going to uh determine that it needs to you know plot a chart in fun in lens that shows revenue and it's going to finalize it's going to fin it's finally going to give us a a blur that says yes you can visualize your Revenue data it did have an impact as you can see here and this is a chart that has been produced in lens that you can then drop into any dashboard so all of this is demonstrating Elser and ezay in action as as used in the uh elastic assistant to to go back and forth with the llm to provide additional information so I'm excited about this you can learn more about each of these demos later today we're going to talk a little bit more uh about the assistant for security and the assistant for observability in sessions later today so thank you if you live today just remember don't run benchmarks on production um that's always a good rule your SRS will thank you um but I I want to touch on uh Ken's point I think here you know like many many years ago we built a search engine and then we we didn't really realize what was what were like logs and threats and things along those lines but suddenly we saw elastic search being used to store logs and explore them in this like Sr market and devops and observability then we started to see security people using elastic search to do threat hunting um and that was pretty amazing and we asked ourselves why and a big part of it was because search is amazing as a technology as an experience when it comes to exploring logs to exploring observability data points it's really liberating people were like confined with their older Tools around how it's being used it wasn't open source it wasn't open same thing in security you know threat hunting was taking ages with uh previous seam tools and suddenly like you could use elastic search and search your security threats very quickly and get answers and suddenly you went like chasing the bad guys for years or for days until you could figure out what happened with Lo for Shell or something along those lines and then we went on a journey as a company to have complete observability and Security Solutions and you can see like those experiences are pretty amazing compared to 10 years ago when you just had a search box where you could find like a list of threats or a list of logs or something along those lines and what happens now with Gen it feels like we're going a big back to the beginning CU suddenly the importance of search becomes even more critical in this Solutions and a company that does search extremely well will be able to build the best AI assistant for observability and for security I think that that's a very very exciting future for us when it comes to our solutions for our ability to go and give you the best observability tool that you deserve or the best security Tool uh that you deserve so what is the future of search for us I mean it's all about search we will build curated experiences in security and observability and made tons of Investments but search is a critical component for Gen workflows sometimes all you need is search um and elastic search we are a leader in search we've built the most popular search engine in the world today in elastic search it's very humbling and we're going to work very hard to make sure that this Remains the Same we're using the most popular information retrieval or search library in the world today in aatul we're going to make that's going to be that aatul is going to be the best Vector database out there and we're making these Investments and we're going to make sure that it's easy and simple to use the way that it was 10 years ago those things are very very complicated with models and vectors and embeddings and things on those lines but I hope that you see that we can simplify that experience the same way that we did uh 11 or 12 years ago when we first released alas search so thank you very much and I'd love to invite Jeff from hugging face a great partner of ours uh to uh to give a talk [Applause] w

2023-10-13

Show video