Build a RAG Based LLM App in 20 Minutes! | Full Langflow Tutorial
in this video I'll show you how to build your own artificial intelligence application that utilizes rag retrieval augmented generation without writing a single line of code and in just a few minutes now the way we'll do that is using something called langlow now I have it open on my screen you can see what the finished application will look like and notice that it's all visual we can pull in here some pre-built components we can connect them in a really intuitive way and then we can run the entire application right from this platform so let me run for you give you a quick demo and then I'll show you exactly how we build this so the demo is running and just to give you a bit of information this app is going to represent a chatbot for something like a restaurant now really it could work in any context but the idea is we'll have something like a PDF here that has some commonly answered questions so if you're a restaurant or some store you probably get asked the same thing all the time when are you open where you located do you accept credit cards cash whatever so that's inside of some PDF document we're going to pass that to the llm and then the llm will be able to answer questions based on that document and it's also going to store our conversation history so it will remember what we asked last so let's ask it a question and notice that I can even put in my name here so that we can see who's answering or sorry who's asking the question so I'll say what time are you open and do I need a reservation question mark let's see what it says even though I spelled that incorrectly so there you go we got a response we're open from 11:00 a.m. to 10: p.m. blah blah blah and while reservations are recommended we always try to accommodate Watkins let's ask it something else let's say can you tell me about the specials and the menu question mark and there we are we have a reply here that's giving us some information from the PDF so variety of vegetarian and vegan dishes they have a kids menu and there specials based on the chef's selection now if I go back to the PDF you can see that if we look here at our specials for example it says yes we feature specials that include dishes and Chef's selection do you have a children's menu yes we have a children's menu do you accommodate food options Etc it kind of gave us all of those different responses and combined them together so I know it seems simple but this is something you can really extend and make quite cool so let's have a look now at how we build this okay so to get started here we are going to need to install langlow don't worry it's free it runs locally on our computer and we can actually install it using pip so long as we have python version 3.10 or above now once we do that we're actually going to create a vector store database using something called Astro DB from the company data Stacks now data Stacks is actually the sponsor of this video don't worry they're free you don't need to pay for them and they've actually teamed up with Lang flow just to make this that much better once we do that we're then going to use open AI so we're going to create an open aai API key and we're going to use that to connect to it so that we can have a really powerful llm but if you don't want to use open AI you can use really any model that you want and you'll see in Lang flow that you can connect to a bunch of different ones so let's go through the first step here which is installing langlow to do that we're going to copy this command from the documentation which I'll link in the description to install the pre-release of langlow so we'll go back to vs code we'll open up some kind of terminal in this case I've just created a new folder this is where I'm going to have my PDF file now we'll talk about the PDF in one second but for now let's paste this command in and if you are on Mac or Linux it's going to be pip 3 install Lang flow-- pre-- Force D install and if you're on Windows it's just going to be pip okay so we'll run that command I've already got it installed so I'm not going to do that it will take a few minutes to run because there are quite a lot of packages that need to be installed and then what we can do is run the command in our terminal which is langlow run when we do that it should start Lang flow on Local Host and then give us a browser window where we can start creating our flows all right so after you wait a second you should get some kind of output starting Lang flow whatever the version is and then it will actually open it at this address so I've got open here in my browser you're not going to see any flows because you haven't created any yet but in my case I've created a few so they're here so now what we can do is just press on new project here and we can make a new flow now you can work off of a template or you can just click blank flow which is what we're going to do for now now the way these flows are actually stored is with Json so what you're able to do is Import and Export the flows with a simple Json document and I'll actually leave a link to a GitHub repository in the description that will have the Json for the flow we're about to build as well as the PDF document and any other information that you need in case you just want to go right into the finished product and not actually build it out yourself anyways we're now inside of a flow and we can start building out our application but I do want to mention that we do want to get access to some kind of PDF that we're going to be providing to the app so in my case I have this restaurant Q&A I'll leave this in the GitHub repo in case you just want to download this one but you can also make your own or you could just get a PDF you already have and you can feed that into the app but just make sure you have some kind of PDF ready and accessible because we are going to need that as we build out this flow okay so let's start building the flow here and kind of go through all the different components see how to connect them and build this application so I'm just going to grab a few components from the sidebar here and bring them in and we'll start building out this app and kind of talk through it as we go so I'm going to go into inputs and notice that you can also just search for stuff up here as well so for inputs I'm going to grab a text input I'm going to grab a prompt and I'm also going to grab a chat input now let's zoom out a little bit so that we can see all of these and if you kind of hover over them or you look at them it tells you exactly what the component is meant for so what I want to do is I want to get some text input which is the user's name now this is how we'll understand which user is asking which questions and store that in some kind of chat memory so for the text input here I can actually change the name of it so I'm just going to call that name by clicking on edit and then for the value this will be something the users expected to input so we don't need to actually put anything here now what we'll do is we'll take that and we'll pass that as the sender name so notice I'm taking the output here which is whatever they type in and then that's going to be the sender name into my chat input now the chat input is going to be the question that the user types so they can just type whatever they want here I'll show you how we run that in one second and then we're going to take that and we're going to pass that into our prompt now the prompt here is kind of like a prompt template where we can actually write out a few different variables that we want it to accept and then we can kind of structure how we want to ask the llm question so what I'll actually do is I'll just click on this link here and I'll delete it because I actually don't want to do that I'm going to click on template and I'm going to write a simple template and I can embed some prompt variables using my curly braces so I can say something like hey answer the users's question based on the following context and then I can say the context is this and I can pass inside of here context and now that's going to be a prompt variable that this can accept next we're going to say the users question is this and then this will be the question and then we'll have the message history as well we'll say and this is the message history and we'll say that that is history and we can adjust this if we want to make it better but let's say hey answer the question based on the following context we'll say context message history question okay looks good to me let's check that and then notice these variables now appear here in our prompt and I can pass them in so for the chat input that's going to be this so that'll go as the question and then for the context we need to provide that that's going to be from our rag retrieval augmented generation we're going to use a vector store database for that then the history is going to come from our chat history so how do we do that we let's bring in our history here I believe it's message history that we're using or actually it's going to be memory sorry we're going to have chat memory so this retrieves store chat messages with a specific session ID and the session ID is going to be the name of the user that's interacting with the model so we're going to take the name and we're going to pass that as the session ID so we'll link that up right here and the idea is whenever we change the name then it will change the memory that we're using so if we were talking as Tim and then we change it to Sarah well then we get the appropriate memory uh that the user was typing in before so it kind of remembers who's saying what that's the idea here so we're going to take that and that's going to be our history and now that we have that we can actually pass that to an llm and then we can pass that to some chat output and we can test this before we actually implement the rag component which we'll do in a second okay so how do we do that well we need to pass this to an llm so actually if we go to models here you can see there's a bunch of different models that we can use now I'm going to use open AI just because that's the simplest for this tutorial but you can use olama you can use Vector AI hug face API there's a bunch of free versions here and one version you could use is AMA you just need to set that up locally and then you can run your own local llm again in our case we use open AI but you can literally just Swap this llm with any llm on the left side here that you want to use okay so what we'll do is we'll take the text and we'll pass that as the input here to the llm now what we need to do is we need to get an open AI API key so let's do that and then we'll move on to the next step so in order to get this API key we do need account with chat GPT or with openai the URL is platform. open.com ai- Keys now they've changed
this recently where you do actually need to fund the account so you will need to put like a dollar or $2 into the account to be able to use this again if you don't want to use this you can use any other llm or API that you want but open AI right now is really just the simplest way to do this so you need to go to API keys and then create a new secret key and then copy that key so we'll make a new key here I'm just going to call this tutorial like that and then you can give it uh Scopes but in my case I'm just going to give it access to everything it's going to give me some key that I will delete afterwards let me copy that and then we can utilize that key now just to show you here if you want to see if you're able to actually use this or not you can go into usage and it's going to show you how much usage you have so I've used this a little bit today and then you can click on increase limit now when you click on increase limit it's going to bring you to this page right here you'll have the ability to set a monthly budget uh and you can you can kind of go through all of this stuff here then you also have the ability to buy credits and to fund the account so what I can do is go to buy credits here and when I go to buy credits what I can do is I can fund the account with a certain amount of money in my case I just put in $5 and you can do that from your credit card balance you can have it auto recharge Etc they've changed a little bit of how this works but if you're getting an error saying that you've reached your quota it's because you do need to fund the account with a little bit of money for it to actually work properly if you guys know another way to get around that that's great uh but in my case that's what I had to do okay so now that we have the API key I'm going to go and I'm going to paste my API key here into the open API key or open AI API key and what I can actually do if I want is I can make a new variable I've had some variables here before but I'm going to make a new one I'm going to call this open AI key new and then I'm going to paste the value here and I'm going to make this of type credential this is automatically going to be encrypted and then I can just utilize that variable so I'm going to go with that here in case I need to use this later on my program okay so now that we have that we're passing the prompt to the llm now we need to take the output from the llm and actually display that to the user so to do that we're going to go to outputs and you see that we have a chat output now this will display a chat message in the interactive panel so we'll just take the text and we'll put that to the message and the sender name will simply be AI okay so I know it's a little bit small but you guys get the idea we have our chat output llm prompt chat input name and chat memory and now now what I can do is click on run here and when I click on run you can see that I have a name so I can enter something like Tim and I can ask this a question I can say hey how is your day going question mark and when I hit on enter you can see it shows my name and then it responds my day is going well thank you for asking how about yours perfect so we have a basic llm pipeline set up now what we need to do is Implement rag retrieval augmented generation so we actually load in the PDF and then we're able to answer questions based on it now in order to do that we need to set up a vector store database so let me talk about how that works all right so I'm now on the data Stacks website where they're talking about their Astra database which is a nosql and Vector database specifically for generative AI apps now this is a really fast way to retrieve relevant information while we're building things like retrieval augmented generation pipelines which is what we're doing now I just want to quickly explain how this works and what a vector database actually is so when we build this rag app what we're trying to do is take relevant piece of information from all of our context in this case it's a PDF document and inject that into the prompt alongside whatever the user asked that way the llm has relevant data to actually answer the question however in order to do that we need to really quickly be able to retrieve the relevant data and we need to get the best match we possibly can so the process goes like this user types something in we then go to the vector database and based on what they typed in we search for things that are relevant to that we get that information we inject that into the prompt we also pass the user's question then the llm gives us some response but first we need to of course take all of our data we need to turn it into vectors and we need to put it into the vector database so that's what we'll be using data stacks and their Astro database 4 this runs on Apachi Cassandra and again it just provides a really really quick way to get relevant information and inject that into the prompt when I start building this out I'll talk about it more but for now go to the link in the description create a free account and I'm going to show you how we can spin up a new database that we can then connect to from langlow so once you've made your account and signed in you should be brought to a page that looks like this now what we're going to do is look for this button that says create a database we're going to click on it and we're going to go with the serverless vector database which is exactly what we want for this app now you can name this anything that you want just keep in mind that you can't change it so I'm just going to call this langlow tutorial then we can choose our provider I'm going to go with Google cloud and in the US East now obviously if you want to use this in production you can upgrade your account but in our case we just want something simple that we can test with for the video which is free okay so it's going to make the database that'll take a few minutes once it's done I'll be right back and then I'll show you how we get the information that we need to connect to this from LF flow all right so now that this is loaded we're good to go and we're just going to be looking for the token and then the end point here as well as a collection name but we can deal with that later so let's leave this open and let's go back to langlow and let's start creating this connection so what we're going to do is we're going to go to the chat input and then once we have that input we're going to pass that to a vector search and that's going to look through our database and find any of those relevant pieces of information and then inject that in the prompt but before we can do that we need to build the database with that information so let's do that down here going to zoom in a bit and on the left hand side here we're going to go to our Vector stores and we're going to take in astrab which is what we're using what we're going to do here is we're going to put in the AP endpoint the token Etc okay so we're going to go to data Stacks Astra I'm going to copy in the endpoint and I'm going to make a new variable for this so I'm just going to go here and click on variable let me actually just delete the ones that I had before just to make sure we don't get messed up here okay get rid of all of those and let's make a new one here and call this endpoint okay we'll just make this generic because we don't need to hide that and then we can save the variable okay so let's put our endpoint here for the API endpoint sorry for some reason that went for our token then we need our token so let's generate the token okay let's copy that close go back here and let's make a new variable for the token and we're going to call this our Astra uncore token we're then going to paste the value here and we're going to make this a credential okay so let's set that as Astra token and then for the collection name we're going to make a new variable okay so let's do that here and we're just going to call this collection and the value of this is just going to be PDF because we're storing some PDFs and this will be generic okay so now we'll set that to our collection so now we have our token our endpoint and our collection what we now need to do is we need to pass some inputs to this and we need to pass the embeddings now the input is going to be the file that we want to load or the files that we want to load and the embeddings are going to be how we vectorize this okay so let's first start with the embeddings the embeddings will be here so on the side click on embeddings and we're going to go to open AI embeddings now leave this as the default text embeddings and for the open AI key we can just set this to our open AI key new which is the variable we created we're going to drag this and connect the embeddings and now what we're going to do is pass our inputs so for the inputs we're going to get a file so we can just search for one here this is a generic file loader so let's load in our file file and we're going to find the file in my case it is on my desktop it's in this folder called restaurant Q&A obviously you put yours wherever youve put it but locate it and then load it in and then we're going to pass this into our text splitter so let's find that this is the split text okay let me give us some more room here all right sorry so what this split text is going to do is it's going to take the file it's going to split it into a bunch of different chunks it's then going to pass it to the Astro database where will then be converted into vectors so rather than just passing the entire file at once we want to split it into small different chunks then we pass it to Astro database along with our open Ai embeddings and the embeddings is a special model that actually does this kind of search for us right so it will convert things into vectors and then allow us to compare the vectors for similarity and get the results back that we want now we're going to be using the same embedding model for actually creating the database as well as for searching in the database which you'll see in a second so I'm going to take my file and I'm going to pass that as the input here to my split text we don't need to change any of these settings but we can if we want to and then we're going to take this record and we're going to pass that to the inputs to Astra database so that's it this will actually now create a new collection in Astra database called PDF or whatever we named it it will do that by taking in this file splitting into chunks passing it here and then embedding all of them as vectors okay perfect now what we're going to do is we're just going to take this embedding and we're just going to make this a little bit easier to view because we're going to use this embeddings up here as well I know this is difficult to see um I'm going to try to zoom in as much as I can okay so now what we want to do is after we have the chat input we want to pass that into a vector search where we then go and look for all of the relevant information so let's move our prompt up here and let's get our Vector search so we're going to have our Aster database search which is right here let's zoom in so we can see this for the token this is going to be our variable so this is going to be the AST token for the endpoint this is going to be our variable endpoint and for the collection name it's going to be the collection that's why we made variables now for the input value that's actually going to be the chat input so whatever the user typed that's the input here to our Astra database search and then the embedding is going to be the one that we had here so we'll just connect those embeddings and give me a second I'm going to clean this up just so it's a little bit easier to see okay so I just clean this up a little bit so it's a bit more organized and now what we're going to do is we're going to take the output from our astd database search and we're going to connect that all the way up here to the content for our prompt so remember we have the content the history and the question we already have the chat history and we already have the question the last thing that we need was the context and that comes from our database search Okay so let's quickly run through this at a high level first thing we do is we get the name from the user we then pass that into the chat input and that's kind of like the name of the person who said send the message then we ask them hey type something in this is their question then we connect this to chat memory the chat memory is just so that we can store what the last in this case five messages were that the user was typing and then we pass the chat input into the prompt now as well as that we pass the history which is our chat memory and then we pass the context from our Vector search so our chat input goes to the vector search we then embed that using the open Ai embeddings and we look for any relevant information from our PDF F or whatever documents we put in here and then we get that and we pass that to the prompt so that's what database search does it searches inside of this Vector database for any relevant information gives it to us and then injects it in the prompt or that's how the prompt works here then we take the prompt we pass that to open AI then open AI gives us some output and we print that out meanwhile over here we have our file we load in the file we split it into a bunch of different chunks we then take that and we pass that to our Astro database so we can create this database and we do that using the same embeddings that we will use for our database search so now if we want to run this we can click on run let's just erase all of this and let's ask it some kind of question hey can you tell me the hours the store is open and let's see if this works so I was just playing with this and it's working properly one thing to note you do need to ask it one PRT first that it's not going to give you a good response for because it just needs to build that Vector store so before the vector store is built it's not able to actually give you any valid response because it doesn't have any context so you're not seeing it on the screen cuz I cleared it and then did it again but the first question I asked it didn't give me a valid answer then the second question it worked because it actually ran this secondary component chain that we had down here which actually built out that Vector store so last thing I want to show you here is that we can change our name and when we change our name what that does is it resets the message history so I'm now talking talking as Joey and I'm going to say hey have I asked you anything yet and you're going to notice here that it's going to say no and the reason for that is that the last person that was talking to this was Tim not Joey so when we change the name it changes the message history which allows us to have kind of our own log based on who's asking questions so it says no you haven't asked anything yet and then we can say okay cool thanks whatever right and then we'll track this in Joey's message history whereas Tim's message history will be different kind of just an added feature that I wanted to add to this project so the last thing I really quickly want to mention is that you can Import and Export these just as Json so what I can do is go here I can change the name obviously and then I can export this I can give it a name name something like Rag and then I can download it as Json and then if I already have one that I want to import I can just go back here I can load it up so let me just go to my downloads and I can just take this and just drag it in and sorry need to actually be in the correct thing so if I drag it here you can see that when I do that it just loads it in so that's as easy as it is to actually load in the Json and to share with other people so what I'm going to do is I'm going to have this Json and I'm just going to leave it in the GitHub repository Linked In the description so you can just drag it into Lang flow and you can play around with it there you are guys I hope you enjoyed this video If You did leave a like subscribe to the channel and I will see you in the next one [Music] oh
2024-04-30 01:38