- [Jon] Hello, I'm Jon Roberts. I'm a Senior Go-To-Market Database Specialist. I'm joined today with Ayan.
He's also a Go-To-Market Database Specialist to cover Amazon Aurora ML. So you may be familiar with Amazon Aurora, especially Postgres and the capabilities of storing vectors inside the Postgres database and be able to index that and search on that, but you may not be familiar with the capabilities associated with interacting with machine learning models in Amazon Bedrock or even how that actually works. And so, today, we're gonna show you how you can leverage SQL and Amazon Bedrock to really kind of create generative AI applications and be able to kind of bridge the gap between what you know may know already in SQL programming language and what's capable or possible with the large language models that are available in Amazon Bedrock. So with that, I'm gonna give this over to Ayan to let him kick off the presentation. - [Ayan] Thank you, Jon. Jon already introduced myself.
But just want to give you, like, my name is Ayan Majumder. I'm a Go-To Market Solution Architect at AWS supporting FSI and Tmax segment. I'm based in Minneapolis. And now let me- Jon, are you able to see the screen, the deck? - [Jon] Yeah, we can see it just fine. - [Ayan] Okay. Awesome. So, this is our agenda for today.
So first, we'll discuss about the Amazon Aurora and its key benefits. Then, we'll discuss the Amazon Bedrock and how it works. We'll also discuss how we can do the data ingestion to the knowledge base for Amazon Bedrock.
Next, we'll discuss about the Amazon Aurora ML and it can integrate it with Amazon Bedrock. We'll also showcase this in the demo. And lastly, we have Q&A sessions. Also, if you have any questions during this presentations, please feel free to drop them in the chat.
Either me or Jon will answer those questions. Moving to the next slide. So what is Amazon Aurora? So Amazon Aurora is a fully managed relational database in gene that is compatible with MySQL and the PostgreSQL. It provides powerful performance for intensive applications or the critical workload at one-tenth cost of the commercial database. So it's cost effective.
And Amazon Aurora also offers high availability which allow you to build your applications with Multi-AZ features and perform cross region applications. So the recovery time, you can say, less than a one minute or so in case of disaster. It's simple to use and follows as a pay-as-you-go model.
And Aurora also like a decoupled storage and query processing, it maintains six copies of data across three Availability Zone, so two in each Availability Zone you can say, to protect against the Availability Zone failure and it can promote any reader node to a writer node in case of failure. Amazon Aurora also has self-healing feature so it automatically detects and repair the data corruption and ensure your database remains highly available and reliable. It has faster recovery capability which allow the database to quickly resume the normal operations after a failure and it ensures your application continues to run smoothly. It also support the elasticity so it can quickly scale up or down, depends on your workload demands without any disruptions to your applications. So this is the other benefits we are getting with Amazon Aurora.
In terms of backup, Amazon Aurora support automatic backup. So you will get a daily backup based on your backup window or the maintenance window, I would say. So it'll back up your entire database instance and your transactional log. And you get to choose how long you want to keep your backup to be retain. So maximum is a 35 days.
And to ensure durability or availability, Aurora keeps multiple copies of your backup in each Availability Zones. So in case of one of the Availability Zone's failure, you can easily restore your database instance from the backup. And there is very minimum impact on the database performance during the backup. But there is also features also like Multi-AZ. So if you have a Multi-AZ deployment, you have a very, I would say, the no I/O suspension because all the backups are taken from the standby instances, not from the active instances. So fast cross-account database cloning.
So another features on the Amazon Aurora is a cross-account database cloning. So let's say when you create a clone on a Aurora database, a new Aurora database with the compute instance is created, but it connects to the same underlying storage volume as the original database. So this process is not copying the data, it's repointing the same storage. And after the clone is created, any updates or deletes or any edit made on the original database are only visible to the original database.
And any updates made to the clone database, those are only visible in the clone database. And the original database that was present at the time of the clone creation is still visible to both, the original database and the clone database. Typical use case, let's say, if you want to test your application on real production data instead of creating a synthetic test data, you can use this feature.
And once the testing is done, you can delete the clone database to save some cost. Additionally, this features is helpful. So if you want to do, let's say, reorganize your database, if you want to run any additional analytics workloads on your transactional database or let's say you want to save a point-in-time snapshot for analysis without impacting the production system, that time you can use these features like cross-account database cloning. Moving to the next slide.
So Amazon Aurora zero-ETL integration. So why we need that? So customer across the industries today are looking to implement near-real-time analysis in their transactional database. So a common pattern we have observed like moving data from operational database to analytics data warehousing, normally, we need to build expensive and custom ETL pipeline, extract, transform, load pipeline. And with Aurora zero-ETL integration with Amazon Redshift, you can bring together the transactional data of Aurora with the analytics capabilities. And you do not have to maintain a complex or a expensive data pipeline to move the data from Amazon Aurora to Amazon Redshift. And you can also replicate the data from multiple Aurora database cluster into same or new Amazon Redshift instance to derive holistic insight across many applications.
And any updates made in the Aurora are automatically and continuously propagate to the Amazon Redshift, so you have always the most recent information in real time. As you can see in the diagram, we have couple of data sources, transactional applications, analytic applications. Data is coming to Amazon Aurora and then using the zero-ETL integration, you can set up in the console itself. It's a few clicks.
And then you can move, or I would say, copy the data from Aurora to Redshift. And once the data is landed in Amazon Redshift, your analyst can do the analytics or create the visualization using Amazon QuickSight also. Amazon Aurora also supports blue and green deployments. So it allows the user to update the database while minimizing the risk and downtime. So you can see in the diagram, we have two environment. One is the blue environment, one is the green environment.
Blue environment is the current production environment and the green environment is a kind of staging environment that mirrors the production environment. So using this deployment, we can create a mirror copy of the current production environment, so the two environment we can keep in sync using the logical replication, and we can then make the changes to the database in the green environment without affecting the production environment. And once you done with the testing, good to go, then you can do the switchover from the- The green environment can switchover to become a new production environment. And this switchover typically takes less than one minute and does not require any application changes. So blue and green deployment can be used for variety tasks like upgrading the DB engine version, changing the database parameters or making the schema changes.
So next is Amazon Bedrock. So what is Amazon bedrock? So it is also a fully managed service that offers a choice of high performance foundational model from the leading AI companies like AI21 Labs, Anthropic, Cohere, Meta, Stability AI and Amazon and that helps you to build and scale generative AI applications in the easiest way. And you can access all these FMs using the single API and you can also customize all this foundational model with your organization's data using the techniques such as fine tuning, or Retrieval-Augmented Generation or RAG models.
So, since Amazon Bedrock is serverless, you do not have to manage any infrastructure so you can just securely integrate and deploy your generative AI capabilities into your applications. Also, it provides enterprise grade security and privacy. How this Amazon Bedrock works? So let's say, as you can see in the diagram, you can use the Amazon Bedrock playground option to experiment with the foundational model and select one of the models that best suits your requirement and then you can fine tune this FM using the RAG or Retrieval-Augmented Generation technique or you can use it as it is, so whatever it's coming from the foundational model.
Then you can send this prompt to the model using the bedrock API and then receive the model response in your application. So it's a kind of four step process how the Bedrock works. Now the Knowledge Base for the Amazon Bedrock, that is another feature come with the Amazon Bedrock. Many companies often use the Retrieval-Augmented Generations to provide their foundational model with the latest proprietary information. And this Retrieval-Augmented Generation, or RAGs, involve to retrieve the data from company's internal sources.
And using that data, it'll enhance the prompt given to the FMs or foundational models. And this helps you to create more relevant and accurate response. And Knowledge Base for the Amazon Bedrock is a fully managed RAG capability. So this allows the companies to customize their response of their foundational models by incorporating contextual and relevant data from the company's own data sources.
And this Knowledge Base for Amazon Bedrock automate end-to-end RAG workflow including starting from ingestion, retrieval, prompts, augmentations. It's like end-to-end process. And it eliminates the need for you to write custom code to indicate the data sources and manage queries. It also supports building session context management for multiturn conversation, like the conversation that involves more than just one, single question and response. And all the information retrieved from the Knowledge Base comes with the source citation to improve the transparency and minimize the halogenation. As you can see in the diagram, the user submit a query and retrieve the similar information from the Knowledge Base, and then the user prompt is augmented with the additional context from the Knowledge Base.
And the prompt alongside the additional context is then sent to the model. We have Anthropic Claude model, Amazon Titan Text model, AI21 Labs, all the foundational models. And then once you send all the details to the foundational model, finally you'll get the response back from the foundational model.
So that's step number six. So this is end-to-end flow how the Knowledge Base for Amazon Bedrock works. But how we can ingest the data for the Knowledge Base. So data ingestion is an important role. So this is how you can create end-to-end RAGs or workflow using the Knowledge Base for Amazon Bedrock. So first step, select the data sources.
It can be S3, and it supports the incremental update. We have also other four type of data sources in preview state, so, web crawler. So if you have a public website and you want to use that data source, that public website, you can use web crawler. It also supports Confluence, Salesforce and SharePoint as a data source. And once you define the data source, you can convert those data into text and split it into a manageable piece which we call chunks.
And it can be a fixed chunk or the default chunk. And chunk can then convert it to the embedding using the selected model, like you can see, Amazon Titan or Cohere embedding models. Then you can store this embedding in a vector store. As of now the default is OpenSearch serverless, but also, if you have an existing vector stores like Pinecone, Redis, MongoDB and Aurora PostgreSQL, that also you can use during the vector store options. So you can pick your model which you wish to use to create the embedding as well as the vector store for this embedding.
This is a example how the embedding works. So as you can see, we have insurance domain specific data and we can break this data into elements that hold the individual meanings, and that process is called tokenization. That elements can be words, phrases, paragraph or entire document, whatever makes sense for your use case. And you can take those elements and pass them to the LLMs to derive the numerical vectors. Essentially, this numerical vectors are the array of the numbers for each elements and these vectors keep the elements with the related meaning close to each other in a multi-dimensional vector space.
And this is called vectorization. You can then store this vector or embedding in a vector store, which we discussed in the previous slide. And in the diagram also shows in the right hand side, as you can see, the auto insurance and the car insurance are semantically similar. So they are positioned closely to each other in the vector space. And when you perform the semantic search, it'll include the symmetrically similar context in that prompt. So that's all from my side.
I think next topic will cover by Jon. Jon, over to you. Jon, are you there? - [Jon] Sorry, I was speaking on mute.
So yeah. I got the screen taken over. Can you see the code okay? - [Ayan] Yes, Jon. - [Jon] Okay, great. So, I like to see things in code and so I think it helps me makes things more concrete exactly how this actually works.
And so I thought I'd start with a simple example of actually using the Amazon Bedrock using Anthropic Claude model. Okay? So the way this works is that you first provide a prompt for the large language model. So you say please provide a summary of the following text, and you're passing in some sort of text value here. You get to do that, you have to take that text and actually create- The body has to be in the form of a JSON object. And so, there are some things you have to provide and the JSON object of some elements there, like Top sequences, Top-P, Top-K, so forth.
But that is then- That prompt, it could be any kind of text string, could be that insurance example, you know, detailing coverage or something on insurance. But that's what this is used for the model to create embeddings, or to create, in this case, create a summary. So using the model Anthropic Claude.
And it is saying I basically want to have a summary of the text that I provided. So, the Amazon Bedrock is serverless, right? And so, you don't have to create like an EC2 instance or virtual machine to install and run the large language model. We do it for you through Amazon Bedrock. So it makes it very easy to use.
There's models not just from Amazon, but also Anthropic, AI21 Labs, Cohere, Meta, Mistral AI and Stability AI. So it's not just Amazon models that are available. And that's a growing number of models that are available from those companies. So what does that look like? So I can say, Python3 summary "hello world." So passing it to Anthropic Claude, it should give you a summary of this.
And so it's telling me, yeah, this is a common phrase used for introductory or test message in computer programming, right? And it was the large language model output, this is Anthropic Claude in this case. The other type of common large language model output are embeddings since we talked about that earlier, Ayan did, and exactly what that is. So it's basically creating numbers from text input, right? And those numbers represent the meaning of the text input. So, for instance, the embedding for orange will be similar or close in numbers to the word pair because those are similar things even though the letters are, I mean the very few letters are shared among those two different words. So it's not like a texturing comparison, it's the meaning of the words.
And these numbers are stored as vectors, like vector databases. And vector stores are common ones. We just mentioned a whole bunch. And Postgres is one of those and it just stores vectors inside the database. And the way the vector, the terminology you used for this is dimensions, right? It's like an XY, like two dimensional, but most of these vectors have multiple, like hundreds or thousands of dimensions and they're stored inside Postgres, looks like an array. So those would be like elements in an array, but they're a vector, they're considered dimensions.
So they're, again, used for RAG applications. And databases can store index and query these vectors very efficiently. And this demo will show actually querying the vectors and how you can index that and improve performance.
So this is the code that does the vector embeddings. And you can see it, very similar to the previous example. But at this time, I'm using Amazon Titan embeddings invoking that particular model and pass in a prompt. The prompt I'm going to use is, again, hello world, but it's the embed script this time. And the Titan embeddings accept up to five meg of input. And the output, this has been snipped, but it's 1,536 dimensions.
So it's very rich in the number of dimensions, and so, it can be very accurate at the comparison of that array or that vector with another vector from another string that you provide. And we'll show you in a minute how you could actually query in the database to get through similarity comparisons of different vectors in your data. So, the Aurora ML, so this is the key, the glue, that matches up Aurora, like Aurora Postgres, with Bedrock.
It makes it simple for you to really integrate your SQL code, your SQL data, with machine learning whether it be Amazon Bedrock, SageMaker or Comprehend. This demo, we'll talk about just Bedrock. But you can interact with these tools all with familiar SQL. And so, it gives you great capabilities of interacting and creating machine learning applications all with familiar SQL and making it very quick and easy and all integrated with the security and governance that you would expect from an AWS service. So the process for Amazon or actually the demo here where we will invoke Amazon Bedrock models from Postgres, we will query for the model input as well.
Like they can use SQL to, say, I want selecting from picker columns and use those columns, that data, to actually be input to invoking the models. And then the output, those vectors, you can store, or even summaries, you can store inside Postgres, and with vectors, you can use pgvector, which is an open source package that's available with Aurora Postgres to store and index the this data inside of Aurora Postgres. Okay? So, we'll move on to demo real quick. And let's go here. Now this is a recording of a, of a workshop that we have available. And the workshop is freely available.
You can use it in your own AWS account or you can use it in a lab environment that you basically work with an account team to get that set up for you. And so, the application uses EC2, Bedrock, Aurora Postgres and Streamlit application. And what it's doing is actually querying the web database or a database that has movie data so that you can get movie suggestions based on your input and using vectors. So like I said, you can use this through your own AWS account. Or if you want to use, have a session, like a workshop session, we can provision the resources for you, so that way, you do not have to be using your own AWS account to do that.
The first step here is setting up the Amazon Bedrock model access. And so this is where you go to Amazon Bedrock in the console and you request access to particular models. For the workshop here, we're using Titan, Amazon Titan, and also Anthropic Claude.
So you have go to the Model access, Enable specific models, and then you can, you get the Titan embeddings which is like the vectors with what we create. And this is all showed and you could follow along very easily, right? But we'll use the embeddings for the similarity comparisons and then we'll use the Claude to do summaries, like movie reviews. So this has been edited for time because the Amazon models, request for access will be granted immediately, but for the Anthropic, the third party models, it takes a little bit of time.
It may take up to five minutes. It says "in progress" there. But give it a couple minutes or up to five minutes and it won't be available. So, I snipped that for this demo so that way we can just jump ahead and move on to the next part. The next part is we will set up an EC2 instance. The CloudFormation templates already created the EC2 instance for us, so it's just matter of going to EC2 and connecting to it.
And it's pretty simple because I'm just using the console instead of like a terminal session. I'm just using the UI that's provided in the AWS console to connect to it. So it makes it very simple. And from here, we're now in the AWS or EC2 instances with the shell. So that's it on the EC2 part and also the Amazon Bedrock part, right? So let's get started actually with creating embeddings. So what we'll do first is we'll create embed.py Python script,
very similar to what we showed earlier where I am providing clients using the Boto3 service or package within Python and then create a runtime using the Bedrock runtime. And here what I'm doing is using the model for Amazon Titan that I just enabled in my AWS account and I'm asking for an embedding from that. So, the point of this is I can pass in a string, like hello world, to get the actual embeddings. So that's the numerical representation of the word hello world. When I first saw that, that really kind of made a lot of sense to me.
It's like, okay, now I get it. I'm changing these words into these numbers and then do comparisons. So this next stock part I'm actually going to download, or actually download the database backup file and then restore it into Postgres. And it's a small database with movie data, like movies that are available from this movie database. So now I'm gonna go into PSQL and create the PG vector extension, which is just vector, and then also, AWS ML. AWS ML is the extension for Aurora ML and the vector is one that's available for all Postgres, the pgvector extension.
So this is some of the data. Like this is actually the ID number 11, which is the Star Wars movie, shows the different columns, like an overview, the keyword, genre, so forth, right? It's all separate distributed columns. The credits is actually an array of movie actors. So, this query, just regular SQL, I'm taking it, I'm just creating a single column with all that data to make it easier to pass as a prompt to go get embeddings, right? So it has the title, all that information on the card of the movies. So, I'm gonna take that and pass that to my Python script for the embeddings.
So, go back out to my shell, pass that whole Star Wars, Princess Leia, so forth. And that's the numbers associated with creating an embedding using Amazon Titan, right? If you look at those last three numbers, here in a minute, we'll see those same three numbers again. But this time, we're going to do it within the database. And this is the magic of Aurora ML.
So, I'm using AWS Bedrock schema.invokemodel kit embeddings. I'm passing in a model id, the content type JSON in that same Star Wars input. And then I can now create that same embedding that I created earlier, you know, that same SQL that I got from SQL database. I'm gonna pass that over to create the embeddings but I'm doing within the database rather than having to jump out of the database and running Python or you know, interact external database. If you notice those, say, last three numbers, I just did that so we show you that it's the same value because I'm passing the same input whether I do this within SQL or if I do it outside, within Python directly because we're using the same Bedrock model, the Amazon Titan model here. So another thing you can do then is let's update the database and let's add a vector data type to my movie's table, right? So each movie will get embedded.
I put it in a stored procedure to make it simple for you to encapsulate the logic, but it's a pretty straightforward SQL statement. Like it's that same SQL statement we did earlier where I'm taking all the data from the columns and putting into a single column for each movie and I'm passing that string over to Bedrock for that select AWS Bedrock InvokeModel to get my embedding. And then I'm doing an update to the database to set the value for each embedding for each row. And just for commit control, after every 10 records, I throw commit in there, so that way, we can move forward. So we can do this real quick for the Star Wars movie, number 11 again.
It's calling that stored procedure. And see, it's almost instantaneous, right? It run over to Bedrock and then invokes that model and gets the result and stores that in database. So now I'm gonna do the same thing, but this is for all movies. And because my table's a little bit large, I edited it, removed for time sake here. So it did the entire table with all the records inside of it. So now, moving on to how do I query this? How do I optimize the performance> And what does that look like? So what we are gonna do next is actually put some timing on so that way, we can see the timing of different queries we execute.
And let's first do this explain plan. So this is using the Euclidean distance. So these are taking multidimensional array or multidimensional vectors, right? So, I'm going to create this embedding dynamically for the input texts of lightsabers, right? So I'm looking for something that's similar to lightsabers using the Euclidean distance. And this is an explain plan for that particular query.
So it's doing a full table scan, the sequential scan on the movies table. And if I then execute the query, which is using that order by syntax to get the most similar to my lightsaber's embedding. I can see the top six records here that Lego Star Wars is the number one most similar to the lightsabers query. And then Fanboys, looks like it's the last one.
And it took 302 milliseconds to get that query result. Now if I create an index on that particular column for Euclidean distance, that's what that vector_l2_ops means. That is an index type that's only used for Euclidean distance operators. Then if I do that same explain plane again, you see the operator that greater than, dash- I'm sorry, less than, dash, greater than, that's the Euclidean distance operator. You can see now it's using the Euclidean distance index scan and the query performance is greatly enhanced. So before, it was 300 milliseconds and now it's gonna be significantly faster with the same results at 56 milliseconds because it's using that index.
Now there are other types of operators, there's cosine distance as well. And cosine distance uses that less than, equals, greater than operator instead. It's a little bit different. And if you look at this query, it took 292 seconds to execute because it's not picking up the index.
And I could show that by doing explain plan on that again and it will use a sequential scan again, right? Because the other index that was created was only for Euclidean distance operator. So now I'm gonna create an index on that same movie embedding table, or column, sorry, using the vector_cosine_ops, which is the index type used for cosine operator. If I run the query explain again, you see now it's using index scan on movie embedding idx1, which is for the cosine distance. And if you see the query performance, you can see now it is again now much quicker, 63 milliseconds, where it took longer last time.
And next is a negative inner product. And this is similar to Euclidean and cosine distance. And we don't really have time today to really dig into why would you pick one over the other. But you can see that the operator is less than, then a pound sign, or hash and then greater than.
And again, if you create an index for that operator, you get better performance. Before, it's 297 milliseconds in the vector ip ops. Type was used for the negative inner product. And you can execute the query again and see that it's- Sorry, the first explain plan which I messed up right there.
I meant to cancel. Cancel that query and then explain showing now it's using the index scan on idx2. And the query performance now, instead of being 300 milliseconds, is down 57 milliseconds.
Very, very fast. So kind of a summary of that that we went over very, very quickly. The Euclidean distance index is different from the cosine versus negative inner product, and they all have different operators.
So, you have to make sure you're using the right operator for the index that you have in your database when you do these kind of similarity comparisons. The next part that I want to cover is similarity searches. So similarity searches like- Well actually, this is, I'm sorry, the stored procedure that you can use to make it easier to do these similarity searches. And there's a parameter here for the type of operator, whether Euclidean distance, cosine or defaulting to negative inner product.
And it's using that same tight embedding and choosing that search query that you pass in to go get the embedding. And this is not required. It was a- You know, makes it easier to consume this and not have the code sitting there like your Python code. But you can see that goes, gets the embeddings for me, just show the capabilities of the, you know, Postgres in stored procedures or functions. So the movie summaries, so now this is- There are multiple summaries in the reviews, so multiple reviews for each movie. So this is for Star Wars.
I believe there are six total here different reviews for the movie. So you oftentimes want to take these kind of reviews and summarize it, right? Because you may have six rows, like in this example. You may have thousands of rows. It's like I just wanna kind of get an overview of my data, whether you could do that for insurance or I mean any kind of use case, for your documentation, so forth. Gimme a summary of all this data that I have available, so make it easier for me to understand.
Like you see that on https://www.amazon.com with reviews of products that we have listed on the website. But you can see there are six rows there.
All the different movie reviews. So what I'm gonna do is actually create this simple query that takes those multiple reviews and then store as one big column and then pass that to the Claude v2 to give me a summary of that text, right? So it took those six movies, turned that into one row and it got a summary of that one particular movie of all of the reviews, right? So, text includes reviews and reflections in the 1977 film Star Wars. That's groundbreaking cinematic experience, so forth, right? So it makes it very simple for you to create those summaries all with SQL. So then, again, I can create this as a stored procedure. I'm sorry, this is user defined function where I'm passing in a movie ID and getting a summary of that, of all the reviews. So it's the same SQL as before, but I'm now encapsulating that all in user defined function.
So this is that same slide for the Star Wars movie passing over to Claude in real time to get the movie data or get the summary data from Claude v2. And this takes about 10 seconds, or 14 seconds to execute. But it's the same result 'cause it's the same input to the model. So to make this easier for us to visualize, let's move on to actually creating a Streamlit application. So this Streamlit application lets you type in search criteria that you're looking for for a movie, comes out with the result of the most popular one and then gives a summary of that popular one, and then the next five movies that are most similar to that.
So to get this started, I've to install some Python packages and so for that time waiting, I edited that out, and then I can start the actual Streamlit application which is pretty simple. It's just Streamlit run. And then there you can get the URL. In this case, we're using the external URL because it's on the internet.
Search for- And I think this case, we're looking for lightsaber same as what we were doing earlier just inside the database with SQL code. So you can see there, Star Wars movie came back rather quickly and it's still running because it's getting that summary in real time, going over to Claude to gimme a summary of this. And the operator here is Euclidean distance. And it shows you the top five movies. And notice the order of those.
The Empire Strikes Back is number two. The last one is Fanboys. But I changed the operator to cosine distance. I get the same results. Star Wars is number one again, but the bottom, the next five movies are gonna change order a little bit.
And this is where as a developer, you'll, looking at your application and trying to understand what works best for you, you have these options within pgvector to use different operators to get different similarity searches. And you can see this one, the negative inner product, which is a combination of Euclidean distance and cosine, gets a different result, different top movie and then it takes a little bit longer because there are more reviews for that movie to get a summary of that. But then, it shows you the next, next five movies. And they're also in a different order, right? So again, you can play with that with your application to say which operator makes the most sense for what I'm trying to do. So it kind of poses some questions like, you know, what- Did you notice that it ran a little bit slower, right? It's the reason why- Because it was getting the movie summary dynamically. And notice the difference between the results with the different operators.
And then, you can improve performance by taking that summary and store that in a database, right? And so, you could batch that up, and as you get summaries or new reviews of movies, you could update the database with new summaries of that on a separate offline thread. So that concludes the demo. I hope you really enjoyed that.
I know it's fast and we don't have time to really cover everything associated with that but I'm moving on to- Ayan, you're taking it from here, right? - [Ayan] Yeah, if you can share this deck. Or do you want me to share the deck? - [Jon] I can share the deck. - [Ayan] Okay. Awesome. - [Jon] That's right.
- [Ayan] So if you would like to learn more, contact your account team or account SA. If you need a professional service or partners, also, you can reach to them. AWS also provide the professional service to create this end-to-end pipeline. And we have already shared this link in the Chime chat, so if you want to run this workshop by your own, you can use that link. It's on public link. It's a self-paced and, kind of, immersion day as we call it.
Jon, moving to the next slide, please. Yeah, so I would request you to please scan the QR code and provide us with your valuable feedback, because this feedback is helpful for us to conduct future events also. And we really appreciate your participation. And if you have any Q&A, you can ask the questions now. You can unmute yourself and we are happy to answer those questions. Jon, do you want to add anything? - [Jon] If anyone wants to and we have a little bit of time, I think, if you have any movie queries you want to suggest, we can try them out in real time because I did provision this workshop here.
So I could go to it and we can do some queries if you'd like. Or if you don't, that's fine too. - [Ayan] Yeah, if you want to showcase one sample maybe, Jon? - [Jon] Oh, sure. So, let's see action comedy with car crashes.
Crash is number one. And it's getting a summary right now. And then the other movies, Collide, Cop Car, Driver Baby, Final Destination.
And let's change it to negative inner product. Looks like the same movie. Crash was number one again.
And these movies changed a little bit. Well, I guess that there are no one asking for a particular query to search or try out on this application. I hope you do- Please, if you could fill out that survey. I'd love it if you could do that.
Ayan and I both really would appreciate that. And Ayan, you shared the link to the workshop itself, correct? - [Ayan] Yeah, that's correct, Jon. Yeah. - [Jon] Okay. Well if there are no questions, I guess that concludes today's webinar. Thank you so much for attending.
And yeah, thank you very much. - [Ayan] Thank you, everyone.
2024-12-29 01:08