AWS re:Invent 2020: Zulily drives shopping with Amazon DocumentDB and Kinesis Data Analytics

AWS re:Invent 2020: Zulily drives shopping with Amazon DocumentDB and Kinesis Data Analytics

Show Video

welcome to dat 215 my name is joseph eidzorik i lead product for amazon document db with mongodb compatibility it is my absolute pleasure today to introduce you to sergey padlasov director of engineering for shopping experiences at zoo lilly sergey is going to talk about how zulily drives discovery based shopping with amazon documentdb and amazon kinesis data streams i'm excited about this talk for two reasons first i love seeing customers like zoo literally being successful with amazon documentdb second zulily does discovery based shopping at scale in the world of e-commerce discovery based shopping is a complicated challenge that requires both you know really flexible technologies and databases and also really creative thinking from engineers you will also notice in this talk that zulily uses multiple purpose-built databases like amazon document db amazon elastic cache for redis a search engine and all of this to deliver a differentiated e-commerce experience each of these databases enables zoo lilly to deliver a unique set of capabilities so they don't have to compromise on functionality performance or scale by trying to run all those workloads into a single database so without further ado let me please welcome sergey and thank you for joining us today have you found yourself going to some e-commerce websites more often than others where you just keep coming back because they give you a warm and fuzzy feeling i'm sergey pavlov director of engineering in shopping experience absolutely in this presentation i'm going to show you how to build a fun and engaging search experience and i'm going to let you in on a secret how you can do it faster than your competitors let's quickly cover the agenda first i'm going to say a few words about zulily and introduce the company to you and then we'll dive right in into defining a great fun and engaging search experience we'll cover alternative solutions we'll talk about technical architecture and dive deep into it and we'll close off by covering the technical impact and the customer impact of this feature so without further ado let's get started zulily was founded in 2010 on a novel value proposition amazing deals on unique products for moms kids and babies today we've scaled this model we launched thousands of products every day and they only remain in our site and in our apps for up to 72 hours our customers can find just about anything on the lily clothing shoes pet stuff home decor toys you name it anything you want and please note i said want and not need so lily is different in a sense that our customers come to us for inspiration our customers are moms and caregivers people who always have a perfect present people who always decorate their house during the holidays and always have a thing or two to give away during a cold drive it's for these customers and to support our business model our engineering team built a high scale e-commerce platform that provides our customers inspiration as they browse now a great search is an important part of this experience in fact it's so important that two-thirds of our customers use this feature it's the second most popular feature on the site just think about that so the team wanted to define what a great search experience is all about and we came up with four pillars pillar number one suggestive we want to share with our customers things that other customers are looking for what's trending currently in our app or on the site and we want to bring all of that inspiration to other customers taylor number two is relevancy in the context of zulily and please remember we have products on the site and in the apps for up to 72 hours relevancy means that when we suggest something to our customer we better have it in stock we haven't run out of this product the third pillar is diversity zolilli carries national brands for example crayola as well as some well-known boutique brands milk barn would be an example and we want to ensure that when we give suggestions to our customers we cover both national and boutique brands finally pillar number four is personalization in today's e-commerce world personalization of course means that we want to display content to our customers that's relevant to them as individuals so now that we've established what a fun and engaging suggestive search experience is let's look at what's lili built but before we'll look at the new feature i want to introduce you i want to show you what the old search experience was and you can see it on your screen right now as you can tell there is actually not a hole going on on the screen we have some trending searches and that's about it it doesn't provide inspiration and as a company that provides inspiration that's not good so the ux team decided to change all of that and they came up with an amazing new look and fill um feel for the search landing page and you can see it on your screen now so it's visually rich it's engaging it's using graphics it doesn't just talk about trending searches it covers things like search keywords but also brands and categories equally important what you cannot see on the screen from the from these wireframes is something that happens on the background and that is what i mentioned to you when we talked about the four pillars of fun and engaging search it's the relevancy aspect so on the background this feature actually checks for inventory in our fulfillment centers and it only shows suggestions to customers when we have inventory on hand and we can fulfill those orders okay so now we're getting into the engineering part just like on the previous slide before i dive into the new architecture i want to start by showing you what we used to have in place and so if you look at the screen it's basically customers using our mobile apps or the site and the apps interact with elasticsearch getting the search results as well as trending uh search keywords from an elasticsearch cluster via rest api okay so now the question is how do we build this new suggestive search experience at zulily the engineering team lives by the following principle design for tomorrow built for today so what does it mean it means that when we design a new feature we want to ensure that it is extensible and scalable that we can add on to it in the future we can or the feature can take more traffic and we don't have to change our entire system to make that happen okay at the same time when we build this feature we want to ship it as quickly as possible where in e-commerce and also this year has been really tough with the pandemic going on so we want to we wanted to really get this out as quickly as we could so we could serve our customers better following the principle design for tomorrow built for today zulily the engineering team came up with a following approach we decided to implement the suggestive search feature as a microservice it takes the data the events from our click stream analytics as input and the output of the service are trending search keywords brands and categories all validated against the available inventory and because we build this feature as a microservice it's well decoupled from the rest of the system okay so this is the high level architecture now let's move on to the next slide before we actually dive deep into the architecture i want to kind of talk about what the service does at a high level the first step is it needs to extract search keywords from clickstream now let's define what clickstream is when our customers use the apps and the site they take actions they visit they view pages they click on buttons they scroll all that is getting recorded and all of that is funneled into a data stream implemented on top of kinesis aws kinesis data stream and all of those actions live in that stream however in our case we're not interested in all of those events we're only interested in the search events and so the first step that the service k-top service needs to do it needs to filter out those search events from our click stream step number two when a customer enters a search keyword apple as an example they may be thinking apple computer apple phone electronics but they only type apple so it's our job to translate that search keyword to related brands and categories so we make we can make our search suggestions richer and so this is the second step in this k-top service where we do a lookup and we bring related brands and categories for every search keyword that we find in our click stream step number three we have to check the available inventory if you remember that's one of the four pillars the relevancy pillar of our great fun and engaging search experience so we do that and then finally step number four we have to save the results in a fast data store where they can be accessed by our customers in the apps and on the site so four steps okay in every system design you face alternatives you have you have these four quotes in the row that you have to choose between a and b and in this particular case on this project the team faced three major choices number one how do we extract search events from our click stream data we considered a couple of alternatives here number one we could have actually gotten it from our data warehouse implemented in google's bigquery the second alternative was to get it directly from the source directly from our clitstream data implemented in kinesis the advantage of the first approach you know going to our data warehouse was that those search events was already were already filtered there we had a table where where to just run a select against and get them however that table is separated from from the source you know there are several hoops that the data needs to go through before it lands there additionally we're also concerned about using data warehouse to power a more dynamic experience so we looked at kinesis data stream kinesis data analytics as our tool of choice here because we knew that we could easily filter our search events from this huge data set hundreds of millions of user actions page views etc so we decided to go with kinesis data analytics on top of the kinesis data stream alternative number two the fortran number two was whether we grow serverless or server and in aws we have for serverless we have lambdas among other things uh if we want with a server-based implementation we could use uh eks ecs there are a variety of choices in this case we didn't really see any advantage of using servers we felt like lambda was completely adequate to meet our needs and we didn't have to worry about the infrastructure so we went with lambda the fourth road number three was whether we go with mongodb or documentdb the lilly the engineering team here uses for a number of use cases it's a technology that we're well familiar with we have a lot of experience and so we felt comfortable making that choice however we also looked at document db and managed service in aws and document db provides basically all of the functionality at least the one the functionality that we needed for the feature that we were building and we would not have to worry about the infrastructure i vividly remember a conversation i was having with one of the engineers on the team and i said look matt do you want to be a mongodb expert or do you want to be a search expert he said he wants to be a search expert and that settled the debate we chose documentdb as our data store okay so now we're on the slide that's the meat of this presentation this is the component diagram and i'm going to walk you step by step through it so you can learn how we put the service together we're starting from clutch stream implemented in aws kinesis data stream again our customers interact with the apps with the site and all of those actions are recorded and sent to the kinesis data stream on top of this kinesis stream we implemented a kinesis data analytics filter this is a really cool technology because basically it allows you to write a sql statement or what looks like a sql statement against your live data stream it's incredible literally you write the sql statement a few lines you specify the destination for your data and you're done there is no service to worry about it just starts working when you set up kinesis data analytics you have to specify the destination for your data in this case it looks at all of the events in the click stream it filters out only the search events and it sends them to another kinesis data stream this is actually a common pattern when you build queue-based systems you often will be chaining up your data streams it could be kinesis could be profit it could be anything and injecting the business logic between those streams and this is exactly what we're doing here taking all of the fluid stream events applying a filter and capturing the results in another kinesis stream okay so now we've got our search events what's next next we have to look up brands and categories from for the search keywords that the customer entered and so we chose to use lambda here uh to perform this operation um on the diagram it's illustrated as event transformer okay and so this lambda function gets triggered when new events appear in the kinesis data stream that integration is really simple and actually that was another benefit for us choosing lambda here is because literally to get lambda triggered by a kinesis data stream you don't need to write any code i mean unless you do infrastructures code so we'll get our lambda function it gets triggered when new events appear in the search events data stream it performs a lookup of related brands and categories and it saves the results in documentdb in reached events data store all right so we've got our top search keywords top brands and top categories how do we get them to our customers well we still have to do the inventory check remember and we've got another lambda function so this lambda function here runs on top of document db and if you think about it how lambda can get triggered or how we can run it on schedule is a really cool feature it's very useful we can use all the serverless computing compute with different technologies and they all work together nicely in this case lambda runs a query against documentdb just like it would run a query against mongodb it grabs all the top search keywords brands and categories for the last x hours and then it performs the inventory check it needs to ensure by making a call to an api it needs to ensure that we have those products in stock and when we make a suggestion to a customer we can fulfill our application we can fulfill her order this is the second lambda function after we perform the inventory check the lambda function saves the data in a fast data store in this case it's elastic cash for redis and this data is available for our customers through the search api and this is the architecture end-to-end just to quickly recap we're starting with all the events in the kinesis stream we apply a filter using kinesis data analytics identify our search queries look up related brands and categories for those search queries save the results in documentdb and then we have another lambda function running on top of documentdb every x minutes that brings up all of those saved top search keywords brands and categories and performs an inventory check and then saves the results into elastic cache all right so let's talk about the advantages of the approach like why is this cool we went from the ux design to production in just 10 weeks just think about that 10 weeks and this was in the middle of pandemic when we knew that we wanted to get this out as quickly as possible so we could serve our customers better okay so very quick implementation here why was it quick to a large degree because we used available managed services and we did not have to worry about infrastructure the team was focused on the business logic not infrastructure setup and this was key to the how fast we could implement this feature we also followed the design principle that i mentioned we designed for tomorrow so we might made smart choices about the scalability the extensibility we decoupled the service from the rest of the system and yet we built for today we were able to deliver it fast to our customers now of course if we build something but customers don't appreciate it they don't like it well the impact is not that great so how did customers react to the service customers actually did like it quite a bit in fact 75 percent of our customers use search suggestions that they built for them ironically it has a side effect fewer customers perform raw searches because they feel the inspiration that our suggestive search gives them and this is again what zulu is about we're different we want to create give people inspiration as they interact with our store also as an of course important metric we we're seeing higher customer retention after we implemented this feature so all in all from the customer perspective i would say it was successful none of that just none of that would be possible without the team who built this feature and i wanted to give them cudas here thank you so much nate suraj ram divish and wendy you guys were behind this feature that created this new and fun and engaging experience for our customers thank you very much so let's quickly recap what we've learned here today i shared with you a few facts about zulily and then we defined what a fun and engaging search experience is all about we looked at alternative solutions that the team considered when building this feature we defined and walked through the technical architecture and finally we closed off with the technical impact and customer impact before i say goodbye today i want to share with you a quote from one of the engineers in the team wendy she said as little developers we come up with ideas and try different technologies to get the job done and the reason she said that is because it exemplifies the innovation the engineering culture the innovative spirit of the engineering team absolutely we face challenges we have faced problems that we need to solve and we have the autonomy and the power to choose the right technologies for the job to choose the best technologies for the job do the design well and deliver features quickly so in this presentation i've only touched the surface of several aws technologies specifically in reference to documentdb i encourage you to take to watch some additional sessions to learn more about that and learn how you can take advantage of that as well i want to thank you very much for your time today thank you for watching i'm sergey patlazov director of engineering and shopping experience at zulily if you have any questions you can contact me at askpadlazoff zulily.com also we're hiring and so if you want to join a dynamic team that builds great solutions just like the one i showed you please visit our career website at zulily.com forward slash careers and maybe we have an opportunity for you again thank you very much and please complete this session survey [Music] goodbye

2021-02-08 18:11

Show Video

Other news