John Lewis Partnership's Data Warehouse Migration Journey with BigQuery (Cloud Next ‘19 UK)

John Lewis Partnership's Data Warehouse Migration Journey with BigQuery (Cloud Next ‘19 UK)

Show Video

Hello. Everybody and welcome czar next good afternoon, just. A quick introduction, we're here, to talk to you about bigquery and, John Lewis's journey, into bigquery I'm, Christiane Ashby I'm a customer, engineer with Google cloud I'm, joined by Tom from John Lewis business. Intelligence and, Corey who's a Technical, Account Manager, with Google, so. What we're going to cover today is some business trends, that Google sees generally, in, the marketplace, our customers, and other enterprises are seeing how. Those trends, apply to John Lewis and their partnership and their, journey modernizing. Their data into cloud and I. Want to put. Sorry. Bring Tom back to bring some success stories from their journey so far. So. I want to look at some trends in data analytics now and generally. We see customers and, enterprises. Globally, are struggling, to keep up with what we call digital natives, we. See increasing. Demands for new insights and definite. Pace in analytics. And the requirements, for analytics, we see. Huge demand for volume and for, capacity, bit more on volume later and we. See a legacy, in the enterprise of a number of different platforms and data Cylons, and of, course on everybody's, lips today is the question of enterprise, security. With. Different regulatory compliance. Requirements, coming in GDP, are etc. Pretty. Much every business needs to think about security in new ways on a regular basis so, who, in the audience quick, show of hands who, thinks these ring true for you. Good. Most, of you which is what I'd hoped so, we're, on the right track at least and so. We talked about volume, of data so, the, fear sheer volume of data that's handled globally, by IT in general is growing. Hugely by 2025. The IDC reckons, that the. Data, sphere, in the world will, grow to 175, zettabytes, I had to look up what a zettabyte was, 10. To the 21 bytes or trillion, gigabytes that's, quite, a lot of hard drives. And. Moving. On from there thinking about how digital disrupts. Retail, every. CEO is, trying to change and is trying to change to meet. The priorities, of any business they've, got to look at shareholder, value and profitability, they've. Got to think about acquiring. And retaining valuable. Customers, and it's, an increasingly, challenging. Labor market, they need to think about how to invest, in employees, and invest, in obtaining new, employees, and new expertise, for their business and those, priorities, are borne out in these data, points. That we observe on the move to, a digital, first business, if we can achieve some of these impressive statistics. How. Do we actually, help, businesses, go. On that journey to start, to deliver, some of these numbers that we see they're so. Faced, with these challenges, every. Business is trying to implement change and, data. Warehousing, etc. Is one of those key parts of any Enterprise change but. How do we stop your data warehouse becoming, a data skip right. All your data garbage, in garbage out, doesn't help anybody and. Existing. Technology, and legacy, applications, can. Fall down when, they're faced with the sheer volume of data that we talked about, now. As you've probably seen from the rest of this conference and your experience, with Google in general our approach, with compute, and our entire platform is to, provide elastic, scalability, and full management, and. That provides, accelerated, adoption, and make. Sure that ease of use across the platform, is as good as it can be and our, data platform, is no exception to this big query and all, of our other data, environments. Provide fully managed continuously. Scalable. Platforms. On which to run your data management combined. With advanced, analytics, GCP. Broadens the reach of, that data, analytics. And the intention, there is that you don't need expert, data scientists, to do the mundane data, analytics, tasks, anymore if we, can democratize those. Tasks, to everybody in the business you, could get to insight faster, and you can use your data then with more value use those data scientists. To generate, more value for the business. So. What is bigquery, you've talked about it a lot already it's an enterprise data platform, it's based on a columnar database and, it, can scale from zero to petabytes. As needed. Like, everything Google Cloud it's encrypted by default and, it's resilient, to enter by standards, it's, a true no op serve list platform, and it, can deal with both real-time streaming and batch, data it. Has built-in, capabilities. For machine learning within. The confines, of a friendly, industry, standard, sequel interface so, you don't need to learn are you don't need to learn the complex machine, learning, languages. Etc to get some insights out of your data using ml and increasingly. And, more recently, it has in memory capabilities. To, improve your visualization and, bi pipelines, and bigquery.

Sits, As part of the overall data platform, on Google cloud providing. Storage, and analytics. Capabilities. That's. Part of this whole ecosystem from, capture, processing. And at, the other end in visualizing, and, prediction. So. Enough. About Google Cloud I want. To pass you over now to Tom, Lomax from John Lewis who's going to talk to you a bit about the, journey that John Lewis have gone on to facilitate. Data across their business. So. Hope you can all hear me all right and I hope you're enjoying a conference so far my, name is Tom Lomax I head up the data engineering, team in, the John Lewis Partnership I, semi, team of the team that responsible, for migrating our data from, a lot of our legacy systems into, the cloud, and. They talk really around three things this afternoon, why. Is retail. Such a challenging environment right now the. Tech issues that we're facing and, how is Google cloud platform and, bigquery really, helping us on that and. Then I'm getting some insights, into some exciting projects, where we're using, GTP, and bigquery. So. I'm hoping that you are all already, customers. Of Waitrose, and the Don Luis partnership can, I just see a quick kind of show of hands who is a Waitrose. Or a John Lewis customer, in the audience. Brilliant. That's really really good and who, has seen the Christmas advert with excitable, Edgar. That's. Brilliant if you haven't I'm sure you will definitely see it probably many many times between now. And Christmas Day for. Those of you are less familiar with our brands. John. Lewis is our department, stores general merchandise department. Store and Waitrose is our food grocer, and, it's a brand that's really propositioned, around service, and. Waitrose, is a high-end gross of the high-end food product, proposition. So. Andy McInnis asked, chief technology, officer spoke in the keynote this morning's I hope you've already seen him once today. And he. Talked about that, the John Lewis Partnership started. As an industrial, experiment. In democracy and the fact that we are all partners, in the John Lewis Partnership which, means that we, all have kind of skin in the game as a business and we, really kind of care about that innovative, approach to how we do things so, for, the last 20, years these, two brands Waitrose, and John Lewis have, really operated, as two divisional, propositions. And. That means that the data strategy, that's been in the business has really been linked towards these two brands now. You. Can imagine from a data point of view what that has meant and, it meant that we've got data in lots of different silos across the business in these two two directorates, or two parts, the organization. Now. Imagine the guy who has to join all that up I'm. That guy. So. Linking, back to Kristina's point around digital disruption it continues, to be absolutely, massive in retail at the moment, but. John Lewis is is not behind the times from a digital perspective. -. Donna Lewis website is one of the most popular, trading general merchandise websites, in the UK and, waitress, comm our grocery, website is in month, on month growth continually. For the last few years a. Real. Example, of how the business is listening. To our customers and taking, using. The data that we gather around customers, and innovating, in what we do is around new customer propositions, so actually if you visit either our Waitrose shops or our John Lewis shops you, will start to see a real crossover, of the brands and their gross over the propositions, in the, coming months so yesterday, in Southampton, on the south coast of the UK we, launched one of our new. Concept, stores in the John in Southampton, and there's, lots of different propositions, that are on sale and lots, of Waitrose propositions, so for example the waitress cookery school is now being promoted within the John Lewis brand and John, Lewis products, are being sold within the Waitrose brand and again, that creates a whole new subtlety, around how you trade, data across the organization's with different.

Product Information or, pricing, information all those things that need to be considered, but. It all is comes from about knowing. Our customers, better. So. UK. Retail, or UK the. High street it, has changed. Massively, in, the last 150, years but. Some things actually remain relatively consistent, so if I take you back in time 150. Years to, do your grocery shopping you. Would have made a list you, would have gone to the grocer's or the grocery shop was probably a corner shop or a small local store you, would hand over your list of the things that the food products you want to buy somebody. Would then shop, those products, for you and then, they would probably be give it given to you in a bag or they would be delivered to you on a bicycle, skit. For 150. Years it's. Not that different you make a list probably on your computer, you, hit, submit and then somebody delivers, your groceries, for you the, next day or the next morning just, probably not on a bicycle but. In some places maybe on a bicycle. So. The first half of 2019. Saw, some. Huge shifts and the first half years of nineteen. 2868. British. Stores actually, closed and so we can see a real shift in how. The high street is performing. That's. Driven by price wars and the competition, in food grocery and the margin being significantly, different it's also a massive change into the online proposition. And. Somebody's of the fact that so for example, the high rates of returns in apparel, is. Something that's really changing, the balance of, how. People buy fashion, and some. Of the brands that have really suffered and, the reason that we need to be absolutely top. Of our game on this summer, like Mothercare, Karen. Millen Jack Wills bass store patisserie, Valerie Debenhams, all brands that have gone into administration, this, year and that's why we're doing things very very differently. So. How do you set yourself apart in tough times on the high street, well. This using. The data we have to personalize, our products and our proposition, as I've just mentioned. Predicting. What customers, want to buy before they even think that they know they want to buy it, curating. The range and the assortment of products and, then. Using real-time marketing. And data to actually. Advertise. That products of people at just the right time and you think they're going to buy it and. A lot of that comes from joining up the data from, the offline world and the online world. And. What we've really, realized is actually it's using, the same customer, base across our two brands as where we've got huge opportunities, grow.

So. Our data challenge, coming back to my point little earlier on around our two brands, that we have in the business. We. Have, four. Major data, silos, in our business that we are actually trying to join up using the, Google cloud and bigquery, Waitrose. Data which includes waitress comm. John. Lewis data including, John Lewis comm our, customer, data which we've always kept somewhere slightly different, and our financial data and. We. Have a collection of hybrid of on-premise. Systems cloud, systems, from. A variety of data, warehousing vendors, that. We've put, in place over the last kind of decade or so. So. Why, we have thought about going to Google. And to, GCP is really, around some of the same reasons like SPECT you're thinking about doing the same it's, actually how without a huge, investment in refreshing, our legacy. Systems, our legacy platform, you, can jump to the cloud and see some huge efficiencies, so. Part. Of the reason we actually chose, Google and moved to Google was, around, a long and existing relationship, with Google actually. Jail at the Dulles partnership, has, been one of the biggest G suite customers, for a number of years and in 2014. We, went live with G suite, really. Scaled that and actually roll that to all 80,000. Employees, of our business in 2017. And since, then we've seen the real digital growth, into digital and into G beeping. So. This slide represents an, extremely, high level what. We're calling the partnership, to data platform, and, it really shows. A number of key. Google. Products and we have a prince about if there was a Google product or a blue hexagon as I call it we should be our go-to first. Place to look for a capability, but. We're looking to, ingest. The data store the data conform, it analyze. It and then visualize it ideally, using a Google, suite of products. What. This doesn't show and what will come in time and this is a journey for us and we're we're at the beginning of that journey there, will probably be other products, from other vendors that will probably be. Augmented onto, this this, diagram, as well at later point now. I'd like to hand over to Cori to, talk a little bit more about our journey so far. Thanks, Tom my, name is Cory I'm a Technical Account Manager, I've been working with John Lewis Partnership for the last two years and I, get the fun part here which is to talk through the, how, how.

We're Helping John Lewis moved from as Tom mentioned a, siloed. Fragmented. Organization. Into. A centralized. Democratized. Data-driven, company, so. Before. We dive into the migration framework, and the program we're using I just, want to take a step back and cover, some of the basics because, these are so important, to any successful, data warehouse transformation. So. First and foremost, use. Agile, delivery principles. These, are by, very definition, long. And complicated, programs. Warehouses. Are complicated, beasts which. On Lewis you know we've got half. A dozen different data silos, we're working with these. Are systems, that have been built organically. Over the years and. Full. Disclosure sometimes. We. Don't even know where. Some of these data we're, where the data is sourced and where it comes from so, if, you have to build a team structure, that's able to be. Responsive, and react to those changes the, changes will happen along the way. But. With that in mind you. Also have to define KPIs, and, agree. With your stakeholders what some of your key. What. Some of the measurements, and what some of the milestones are because, really. You, know you need to keep returning, to those throughout the migrations. And as we, go through this framework this. Will help to. To. Inform, some of the business decisions that happen with your migrations, along the way so. There's two main areas, business. And technical. Drivers, so from a business perspective you, think about things like value. Generated, from your data and, cost savings, maybe, you're gonna be decommissioning, some some, legacy systems. Saving. On license. Costs etc and from, a technical perspective maybe. You're looking at you. Know query speed response, time. Concurrency. Of users that are able to access. Data that you're trying to trying, to exploit or. System, availability, so those are all great metrics, you can establish up front, the. Third principle, is really, sitting. That governance, for, a wider adoption in, your business and like Tom said you know we're we're, on a journey here we've built the platform we're, now onboarding. More, and more systems more and more users into this platform and we, operate a, sort, of paved, roads principle, where you, know we have some, established. Design, patterns, and blueprints. That. Developers. Can come to and use. Be. Able to use as they come on the platform. That's. Not to say we don't balance. The best practices, with developers, choice so. You, know if there's. Functionality. Or tooling, that's not available, we, can run a proof-of-concept or, a spike and then maybe incorporate, that back into some of the established design, patterns, so, once. You have these three fundamentals, in place then. It's really on to picking a good first mover for, a proof of concept and. That's. When the, data. Warehouse. Migration. Framework comes into play and to. Be honest this, isn't rocket science here this. Isn't the first time you've seen discover, plan, migrate. And have a validation, loop in here but. We're gonna go just, kind of walk through some of the steps and how we've used this to, apply to some of the John, Lewis migrations. So. First under. Prepare. And discovery, this, is where you identify use. Cases and we, define a use case as. The. The the schema, the data the. Pipelines, and visualization tools, for. An end-to-end business, process, so, like, Tom mentioned we. Have a number of silos, we will look within, one of those systems start. Identifying those, use cases and. At this point you also need to understand, those non-functional. Requirements. That. Go with go. With those use cases so, is there a requirement. For data freshness is there a requirement for concurrency, of users. What. Are your security requirements. That's always one that is, important, to keep in mind up front and then. You can start as you, identify, those use cases. Modeling. The underlying. Dependencies. And in, interdependencies. There. Once. You've got those, use. Cases and dependencies, modeled you, can also start calculating a rough, total cost of ownership so. Really understanding, back, to those, KPIs, what. Is the business value and what, is the what. Are the the technical, milestones. That you're trying to trying, to meet and it's, really, important, and I fall guilty of this as much as anyone else we're. Not migrating, for migrating sake it's very easy to say well what's the point of this program oh we're going to take this data and we're going to put it in the cloud then, magic is going to happen and it's going to fix fix, everything you know we're, we're.

Looking At underlying, business, cases so that is really, crucial to, keep in mind when. You're establishing the, value of the migration, and then, keep. Returning, to that you'll see this over, and over again through these steps. So. Then you move into, assessing. And planning so, having done that discovery, work this. Is where you prioritize, your, sprint backlog. And. Really. Sort of a sprint. And iteration kind of breaks down all these use cases will, probably. Go across multiple, sprints. There's, a few different ways you. Can, prioritize. These and with with John Lewis we. Had an early. Early. Migration opportunity. So we were able to sort of exploit an opportunity, where we had a on-premise. System that was due for license, renewal and rather. Than renew, that support contract we opted. To use that as a first candidate to migrate to cloud probably. One of the key things for a good first mover is, taking. A workload, that is that, is known. So. You know some. Of these legacy workloads. They're. Complicated that they've probably grown organically, you know you. May know as much about them this, was one that had actually been recently, developed we were able to to, move it in its entirety up. Front. Also. Important, to take a sort of less risky, one because you're going to be learning as you, go along and, it's very important, for. These. Two. To, leave time, to. Allow, for these learnings along the way so. You've, prioritized your, backlog. Then. Go. Back to those KPIs and, refine, them for its success, so, understanding. Within that sprint what, is the definition of done so. What are the checkpoints. For testing. For integration, for. Everyone's. Least. Favorite four, letter word, Docs. That's, part, of the definition have done for these so. Understanding. What those those, KPIs and those metrics, are. And. Also at this stage as well it's. Really important to identify, do. You have the skills in your team to. Deliver this change and do, you need to get an engaged partner, and that. Partner could be someone, else within your organization, that has that skill set or, it. Could be a external. Partner to work with and we'll talk, about some of the partners we've worked with with John Lewis partnership. With. The uppercase. P lowercase, P it all gets very confusing with partners and John. Lewis partners. But. So. Then you get to the actual migration, path and. There's. Two, main. Forks in the road here so there's.

Migrate Offload or doing. A full migration, so. Under a migrate, offload. It's. Really, and. It's. Really around. Leaving. The, oh. Sorry. So migrate offload is where you leave a hybrid, solution in place where you still have that legacy platform, there. But. You're still running additional. Capacity. Out of the cloud and this is where when you're establishing those KPIs. Understanding. Why you're migrating so if this is a system, that maybe has been capacity, constrained, and is really complicated, and you, don't quite know what's in that can of worms and you don't want to open that yet well, you can move some of the some, of the use cases into, cloud and get. Those running and. Then. Address some of the others we. Call this within John Lewis strangulating. A system where you're going to take pieces. Of it and continue. To run in the legacy system but then spin. Up that capacity, and cloud to run other pieces a migration. Ful is pretty. Much what it sounds like so it's a Big Bang decommission. Of a legacy warehouse so. Case. In point for the first migration we did on. To the partnership, data platform, we. We. Wanted to decommission, the on-premise system, so, we brought everything across in one bang and and. We're running out of the the. New partnership data platform, so. Regardless. Of if, you're doing migrate, offload or, am i grateful it's. A similar set of steps involved. So, you. Will build, out any new blueprints. Any new design patterns, if you need to run a spike on something or a proof of concept to, really understand. You. Know is this a viable solution you'll, you'll build that out you'll. Migrate, the schema, and historical. Data and, that's. A step that can take anywhere from minutes. To months, and if you've got multiple, petabytes. Of data it. May take you months to migrate that data across, so, once. That data is there, then, you can open up the platform to, your data science teams your data analysts, and have. Them start optimizing their queries. Particularly. If your landing zone is big query. There's, some, tuning, and there's, a few sessions that are available over the next two days around, tuning, and optimizing, for bigquery really. Suggest you take advantage of some of those features with partitioning. Clustering, nested, tables etc. So. You've. Got the data you've, optimized the queries then. You move the downstream, applications so, any the visualization, tools, migrate. The data pipelines, and then, you, move into that verify, and validate loop. And. Finally just want to talk a little bit about some, of our partner network as I mentioned, you, know if you don't have this experience. And expertise in-house we, have a wide, array of partners a lot of them are here today, have a talk with them, want. To give a call out to some of the partners we've worked with with John Lewis Partnership so. We've got Apps broker, team involved, we've, got David tonic, reply, on the data side. Deloitte. And equal, experts have been heavily involved on the e-commerce side, and. Yeah. They've been really really great and helping John Lewis's so I want. To hand back to Tom to. Talk about some of the success stories we've had. Thank. You Corey. So. I am, going to talk specifically, about two, GCP. Big query success, stories, both. I've. Chosen them both because I think they deliver a value, fast, for our organization. And we've delivered them as proof of concepts, and then we've iterated them, on into actually more fully fledged solutions, so. Both are quite large transactional. Data sets Russ a lot of business a lot of information, on our businesses, transactional, data, and. Really, what we were doing is we were quickly joining, up those data sets applying, analytics. Machine. Learning, and then making it an elastic, model. So. The, first example I'd like to talk about is about joining, up online and, install, data. This. Was, sales, data in Waitrose, so customer, data what people bought in the Waitrose shops, what. They have shopped online in, Waitrose for. Us this is two quite big disparate data sets, it. Was combining, that data feeding. Into analytics and then using it for. Segmentation. And advertising. Reasons, now, spoiler. Alert this, will be talked about in more detail at a session this afternoon in, this room at 4:15, by my colleagues guy and Andy who are thinking in the audience so if you can put your hands up guy is here and Andy. Is over here so I'm not going to talk about that one in in, super detail because, I know that they will talk I will they will feel bad if I steal their thunder and I know they'll probably talk about it in a fraud of richer level of detail to do come along and during that session but, in summary it works, and. What. Took weeks to process data is, now refreshed daily, so it's a big success story so.

The, Other example, I want to talk about is, a more complex one and it is the John Lewis brand, propensity, model, so. This is a bigger, similarly. A bigger set, on a historical. Set of John Lewis transactional, data and. It's used to build complex propensity. Models, across. Multiple, brands and multiple, categories to understand, what, people have bought and what is the likelihood that they will buy another brand, in the future so. In collaboration with data type which is a company that Cory just mentioned one of our partners, it. Was used to develop a machine learning pipeline, on the Google cloud platform capable. Of generating. Multiple. Purpose. D models for millions, of customers in, a scalable way so. Previously, this was a slow, and very resource, hungry, process. By. Working with that partner we've identified ways, to massively, massively speed, that up. It's. Also something. That is scalable across different categories so it was built for one part of the John Lewis business, and that's it's now easy to reapply that for another part of the journalist business. So, I said it used machine learning models this is a quite. Simple. Aggregation. Of how the the. Google, tools or the DCP tools that were used. But. It really has allowed, us to generate massive insight, and then to. Do. Direct. Messaging, to customers, around the brands that we think that they are most likely spy next. So. Yeah as I said on this one similarly what, took normally. About two weeks to run a model for one brand now. We can run the same thing in in a matter of hours for thousands, of brands so it is from, where we were previously to, where we are now kind, of a massive. Massive leap forward in capability. And scalability. So. We're scoping, more. Wholesale, data migrations. Across, the whole of the John Lewis Partnership those. Four big silos, of data that I talked about are, on that journey for us around migrating. All. Of that data warehouse into, into, the cloud. So. I'm. Going to now touch, on some points about if you're planning a TCP migration, what you might want to think about and some of the learnings that we've had in the last year. Or six months or so and how we've overcome some of these so. We. Really did work with proof of concepts, and we did that deliberately to try and demonstrate. Value. And deliver, value quickly, for the business so, whether it's, migrating. A small set of data to do some analytics or, to join that up with some other sets of data I would, highly recommend that you you, do some proof of concepts, and you think about how you're going to do that first and really think about the outcomes, of the benefits that you want to achieve. The. Next one is around understanding. Your customers, so who needs to consume this data it the data scientists.

Is It analysts in the business, how, are you going to surface it surface' to them so if it's out of bigquery are, you going to actually, make bigquery directly to available available to them which is something that you might not have done previously are you going to allow people to access it through. Something. Like Google's data studio are, you going to allow them to access through the sheets. Connects to directly into bigquery to really think about how you want them to access the data and, whether you want them be doing, the work in bigquery and using the machine to the, analytics, engine in bigquery or whether you want them to actually just be a consumer of that data. As. Corey mentioned find, the right partner to work with you so we've. Worked with a number of partners at, the moment we're working really, really well with Apps broker who are here, today. And. We've, used them a kind of across the broad from actually giving some us some insight around what, experience, they've had had. In other organizations. Some, of the architectural, work and then through to some of the delivery work and some of the analysis work. Engage. Your security team early. Engage. Your security team early, is a massive learning for us so we. Have. Iterated. As we have gone along so it's very, much an agile delivery process, not, a waterfall, process and, we didn't necessarily have, all of the security artifacts. In place from the get-go and so we have had to form a really strong working, relationship, with, our enterprise. Architecture, team and our security team and they, have been absolutely invaluable, in helping keep, the progress in our delivery. And. That I suppose then leads on into upskilling, your team so although we have worked with partners we, have got incredible, people. In our business in our cloud team, in, our delivery team in, our enterprise architecture, team and in our security teams and by bringing them in to collaborate, and work together on this project is what has been really really successful.

But, These are new technologies and the new technologists for us and for a number of our people so, we are up skilling, and up training those people, as we go along we're. Doing some of that with our partners, and we're doing some of that specifically. Training. People off. But. That's invaluable because actually once those partners move on but, actually we need people, in our business who are able to take these things forward and iterate it to the next level and get even more value from it in the future. So. Thank you for listening I'm going to invite Corey and Christie and back up on stage and invite, some questions. Thank. You Tom. I'll. Let you do that so. We've. Got quite a bit of time how are we doing we've got about 15, minutes for questions so. For. Any of the three of us so, go. Ahead somebody. In the front here first yeah. We got two microphones, going around the room, thank. You I. Understand. You successfully. Managed, to migrate. Your data, warehouse, and you briefly mentioned data studio, and. I wonder did your analyst. Manage, to switch, their workloads, and the visualization. Side, of, the job to cloud-based, solutions. Like data studio. Looks. Like that yep I can I can talk about that one so we. As, we move more data into, bigquery we're, really trying to unlock, data. Studio for kind of quite simple, quite basic, reporting. And data visualization. We. Haven't fully replaced, our legacy data warehouses, so we haven't fully replaced, our legacy, data. Reporting, and visualization tools. Where. I think we will go with this is we will use data studio as the, as. The. Main place people would go for reporting. And visualization. Recognizing. The analysts, may need another product or products as well which, probably, go deeper, we. Looked. Earlier, this year at which products those might be and we have a fully selected, one that we're going to go with but one of the ones you mentioned I mean look as a good example of actually that would certainly be in the running particularly, as it's in the process of being bought by Google but not not, quite there yet but. That would certainly would be in the running but we recognize that probably different analytical, tools or or data, visualization, tools a slightly different use cases as. Data. Studio is part of the. General. Google. Products, we think we can get a lot of mileage out it for lots, of people within our business before, then a subset, of those people needing to use a more powerful analytics, or visualization tool thank. You. Yeah. As. Opposed to us with any other enterprise we, are in the situation where we have a mix of us opposed bats and streaming, data sources which meet after handle in this ecosystem and. There's this sort of lingering suspicion that streaming sources are sort of a second-class citizen on bigquery many love to pay for streaming inserts, streaming. Data is not available, more or less than a real-time basis, because bigquery, operates. In this principle of slots. And so on so, I mean could you expand on that a little bit more to say it's. Streaming really gonna become a first-class citizen given, more, and more of our sources are moving towards streaming I think. If, I take, that one I guess so I, think there's, I'm. Gonna move over here because I realized I'll to the light over there sorry so. It's a very good question and I think you've got to think about the, ecosystem of. Data, platforms. As a, whole within Google cloud and for some use cases the, more operational, use case is you, may want to look at some more real-time engines, or you may want to look at bigquery and I, think the, there. Are some changes that are in place like the biz the, BI engine, and visualization. Tools. That can accelerate real-time reporting. But. Realistically, you need to find the right data storage. Media the, right structure, for your data set for your workload and there's, an example of that we are doing more and more work to federate, the data that is available, in, bigquery out to other storage platforms, so you can already query, BigTable, in bigquery for example, and, looking at extending that Federation, into, a broader environment. Where you, look at the appropriate, storage technology. But, bigquery, is still an analytics, tool on top of that storage so, I think that's where we're going that's the direction of travel is more to look at the ecosystem. Of platform products and, if you look at the industry as a whole we've, moved from sequel. And RB our DBMS, is being the answer to everything, to a more polyglot, world over time now, that's caused, silos. It's caused challenges, right but being, able to identify, how to best use those different, capabilities and, those different polyglot, storage, media is part.

Of A data analyst, and a data architects, work now and the. GCP environment, is no is. No different to that does. That help answer your question yeah. I think you know just to jump in as well, there's. A lot of things on the bigquery roadmap, to improve. Functionality, for streaming inserts, and that sort of things so just watch this space the, cloud moves incredibly quickly, it, will get better. Hi. Thank. You for that, sure. Doing, your agile, development how'd you deal, with or incentivize. Your product owners to, actually handle, technical, debt and. Non-functional. Requirements, rather than just continually, adding new features. It's. A very good question Tom are you talking about in John, Lewis specifically. Or bigquery. Developers, within Google, I'm, talking, about as you do a data migration tour, to a new day Wow not specifically, with them John Lewis poem if you've got insights into how you you. Know in certain wise people to be good citizens and, not just build up technical, there and yeah ignore the governance and not just. I'd. Say that's a challenge, I would say. There's. Also something for us around getting some certainty. Around which. Particular, version, of encryption you're using or which particular, version. Of some, kind of third-party, software solution, you're using and getting some certainty around that I think really for us it's around, having. Rigorous, delivery. Sprints, and not, allowing things, to creep into those sprints. That. Take us off our path so it's about having really clear goals really, clear vision and actually keep on keeping people really focused on that task and if there is a decision that needs to be made about a new feature a, new. Capability it, comes, through the right governance process, and we make a decision about are, we going to add value to. This project or this piece of work by, adding that in now and potentially, carrying forward a bit of technical debt or is. It best to. To. Keep, things on track and deliver and then reiterated. A, later stage. Yeah. And tell. I'm sorry if I could jump in as well one of the things we're working with John Lewis on is. Establishing. A culture of sre so, for, though it's not. That. Happened heard before sre site reliability engineering. It's. Google's, approach to DevOps it. Is you. Know establishing. Those SLO, s and SLA s the service level objectives, understanding. What measures are important to your business and. Tracking. Those and setting an error budget, and being, able to you, know actually, work with the product manager saying no actually, we're, gonna hold back this functionality, and I know you really want this functionality but, we're. Gonna focus on the system stability or, upgrades, or some of that some, of that technical debt that's so important, to just stay on top of I think, if I could add one thing, it's that agile. Is not, a synonym, for throwaway. The rules, I, think, one of the one of the challenges in adopting. Agile that. He's seen is sometimes, means oh you could do anything anything goes it's all cool right actually I'd argue the opposite, is true in. An agile development lifecycle, you've, got more sprints you've got a backlog, building, up you'd. Need to develop governance. That applies you, need to develop ground, rules that, enable, your. Project, delivery to continue, so, as Corey said KPI. Is documentation. All of those things don't go away, in fact they become more important, because they're assets that you have to build out and understand, probably. Across a broader, group. Of your project team so. I think don't throw away the rules think about how that agile delivery will progress and, think, about how to set out those guardrails. To, avoid scope. Creep to avoid building up technical debt before you've even, relinquished. Your world does, that make sense. Okay. Thank you anybody, else, some. Over this side I think yeah. Good. Afternoon I'm. Interested. In hearing how you empowered. People. Who might not be data. Analysts, professionals. In. Using, the data that is available bigquery. Studio. To. Find the insights, that they need for the job without systematically. Relying on the small. Team of data analysts I. Would. Probably summarize, that in demonstrating. The art of the possible so. I think. Over. A number of years people, become, very. Understanding. Of how, long it takes to extract some data from a system, or the, fact the way they're going to receive it is going to be on a shared, network drive or an Access database or, something like that so I think people are quite conditioned, to, things, being quite, difficult taking.

Time, Requiring. Request. Of the data team then. That. Being assessed in a backlog and an eventual isse and reporting, or some some value might might come of that I think people are so conditioned to that I would say if you can demonstrate value. Very, quickly to those end users, once. You've got data into. Something, like bigquery, you. Can very very quickly gain, some momentum in, this. Is really powerful so. We've had a number of experiences where, people. Have been blown away by the fact that they can now access this data and it's, now surfaced, of them and it is performant, in a way that previously. They didn't think was an option and so, what, they previously would have done would go to a legacy database, go to a legacy reporting tool export. It into an Excel or CSV file, and then, work, on that locally, if we, were able to demonstrate well, actually you can come, straight from bigquery. Or cause, some confirmation from state from a from, the data. Base of those warehouse into. Something like Google Data studio that's. Where, I would really demonstrate, the value and it, almost leans, a little bit into that proof of concept model, if you can show people, then. You can get given mentum and then get buy-in to do bigger and bigger proof. Of concepts and deliver more work. And, I think it's key to get the data analyst, team on board early so, that they can start identifying. Those. Use cases where, actually they can say, go. Do it we'll give you the power to go do this and and they can then be some very strong advocates, for those use cases I, think there's still a way to go but. But, that's definitely the direction. Of travel that we've been seeing that works pretty, well of course a number of customers actually. Have. Any tips on. Controlling. The cost of using big way. I was, interestingly listen to the previous presentation, in this room which was focused on how do you manage cost in in bigquery. I'll. Probably let one of my esteemed, colleagues talk to that one a little more detail it's certainly, something that is on our, mind not. Only from, the run costs, once you've moved the data there and you then start to query it and we're.

Certainly Very keen that the queries you run on it are. Effective. Or well. Written. Queries. Only querying, the correct amount of data so, the data gets loaded and petitioned, appropriately. So you're managing that from, a wrong perspective, but. I there, are also challenges around, how do you predict. What your storage, annual wrong costs are going to be so, we've done a, lot, of analysis on how much data we're storing today how. Much data we're going to move across and then how much queering, of that data we will be doing once it's in bigquery to. Allow us to generate some as, is and to veer on costs, so. It's something that you really have to do the analysis, on and do the legwork on, how. Much you query today and be, mindful that in the future if, you're to get the benefit, actually you're you're most. Likely going to be querying more and doing more data analysis, but, most likely that's going to be the right thing to do if you were able to make smarter, business decisions with, that but. You've got some points on how you have to control it yeah, so, there's. The. Two main cost areas for bigquery so storage, storage. Obviously you pay for a way you consume. Then. For, or the. Querying there's, the flat, rate query. Of sorry there's a flat. Rate reservations, as well as on demand so on demand you just pay for whatever, query processing. Power you use the. Flat rate if you, want to have that consistent. You. Know predictable. Monthly fee. You can set a flat, rate reservation, now, it used to be the, buy-in, for that was quite. High to be honest you had to have quite, a lot of usage. Of B query for that to make sense that's, now dropped to a. We. Measure it by this lot which is kind. Of a mix. Between CPU. And memory but. That's now dropped to 500, slots and actually. Just, launched this week is the reservations, API so, you can sorry. In to beta you. Can use the reservations, API to add. Additional capacity. Into your reservation, on your own so you can self manage that so so, the the cost and the threshold of that has come down quite significantly it, provides predictability, and, you, can self manage I think. One last point on that is in the context, of a migration journey, don't. Forget, where you came from. It's, it's all very well saying you, know bigquery will enable a whole load of things but. If you've actually identified KPI. Is done, some work upfront worked out what your query. Rate is in, your existing, data systems, and what, your total, cost of ownership in, those systems are when. Somebody comes a year later and starts beating you up and saying you're spending a lot more on bigquery suddenly, you can say yeah but I'm still spending, less than I was before and this, is a good place to be whereas, if you didn't have that data upfront you've got nothing to measure against, so, you're, constantly battling.

Why Am i spending more rather than actually. We're enabling more insights and we still lower. Costs than we were previously, obviously, as long as you can demonstrate that right but. That's that's one other way, to look at this as well I think we've got a couple of minutes left and so. Probably two more questions more. Question one more question over, here yep go for it thank you. I have a question for Cory. You. Mentioned, that when. You want to migrate. Your feeds basically, you pick a use case and. Based. On that you, make. Your way. But. There are many use cases that are built on, top of some, dodgy, data. Feeds, and on. Top of those sits. Some dodgy. Scripts, mhm that are creating, some reports, or, functionality. So, my question is how, do you go about it like do, you throw. Everything away. And build, stuff from scratch or. Yeah. It's, and. I laugh because we as we were going through this you know you you always find that use case that also, then goes through a spreadsheet which, then goes through this team which is then put into a pivot table and, then all of a sudden that's part of a critical process for some other team and, you're, always going to find those and, I would, really encourage you, this. Is a chance, to refresh. And, you. Know and retool. If. Time, if, you don't have the time or the availability to invest in it then you know migrate. As is as much as you can but. Look to come. Back and renew that it it really is a. Yeah. I hate to say once-in-a-generation but. You know you don't, get these sorts of refresh opportunities, that often so I would encourage you to take advantage so. I'm. Very aware of time I'm afraid we have to call it call, it a day here but we're. Going to probably wander. Over. Probably. To this side by the fence out there I think, you'll all be leaving that way actually thank you so. We'll we'll be in that corner and we'll be able to take a few questions personally, later can, I also a, plea please. Do leave some feedback using, the app I mean, you should be able to exact for all the sessions and, we, do use it we do all see it and it, does feedback into future events so thank. You very much for your time everybody who Janet.

2019-12-13 13:37

Show Video

Other news