Real World architecture considerations for Azure: how to succeed and what to avoid - BRK2215
So. Good, morning everyone welcome, to our session. So, the session today will, be on real-world architecture. My. Name is Thiago Barbosa I'm one of the fast-track. Engineers for. The azure team working for Microsoft so, with me today I have will, for. The fast, refresh your team as well and, we have the pleasure to have Jamie Yates coming, from Flybe to. Talk. A little bit about his, interaction with the fast track team in his own experience, with with. Our team and how, we help them. Design. Their own architecture. Ok so, first, of all I'll. Start by talking. A, little bit about the session objectives then. Will is going to, walk. You through some of the the ways that we work with our customers on, the fast track for azure team helping. Them design their architecture, in Azure and, then. Is going to cover or what. Some. Of the things that you should be doing when, working with Azure some of the general good practice some, design patterns that we typically, see in, our that, our customers, should be implementing. So on infrastructure. Application. And data persistence as well and, we. Will have some prizes. At. This point so don't, leave the room before that ok so we have some questions and you can win, some prizes there, then. I'm going to cover some of the things you should be avoiding so there, are a lot of things that show up in a lot of customers, and. We. Will give you some tips and tricks there so you can avoid all, of these pitfalls. All. While also walk you through some of the common anti patterns here and then we will have, Jamie on, stage to. Talk. A little bit about his experience with the fast-track team in, the end we'll have some time for Q&A. And. Some time. Too to close the session ok so. We. Have our. Objectives. For this session so the first one is. Basically to to, show. You how fast refresh your engineers work with customers and help, customers. Architect. And design their, solutions so. They can be highly. Scalable. Available. Reliable, and durable solutions. When of, course. The case for your application. The. Second thing that we are going to cover in this session is we are going to share. Some proven best practices, from our experience, working with customers, all around the world so we, are going to give, you some tips and tricks that you can use and hopefully. You can you take some of these learnings and implement, them yourself so the, next thing that. We, want to show you and we have Jamie here for that so we want to show you how customers are actually leveraging, the fast-track pressure team and other teams at, Microsoft, so they can. Basically. Get easier. Into Azure so there are a lot of things that, you need to to think when you migrate to measure so we were are going to give you a real early. Example of how that works, and. Of course, we want to inspire some confidence, in your architecture, design so, we, are going to give you some of the the learnings we got, from the field and we, hope, you can take all of these and implement, it in your, architectural. Solutions ok so. Last thing that I need to mention is that this is a level 200 session right so we will talk. About all of these challenges we'll give you some tips and tricks but, we will not. Necessarily talk. About code ok so if. You want that we, have a FasTrak treasure booth downstairs. And down the. Expo area so, you can come there and talk. To us we you, can always book a session with us. So, myself will and probably. 10, plus engineers that are there follow the blue jacket follow the blue jackets right so, we'll this. Is your time thank you very much ok. So I'm. A little bit about fast-track, for our on what the approach to architecture and, how we look to build solutions, with customers but before, we do that let's just take a quick show of hands in the room for. Some audience participation straight. Off so. Who here has actually had, an engagement where, FastTrack, Fraser have helped you do, something, with your service. One. Percent maybe, who. In the room actually knows about fast-track, Fraser and and heard about us before you've arrived here, that's. Not bad it's 25 percent so 75, percent the people in the room we. Just didn't even know what faster at Fraser was so if you haven't put your hand up for the last two. Okay. People. You just put their hand up we're, targeting you today. We. Want all of you down at the fast rap Fraser bar later on and. Hopefully we'll give you enough of a taster about how to work through in the real world architecture. Situations. That you can't wait to come down and join us, okay. So what do we do when we work with customers and this isn't. Exclusive. To FasTrak this. Applies to you. Know to a lot of T's within Microsoft but we like to think we're subtly different in terms of how we approach things.
So. Firstly we, all kickoff will start to look with customers, around, what. Their. Architecture. Looks like we'll go through any documentation, that you have we'll, talk to customers, and we'll go let's. Apply a review. And guidance framework, so. We've got a bespoke guidance, framework, that takes in a number of aspects, around. Different pillars of services, around different solutions, and we'll combine this together if you take a look at the bottom left so, you can download the slides you'll see it's open source so, it's up on github you can go and take a look at that and you'll see the kind of things that we talked about that, we questioned that we find the. Kind of questions that we'll ask and the kind of recommendations, that will make, and. They, might be subtly different from the ones that you're used to because, no typically be. Around. Informing. Customers around. One of these four or five pillars from the architecture Center so, things like design. Patterns, if. You're trying to do something that's already been done before why not go and pick up an established, way of doing it and apply that to the situation that you've got. Anti-patterns. So, Tiago is going to talk at, length of some of the things that we found in the past which you should not do things. Which if you do them in any cloud environment, regardless, of whether it's a show or not will probably blow up in your face in some way and he will tell you why. So. We'll talk you through a couple of reference architectures, we've got a couple of slides but kind of where you start when you're going, to begin on architectural, design with Azure so. The. World of as you can be quite big and scary there's a lot of different things and a lot of different services out there and if you're not sure where to begin it can be quite intimidating so. How people walk through that road where. Do you start with reference architecture, you want to start with best, practice and one dose here in that term yeah we're wok customers, through how actually, you can use these sessions and. Then. The pillars of software quality so, that's something which is founded. By our patterns and practices team we'll go through where those are and, what those different pillars are in a moment. But. First a quick quote. Whenever. We're discussing architecture. With with the customer you're. Almost always, going to hit a stage where there. Isn't alignment, on what may be people within the customer are trying to do or different teams within the customer so there. Are some not, necessarily, heated discussions, but honest, disagreement, it very often forces, people to think about the objectives, and what they're trying to do with the solution, so, we can architect, something, which allows them to achieve the objectives, that they've got, frequently. We get people who come in to us and they, actually don't really know what they're trying to achieve so. They don't really know what their architecting, for and they architected, for reliability and they architecting for performance, are they architecting, for both yeah. And what levels do you need what SLA is it I'll drop into some of those in a moment but, there are three key pieces that we start to talk about number. One being, the purpose what is the solution going, to do what's, its reason for being, and. We tend to express that in high-level functional, requirements, agile. Processes, are great as well so you can express that in being stories, in in various tasks but. They tend to then derive into, the, success criteria which, is how does it need to do it and from. A architectural. Perspective where you're designing solutions that run Imagi this. Is the critical piece if, you don't know, if.
They've Taken to the show hands how many people have tried to build something without. Knowing what its actual requirements, are behind the scenes. How. Many people bear, the scars of what, happened when you tried to do that, yeah. Fair few stunts so so that's where we would encourage people to begin all, the time yeah if you don't know what they are go back and find them out and challenge people because you. Can't build a building without. Knowing, what the building's going to be useful and what, it needs to be sustained against, yeah. How big does it need to be how many people need to come through it, what. Does it need to withstand hurricanes, does it need to withstand extreme. Temperatures. Though, the kind of things you think about when you're architecting a building and software. And software, architecture and a jury is no different whether. It's infrastructure, or whether it's platform, or where it's software-as-a-service all, of the same rules apply you just apply slightly, different patterns, and. Then lastly we talk about stakeholders, so, how many other internal, or external customers. Might there be involved, do we have an SLA commitment, to meet with them. All. Of that arrives, into business objectives, of the solution and that's the foundation, of what would underpins, what, we discuss. Above. That we can talk about the pillars of software quality which we'll give you in a moment or. We can talk about functional, aspects of the software and we tend to address those separately so. We can do it what we call it deep dive or a solution review which, will focus just on one aspect so focus, on maybe on your implementation, of sequel, database rather. Than a high level architecture overview, which is the umbrella of your whole solution. So. These are the areas which you would want to consider and, generally. They assign with patterns and practices if. You go into pants and practices you'll see that those left-hand, areas are very similar. To what Kat and P and people as pillars of software quality, we. Add a couple of additional ones within fast-track so the, kind of other general, observations, you, know all of our engineers, have been building software for quite long time working with other customers in other places sometimes. We see things which aren't necessarily documented, from prior experience. Cost. Design is a critical, one well. Architect not just to a performance level and a resilience level and a security level and potentially, a deployment, level, you also need to architect a hit across power cost, barrier really. Any ice fees in the room anybody who sells software or. Build with services, that other people consume yeah. So super. Critical for you guys because if you get cost is I'm wrong you're not going to make any money from the service. Okay. So pop quiz we've got some prizes first. One is a nice simple one to get you into my English narrative, of using a lot of double negatives all the time. Which. Of these possible approaches, the, fast-track Fraser not due when we talk in design architecture, with a customer and. We'll give you the force so do, we not apply the class track and guidance framework. We're. Not default, to a really, awesome Hou bar architecture. Yes. Do we not use patterns of practices and reference architectures, or do, we not understand what the customers trying to achieve. Okay. So can, those, people who have their hands up just keep them up for a moment we'll. Try and get some t-shirts handed out yeah I. Think. Of course that many it's B right. So. What. Do we do what do we want to understand, and what's the general good architecture, practice so. This is taking the next step we've got our objectives, we understand, what the system is trying to do in the customers trying to achieve now we, can actually look, to make something tangible we understand, our goals let's turn them into something, visible.
So. Where do we begin my. Apologies for this, can. You see this at the back okay. Yeah. Okay so you might want to go and grab the grab the download but I will walk through these yeah what's, not particularly important, is what's actually in the small boxes if you can read the middle box then that's great so. We start at the beginning with the simplest, and, cheapest. Option that will get the job done from, a reference perspective. And for those of you at the back of the screen what's, at the top left is a simple web app, it's. Basically an azure app service up with a sequel database behind it, at. The bottom left that's a single VM on premium storage to give us an SLA, simplest. Possible architecture, a lowest cost and then. If we've got requirements, for the system to be largely scalable, for, it to maybe be tiered maybe it's an NT ER application, you're deploying you, know maybe it requires some infrastructure, behind it that's non you know path deployable, or. Maybe you're looking to do a virtual data center implementation, we can start to add additional complexities. In based, on the requirement so we've got so. That, middle stage is then at the top is a simple reference architecture, again it's a standard one that's up on our site and that, adds queuing, adds, Redis caching and add to see the end to the weber high. Levels of scalability and burst ability, if. We looked at the bottom that's an infrastructure, style. Environment. And that begins to add in multiple. Entire stages so that's actually a typical design of a SharePoint farm yeah. But it will apply to any NCR up the host database services, behind it and then. If we've got a requirement, for a super H a service, if, this this needs to be bulletproof and we'll start to add things in like multi region H, a failover using traffic manager for example and, then you'll start to see you've got, it. Is quite small actually looking at that slide on the far right hand side there's a yellow peri yellow boxes and, that's a sequel server always on Geo distributed, cluster, so. That that service would survive the loss of a single region. And. Would, just keep operating. Particularly. Poignant given the events earth I've gone on over the last last. Month or so so let's get that one out the room start to start with, the. One at the top might be south central the one at the bottom will be somewhere else. Okay. So general good practice we've banged on a lot about. Determining. And requirements, but that's absolutely, critical you need to know what your cost envelope is you need to know what those non-functional, requirements, are if. You don't you're, going to architect, something, which doesn't meet them, yeah. And you won't know until. Something, either failed or you deployed something, which missed the target you, find your target and shoot for it okay. Secondly understanding, as your storage performance envelope anybody, see our keynote. Announcements, around storage yesterday. Yes. A couple of people in the room so so, understanding. What those different tiers can do and what the boundaries of them are, effectively. We're at the stage now where you can architect any storage solution energy you want yeah. He's pretty alter, SSD tier you. Know 160,000. Drops per volume that's, big, disk storage very, very fast equally.
At The other end, 500. I ops for volume for standard storage that's very very cheap reliable, storage and somewhere, between there you've got the balance for your solution that you're gonna need it's. Understanding, that envelope and which tier you want to use it really important. That. Scale, out. So. I, can. Pick a bigger instance, if I want to if I just want to scale up and I'm running a VM or I'm running an app service or something like that and I just want to add an extra box and I, can stand a reboot of my service that's great ok just going go change it and we get a bigger box it'll reboot come back on if. I've got a service level that doesn't allow the downtime for a reboot I need to scale the service out by adding multiple instances, also. Scaling. Instances out by adding multiple instances, if, you're in another option in terms of cost it allows you to scale them back in so. When the services, are you. Know quiet you, can shut down the additional instances and your cost base line comes down to. The size of the smallest instance, that you've got is the minimum and. Just simply add multiples, of that way you can to. Add additional levels, of scale so whatever you need or the maximum threshold at service there. Are very few workloads, that need anything, beyond. The. Size of scale that a standard app service plan could do which i think is 80 odd course out across ten instances, if. You need more than that you can have multiple instances so that of that as well do a storage dam staff approach. Okay. General, booklet practice continues, we. See a lot of people build out solutions, that can't, be lowered testing or extremely, difficult to load tests consider. How you're going to test and performance, test and throughput, test your applications, at, the very beginning and. Normally that means we can expose some kind of API or maybe we can do something with browser, stack, to. Do some kind of web-based interaction, or testing or automated testing there but think about it all applications, are different in terms of what they look like and how they can be tested but it's important you're thinking about that there, are a couple of other key points we'll touch on things. Like dependencies. So, do. You understand, when you're building out a service what, its dependency, graph looks like, how. Many people have got a service they've. Got a fully, documented dependency. Graph and they understand exactly which, service calls each other and what calls off their their, service one. Yeah. What sorry. Because. You've automated, the process of pulling it yeah great yeah that's the other way around yeah you do it from bottom up. But. Only one person and they automated, it so if something goes wrong you're going to spend a lot of time tracing which component, failed why, if. You've got that ready and the, service goes down you, can be watching and monitoring those services, and you can maybe swap something else in or take action before, your service potentially fails. Then. Lastly on this slide architect. For business continuity and disaster recovery how many people have a fully documented disaster, recovery plan, for their applications, how. Many people have tested it keep your hands in here if you tested it yeah. How, many people were successful. When they tested it. How many people test that regularly and tested it within the last three months. About. Three to four percent of the room yeah, and I know that one's a little bit disingenuous right, so some, people test on changes and maybe haven't deployed for three months etc but, but, you can see a general rule there is that. 90%, of the people in this room, have. A tested disaster recovery plan, that they tested inside the last three months or so that's. Critical right that's, not something I bear the scars of that from an enterprise environment that. Had a recovery. Failure, from, an ERP system actually a subsystem. That's. Absolutely. Critical write. It down document, it automated, if you can and test it to death. So. Building out services, away, from kind of looking at the dependency, graphs of what comes in how, do we know what we're going to build. Assuming. That we understand, everything now so there's. A number of different compute options, within. Asia. Starting. From the left hand side we have lots of control when we deploy virtual machines and potentially. Virtual machines scale sets yeah. We're, built with generally, building pets with virtual machines well scale, sets were building stuff up spinning cattle that we can tear down as.
We Go across into the rice with, compute, options become slightly more restrictive, yeah. Not completely restrictive but slightly more restrictive, but, they become more scalable, or agile because we're managing them for you and, as we go across to the right in through so, something like service. Fabric for example where we can do patching extensions, we can manage the platform, and you, just deploy stuff to it right. Into the very very extreme, examples, which would be functions. Logic apps service. Fabric mesh one, anybody. Here interested in mesh, yeah. Anybody been to see the product team on the expo floor, yeah. Is. It down there at the moment head. Down and see them speak to them if you need to but. You know mesh fully, managed yeah just, give me it give me something to run and let us run it and same. With functions you know give me some code let me run it logic. Apps give me a piece of designed, logic and, we'll scale it out and run it for you look, into those towards that right-hand side really how allows you to architect, solutions much more quickly whereas. Looking to the other side, realizes. Actually tonight to my left up here but. Running. Virtual machine environments, allows you to move legacy, stack stuff. You can move stuff that you can't put it to pass. Ok, so a few are there patterns and practices that we mentioned, I'm going to go through a couple of infrastructure, ones a couple of appdev ones and a couple of data ones and then, I'm going to hand over back to TR go but, the infrastructure ones so. Again can you tell us could anyone recognize, all four of those a. Couple. Of people these aren't the only ones they're key ones that we use I think, two people put their hands up so let's blossom quickly federated. Identity, so. Who logs on using either a DFS, or azure ad b2b, into any kind of service, every. Single person in the room at, this point should have their hands up if you, logged into the Ignite registration, portal. Yes. So when you went on and you logged on with Facebook or for. Us at MSFT employee which can x-ray ad you're using Azure ad b2c. Yeah. I intentionally. Didn't frame the question is did you like the portal. Yeah. So that's basically you look at a pattern, which is federated, so you didn't have to go and create your own accounts, tag. Them. Keep. A patent. So. The gatekeeper, is basically, I put a security instance. An instance, of something which is dedicated only to security, and nothing else a reverse. Proxy layer, something. Like a waft here or an appliance that, basically. Protects my application, from attack and if. That externally, facing instance, gets compromised, for some reason you, don't have the keys to my service because you're not running directly, at, the web layer or at the the API, layer that. It's, a very complicated diagram, behind the scenes that you can see there but actually, that's all it really is you stick something in the way and then people attack that and it doesn't hold any precious data it's just responsible for forwarding and sanitizing, request. Classic. Example would be something like application gateway without, what here. Okay. I'll a keep an anybody, using that for file uploads. One. X. Valet key instead. Of giving you a storage key so you can upload things I give, you a shared access signature, so. You can only upload that one blob then I don't have to give the entire world my entire storage key, yeah. And you'll see that implemented, a lot of ways it's, actually a security, pattern, to, delimit the attack service of giving somebody a key but, it can be used in applications as an offload and performance pattern if. I give you something that means you can upload this, directly. To wash your storage and all I have to do is give you a link that you can post that your, application, doesn't, need to maintain an open connection, to our web server and it will make application, dramatically, more scalable but handling long file uploads. And. Then, lastly static, content hosting pattern did, anybody see the announcements. Around. Microsoft. CDN. Yeah. So obviously we've had other, CDN platforms for quite some time so the Verizon and Akamai but. Effectively static, content hosting pattern you can either, choose. To just read pre-render content put it on storage, yeah. That way you don't have to spend the CPU cycles rendering it every single time it gets displayed, some. Some people do to use output caching or you can simply just stick a CDN, network in front of your endpoint and let. That just handle the caching and detail for you not, appropriate in all situations the highly dynamic sites, have some intricacies there, but. In general static, content hosting is a great way to reduce load, on any, hosting, infrastructure. Okay. So, next. Piece, got. Another prize set so. If I want to offload the overhead of writing uploads and downloads of, blob table or queue storage. Directly. To Azure storage which pattern do I need to understand, is.
It Valet key, static. Content hosting. Iterated. Identity, or. The gatekeeper. He, hands up. Absolutely. You're going to give me a hard time yeah go on just these two. Again. God get your hands in the air please yeah. Absolutely. Okay. So. Application, patterns. Some. Of those had a had a kind of bent into application, but in particular there are some application, patterns which. We think it's really important to know so, the infrastructure guys this, is probably still quite handy to know because a lot of applications, you will be hosting, yeah. We'll be doing this behind the scenes. You. Base load leveling if I bought workloads that do this in terms of spikes and troughs in load a classic. One for this is ticketing, sites, who's, been in the situation where they wanted to get tickets for an event some, sort and, they've gone the tickets get released you go there no sites dead, yeah. Couple of people because. They didn't implement that. If. They implement that the workload will stay roughly constant you'll. Just get a message saying you're in a queue hey, you'll, be requests being processed it's, gonna take a little bit of time because we're here experiencing, high demand yeah. Until eventually the queue fills up and then the site falls over so. When that happens we implement, the second pattern which is competing consumers. Yeah. So if I've got a really, really long queue only. One instance processing, it the instance becomes the bottleneck. Simply. Add additional, instances, of the. Consumer, and those instances will compete for messages on the cute I just grab the next one and the other one will grab one and next one will grab one and eventually, that will come down really. Powerful, in applications, that will use auto scaling, yes. I don't normally just have one consumer, my, queue gets longer I had another load of instances, process. Everything on the queue and they shut down the additional instances so I'm back down to one again. Really. Compelling way to make really highly scalable. Applications. It's. Not that difficult to implement again.
All Of these are up on our patterns and practices site there are also examples in, code of how to implement them. Okay. And then lastly one thing that we tend to see people start to lean towards once they've done the top two is, command. Query responsibility. Segregation, so, and that's around separating. Reads from writes and you, can write them to different models so lots. Of people when they design stuff if we start to look at data patterns, you'll, start to see things. Like contention. Between read and write models so, people have very very highly optimized, databases, usually. Normally. They're optimized, for read, because. There's usually not that much write throughput if you've, got databases, that need to do both high throughput reads and through high throughput writes the. Optimizations. That you'll normally make to, a database are different. So. What makes sense is to have two stores when. That you write to one that you read from which are optimized for for those methods, and you can even have completely different, methods. To write them one might be using entity framework one might be using some kind of direct query methodology, or maybe, one's writing to something like an event store. And. We'll talk a little bit more about the events awesome pattern in a moment but something's writing through two events and then when you're reading you're reading from a nice view which is cleanly optimized and has all the aggregations, that you need. Okay. A couple. Of these around reliability, and performance next, so throttling, is a classic, one anybody. Seen. The situations, where as your, storage, if. You put a single disk there. Has a more than 500, IUP's through it starts. To see interesting, performance, characteristics. Yeah. One to anybody, on this side a couple. So that's because a throttling yeah. We designed to a performance, level of X we state what that poorn formas level is and if you go through that request will simply start to return you, know to many requests, messages. Back from the storage infrastructure, yeah. You need to understand that pattern to be honest even if you don't want to implement it but understanding, that means, that you can start to do additional things so you, can protect, multi-tenant services from abuse by a single tenant, for. Example I've. Got, children tenants, you've all got separate keys I use, something like API management. It's a thatíll request based on the incoming key one. Tenant runs amok we can throttle request so they don't. Deny only as a service to the other users. Circuit. Breaker so. The opposite of the retry pattern we. Try parts really important to be able to know that things either, work or don't work have retry them when, you expect something to work again circuit, breakers the opposite use, circuit breaker to say I know that's down there's. No point in retrying that operation, because it is never going to work, wait, until it's back up and then retry it I. Just. Want to look at the implementation actually, up on the patterns and practices size the source code it, sounds complex that to start with should. I reach why shouldn't I retry should I fail over should I do something else there's a really nice implementation, tour through and, then we've got cash aside. Ash, aside basically, deals with how to do right back cashing in an environment, which has, multiple, assume. Is so, I. Go, and fetch my data I go, and look in the cache first to. See if the data that I need is present if, it's not present I go and fetch it from the store and put it in the cache so.
I Write I write it in on query. If. It's, if something them writes to the original, store notify. A consumer, of the cache that the date has changed and. Flush it they. Just knock it out from the cache whenever it gets updated the. Effect of that is you'll never see stale data because. The next request to go and read it will go and read it from the cache and put it back in there again for you until, it changes and then it'll get knocked back out so. You only ever going to have one read or one piece of data until it changes really, really elegant way she was hugely with something like really skosh. Okay. More prices so, which popular, queuing patterns dependent on the implementation of another pattern as an enabler. Is. It competing consumers, is. It circuit breaker. Queue, base load leveling, is. It throttling, this. One's slightly more difficult actually it's kind of needed to be listening to where we were going okay a couple, of hands over there assume, you want to call that out. Hey. Keep, your hands up. Yeah. So, competing. Consumers. Occurs. When you when you're seeing throttling, based on, the length of the queue and you need to add additional. Consumers. To, process, that Cuba. Let's. Talk data so. Again quote from Sacher you. Know everything's connected to cloud and data and all of this stuff is going to be run by software. Yeah. The. Software. Underpinning, the data platforms, obviously, is software based you, know what we do with data, is software processing. Everything. Is underpinned by the data. Let's. Talk some data patterns, so. Polyglot. Persistence, use. The appropriate type of data store for the job yeah. There, are teams within Microsoft, who, quite frankly probably, won't particularly, like people, calling this out but we do it because it's right for the customer you, don't have to default to a sequel database store for everything just. Kind of the de-facto choice, when you put in put, in web applications, out there I'll put in yuppie, applications, if. You've got a choice on the store you can use have. A look at the other options that are there sequel. Is write in a number, of scenarios where you need a relational, database, but. I've, seen people putting up, data, for, example large binary objects into sequel tables when all they needed is an index look up against a plot well blob storage is a perfect, choice for that you. Need to index across it the. Sequel, in terms of the store if you.
Compare That in terms of cost with blob storage sequels, up here Rob stories just down there. You. Know they all have different performance characteristics. They all have different cost characteristics, and different, scalability, characteristics, the. More you can keep out in terms of binary data of a relational. Store the Bela. It's. A document-centric, stores, semi, structured stores op storage data lay what am I going to use, that for. Document-centric. Store something like cosmos. DB in its, document, API, for. Fun. Storing, JSON documents. Probably. Makes sense to store them in a store that's optimized, for storing JSON documents, and. Then we've got graph store so the to which icons we've got up there for a sure a DS graph API and again, for cosmos is gremlin API if I'm storing information about, relationships, between people or objects if I'm storing edges and vertices it. Makes sense to use a graph store that's optimized, for storing graphs. Or. When I use more than one and that's what probably got persistence is all about having, an adaptive model that can use the appropriate type of store for the job and can, choose on-demand what I want to use. Okay. So we mentioned earlier with, CQRS, and. Bringing, it bringing data. Into. Potentially a different model for, read and our different model for writes I kind. Of hinted events. Yeah. So I store something I raise. An event that it changed I've. Got a reader that's capturing, that data that's, being written and writes, it away to update a view. Somewhere that I can use for my application to read against, that. Model is called event sourcing, yeah. And it's really powerful in that I can have multiple, stores, that can be updated at the same time I can have multiple things subscribing, to a change feed change. Feeding cosmos TVs great for them right it's, basically a log replay of everything that's ever been stored in cosmos and you can just subscribe to it bang, bang bang bang bang bang I've got 10 stores and he's kidding when his sync subscribe, to the change feed write them out job done, yeah. Event hubs he's also commonly used for something like that or a service bus to. Actually push stuff and serialize it prosecute, and. Then lastly, once you've got events or seeing what we're really doing here is we're using a materialized, view so. Because I'm optimizing. For query I'm. Going to do pre aggregations, maybe this this one materialized. View is a really, old pattern really. Really old it's just going to come back to light in modern data warehousing, and modern, cloud analytics yeah. It simply. Gives. Me a view which is optimized, for lookups. And aggregations. Of a query but instead, of doing, that executing, that query at runtime have.
The Result of the aggregation pre, stored and. That could be an OLAP cube. It, could be anything. Which is called path analytics, really but. You can use it in an application or just simply, to speed up reporting, an, infrastructure, style environment. Okay. So last. Um comment. From me last price, which. Pattern would you use when you want to maintain a very fast store of custom query aggregations, when, your primary store isn't optimized, for that. Is. It a event, sourcing. Is it be materialized. View. Is. It see index, table, or. D microservices. Will. You want me to take those ones now because I'm gonna hand over okay. Yeah. So, thank. You will for covering all of these best, practices so I'm going to walk you through some of the things that we commonly. See our customers, implementing. Right so so these are there. Are multiple things that we could put, into these slides but these are the most common patterns that we see customers, trying. To implement right so. First. Of all in. Talking about scalability, so, one. Of the the things, that we see customers. Do, multiple. Times and this is kind of more on on or. Application, developers maybe but. Basically. People. Tend to share. A lot of objects and in terms of. Reusing. Code it makes sense but there are some specific, objects. That, if. You reshare if you. Share. Them you will basically end up with using. TCP connections, you will end up having or, ending, up in TCP, exhaustion, and. In terms of scalability, you will be kind of limited so an example for that is HTTP, client, you. Have many many other options. There so, yeah. One of the things one, of the issues we commonly see is this. One so. Another topic, is and that. Always comes up when you have our design. Review sessions with customers, is okay. So I want, to, reduce. Costs. So I want to share, traction. Environment, we do quality environments, so in a perfect example for that is using. The app services, so you know that we have these different. Deployment. Slots so, people see that okay I have the deployment slots I can just swap between, my. Production. Workloads, in play and. Ecology. Therefore whatever you want to call it so. Can. I use that and. Typically. Say so well, keep production completely, isolated, because in the case of app services, you'll, be using the same app service plan so it's running on top of the the same, hardware. So if, you are running low. Testing on your quality environments, that's something that you do. Not want to do because you'll be impacting production, workloads as well. So. Moving. On and talking about, performance. So. One. Of the the things that we actually see. Very often is people, kind. Of so, I have a solution that worked very well on premises because everything was complicated, right so. Now, I want, so my application is kind, of slow so. We. Placed a, cache, for, static content so using CDN or any. Other service and, we, are also using, readies cache for instance, for. Not. So, static content so in. One of the, things. That I actually recommend, you hear is that so. Caching is not just although, it in Azure you can just click a button and it's enabled, it's. Not that easy right so you need to have the right policies the. Right eviction rules otherwise, you will end up having stale, data that you are sending your customers and this is something that we see very often, as. Well ok. Yeah. So the cloud latency, envelope, so. We'll already touch, this very lightly so when. You are working, within. Deploying your applications, on-premises so everything, once again is co-located so you don't have to think about it's, going to be slow slow, connection between your database, and. In. Your front-end, or API, so. You don't need to think about this but when we are moving to cloud scenarios, and specifically, when we have. Geo. Distributed, scenarios, you, need to think about this so your application needs. To be aware, that sometimes. You, may have a, bigger. Latency, because one, of the regions is having some issue or you have an issue on your application, on a specific region and you are redirected, to a different place so.
If. Your application does not isn't it's not aware of these kind of things you. Will end up having some timeouts in your applications, and that's why the. Patterns that we'll mentioned earlier can help you. Do that so prepare your applications for this as. Well. So. In terms of, resiliency. So, you. Mentioned early. In the first in the first the, first couple of slides that. Typically. Customers, come to us without even. Knowing what. Their, application is supposed to do so once again go back and check that and. The other thing is the. SLA so if you asked if. I asked probably. Many. Of you so what is the SLA that you want to provide your customers in your applications, you'll probably reply, something close to 100% right. Please. Yeah. So, ideally. You would have 100% right so, but. This is something that after. A small, conversation with the customers you see that okay cost is a concern. And/or. Is a huge concern and they don't want to deploy in multiple. Regions so. And. They are using services that don't, have the level of SLA. Or we don't guarantee the level of SLA that that they are looking for so this is something that we. We. Advise. You to to look into because so if you are looking for something that once if. You want an eye SLA and you are going to a single region deployment. Yeah. Maybe that's, something that's that you need to rethink, okay, so. Also, another. Point here is the lack of strategy, for resiliency. Within services, so, we do have a lot of services in Azure that we do all this work for you already so, if it's we have redundancy, if it fails in one region we can take care of that but. We have these global services as well but, your, applications, need to need. To be, aware of that so, take, that into consideration use. The services in Azure that already provide these features such, as traffic. Manager, gene. Redundant storage so all of these are already we already provide some some kind of resiliency, for you so leverage. These services, in your applications. Well. Otherwise you need to implement all, of these things yourself.
So. Another, thing here is the the single points of failure right, so even, if you have low SLA. You, should not. Have single points of failure right so let's say this you have a virtual machine that, runs. A batch job that actually doesn't have an SLA a. Very. Is Li so if it doesn't run right now you can run an hour later or two hours later there's, no issue there but, let's say that this is a virtual machine and the virtual machine, goes down and, specifically. On that day you actually need it that's that batch job to run on, that specific, hour and yeah. You are not prepared for that so so think, about. The SLA is that you that you are going. For and try. To avoid single points of failure in your in your solutions, a very common thing. That we see is on on virtual machines right. So. I have. A quote here it's. It's quite long but basically, the. Idea here, is that you, should always design your solutions for failure right you should lose solutions, should be designed. For. Failure right so and. You. Should have a recovery plan for that and as we'll mention the end I was very glad to see you know I was actually a little bit surprised to see a lot of ants in the air when we'll. Ask for, well. If you had well. Backups, and repossessor recovery plans and that did you test them so I was very glad to see some, some hands in the air because. Typically what we see in our in the customers that we that we work with is that most. Of them don't have a major, incident response, plan so if. Something fails you. Don't know exactly how to proceed to, restore, the services, you don't have you, don't know the who is accountable for for, restoring, the service so. This is something that you, should always have in, place, and if you don't this is how someone, one of the things that I would recommend you, having. Or working, before. Moving. To Azure or when moving to Azure so. Another thing around, around data is about. Data reconciliation. In consistency, strategy, so. Your. Applications, and your services, need to be, aware. That maybe, if you have the. Data replication in, two different, regions. You. May have some, differences, in the data that you have in this, multiple replicas right so your application needs to work, with this kind of thing so it needs to be aware of, these. Kind of scenarios otherwise you, would be expecting, that all the, data is. Is. There in, every data store at the same point in time and maybe it isn't okay so, in, typically. This is a mistake, we see as well so, we'll already mentioned, so your, disaster recovery strategies. Untestable. So. You. Have a major of major incident. Response. Plan but you have never tested it you don't know if it actually works, I, actually. Had an interesting conversation one, of our customers, last. Week about this and I, said yeah I have the disaster recovery, plan. Yeah. Everything's, working and I said have you tested it and he, said yeah well actually I haven't, so. Yeah. How do you know it's that, it's actually working well, well. I don't okay, and that's. One one of the things so it doesn't so. If you have a major incident response, plan it's good but if you'd never tested it, probably. Not working okay so something, that you should, validate. In. Validate, very often so. Moving. Forward so deploying, in DevOps. How. Many of you are. Still deploying services. Manually. To Azure. Okay. People. Okay, so this, is one of the things that we always recommend our customer so we have multiple. Ways of automating, deployments, in Azure so you have address you live PowerShell, arm templates, you, have multiple, tools that you can use to integrate, and. To deploy, these. Your, services to Azure so, using. Now the new. Rebranded. To a DevOps, or if you want to use any other third-party. Tool that you are already using it so. Use it. This. Is this is something that that, we, do, see a lot in our customers so lack of continuous integration so.
Everything, Is deployed manually, so try to avoid that and also, the lack of telemetry. We. See. That a lot of customers are not I mean any insight, on the, application. Itself so they. May get some insights on the infrastructure itself because Ashur provides or, because you are running some monitoring tools but, you. Don't have any visibility, on the application, itself right so if, something, is going wrong with your application, and let's say you have billion. Users all around the world and, one. User is having, this issue maybe, it's time or ten users or 100 users are having this issue maybe it's time that you fix it, otherwise. You'll. Get this the same issue probably to your, the. Entire users. You have okay. So having, telemetry, will help you fix the issues before they reach every. Every, customer, you have. So. Yeah. Moving forward so. We have some, of the anti patterns that. Will. Already already, mentioned. Earlier. So, the. First first one here is the the. Busy database okay, some. Customers. Tend. To have a. Lot of logic running, around on, the database which, is not bad for for, some scenarios, but. For for, other scenarios, so. A lot, of time is spent on the database so or, the, user experience on the front end side. Your. Website, is slow okay your application, is slow because there, is a lot of work going on in the background on the database so, one. Of the things that you can do in depending on the workload of course is to move processing, from database server to other application, tiers that may be able to scale a lot faster and. As, I said there are some scenarios where having. These cookies. Work. Running on a database makes sense others, where, it doesn't ok so. Then another. Anti-pattern, here is the busy front end so, the busy front end is. When, you have a, lot of work running on your front ends and typically. You can offload, most of this work to. Other back-end, services. Maybe. Even using or leveraging, queues. As well so. We. Do see still a lot of customers, trying to bring all the go to the front end side and. Sometimes it doesn't make sense you can offload this this work for to a add your function, or something like that or even a beckoned service that you build in yourself. The, third entry pattern we have here is something. That is very common as well where, people, tend to get. Every. Data that they need from a specific table for instance and. In. The end they will not use they. Will not use all that data so they grow and grab all the columns of that specific, of, that all the columns and rows from that specific table, and this. Is just a large amount of data that will go into your into. Your services, and you, will just using. A link query or something like that you just throw them away afterwards. So why do that so please, just. Go and grab the, information you need, filter. It down grab. Only the things you need it'll make your services. A, lot faster the. Last one is, something that we have recommended. We. Have been recommending for a lot of years so, blocking. Gallo is, still. An issue in in, some applications so use the synchronous service synchronous, methods. On, your applications, this will allow you this, will allow some. Threads in the in the in the UI to be released, to do other kind. Of work so this will give you a better. Experience. For our for your customers okay so, let. Me very quickly. Show. You an. Example here. You. Can see this right so this is just, an example of an, architectural design we commonly see in, our customers, typically, when. You don't have very I SL. A's or when, you. Actually are not present, in multiple regions, so. Basically. We have this. Web. Front-end running on App, Services, we have an API and we have a database that it's actually. Replicated. To a different region right so. What. Happens if, the. Connection, - or there is something wrong with the database in the primary region, and we already directed to the second region in, this scenario okay so I have just, two. Samples here that basically they run exactly the same code but, the architecture is slightly different so in this case. Fail over in this failover group so the first database will. Stop everything will start having some issues so we will be redirected to the second database we are using failover groups. And. This, secondary. Database is in a different region so let, me just, go and start, the test here so, I'm running the test for 20 seconds and you can see the amount of requests, that we are doing. Ok so we have total requests we have around, 70. Requests, right now and the average request, time he's around. Yeah. Is. Around. 300. Milliseconds, right so, which is, yeah. We are going to different regions so it's not bad but still is as an impact on. The. The latency and if your application is not prepared to handle, this, kind of latency you will end, up having some issues so.
Doing. This in. A different way okay and we call it the right way but this is something that you can go and grab directly from the architecture, the, architecture Center so this is in, our in. Our documentation, so, I will run the same thing so in this case we have Traffic Manager in front of it which is basically. Working. On the DNS level and in. This case we have a. Failover, region, so I'm going always to the two in, this case. With. Us I'm not mistaken so I'm going to that specific, region and if. Something goes wrong I'm redirected, to this second, region but my. Front-end. And API and, the database will be there so I will be depending, on where I am I will have some, additional latency but not the, same amount of latency there I have. Using. The the, first example, let. Me run this you. Can see that using, this, using. This scenario. Basically. I can make a lot more requests so, it's exactly the same code running okay. And the average we request, time is a lot different, this. A slight change in your design may, well. Change a lot okay. And. Let. Me go back to the presentation and I, guess, thanks. Tre. Good. Morning every sir my name's Jamie Yeates and I'm a day two architect, for UK airline fly B so. This morning I'm going to talk to you a little bit about my, experience, architecting, a day - and insight solution, on Azure I'll, tell you a little bit about the process that I went through and, we're going to take a quick look at the resulting architecture. So. We're first of all so whoever fly B supply. BA Europe's largest independent. Airline and we fly around eight million passengers, each year between eighty one airports, across the UK and Europe so. We specialize in regional, flying many. Of our customers are business travelers so. We got a number of partnerships with other airlines such as Virgin, Atlantic, British. Airways Emirates. Etihad where. Our customers, use. Library services from regional. Airports. Across the UK to fly to the larger airports in the UK to fly further afield. So. The, challenge set for me was to architect a modern data and insight solution, and like many other organizations, fly B was becoming a victim of the insight gap so. The insight gap describes a situation where organisations, have more and more data but, they don't have more and more time to make sense of this data this was certainly the case for us, so. We needed a solution that would cater for large volumes of data both structured, and unstructured and, allow us to take this data and put it into the hands of our business so they could use it to make data-driven decisions. So. There are three main sources of data than the airline so number one us, our customers, okay so our customers make reservations, they participate, in customer, journeys they, create customer contact records, and social media interactions. Number. Two our flights so, during our peak season we, operate around 500, flights per week and this generates a large amount of operational, data including, flight plans journey log and aircraft, data from flight, sensors. It's. Number three our aircraft, maintenance repair, and overhaul operation. Is hugely, complex, and highly regulated by the Civil Aviation Authority, and, they've got to brace strict requirements around traceability, so. There we need to retain huge amounts of data as well so. We've, got all of this data but, our legacy. Data architecture. Is out of date it's hosted on premise in craft hangar in the United Kingdom and that. It's received brilliant investment, in the past ten years in. Addition. To this we've also got an increased need for data governance with the introduction, of the new European General data protection regulations, which came into force in May 2018. So. To allow us to start to address all of these issues we started off by defining our data strategy. The. Data strategy was put together to define our organization's. Comprehensive. Vision on how to exploit. Data and. It laid out the four following goals its number one to, put in place an effective data, governance program, this. Would ensure that our data was stored securely and handled with great care in particular our customer data it's. A number two we. Needed to control the flow of data between, business, applications, and in Stata repositories, now.
Largely Due to the failings of the legacy. Architecture where, the problem with shadow IT and much of the data flowing around our organization, was doing so with, very little oversight or governance. So. The third goal was we needed to create a centralized, repository of trusted business data far, too much of our data was siloed in source systems which are very poor capabilities, around reporting, and analytics and. Number. Four we need to get those tools into the hands of our business so they could use the data effectively. So. With, the data strategy in place but. It was completely technology. Agnostic so. The next step for us was to start look at the technologies. To fulfill the required capabilities. So. Our strategy roadmaps been around five years and, we were able to identify technologies. In this year that provided everything we needed from foundational, things like data warehouse, and data Lake to, more advanced capabilities around, machine, learning and predictive analytics. So. Azure also provided a great set of tools that we could use as part of our data governance program. In. Particular data, access controlled by Active Directory. But. It also gave us a complete set of controls we could put in place so that we can limit and restrict data where required ensuring. That our data was stored securely. So. Our finance guys were really keen to understand, the to. Understand the costs around the new technology, so we set about doing a financial comparison. Looking. At as year against other cloud vendors, and, it compared favorably, proving to be 20% cheaper. But. Also it also delivered some significant, cost savings, when, compared to running our legacy, on the hosted premise solution when, taking into account total, cost of ownership so. Finally, we, also had already had a little bit of experience in using Azure. So. We'd previously, migrated, stuff like on websites onto the platform and today as your hosts our website. Which, serves 17 million users and around. 176, million pageviews each year. So. When, I first started talking to Microsoft about building, this solution I was put in touch with Glenn small who's an engineer within their fast track team so, together Glenn, and I reviewed our date strategy, and we started looking at some of the technology choices we've made we.
Potato We paid the particular amount of attention to, the areas where there was considerable overlap between, services, offered in this year a. Great. Example here would be our choice of data Lake technology, and originally, we thought we were going to use it your data Lake store but, having looked our requirements, we realized, that actually as your blob storage is going to be satisfactory for our for our purposes. So. Making those choices early on within our within, our project to save this a great deal of time with an implementation. Okay. So. As. Part of our as, part the fast-track experience. Glenn ran a number of despite. Sessions, with our teams from beyond data and infrastructure, and. This allowed us to look at different, technologies opportunities. And problems, I'd. Say these sessions were great for addressing people's concerns remembering, that our legacy architecture have been in place for many years ensure. That all of the people within, our delivery teams were comfortable with the design and technology, choices. The. Sessions were also run in a very very interactive way and it would normally be one of the guys from our team. Interacting. With the azure portal or perhaps reviewing, demo resources. So. Just a quick look at our architecture. Blueprint here so, this is probably quite familiar to those of you work in this area so rope on the left, hand side over there we've got our source systems our business applications, we, also interact, with a number of third-party applications. As well and at, the bottom we've kind of got the wide world where. We take in data from industry organizations. Such as theta and IATA we're. Also going to place like social media so that we can start to glean some insights into our customers. In. The datum data management platform, area we've got a number of non-microsoft technologies. Around, data integration. Service. Bus and file transfer but. We migrated, these on, to Azure using. As your compute. And storage, and. They're hosted in the same data. Centers as our data repositories, ensuring, that we've got low latency, no latency, data transfer, at. The, bottom we're, using Azure service bus and. This allows us to decouple. Our integrations. It, also allows us to serve up messages to multiple subscribers. We. Don't quite a lot of work in this area mainly. Because we often find that you we have different, technologies. Incoming, and we wanted to bridge the gap so, we're now I haven't done that work we're quite ready to receive, messages, from any sort of messaging technology. Across. In the data, repository, area we've got Azure sequel data warehouse, and this really is the engine, room of our data architecture. So, it's massively. Massively, parallel processing technology. It. Allows us to ensure that the data is served up for the end-users very very fast.
Below. There and data Lake we using blob storage and this provides storage for the raw data but, also curated, unstructured, data which we sometimes serve up into our analytics, platform. At. The bottom there we de tomates we're using little babies your sequel database and, here. This enables, us to put the the data into the hands of the business give, them a data model they can control and this enables them to do their own analysis, often. Enriching, data from the centralized, data repository, with, data as they've obtained from other places. So. Over in the, analytics area power bi services the main front end on data than our business. We. Received a great response from our business having. Started to roll out power bi the. Things they particularly like. Probably. The ease of use so, things like quick insights, and natural, language data. Querying, where they can just generate a visualization, very quickly and, pin it on to a dashboard. Below. There we've got Azure. Analysis, services and we use this to host our larger, and more complex data models, again. The in-memory compute, there allows us to ensure that that data is served there very very quickly ensuring, good user experience. And. At the bottom we're using Microsoft. Our open so I personally think that the our language is a great data focused, language it. Allows us to create advanced, analytics routines, and. Thanks. For the integration of mugshots put in place we were able to actually deploy those into power bi so, they can be consumed by our general users and. We also take the output of some of those routines, push, that back in to the data warehouse so it can be used for future reference. So. At the bottom there we using, a helping, of our data governance problem, we're, using Azure data catalog, and this enables, us to keep track of all of the data stored within our repositories, and it also allows us to line it up with business understanding, for the use of the business glossary function, they've got. So. In conclusion I, think even, though we're quite early into this I think that this platform is going to give us everything we need to turn fly B into a truly data-driven, business. Hand. You back to tre. Okay. So, that's. The nice summary, of the fast-track program in action with a customer helping them go, through architecture and what the end result was. Please. Come down and see us we're. Not quite the end yet we've still got some some, additional prizes to give out for Q&A but. Come down and see us we, can help you build solutions, on Azure on the Azure platform, faster. And. We can get your architecture, right with you and set, you up for success I'm. Going to go to Q&A just a moment that's. Cool but. Please hold the question definitely yeah. So we go through the discovery phase that's largely what we talked about earlier on we've, talked about the solution enable macphails which is kind of how we do the guidance but, there is also you know product. Team skin in the game from, Azure engineering, to help make your solution successful, on, the edge of platform, and. We've. I think we've gone 500, plus customers, at the mom