Tips & Tricks for getting data of any shape and size into Azure Data Lake using - THR3018
All. Right morning. Everybody. Thanks, for coming along I thought I might kick off one minute early I just, give you an extra minute and. Then, burn it by talking about how I'm giving you an extra minute my. Name is Damien Brady today I am going to be talking to you about DevOps where to start and so the. First question I got from somebody was is this specifically. Net or specifically, add your DevOps or anything like that and the answer is no this, is completely ambivalent. Of tech this. Is ideas. About where. You can go next in your DevOps journey whether you're just starting or, whether. You're halfway through and, and doing well and just, looking for the next thing to do so it's, very generic and it's very high-level kind of advice. So. As we go through I'd, like you to think about your current situation and think, about whether any of these tips. And suggestions resonate, as we go through as well and these can be suggestions, for. What to look at next so. DevOps has, a lot of definitions at. Microsoft, we define DevOps as the union of people process, and products to enable continuous, delivery of value to your end-users, now. The bit that I want you to focus on here is value. So. We're not talking about continuous delivery of, features. Or continuous, delivery of, tasks. From your task board we're, talking about continuous delivery of value and value. Is the most important, thing when it comes to DevOps if. You are just pushing features, out the door and then forgetting about them as soon as you've finished putting, them in production then. You don't know whether you're delivering value so, having that feedback loop is super, super important, and this really comes down to focusing. On the customer or focusing on the end user is, what you're doing valuable. For. That customer for that user who is using the application. So. That's where your focus needs to be initially, and the, other definition, that I keep seeing of, DevOps, is all about how DevOps is not a process it's not a set of tools it's nothing like that DevOps is a culture, and I. Think that's true this. Is a really popular opinion, by the way that's just a Bing search for that phrase specifically. I agree. With it but I also think it's not terribly, useful if, your organization. Trying to get started with DevOps if. You've ever been somewhere where they tell you you're, your culture has now changed that's, not really, how it works right so. What you can do is you can encourage certain behaviors, and certain practices, and and, discourage, others and that can lead to an improvement in culture you can't tell people to change the way they feel about things, so.
DevOps. Is a culture sure but that's not necessarily, where you where you can start to. Actually make a change so. Um, the other piece, of advice here is. That, rather. Than doing everything in one hit so, don't, stop work try, to automate the whole thing to get a DevOps process, in place. Find. What hurts the most in your organization, and fix that and the stream of people by the way there's a book over there as well I believe, Jeffrey has some a. Plenty. At build as well so the, azure DevOps booth I believe has it has a ton of them as well so if. You miss out now there's plenty more around. But. Keep, going. So. I said that this was going to be general advice and it is but. I thought it a nice way of kind of. Categorizing. The different tips and the different things that I'm gonna be talking about is to actually look at the product as your DevOps so I mentioned. Before that DevOps was about people process, and products you can't just install sorry. You can't just start using as your DevOps services and that fixes everything but. The tools in agile devops are the services, in Azure DevOps are really good for categorizing the kind of changes, that you might want to make in your organization, so this is how I'm going to structure this with, some tips around these things and some, of these you may be implementing, really well some of them you may not be but, this is a nice way of categorizing them, so, let's start with agile boards which is the azure DevOps service for managing. Your work what. Work you're going to do next what, works been done the bugs the tasks, the epics all of that kind of stuff and this. Is really what I was talking about where this needs to be based on the customer, right. If. Your, work planning, is based, purely, on the. Requirements, that somebody who never speaks to the customer thinks you should implement then, you're likely to be implementing, the wrong thing right so, you really need to be focusing on the customer and to do that effectively you. Actually need to talk to the customer, so. This is something that the agile DevOps team itself, puts, into practice there's, no layers of support and layers of communication, between the developers. Writing the code and the. Customers, using that code or the users using that code if, you have these layers of, communication. These. Are barriers and it means that by the time the developer actually gets that request you, don't know really what was, being asked for anymore that, can be solved with a phone call with an email with, some direct, customer, communication, so don't put, these artificial, barriers in place and, that reduces, communication, that means the work that you're doing is less likely to be valuable so. Direct communication, with the customers really important, the. Other thing that's really important, is actually, working, out whether the work you're doing is, valuable and pivoting away from it if it's not and pivoting towards, the stuff that is valuable and this is all about experiment. Driven requirements. So, this is a concept, that is. Deeply. Uncomfortable for some organizations where. They start doing some work evaluate. Whether it's actually solving the problem it's supposed supposed, to solve and if it's not moving, in a different direction or turning that feature off this.
Works Quite well with other techniques like feature flags where, you can roll out a version. Of an implementation. Or a version of a feature and, if it works that's great you keep rolling it out to everybody else and if not you, roll it back or you turn, it off. This. Allows you to actually test things in production which sounds like a bad idea but, testing in production, is the only way of testing whether something is actually valuable, so. In Azure DevOps itself if you logged. In and you click on your name or your profile, photo up the top left there is preview, features, these, are all feature flags and this is what we're doing we're, experimenting with these new features and we have metrics, around these features, to determine, whether our, guesses. About whether this is going to be valuable are actually, true or not so. We're not just pushing features out and assuming they're going to work we're pushing features, out a bit at a time and making, sure that our assumptions, are correct we're, actually delivering value, so. These are just little tips about how you can plan your work a little bit better so, let's move on a little bit more technically, and we'll talk about I'm not, as your repose but as your repose is where we keep our source control and, I didn't put a bullet point for this but um it goes without saying everything, should be in source control your. Code, your infrastructure. Definitions. The. Way you build that infrastructure, your. Build. Definitions, release definitions, thankfully, as your pipelines now allows you to put your release definitions, in Yama, alongside your code as well so. All of this stuff should be in source control as well so, starting, from that there's. A few techniques that are really have. Really been shown to be valuable in, a good DevOps organization, and one is trunk based development, so. The short version of this is that everything, that is venture is eventually, going to get to production. Should. Be on your trunk or master branch or at least one main branch so. Rather than having a dev branch for the work that you're doing and then merging that back into a QA. Branch, and rebuilding, it and deploying that and then, a master, or production branch and rebuilding and deploying that if, you have this one trunk it, means that you are pushing, code to this main branch. Constantly. It means that that main branch needs to be healthy and that, there's one artifact, one build that you push all the way through your environments, to get to production so, by the time you get to production you, have tested, that build, that code that everybody, has been working with so. Much that it's very likely that what goes into production is going to work there's. A few things around this theme but trunk based development is a really good way, of trying to work trunk. Based development comm is also, a fantastic work place to go to to have a look at all of the arguments, for and against and why it actually it. Actually results in good outcomes and there is proper research behind this as well it's not just people's opinions, so. To, do this effectively in, a larger team it, doesn't mean don't branch at all it doesn't mean everybody pushed directly into master you can have feature branches, and branches, for different parts of code but, try to do them as short lived feature branches, anything. Longer than a few days in a feature branch. Starts. To drift away from what the master looks like so you have five teams working on different sets of code and then when you merge you get this merge hell and it's really really difficult to put things together, right you wanna merge as early as possible so short-lived, feature branches that come in. One. Of the other tips I'll give you as well is this whole idea in DevOps are shifting, left so if you think about your pipeline.
As An idea. Through, to writing the code to building, running tests, deploying. It monitoring, what's in production there's kind of a left to right it's. Much much cheaper to fix an issue that you find as far left in that pipeline if. You find a bug on your development machine that. Is way cheaper to fix than finding a bug in production, right so. Shifting left cannot can work this, way as well so pull, request checks and pull request policies, or branch. Policies, mean that you can run a build before the, code from, your team actually even gets into master you can make sure that the code that people are submitting is not, going to break your build you. Can run whatever stuff you want in that pipeline as early as possible and that, will alert, you to problems much. Much sooner which makes them cheaper to fix so, shifting. Left and moving as much stuff as early in the pipeline is really handy so pull, request validation, is really important and, then some of the technical things you need to do you may need to start refactoring, some of your code to, enable, this stuff if. You try to get everybody in your organization to, work on your master branch right now that's likely to be very difficult especially doing, short-lived feature branches, everybody. Knows maybe there's a week's, worth of work in something off or six weeks worth of working something you don't want to break the master branch while that work is happening but, you also don't want to keep a branch separately, from master while that work is happening so. Refactoring. Such that you can keep writing code against. That master branch without breaking, things. Abstracting, things away and writing, a new implementation, rather than writing over the top of existing code. Having. Feature flags that that, point, the code in one direction so you deploy, to production pointing. It at the old code until the new code is ready and then, you try and flick it over and see if that works as well that, can require a few changes in. Your architecture of your application, - all. Right let's. Let's, move on from the code and let's talk about. Continuous. Integration continuous delivery, so, as your pipelines is the tool that you've probably heard about a few times, I've built so far. This. Is all about building your application, and then deploying it in an automated, way as much as possible so the key here and this is something that I really try to get everybody to do is deploy, as frequently, as possible with, the smaller changes, as possible as small changes as possible and the reason is if you have six months worth of work that you're deploying all in one hit and something. Goes wrong what. Went wrong and even. If you do identify what, wrong what went wrong if, you go to a developer, and say remember that code that you wrote two months ago there's something wrong with it I don't, remember what code I wrote two months ago I don't remember what code I wrote this morning right so. If you can release these things as soon as possible, there's two advantages, one there's less risk because there's less change, if. Something, goes wrong, it's. Easy to identify what it was and easier to identify what, fixes it to so, smaller changes, means smaller risk and faster. Delivery of these changes means if something does go wrong you. Are already on a cadence where you can fix that and push it out really quickly so, rolling forward is much much easier fixing. The bug and rolling forward. This. Means having test environments, in your pipeline as well so if everything is on your master branch and everything, gets deployed really really quickly you want your pre-production, environments. To, be relatively. Determined. Relatively. Accurate, in terms of is this going to work in production as well so. Having, test environments, that look nothing like production, is, not very valuable. So. Having, test environments, means that you can check that things go wrong before, you hit production, so, a failure in staging is not a failure it's a success, because, it didn't break in in production, in breaking staging, so. Having these environments, is really important and, some kind of automation, as. Well and this is a tricky one people try to automate absolutely, everything and I think you should that should be your aim but. Don't stop. Doing DevOps sorry. Don't prevent. Any movement, forward, until everything is automated so if it finds something that is that is difficult right now automate that then, find the next most of you and, automate that and eventually, you'll get to a point where everything, is automated you might still want somebody to click the button to deploy to production but, the deployment, to production will. Actually succeed because, everything, is automated, so. Finally. The other tip I have around, here is the idea of idempotent. Deployments, so.
This At its, basic. Level pretty, much means that if something goes wrong halfway, through a bit halfway through a pipeline or halfway through a release you can just hit deploy again and it'll, try it again it, doesn't matter if, something fails halfway, through it just means that you, can start again you can do it in one here I've. Been in plenty of organizations, where a deployment fails halfway, through maybe there's, some. Problems with disk space maybe, a folder, was locked or something like that and. You have to kind of work out where you are in that deployment because you're just applying a patch what. Sequel scripts have been running what, files need to be copied now like, is something corrupt halfway through it's, much much easier if you can get your builds and. Deployments. So that they're idempotent, which means if something goes wrong you can find out what the issue was fix, it and just hit go again, so. That's a that's a worthwhile aim as well, all. Right, the. Final part of this is, sorry. Not the final part second final part as, your artifacts, is the service. In Azure, dev ops which is all about packaging. Your applications, and storing, versions of packages you might have heard some announcements, around that we've updated the pricing so it's consumption, based pricing now rather than users, which is great. And there's three tiers as well so you can use Azure artifacts, for free. Just. Out of a bit of interest as well as, your devops stores. Currently, in Azure artifacts, 155. Petabytes, of artifacts. So. This this thing scales pretty well and, to put that in perspective we were talking about in the team they did the math if you if. You think about that in two terabyte, hard drives you need, 76,000. Of those or, if you go over to the other booth there's a Azure. Box I think it's called where. You can like, bulk. Upload stuff those things are a petabyte so you need 155. Of these it's, like three quarters of a 40-foot, shipping container. For. The stuff, that's in there I found that interesting it's not not at all relevant to what I'm about to talk about but found that really good um. The. Idea with the artifacts, the stuff that you build the actual versions, of your application they really need to be immutable so, I've, worked in plenty of organizations and, spoken to plenty of teams where they will build version, 1.1, of their application, and then I'll find a bug and so they'll fix that bug and then, rebuild, version, 1.1, of their application, and now you have a problem you have two artifacts, that have the same version which.
Ones Which like. That 1.1 version should always be, a unique. Immutable. Version, of your application if. You need to rebuild, reversion. It 1.1.1. And that's, really important, for knowing what has been deployed where for, that whole provenance. Story of what what, work I was doing what code was changed what build was produced and what release went into what environment, you can follow that all the way through if. You, can change those builds after the fact then, you just don't know, so. That's really really important, that, also means that you need to externalize. Your configuration, so when, you deploy this same artifact, to your test environment and, in your QA environment your, staging, environment, your production environment. You, need to know or you need to be able to make changes. Obviously. The database is not going to be the same in your test environment as, it is in your production environment but, if that connection, string is hard-coded in your code, it means that you need to rebuild the application, and so suddenly you're not testing, the same thing all through all the way through your environments, so. Any changes, to the way the application works need to be externalized. From that build so that you can maintain this, immutable, artifact, and. Whether that's configuration, files app. Settings. I don't. Know environment, variables or even stuff in the database it doesn't really matter as long as it doesn't affect that immutable build so. That's important as well and. This really leads to deployment. Consistency, and that's what we want when, we deployed, this artifact, that's binary, this set of binaries that you have built to, an environment, and then. You promote it to the next environment you don't want to rebuild end up with a different code, and you're, deploying it to a different environment now, you want to be able to test the same thing through all your environments, so by the time you get to production you know it's going to work because you've deployed it to the same kind of environment like. Three or four times before it's. That there are no surprises. All. Right final. Part of um Azure DevOps is as your test plants and, this one is. Really all about testing your application and, making sure that what. You actually produce and what you put into production is as, safe. And as good as possible so. In terms of testing one. Thing that we found that's, really valuable is automating, as much testing as you can and again this pushes against a lot of organizations, cultures, where they have a QA department, and they, have in. Their heads that they, absolutely need, a human, to to. Press that button or a human to copy those files over it's, fine if you deploy automatically, to staging, that's, all right but when we get to production we drop all of the automation and we do it manually now. This is definitely, an anti-pattern doing it like that you, have proven that your deployment, wart works all the way through to this pre-production, environment and then, you scrap all that work and you introduce, human error to do that final stage the most important, stage the. Other thing that frequently happens is they require somebody, to actually do that work at the end or. Hit the button to say yes, we. Have gone through and we've tested all these things and we're, going to deploy to production and, you'll find that 99.9.
Percent Of the time it, just gets deployed to production nobody's. Really, adding. Any kind of value in that final step they're, just hitting the button to go to production so. Why not automate, that the. Agile DevOps team itself over the last 1012 years has. Moved a lot of their manual testing in fact almost all of their manual testing over two unit tests and it. Took a long time but, it means that. They're. Following a lot of the stuff that I've mentioned. Here all. Of their tests are automated, they. Will pass all the time they run 85,000, unit tests for every single pool request in. Their short lived feature branches, so they know that these things are going to work before it actually goes to master and they're, running 85,000. Unit tests it's all automated and they. Don't need to worry about human, error in this case, the. Other thing that's important is that all of the tests pass all the time otherwise it doesn't get to production if, you have a test that fails some of the time that, is worse than a useless test it's worse than having no test at all right because. If you look at it and say oh well some tests fail but that's fine then. What good at what what are they doing for you they're not testing, anything right they're, just giving you a false sense of security, and with code coverage or something like that right so that all the tests must pass all the time otherwise it doesn't get through to production, and. Then finally one thing I just want to point out as well is if if a bug makes it to production that's. Not just a failure in the code that's not just a bug you've introduced in that code that is a failure in your whole pipeline. What. That means is that there should be something in your pipeline whether, that's a unit test whether, that is an additional step in your build or deployment pipeline there. Is should be something, in that pipeline to prevent that from happening again, so. We've all heard in in terms of testing or many people have heard of our red green refactor so, rather than just fixing a bug you write a test to, prove that that bug exists, then. You fix the bug and then you, run the test again and make sure that it now goes green so, now you have, a test over that stuff you know it's not going to go wrong next time you, can do the same thing with your pipeline if. You've deployed something, to production and it's broken fix. Your pipeline so that it won't happen again and now you have coverage over that thing you know it's not going to happen next time, so. Anything that gets into production that's bad is a failure in your pipeline not just in the code. All. Right so there's just a kind of a whirlwind, of stuff that I've dropped on you there hopefully some of that stuff resonated, and I want you to really think about this definition that we talked about it's, not just the products, that's, just how I categorized this advice it's the people buying into this idea of continuously. Delivering, value and it's, all about value so focus number-one on the customer it's a customer, centric thing or a user, centric thing to have real success in DevOps a few. Resources as well there, is a DevOps itself on Docs that Microsoft DevOps. At Microsoft, is an awesome resource and, it covers a lot of how we do it at Microsoft, and the lessons that we've learned over the last 10 or twelve years and we, deploy 80-something, thousands, of times, to. Asia every single day using. Hazard EV ops so, that's really useful as well and then I also have a channel 9 show if you're interested in me talking to smarter, people about, solving. Problems as well called the DevOps lab finally.
In. The other section, just, past the github near the big glowing sphere thing there. Are a couple of azure devops booths and they, are staffed with people. From the product teams so if you want to know about agile devops the product they are the people to ask and. Lately. They other people who I thanked. Quite frequently, they've actually been getting a lot of people going over and thanking them for the work they've done which, is fantastic so. I would highly recommend going and having a chat to them as well I believe. There's still a few more books here there's some stickers there and there and, other, than that thank you very much for your time and I hope you have a great day.