Recording Day 1 Azure Stack HCI Days 2021
yeah hello hello and welcome to the second azure stack hci this time days we have two days my name is carsten rachfar and i'm joined by manfred helber who is so nice to share his nice studio with me so that we can do the next uh two days uh azure stack hci days here together so manfred some words about you yeah um welcome also from my side i'm very happy that carson is here for the second time for the asiastic hci days last year it was the azure stack hci day and before we had the just like hci day we met several times on your cloud and data center conference uh what was a yeah let's say in person event a real life event and because of the situation we all know we carson had to switch to a virtual event and i had already this video studio here carson mentioned and so i'm very happy that we are using this for carson's azure stack hci days and also for several events i'm doing all over the year and i'm really looking forward to the many interesting tracks who you have on your agenda so yes yes in the last days yeah we will talk about all the tracks and the speakers and of course the sponsors very soon but first i want to say some words about the cloud and data center management conference um unfortunately we couldn't do it because of the pan pandemic last year in 2020 this year also not possible and i'm to be honest next year there will also be no cloud and data center management conference we are planning now for 2023 so we hope that then um corona virus is something uh we get used to every everybody or nearly everybody is vaccinated or had the virus and is immune so next year not a cloud and data center management conference but i hope in 2023 and i hope we will do another azure stack hci day in 2022 manfred i hope so at least and you start with a new show on the first of october i think you will mention it in uh in your presentation yeah and for you yesterday now because you do a new thing yeah there will be an azure stack hci show this show is presented by sven langenfeldt who is a microsoft employee and my person and we have great guests there in the asia state hci show for example karsten in the first show and maybe some of you know eric burke he is also microsoft mvp um he will be also in the first show in the third show i think so we will have cosmos darwin uh will present there so the show as the name says is all about azure stack hci and if you are interested and you want to watch it you can find it on youtube when you search on youtube for manfred helber so my name then you will find my youtube channel and there you will find the show um it will be online it will be on the 1st of october from 12 to 12 o'clock to 1 pm and the youtube link will avail be available on monday so it will be listed there on monday but my channel is there already and we also do a little thing together the on-prem show the on-prem show there was a large summer um how you call it pause uh that's mainly my fault because i i was a little bit in a corona down and uh didn't do webinars and other stuff a lot so we will start again i think also in october in october yeah i think in the middle of october our next uh on-premise there is a date yeah and it's also streamed on youtube okay so let's uh let's switch to the slides and uh do some some logistics um so um for your information we will talk about this sessions very soon but first the sessions always start at the full hour so the first technical presentation after we talk a bit about the azure stack hdi day here manfred and i the first session will start at uh 1pm or in in german it's it's 13 00 and the session will last up to 55 minutes i talk to the speakers so they have they are free how how much time they use they the session should be at least 45 minutes and then we have some room for q a so if you have questions you are live in the event please ask your questions in the chat and a uh manfred and i uh will answer them and we have some other uh helping hands here and uh if questions are still open we will ask them directly to the speakers so they can answer them yeah so we have two small issues we are using teams live event you are successfully locked in so it works but we have a delay in this live event so we are talking here and you will see the content about 30 seconds later so if you write down a question maybe it will take um about one or one and an half a minute till we see your question in the q and a area we already have a question from jan he's asking if there's interactive chat yes you have found it jan and he asks if yes where it's here and what's not really there is the interactivity because we always have to publish a question we can publish it or we can answer it and this is the reason why you create a question you send it we will see it and it takes a while if you get an answer directly because we take the question here to the speakers or we answer it here carson and myself or we will write down an answer if we are not speaking actually and we find time to to go into a question so it should be interactively absolutely um but please keep in mind that there's a small delay yeah and we will have some questions for the presenter anyway because uh manfred and i we have also questions so we will see um then all sessions are recorded they will be available later of course i will inform you i hope most of you have ticked when you um when you registered for the event that you get onto the newsletter and if you're on the newsletter i can send you a mail when the sessions are available and they will also be available on my youtube channel i'm not quite clear now if we do the whole event or if i cut them up in in smaller pieces hourly sessions and press your thumbs this is a very long uh team's live event we start now and it it takes up to up to midnight so that's 11 and a half hours never done such a long uh live event right me neither so i hope we will have no problems with uh with some technical issues you never know we are dependent on the internet and um in my in my area today the internet is very scary scarce because today is a is a day of where the provider is doing some work okay um then at the last session the closing that we do uh tomorrow evening i will have a raffle and um we will raffle i asked the sponsors about uh about prices but then i i thought how do we get the prices to the spawn to the winners and if you win from outside of germany maybe in the us or wherever you are it could it could be very hard to send them over to your place so i decided let's do amazon gift cards and we do amazon gift cards when you win the ruffle of course i get your mail and then i send you a gift card and you can you can use it for whatever you want and we will have another prize i do also an azure stack hci course a five-day course i've also a storage basis direct course and a hyper-v course but we have the azure stack hdi days so i will give away a free participation in my course and if you if you are only an english speaker not a german one we will find a solution for that usually i do my courses in german but maybe we find some more people who are interested and then i will do a five-day english course and you can participate in the course over teams or live in the nice town i live in hallenberg and manfred was also there it's a nice town it's a small town but it is nice yeah okay let's switch over to our speakers and if you have questions for this we will also answer the questions so we have an amazing line of a lineup of speakers i think i have four slides of speakers there are 30s 30 speakers different speakers actually they are now 29 because one of our speakers got ill he informed me on i think on monday and i had to shift the the agenda a little bit so um i'm the next speaker today usually it would be jan torre from norway but he is not up to presenting because he is a little bit ill so let's talk about the different speakers you see here fabian uh bosca is um from the company fujitsu i hope that's correct for hitsu manfred is that correctly pronounced i hope so yes he will present today um about the great solutions fujitsu has for azure stack hci and they are especially talking about stretch cluster scenarios the title of this presentation is i have it here deep dive building stretch clusters with prime plugs for a microsoft azure stack hci then we have two presenters from dell another sponsor michael wells and lisa clarke and both of them are also microsoft azure mvps so as manfred and i we are also mvps and their session will be turning hybrid cloud dreams into reality with their technical technologies and i like i like uh turning hybrid cloud dreams so i like the title very much i'm looking forward to that session then we have rob hindman rob is from uh redmond and i know rob and manfred also for uh quite some years he is also responsible for the cloud and data center management mvp program where where i and manfred is in and rob is our mvp contact yes let's say and he's presenting i have let's have a look i think tomorrow tomorrow yes tomorrow um the last one rebort in seconds with current soft reboot yes with christina collette we have we have her later oh she's she's here on the slide christina coletti she's also a program manager uh from redmond um they will talk about uh rebooting in or reboot in seconds with a kernel soft reboot i'm looking forward to that because it it will shorten the reboot time of an azure stack hdi cluster a lot right absolutely sorry for interrupting you carson we have one message in the chat if there's already audio or not um i think because of the answer of yan we can be very sure that there is audio because he heard our message that there is a chat but we have to publish the questions and so on and he replied to this uh information we gave via audio but maybe you can send some um yeah q and a messages to us if audio is okay for you if you can hear everything if you can if you can see everything um because if there are any issues we can use this first minutes to optimize it till the first session starts but i think this it was an anonymous so i don't know who is behind this message who has issues with the audio i assume this is an individual problem on this specific machine but it would be perfect if some of you could give us feedback so we can ensure that there's everything yeah and uh jan also asked or said it would be perfect if the attendees could chat to each other that's unfortunately not possible with microsoft teams team live events that's only only with teams right yes yes yes and then we wouldn't see all the questions because usually in the chat is so much going on you don't see the questions anymore right okay then andrew hansen has a session uh with tina wu she is on on another slide he's also a program manager in redmond and for um yeah storage and files is a storage and file system team and we will have here in that session about refs and the syn provisioning for storage spaces direct then we have roy casserby i think roy is where's roy he's uh he's one of the presenters about secure core servers and david shot both also live from redmond is presenting about um yeah sdn um software defined storage let's go through a bit quicker because we have only uh 50 minutes to the first session so then we have alvin morales alvin george prasit and zakib and payman are also um and trung and mike are also live presenting from redmond or from the microsoft product groups alvin is doing with other presenters gpus for high available availability or available vms i'm looking forward to that session a lot george is george is i think uh doing an sdn session uh prasit is doing is in two sessions one about gpus and uh about secure core servers uh zakib is um also in the secure uh course session payment is in the gpu session jan torreira unfortunately i i didn't want to remove him from the speaker list because he was speaker until monday he got ill so i hope you get better soon yantore we miss you here uh trung is um doing a session um let's have a look he's his co-presenter with truang where's trunk trunk is network atc network atc is doing with dan uh on the next so um we will see how you can automatically configure your network in azure stack hci tina we already mentioned she is doing uh this session uh about thin provisioning in storage basis direct with andrew hansen thomas morrow he's a former mvp we know him very well he's from switzerland so he is i think the only microsoft presenter who is not from redmond and thomas is talking about arc the the new thing for measure where you can manage everything right um jagomir casper jagomir is an xm microsoft employee he is very well known for ws lab and now it's called ms lab and we will show us how to deploy azure stack hdi on hardware i think with uh ms lab dave is an mvp fellow from canada and he will talk about defender in azure stack hci another mvp a friend of us from belgium he will talk about smb and quick a new technology where we don't use tcp ip for smb and other protocols but the new quick standard and that's now new in windows server 2022 the azure edition right yeah it's in the azure edition yeah let's see manfred you already see in the picture manfred also mvp in the cloud and data center management group manfred will talk about how we deploy uh azure stack hdi with windows admin center and his session is after mine at two o'clock i'm presenting about uh stretch cluster right at one o'clock where a young jan torres session was and then we have helmut another mvp from austria and he will also talk talk about from smb to data center the scalability of azure stack hci and now we are on the on the i think the fourth speaker slide and we have udo vibra udo is also known to you if you were a participant in the last azure stack hci day in november udo is from lenovo and he will do the lenovo session let me see making azure stack hci solutions easy lenovo think agilemx and then we have cosmos cosmos darwin of course cosmos will present about the azure stack hci roadmap today at five and i'm really looking forward to that session what cosmos will tell us about the the future features or the future roadmap uh of the product uh we are talking about hall two days and jason jason also program manager from uh redmond he will talk about the new vm fleet i am a huge vm fleet fan and uh i do all my installations or also petra another employee in my company we do all our azure stack hdi and storage basis direct installation we test with vm fleet and there is a new vm fleet coming um very interesting and he will show us all about it and then we have dan cuomo dennis uh from the network team in redmond and he will present with truck trung trung about azure atc so automate automatically deploying your network and then we have matt mcspirit um he will tell us a lot about azure kubernetes services uh with a colleague of of a colleague of him has another session mike we had him on the first slide mike kosternitz he will tell us about the archit architecture of azure kubernetes service and then we have jeff woolsey last but not least and jeff is also well known as a speaker in the community and he will talk about the new things in windows server 2022 and that was the speaker list it's it's quite a long list yeah so we have 10 minutes left here you see the agenda it's live on the azure stack hci day site so you see here we are now in the first session it's not really a session it's a welcome to all of you and we will start at one with my stretch cluster session then manfred then dda because uh then we have our dell session or the sponsor session and then it's eight in the morning in redmond and the redmond speakers will join us so starting with um cosmos um about the roadmap and then we have dave from canada windows defender advanced red protection then we have the networking session today we have the sponsor session from fujitsu where we learn a lot about the primeflex servers i hope we will also learn a bit about the new server generations that are coming um also at the dell session and then we have sim provisioning for storage spaces direct and rfs improvements um and then something i i think many of you are looking forward to to gpus for high available vms um they will talk about what's coming for gpu support in azure stack hdi 21 h2 and 22h2 and then we close the day with a secure core server session and then it's nearly midnight we will close up and tomorrow we will start at one o'clock with jaromir he will show us how to deploy azure stack hci21h2 with ms lab and then we have helmut which is from smb to data center session then we learn about azure arc and what we can do on premises i hope uh from thomas mauer then we have udo with the lenovo session make azure stack hdi solution easy and then we have a round table and i'm looking really forward to the round table where of course manfred and i didier jeff woolsey helmut otto jagomir kasper and matt mcspirit are here to answer your questions and i hope we get a lot of questions you can ask anything azure stack hdi related and of course server 2022 uh about all the things and uh i hope that we will have a very live discussion about uh these things and then we have after that we have jeff woolsey with a session about windows server 2022 and then two sessions about azure kubernetes services something i really look forward to because i'm i'm to be honest i i'm playing around with azure kubernetes services on premises but there's still something missing in my head so i hope i learn a lot here and then we have uh what's new in software-defined networking session then the vm fleet session with json and then the soft kernel reboot reboot in seconds not in minutes in seconds and then we have the closing uh with the raffle and then everybody can sleep and i think we both are very done then right i assume so yeah yeah i'm really looking forward to this reboot uh session because the reboot will faster than the duration of the delay we have here but yeah yeah i hope and do you think it's under 30 seconds i i don't think so i assume with vmcs but not with hardware we will see we'll see yes okay and then of course i have to thank our sponsors because without the sponsors you know um these free events are not really possible so a big thank uh thanks to dell technology who is supporting the azure stack hdi days puitsu and lenovo and of course a big thank you to manfred that we can do that here with his nice equipment and i think so far is everything okay huh or do we have yes and we had a lot of feedback that audio and video is fine so many wrote that this is fine and so my my information is always that and maybe it's an individual issue of the audio settings at the end point because teams works in the cloud and so if one person can receive our audio and video then everybody should and we have a few questions already in the chat and one of this question is very strategic so i'm not sure if it was desired for us to answer it if or if somebody wanted to ask the audience maybe this anonymous can add this information because the question is do you see azure stack hdi as a virtualization platform a cloud platform ignoring azure arc or both so if the intention was to ask the audience please resend this question with the information then we can publish it and we can try to get some information from the audience if this question was for us i think it's perfect to discuss it tomorrow in the round table so we can take this uh uh one because now we have only a few minutes left so we should take it till tomorrow and then we i have seen a question already about gpu support and my recommendation would be we have later a session so we can put this gpu question to the session where we talk about gpu later and we have um one question about uh where can i watch que uh sessions i miss uh and they mention carson everything is recorded and it will be published in the next days so you cannot watch it immediately but someone later i think next week yeah i hope to to get it up next week uh and uh the session could only can only be record only recorded we do it with teams live event so we will get a recording when teams has not not a problem so if there are sessions that are very important to you watch them live if it's possible for you i know the time the time span is very large from midday in europe to midnight but if it's important to you and you want to get your live question questions in watch them live usually there are no problems with the recording with team live events but if there are and we wouldn't have a recording that would be terrible so life is better of course than watching the recordings and this is the reason why we only have one track so we have two days with one track instead of one day with two tracks so you can really watch every session and maybe there's something not for you so you have to to to eat something and so on so uh remember the sessions start at always the full hour and if there should be any technical issues if something doesn't work or the screen freezes or something like this then stay on the call we will come back we have a lot of backup equipment here so um stay on the session um we will do the best and then you can receive the information yeah okay you can also uh questions in german yes and i we are we are capable of german very well yes i hope our english is okay i will not talk german here because if i switch to german in my head i have problems to do english again and we can translate this uh questions i did this with some of the all your questions before yes yeah so we have two minutes to go then i will start my stat cluster notes from the field session and if you have questions for that very welcome uh for me it's a little bit small to read yeah i will you have you have to ask yeah you have to ask the questions um to me so i think this start went well so we have nearly done the first half hour so only 11 hours to go today huh yeah and it seems for the attendees everything is fine and again to mention this if you want to have something discussed uh in the um in the audience then we can publish the question i didn't publish any of the questions now because this was feedback about audio and questions we will take for the later sessions but if you want to discuss this in the audience add the information that it would be great if we can publish it maybe we cannot publish every question but we will do our best and as soon as i publish the question everybody sees it and then you can interact with each other so manfred i have a technical question do i switch no i use this microphone right yes you use this microphone because you are on my camera here so um yeah you you only you only lean back and start with stretch cluster notes from the field so i have a question for the audience do you like to have the speaker at the side of the slides or do you want to have the full view of the presentation because i have to start my presentation now so question is if we do demos of course yeah yeah i will switch to the full screen and if the feedback is that full screen of the slides and presentation is always better then i will stay on the uh on the full screen layout of the presentation so we can i will take this feedback and um react based on the feedback yeah okay so it's uh 1 p.m in germany so i will start with my session good morning to everyone out there first something about me you are in the session stretch cluster notes from the field my name is carson rachfar i'm one of the cloud and data center management mvps in fact for 11 years now i got my 11th award some days ago and i had i hadn't even the time to tweet and blog about it and i'm also an azure mvp now in the third year i like to to be in both categories i always has cloud and data center management is the old on-premise world and azure is a new microsoft world and i'm between that a bit more on-premise still than in azure but uh both and then i'm also being beam one guard so why i'm doing this session um in the azure stack hdi implementations i'm doing uh the stretch cluster is really the feature that most of the customers want uh and if they don't don't do storage spaces direct they and do azure stack hdi it's mainly because of the stretch cluster feature because in germany and i think it's the same for for austria and switzerland we use a lot of stretched clusters maybe if you look at the amount of people who live there most stretched clusters in how you call that uh per head of people are maybe in in germany so why use a stretch cluster uh what are the reasons for stretch cluster because it's more complex of course than a normal cluster or even standalone nodes so the data in the vms is today very valuable and you can't lose hours of work or even minutes of works in some environments and there are a lot of those so um we have companies today who have um have a 24 by seven so they work the whole 24 hours seven days so including saturday and sunday and even some of them 365 days a year i have i'm actually actually at a customer who has this requirement uh who where the it has really to work 365 days and you need really full availability and a stretch cluster or um yeah a stretch cluster can help with that so we want to protect protect ourselves against local disasters so a stretch cluster could be if we think about let's say a natural catastrophe like like a volcano uh breaks out like we have now in las palmas or you have a a thunderstorm or a hurricane you have earthquakes and so on so you have maybe a stretch cluster that is uh whether where the two sides are are large there is a large distance between both but the stretch clusters also if you have two rooms uh with one wall and one of the room you want to prepare that this room has no power or can burn or whatever so a stretch cluster the distance can be very small two rooms uh directly connected the rooms are on the same compass maybe 100 meter 500 meter but it can also be a very long distance and you want to you want to protect yourself against the the total um not availability of one of these sites so if if one side is hit vms starts on the other side that's very important it's not something where you have constantly replication of the vms and the memory of the vms it's a replication of the data so if a vm runs in site a the data is also replicated to site b and if you have a disaster and say site a the vms can be started inside b so you have a slight outage of your application but it will be there maybe after some minutes and that on the other side so there are some requirements for a stretch cluster we have to have at a minimum four nodes two on each side so the the smallest microsoft azure stack hdi stretch cluster has to have four nodes not two four nodes two on each side we have to have a network between the sites that can constantly replicate the change that churn so if we have a long distance stretch cluster we will see on the next slide an example from microsoft where they have one site in london and one side in paris that's very far away uh then you have to have a connection between the two where you can constantly replicate your churn and if if you have vms on one side that do a lot of rights you have to get them to the other side so with a 100 megabit internet connection i think you that will not help you you have to have gigabit or even 10 gigabit and that's that's something that is not so easy especially in long distance stretch clusters on a campus shouldn't be a problem to have 10 gigabit between the two sides then what's very important is the witness placement we have to have a third side or azure i will talk about the witness place placement a little bit later because this is really important we have four nodes and four nodes are equal number of nodes so the cluster must have a witness and the witness can't be in the first side or in the second side this is very important and many stretch cluster projects um don't have this cert site or don't have the right connection to azure azure is also an option you can choose and then uh azure stack hdi is an azure product but you have to have it every host has to be integrated in an active directory and i'm not talking about azure active directory i'm talking about the old active directory that can be on-premise on-premises of course it can be hosted in vms in the cloud but you the the notes have to be in an active directory for example today for the live migration moving one vm from one node to another you need you need tickets from the active directory and that that is a must today otherwise you can't live migrate so here is an example of microsoft microsoft always thinks big because the azure data centers so when you when you have your redundancy there are i think at least 300 miles apart so the london paris example is is something microsoft thinks of the cluster is stretched over a large distance but that that that is not the requirement you can also have one room london could be and the name of one of your sites the room is called london and the other room on the same campus can also be called paris but we can also uh have a cluster stretch over both cities both european cities yeah so when we have this cluster we have vms running in the london part and these two nodes we have a storage pool where these vms live so their data is spread over these two nodes yeah and then when we create a replication a replicated volume every every data that one of those vms is written to the csv so the cluster shared volume is replicated synchronously or with such a distance asynchronously to the other side and then we write it here in a volume that is um it's offline it's not accessible so it's uh but when a disaster hits this volume will be brought online so if we have an outage here in london for example fire or power laws these two nodes here because this is one cluster so we have one cluster stretched over two sides these two nodes will notice that the other nodes are not available over the heartbeat and then they will bring up this csv where the data is and start the vms that are in these csv yeah this looks a bit like an active passive design that's one of the possibilities but more often we have an active active design so we have other vms running on this side in another volume this volume can also be stretched to this side and then if this side fails the vms that we're running here will be started here so it can be an active active or an active passive construct so um normally i will have i would have here an eight-minute demo how to install an azure stack hdi stretch cluster uh in the next session uh manfred will show that in detail i will only show show you the important part what's different with the stretch cluster to a normal cluster that will manfred show so um we what you see is windows admin center on and in windows admin center this is a the actual one the 2103 dot 2. um we can add here under add i will do that very soon we can add a cluster and i do that now add then we have create new server cluster there are other possibilities we have and then we have the possibility to install a windows cluster or an azure stack hdi and here you see we have all servers on one side or servers in two sites and that's the difference we have when you see manfred's demo soon um let's stop here um there he will not choose two sides he will uh install the two nodes in one side so and then everything is the same until minute six here after the cluster creation yeah and then here in minute six when we when we continue here after the cluster creation because we choose we chose two sides at the beginning we have to assign after the cluster creation two names for the sites here i choose east and west that are the two sides and when then we have to assign the nodes to the individual sites so two nodes are added to east and two nodes are added to west and then in the storage part that you that you will see in an in manfred's section it will create a storage pool inside a and a storage pool inside b and that's the only time as far as i know where we can have more than one storage pool in a azure stack hdi cluster and storage basis direct does not support two pools so i go back to my presentation and continue with the possible designs we talked a little bit about already so we have two possible designs we have an active passive design so we have a stretch cluster over two sides we have side one inside two and two servers minimum inside one two servers inside two and we have all our active volumes our stretched volumes inside one and they are replicated with storage replication two sides two but here they are offline and we have only vms running inside one so these blue and red uh it's not really blue it's purple also they are running in side one and site two is passive so it's it's just waiting that side one fails and then it will bring up the volumes automatically and also start the vms but most of the time to be honest i have never installed at a customer site an active passive design they always so far so far they always have active active designs so in an active active design we have one um stretched volume here that is replicated to from side one to side two and we have one active volume here from site two replicating to site one and of course in one volumes are actively vms running and on the other hosts are also actively vms running and now each side is uh watching over the other side if there is a failure then the volume from the side that fails will be brought online on the other side and the vms are started and of course you can have multiple volumes you don't have have to have one volume on one side and one volume on the other side there can be multiple stretched volumes on each side and we can have also the possibility in a in a stretched uh azure stack hdi cluster you will see that in the demo later to create a volume that is only presented on one side it it it must not it it must not be stretched you can also have volumes that are only presented on one side for example if you have applications that have um redundancy built in the application like an exchange d80 so in a dag or a and sql act um sql how it's called always on cluster they have already redundancy in the application so you you put the the vms in a volume that is only represented on one side and another way i'm in a volume that is only represented on that side and then you use the in application replication so um stretched azure stack hdi possible nodes in two sides we already learned we can have four nodes so two nodes in each side here in this picture we have a site in frankfurt and the site in hana that are two german cities that are not so far away let's say maybe 20 kilometers or 15 kilometers so we can do that and of course you can have a six node cluster three nodes in each side and you can have an eight node cluster four node in each side and you can have ten node twelve node fourteen nodes sixty node and then five nodes six nodes seven eight aimnode eighth note in each side what you can't do is a two node cluster stretched or three node cluster switch or having an odd number of nodes so you can't do a seven node cluster three in one side and four in the other side they are always the same number of nodes in both sides and if you look at this we have also always an even number so we have to have a witness because we need a tie breaker for our cluster design and for that that's our next topic the witness design so we have again our picture from before we have an active active cluster we have two nodes here and two notes here and we need a tie breaker so now let's have a witness the green one here in site one if we place our witness inside one we have now five um five votes yeah and if now something happens to our site two we have now uh three votes from five votes so in total it would be five but two are gone so we have three from five that is more than half of the votes are available so what will happen the vms the volume will be brought up the replicated volume will be brought up inside one the volume from side two and the vms are started here everything is fine so let's do the other side site one fails and of course then our witness is also gone so we have two nodes running inside two they are perfectly fine but they have only two votes now from five so there are below half the votes so and they are in the minority they are not at the mirror majority they are in the minority so what do they do they shut down the workloads they have and we have no vms running but you you build a stretch cluster to have all the vms running in a disaster not none running because the cluster is in the minority it will shut down the workloads so how do we do that correctly we have to place our witness in a third side so and what's very important each side can communicate with the witness without the other side so we have to have a direct connection from side two to the witness and also from side one to the witness not going through side one to the witness so if side two can only reach the witness to through side one because all the internet connections are inside one that's not a not a design you can't build a stretch cluster because if site one goes down remember the the slide before your witness is also not reachable it's running in your third site or in azure but you can't communicate with it and that's like down so the cluster will go down so we have to have the witness in a separate site and each side can communicate with it now site two fails and what will happen the cluster will bring online the vms inside one and the other way around is the same we have three from five uh votes so still the majority and our vms will be brought up on the other side so what witness options do we have with an stretched azure stack hci cluster or even with an azure stack hdi cluster we have the same witness options with as we had in storage spaces direct we can we can choose a file share witness where we have a share somewhere i think at least an smb2 share not smb1 s b2share or we can have the cloud witness we have another picture cloud witness is an offering in azure you have a blob that is your cloud width is very cheap and i i heard from a lot of customers in the past that that have done or that are doing storage spaces direct no no cloud witness we don't want to be connected to the cloud with azure stack hdi you have to have you have to be connected to the cloud because azure stack hdi has at least um inform azure every 30 days how much cores are running because azure stack hdi is built through azure so you have to have a connection to the cloud and then you can also use a cloud witness if you don't have um a third site handy with a file share witness of course you can use a cloud witness and it usually it's much cheaper than to build a third site just for the witness and what's not supported in azure stack hdi and storage basis direct even if they are not stretched is a disk witness so in azure stack hdi stretched or not stretched and storage space direct only the file show witness and the cloud witness are supported not a disk witness so let's do a small demo how we install the cloud witness where is my there it is so i go to an another node oh no i do it here the other already have a witness so if we go to settings here i'm now in an azure stack hci cluster it's installed um yes we talk questions at the end that's a very good we we talked about your questions at the end so um i have already installed it it's registered in azure and now i go to the settings and manfred will show you how you register your azure stack hdi in the cloud right okay so uh if we go here in the settings of the azure stack hci cluster and we go to witness you have the possibilities we have no witness or we could choose a cloud witness or a file sharewitness i will add a file share witness and then we have to give it a share and here i have a share we see there are already two clusters registers under this uh directory under this share and i will do now the third so i go back here and post paste it in and click on save and now we have our witness set and the witness of course has to be on a place third side or in the cloud so now our cluster has a witness so if something fails it has the possibility to have a quorum we call it a quorum cluster as a quorum he has more votes than the other side and here you see it's all all so registered in azure and manfred will show that in the next session so let's go back to my slides this this was a very small demo so um for the replication for the stretched volumes microsoft user feature that is available in windows with windows server 2016 it's called storage replica so you you can use storage replica if you have a data center windows server 2016 2019 2022 but it's included of course in azure stack hci the new operating system and it's a feature for disaster recovery so prevent you from disasters so here we see where the storage replication is embedded in the io stream to the disk or the i o pass from an application to a disk device and it's agnostic to to file system filters to the file system we use to uh snapshots bitlocker everything in fact it's between a vault the volume manager and the petition manager so uh in fact we replicate partitions not volumes and here every io to the disk is replicated to the destination site and then also written to a disk so it's very easy it's block based everything that is written to the volume is then transmitted over smb3 to the other side there are three scenarios we can do with storage replica there is a server-to-server scenario so if you have an hyper-v server with local volume you can replicate it to another hyper-v server the volume would be offline so if the a disaster strikes here you have to bring manually this volume online and register all the vms that are in the volume because this is not a cluster server 2 doesn't know about your vms in that volume it doesn't even know the volume so this is the manual task we have another scenario where we have two clusters using storage replica and it's the same we have one cluster here one cluster here if this volume fails you have manually to bring up the volume and register the vms in this cluster because this is the second cluster and this is the first this cluster doesn't know the roles only this scenario and by the way this you can do with storage spaces direct with two separate storage bases direct clusters but what we want is a stretch cluster where we have one cluster um including both sides and we have our volumes that are replicated so if this site fails the cluster already knows this volume and knows that he can bring it up and he will bring it up automatically and he also also knows all the virtual machines because they are this is the same cluster he knows every node knows every role and the status of every row so he brings up the volume and he starts the roles automatically so how does it work the synchronous replication we have two types of replication we have the synchronous replication we have an application for example our vm that is writing a block to our source node and in azure stack hci we have a source node and the destination node that is replicating the volume so it the application the vm writes a block to the source node and the source node is writing the block into a separate log volume i see also a csv and in the same time transporting the data to the destination node over smb3 so smp3 is used as a transport and then it writes on the destination side also in a lock volume and then it acknowledged the right the the successful write of the data in an in a non-volatile volume so the log volume is also on on flash and if you write it to the log volume it's even if you have a power out of the server the data is there then the data is acknowledged and then the application gets acknowledgement so now um the data is written and of course how long it takes from the right uh until the acknowledgement depends on how far away are your both the both servers if they are next room it's very fast if there are 300 kilometers between the sides you have maybe 20 30 milliseconds of delay from your acknowledgement and that's too much for a lot of applications after the acknowledgement is done the data is copied to the original destination so as you see here we have four volumes for a stretched data volume so four volumes instead of one for stretch dwelling and we have some requirements the latency between the site should be below five milliseconds and the lock volumes have to be on fl flash storage if we don't um if we can't promise these requirements if the delay is longer than five milliseconds or you have also the possibility to use asynchronous replication how does that work we have our application it writes a block the block is written to the log and the the write is acknowledged and then the data is transferred to the destination node written to the log acknowledged and then it is written to the data but we don't wait on the other side so the uh synchronous replication is immediately you don't have the delay but you have also your data on the other side it's not there immediately but it's maybe a delay of two seconds or 30 seconds or whatever so it's a continuous replication but you don't have the guarantee that every right is on the other side but it's also not bad because if you have a disaster on the source side you maybe are missing let's say 30 seconds of your data that's much better than restoring a backup from the night that that includes maybe does not include six eight hours of data you have done you have already changed in the vm okay so let's create a stretched volume do a demo where we create a stretched volume so we go to uh oh we can do it here we can go to volume in the windows admin center and you see we have already four total uh four volumes and these volumes are the very important cluster performance history so if you create a stretch cluster where is my mouse here yeah it's very small so if you create if you create an azure stack hdi cluster a stretched one you get your cluster performance history already stretched because it's important and we don't want to miss all our data from all vms from all data from all this from all volumes so microsoft creates a stretched volume and here you see the four four csvs we get with that so now we create a new stretched volume click on create and here we it will gather some informations and we have now the possibility to create a volume in a single site for my exchange dag or my sql always on cluster no we want to create a replica replicated volume across two sites and then we can specify where where is it online invest or east so if it's in west it will create it will be created invest and replicate it to east or the other way around so maybe if you create multiple volumes you spread them over the sides for an active active design we just do it from east to west because east is first here and then we can choose which replication mode do we want an asynchronous or asynchronous and of course you can change that later so you can change from synchronous to asynchronous or from asynchronous to synchronous after the fact so it's not it's you can do that later so i will call it stretch volume and now we can choose the resiliency and this is a four node cluster so we have only the possibility to do a two-way mirror in each site if you want to to to do something like a nested mirror or a nested nested resiliency we we have to do it in powershell um but for the demo here we do a two-way mirror and i say it's a 400 not terabyte it's a 400 giga gigabyte volume so then we have some advanced options here and you see our volume is called stretch volume so it will create replication groups from the name with group and replication group and a lock volume size and best practice it turned out at least do 100 gig or better 200 gig for your real world uh um stretched volume so the the proposed 40 gigabits is much too less so or do 200 or even 500 gigabit if you have a 10 terabyte volume and then we can do some other things here but we can't do bitlocker and so on and now i i go to create so i press create what will the windows admin center do now it will first create the stretched volume in side east then it will create the stretched volume log inside east then it will create the stretched volume inside west and this is offline and it will create a stretched volume um now it will create a stretched volume replica in west and it will create a stretched volume replica lock in west and that will take a bit and it will also install a running storage replica so we will come back to this one and i will show you that in a demo or i go to another another cluster that is prepared for that and if you look here and look on the volumes on the volumes we have eight here you see we have our cluster performance history our stretched volume stretched volume lock and storage volume replica in stretch volume replica log and if we click on storage replica we can look at the replication status of the storage replica partnership and there should be one partnership for the cluster performance history and one partnership for our stretched volume and it is continuous replicating we can look into some details here we see it's synchronous here it is synchronous and we can if we want to can go to settings and here we can modify the partner settings so we could switches to asynchronous or from asynchronous to synchronous we can increase the lock size it's a 200 watt gigabyte volume so i would advise to do more here is it no it's i think here it's only only a 100 gig okay but i think you got the impression how it works so let's go back to the presentation i've still 10 minutes maybe i need a little bit more so there are some power shell commandlets to help us with uh the replication and to be honest in the moment with windows admin center you can't delete a replicated volume or at least you can't delete it fully so you need some powershell command let's for example to delete the partnership and i hope it will be added in the windows admin center module but in the moment you need some powershell and there are plenty of powershell commandlets we have for example for the storage replica we have a test sr topology where we can first test if a replication would work especially in a stretch cluster that is is a far distance stretch cluster we can look at the replica status there are replication groups you can with get storage replica groups you can look at the status and see them and there are the partnerships so you can also get get as our partnership and there is a set and the new and the remove of course then if your synchronization stops for what reason or ever for example your internet went down for two days then of course your uh synchronization stop because the logs are full and you maybe want to manually resync it there is a powershell command for that then a very important thing you have to specify which network cards are used for the replications microsoft has some very strict strict requirements for the storage replication networks and you have to for your storage replication groups you have to specify which networks are used um very important then we can delegate admin rights so you don't have to be domain administrator if you want to care about the storage replication there are possibilities to grant rights and revoke rights and we can limit the bandwidth how many bandwidths the storage replication can use bytes in bytes per second um usually we have to have a very fast replication network where you where you that you don't share with other stuff so you don't have to limit the replication bandwidth you have to guarantee that your replication can always happen otherwise you have some problems so let's let's move the vm and the volume to show you that was a question when i first introduced a stretch cluster before it was available we we knew last year that the stretch cluster scenario will be available and people ask me in webinars um can you move a vm around without a fail yes you can do that and i want to show you that so i go back to the demo here in this cluster we have a running replication and i have deployed one small benchmark vm into into this stretched volume so if we look in the hyper-v settings here and you see we have it in stretched in the stretched volume there are our discs yeah in c cluster storage stretched volume small benchmark and now i saw i i missed something i didn't include the vm in the cluster i will add that so i have to connect to the cluster i'm on two it's the cluster is called azhci to cluster so this is live and i open the cluster the good thing is all the old tooling is still working and if i go to roles there is nothing and i will configure a role just a virtual machine the cluster will look on the hypervisors which roles are not included in the cluster yet so i add the small benchmark vm and then we have it in the cluster so here we are it's running on the second on the first node you see that here and our stretch cluster going to windows admin center to show you clicked wrong to show you the servers and under servers we should see under inventory where each node is so we have an east and the west side now we can't see that here right so we go to windows admin center under nodes here are the sides so the first node and the second node are in east and the third and the fourth node are in west so our our vm is running on node 1 in east in the volume called stretched volume and start stretched volume is also the owner is the node 1 where also the vm is running so now i want to move the vm to the other side so let's do a live migration here move live migrate and select so if i choose best possible node it will move it automatically to node 2 because it's in the same site and if i then again live market and let him choose where to move it will move it back to one so the vm will stay in the same side where the storage is and if you if you let the vm move it to the other side that's now the case i have now running the vm on in the other side um it will the cluster will move it back to the site where the storage is so the vm is following the storage after a while it will move it back automatically so now we have the vm running in site west and the storage is inside east and we can see that if we look at storage replica here it's much easier in i like the windows admin center and i always use windows admin center unless i don't have the feature i need like moving a vm on purpose you can't do that in windows admin center in a cluster in the moment so here you see we have our replication from note 1 to node 3. and now i click here let's see what what happened with the vm it's now running on the other side and i was i say here switch directions switch directions and maybe he asked me for something so here we see the vm it's switching directions and we will see two things the i o will halt on a hardware cluster this is a virtual machine cluster on a hardware cluster to for five to six seconds then we have we'll have i o again here it maybe takes longer because this is a nested virtualization cluster then the io will will be there again and then it will halt for another five to six seconds here unfortunately it takes longer but in real world i've done this multiple times so you will have an i o hold for five to six seconds and another one but your vm is running here you see the another one but after let's say 30 seconds your vm the storage is on the other side we have you have two small five to six seven i o pauses and then the vm is running on the other side and now it will rebuild the replication yeah so uh replicating some blocks back to the other side okay let's go on i i'm nearly done so um with the time at least so now some important networking requirements and microsoft was a little bit late to to document the network requirements for a stretch cluster so in in the side so here we see an eighth note stretch cluster four notes inside a's four not inside b in the sides you can do smb3 over smb direct so with rdma for the storage replication between not storage replications uh excuse me for the software software bus layer so the storage bus when you have your extends your vm writes into extend and you have a three-way mirror the the data must be written also to other nodes for for that we use smb direct preferably um switchless is also supported in the site so you can connect the four nodes with a lot of cables without switches so for the replication network between the sides we need a fast low latency network that's important low latency because if you do synchronous replication and you have a high latency network it adds to your io because it adds the milliseconds to the i o microsoft does not support rdma for the replication traffic indeed microsoft wants to have separate adapters for the replication traffic with no other traffic on them i have some support cases where where that is clarified no other traffic than the replication traffic on those adapters and only smb over tcp and we have to have layer 3 routing microsoft does not support a layer 2 ip network between the sides even if there is only one small wall between the two sides and you can perfectly have a layer 2 network microsoft requires a routing instance so you have to have routing in your switches or in your firewall for the replication traffic it's not supported to have a layer 2 network no layer 2 network is supported i'm i'm talking to microsoft because in a small stretch cluster with with small distances it would be much easier to have a layer 2 ip network instead of doing routing in switches and so on and we have to add our constrain network uh so the storage replica constraint network uh for the replication group otherwise the network is going over another net and it does not prefer the routed networks usually it uses the management network or so you want to have your replication traffic on the separate networks that's very important i will maybe do another webinar about that because i'm doing in the moment i've done the last months some stretch clusters what about dot dot in stretch cluster azure kubernetes services in the moment azure kubernetes services are not supported in a stretch cluster scenario and that's a huge miss in the moment because if we do azure kubernetes services on azure stack hdi people want to stretch them they want to have high availability or even disaster recovery possibilities for the containers software defined networking doesn't work it's not supported in a stretch clusters enough it's only supported in a non-stretched cluster scenario bitlocker it works so i have a customer where i implemented bitlocker on the on the stretched volumes but you have to do it by hand so with powershell and some commands you have to do on the host in in a remote desktop session on the host it is not working um remotely where you use invoke command or something integrity checksums i don't know actually i have some questions for the product group integrity checksum you can't configure them for the data part of your of your stretch volume so there's no um in admin center there's no possibility for that manfred will show you that you can do that very nicely bitlocker and integrity chocsums if you create a volume in a not stretched scenario deduplication same i don't know if it's supported at least you can't configure it with windows admin center but in the moment i would i would say don't use deduplication so now disaster demo we skip i will because we have only some minutes left and other questions a lot of questions okay then we go to the lot of questions uh helmut did a great job he already answered a lot of them but maybe it's interesting to take them or some of them on audio yeah so ask ask away i have we have let's say six seven minutes and then uh okay so let's see which questions we have here where we will start so there are some questions about wino's admin center i will take them in the next session i think this is a good idea so we don't forget them there are comments about windows admin center we will take them uh later so you can sure we have them here um so there was a question when you were presenting carson can we change the resiliency from two-way to three-way mirror that's a great one and uh thank you for that question and uh it's um so the the technical answer or the the marketing
2021-10-04 14:31