Microservice Architecture and Serverless Technologies Implementation in Health Tech

Show video

Hello everyone, I'm Milos Kowatschki.  I'm an Engagement Manager at Vicert.   I was previously a Software Engineer and now I'm  managing customer relations and making sure that   projects are delivered successfully.  With me, I have Kresimir, who is our   lead DevOps Engineer. Could you tell us something  about yourself? Sure, my name is Kresimir Majic,   I've been in IT for about 10 years. Started out  as a Linux Engineer, then graduated to a SysAdmin,   and I transitioned later on to a DevOps Engineer,  which I instantly fell in love with because   I like automating stuff and putting  myself out of work. Great to hear that.

Speaking regarding serveless technologies we are  living currently in an era where more than 83 of   enterprise workloads currently work in the cloud  environment and more than 30 percent of the budget   for each IT company goes to the cloud environment.  Now, in recent years there is actually a trend to   refactor the infrastructure to  work on microservices architecture   in parallel to monolithic architecture  that was more or less most common   in recent years. So what are the challenges to  actually do the refactoring from the monolithic   architecture to the microservice architecture  and what are the greatest challenges to   tackle in that aspect? In my opinion, the  biggest problem is trying to determine   how complex you should go. If you look at  for example DevOps job postings uh they're   basically filled with Kubernetes requirements and  i think Kubernetes is both overly complex, and   secondary a relic from the past because in  the past you had this need where you had to   orchestrate containers but all you  had was bare-bones servers so you   made this cluster that was Kubernetes and  it basically shifted the workload around   those servers and then they somehow just  pushed that up to the Cloud. So now you had   cloud servers but you still had Kubernetes  that split the workload over to them.   Since we now have the option of creating  serverless containers and simplifying that a lot   I simply don't see the need for most of  the current setups that employ Kubernetes.  

I think it's overly complex, I think it  should get abandoned and the problem is that   people usually don't want to tackle Kubernetes  I mean people outside of the DevOps world. It's   difficult to get developers involved in how your  pipeline works how infrastructure is tied together   when they have to untangle a huge complex  mess that Kubernetes can become very easily.   So basically I avoid it whenever I can and  how complex is actually tackling the network   infrastructure in that aspect as well because we  know that communication between internal services   could get to really complex points. So how do  you transfer all that to the Cloud environment  

and how do you ensure that the networking is  working properly and that you have the bandwidth   and latency that software requires? Well the way  AWS is structured - they have these huge, well   they're not even really data centers, when talking  about data centers you imagine rows of servers and   the way current AWS data centers look like they're  like a container. How do I put this - Have you   ever seen those shipping containers? Yeah.  That's basically it. They just have you know   the power input and the container is filled  with CPUs and memory and storage and whatnot,   so you can't really walk among that you know,  racks of servers, it's no longer like that.   So since all of it is so packed tightly  together, the bandwidth and everything,   it's not as costly. The latency is not  as big of an issue as it was before when  

you actually had to go from one server to  another. It's also tightly packed now that   you really don't have the impression  that you're going to a different server,   that you're abandoning your own server and going  to the other one that hosts another instance.   AWS does charge bandwidth and in a  different way, there are ways around it. But  sorry what was the question again?  How do you integrate networking   with services? How complex is that? I'd say that  you have to think about the infrastructure from   the start and not try to cram whatever you  came up with afterward. So this is basically  

cloud-native thinking, a cloud-native approach,  so if you know what the requirements are and how   you're going to pull them off then it's  not a problem. The problem arises when   you don't know how to do something and then you  try to hack it and in the end, it becomes messy.   Yeah, that makes sense. Concerning the databases -  what I read is that for example Aurora Serverless,  

it's for variable workloads mostly, so if  you have let's say a stable number of queries   going to the database - is serverless something  to consider or you would rather go with RDS?   I would definitely try to go as serverless  as possible that's basically my motto   so whenever I can abandon an old way of thinking  I try to do that. But that mostly depends on   what our requirements are. If we have a  service that needs access to the database   throughout the day then we wouldn't really be  able to use Aurora. However, if you have a service  

that basically uh does a dump of data maybe at  midnight or, I don't know, once per day basically,   then it's no problem integrating it and spinning  up the Aurora DB to serve that need and then   maybe you can spin it up for the several hours,  for example when the doctors are checking exams   or something like that you would spin it up daily  when the data is being dumped, and then maybe for   two or three or six hours when some work is  actually being done, but that means that you   can keep it down for, I don't know, 18 hours  a day and all of the weekends and everything,   besides you can even build in a switch that like  we're using with containers now if somebody does   a commit it spins up the instance and keeps  it up for maybe two hours or something. I presume that in the case of  patient portals for example that is   quite common in healthcare  something like Aurora Serverless   would be compatible because you don't have  a stable number of requests from patients,   it's a very variable kind of workload, so that  should be complementary to that aspect right? Yeah I guess it's difficult to estimate how the  workload is going to look like ahead of time. Now   if you can do that's great but in  our situation some of our developers   are in a different time zone so we have both  U.S and Europe-based developers. So that fact   alone contributes to the amount of uptime that's  required for the project to work properly and   it kind of goes against the logic of being fully  cloud-native and using serverless as much as   possible because if we need the database  up so that our developers can work on it   and then the US-based developers can work on it as  well that's basically two shifts out of 24 hours   per day. However, that requirement  is only there for the dev environment   so we can turn off the database in staging,  in production, in testing or whatnot   in order to set well. Obviously not  production, but you get what I'm saying. Yeah

now many of our clients are concerned whether  they could achieve the compliance in aspect to   HIPAA compliance as well as part 11 CFR concerning  the data at rest and the data in transit as well   duties serverless technologies. Are they capable  of achieving compliance up to the standards?   Basically, the legal system requires  absolutely I believe there's also a   step above healthcare requirements which  is the government's secret requirements,   that's not a technical term, I mean just that  AWS when they were developing their services   they were not thinking that they need to be capable of providing compliance  for just the health care industry   but one step above that, which is  the military and the government,   of course they do have separate data centers for  that type of requests to serve those customers,   but the point is that they absolutely can do that  and they need to be able to do that in order to   provide service properly. So it's just a  matter of are you going to implement it. In a proper way can you actually deliver because  they are certainly able to provide everything   you need to to make it happen to be compliant so  what you're saying practically is that instead of   building that security on your own you can just  use the security that was built up in the AWS from   the ground up and you will not have to have that  kind of concern at all and the more you leverage   your workloads on AWS the more security concerns  you are basically passing to the cloud environment   because I know for example  when you run an EC2 instance   um you have to take care of the  security and instance-level. Yeah so the security goes, the responsibility  for security goes two ways. AWS is covering   their part making sure that nobody would  access the physical premises where their   hardware is located but there's also your  responsibility not meaning you exactly,   but any DevOps guy or even a developer who  develops software that's going to be stored on   that hardware. So they secure their part but you  also need to secure your part um imagine if you   had a car and you park it in a garage and the  owner of the garage guarantees that nothing is   going to happen to your car while it's in that  garage. However, when you take it out you're on  

your own basically there are no guarantees at  that point. So even though they allow you to   let's say create the most secure thing possible,  you can still make it publicly accessible without   a password and there's no real way of stopping  that, that's up to you completely. So you need to   know what you're doing but the point is AWS will  support you pretty much in every way possible. I   mean they will allow you to be as secure as needed  so there will be no constraints from their end.

Okay that's that sounds reasonable and  concerning the costs because most of our clients   are speaking about it and they would like to know how cost effective going  serverless is and whether that will significantly   reduce their costs that are currently being spent  on the infrastructure and also on maintenance   um there are significant uh significant  how do i put this you can definitely save   a lot of money by implementing serverless  properly and the key word here is properly   so if you keep serverless infrastructure up  24 7 then you're going to be losing money   even that is not a deterrent in some cases because  the amount of simply simplification that it allows   is worth it because you no longer need to spin up  entire servers you no longer need to maintain them   you can just keep one container up uh throw out  anything you don't need and it's much easier   to deploy. However in order to make effective uh  price reductions you would need to think about how   to turn stuff off. Basically that's the biggest  advantage of serverless because it can spin up pro   spin up quickly and take itself down when needed  so you can basically schedule your infrastructure   I don't know if you have a consistent  office hours you can schedule it to start   five minutes before you go into the office if  you really want to be showing off you can maybe   time it so that when you pull your key card  through the entry in the building. You can make   that a trigger to start up your infrastructure,  even to make you coffee or whatever,   if your coffee maker is plugged into the network,  I mean the possibilities are endless but you get   what i'm saying. Yeah the key point of serverless  is to shut it down. That's how you make savings.   I know the developers are sometimes  concerned regarding the cold start,   could you elaborate on that aspect definitely um  currently our pipeline is pretty much being told   to retry every 30 seconds just to make sure if  the deployment if the new deployment is ready.   So for example um let's say we deployed a  new container then hold the API the back end   and we need to migrate the DB. What we do is we  place a timestamp as a response in that container  

and then the pipeline knows which timestamp it  gave to the container and when it gets it back as   a response that means that the container is up and  it can now do the work that it's supposed to um   I believe it takes like two on average about  two minutes for it to get deployed and for the   networking to propagate to set everything up  and that's basically how long we need to wait   um I am not sure about Aurora, but I think it's  somewhere along those lines it's all under five   minutes, the point is when it's automated  you don't really care how long it takes   you just need to time it properly now  that's all assuming that the workloads are evenly split or predictable okay, and when you're building this how would you,  I would say prioritize things because I know that   many companies are currently transforming their  workloads to be cloud compatible and they often   say where to start and from where to start  exactly in the infrastructure aspect? What is   the easiest way to start that and how would you  prioritize what goes first and what goes next?   Whenever I'm building something I  try to make it as simple as possible   with the least possible amount of moving parts  so that it doesn't scare away people who try   to get involved. People always say that  DevOps is not a position it's a mentality.   I don't really agree with that, but I get  what they're saying. It's not easy if you   let's say have a full stack developer or at least  somebody who knows both front end and the backend,   it's very difficult for them to grasp all the  concepts of building infrastructure networking   and everything security as well, and then  to build infrastructure and pipelines and   automate all that. It's just too much work for  one person. So somebody can maybe scratch the   surface on all of those areas, in all of those  areas but not really be excelling at all of them.  

When I build stuff I try to make it with the  least possible amount of outside dependencies   basically anything that would scare  people away, and I try to, I don't know,   for example, hold all the variables in  one place and not making people go through   five or six different UIs, where one would show  you, I don't know, monitoring, the other will   show you logs, the third one would be the API  interface that you know swagger or whatnot.   I try to simplify it as much as possible uh  i prefer mono repos where everything is in   just one repository and there are no outside  dependencies so whenever I can, I offer that,   even when the project is done i see i try to  review it and see if there's anything we can   throw out um does this variable consist of  some other variables so that maybe we can   compile it on our own if it for example  has the AWS account ID and the region   well I already have both of those variables I  don't need this one as well so I can throw it   out and just use the two previous variables  to compile the third one basically stuff like   that so just minimize everything and as Elon Musk  would say - "The project is not done when there's   nothing to add but when there's nothing to throw  out", and concerning development many developers   say that it's hard to debug the code in work.  When you're working with a serverless environment,   how difficult is that really and are  there any best practices to have easier   debugging and easier development time? That's a  great question. What we've settled on is basically   abandoning local development stages so we do not  rely on our developers' computers to run anything   from the start we opt for a development  stage that's entirely in the cloud   and we even try to mimic production as much as  possible so that there's no split. Let's say when  

okay we're done with the development phase  we're going to push it to testing or QA   and then when we do that or when we go from  testing to staging that there's a huge shift   and now all of a sudden we put on some securities  and restrictions and whatnot we try to avoid that   we basically try to build it production-ready from  the development stage so that everybody knows what   to expect later on so we encrypt the buckets the  databases everything we try to hold the logs for   a bit longer, I don't know, making backups  as rapid as the production would have them,   and I don't know, basically mimic production in  every way while still trying to keep costs low.   That way our developers are allowed to use  whatever they're comfortable with uh they can   be on Windows or Mac or any variant of Linux  they prefer. They can use any tools they want   they can basically do whatever they want because  their development stage is in the Cloud and it   also allows day one pushes to production  because you just need to clone the repo   push something there and we usually do the pull  requests between branches to promote stuff from   one stage to another so in theory if you hire  a guy day one they could maybe add a comment   in code and commit that and then just do the pull  request through to testing staging and production   and their comment would be visible you  know in production and when it comes to QA   is it easier or uh it's more difficult  to QA on the serverless environment that I guess, that depends. Working from  home kinda introduced another barrier   let's put it that way. I mean I definitely love  working from home but usually, when you're in   the office you have the single IP as the source  and then you could restrict your infrastructure   to just that ip, and then you're pretty safe  because you know that only people from the   office are allowed to access it. That's that.  Now you either need VPNs or you need some other   securities to limit who can access your stuff and  how so you can use http headers you can use VPNs   but it's not as simple as it was before. When you  know what the office ip is static it's always the  

same, you just limit that ip limit access  to that IP, and you're good. So I guess in   that regard it is a bit more difficult but it's  definitely not an issue it's a minor thing really Okay. When it comes to optimizing  the microservice architecture   um what are the best practices that we  have encountered so far in optimizing?   Well usually in my personal experience   it was related to how the processes work.  So I try to log everything for example   in a container startup usually i use Nginx as  front, and everything goes through it, and if i   use for example OpenSsh to allow the Ssh access  to that container, we're talking Fargate here,   um will be the only other process that handles a  port on that container. However everything else  

is sorted uh through Nginx ,and I remember  this example, it took like four minutes for   my container to spin up and I couldn't figure out  why. So i try to log everything that happens in it   and the docker file will basically um spin up  my OpenSsh it would spin up the back end service   it would, I don't know set up the OpenSsh service  create the configuration everything and in the end   it would start Nginx and then Nginx would be  the initial process and all of that happened   in two seconds maybe. However it still took like  four minutes for my container to be visible and   I just couldn't understand it, so i started  logging the AWS Health checks that arrive from   the load balancer to see what went on why  is it taking so long and it turns out that   obviously there's amount of  health checks that need to pass   that go from AWS to our container just to make  sure that it's responsive and it's ready to   receive traffic so first I minimized all that but  even with having just two positive consecutive   health check passes that are I believe five or  ten seconds apart it still took like three minutes   and when i realized that this has nothing to  do with me I finally contacted AWS to ask them   what's the deal with the container uptime,  why does it take so long to spin it up,   and they basically confirmed that it takes their  infrastructure about two minutes, two to three   minutes to tie everything together and allow  access to the container. Basically, those health   checks would only start after about two minutes or  so and that's why it took that amount of time to   spin it up so there was nothing I could do anymore  to to to enhance that that was as fast as it could   be and, yeah, so I guess you would need to know  a lot about how stuff works in order to eliminate   everything you can so I wouldn't really say  that I have a process it's just observing how   stuff works and seeing if there's  anything that could be eliminated okay thanks we are witnesses in healthcare that many   applications are actually used to process  images and I would like to see how we   utilize serverless architecture to complete that  goal? Okay so what we have here is a diagram   of how our infrastructure looks for one of the  projects that we worked on um what this is about   is having images that need to be reviewed so our  AI is going over images it tries to diagnose them   and whatever the outcome is we also want to review  that later on so we need to have doctors going   over it to see if they agree with the assessment.  We need to have technicians go over them to see   maybe it was blurry, maybe the camera didn't work  right, whatever the issue may be. All of it needs   to be reviewed. So the works the workflow  itself is file-based so this allows us to  

have modular projects that go across teams so  the endpoints are basically just S3 buckets,   they fill up a certain bucket with their images  they just dump them or the camera dumps them, it   doesn't really matter, the point is the event of  a file being created in a bucket is a trigger. So   once a file arrives let's say we have a prefix  and a suffix, that we will filter it from, so   let's say a jpeg image comes in and it goes to a  folder that belongs to one client one hospital or   a clinic or what not, that would trigger a certain  event that would go either to SNS or SQS directly,   basically it would create a  message that would trigger a lambda   and that lambda would pick up those images  or a single image, store them in a bucket   for safekeeping, and it would also call  the API to put that in the database to make   the database aware of the images that are now  in place. Also, the API would then allow the   front-end application that's stored in CloudFront  to have access to that, to that image set,   and then we would also generate an email that  will contain the link to that exact location   where those images can be seen so once everything  is tied together an email would go out and notify   whoever needs to be notified with the exact  deep link so that you can just click it   and visit the site that has the images ready. We  also tie the Cognito into the whole thing as well   so that you need to be logged into our Azure  DevOps project or basically just Azure um   and if you have sufficient permissions  to view this it would allow you through so you have a lambda function that  is being triggered once the file   is added to the S3 bucket um and then that file  is processed uh right um yeah and and and then   we trigger another lambda function that again does  some um processing from the SQS. The main logic   is that when images arrive a lambda is trigger  that does its magic pulls the images makes a copy   stores them in DB and whatnot and there's another  lambda that is pretty much just generating the   email links you don't want to cram too much stuff  in your lambda you certainly can do that by having   multiple different triggers for the same lambda  but you don't really want to complicate lamps   too much and and have them determine whether  I'm called from this path or that path and then   i differentiate how I'm going to behave it's  certainly possible but I like to keep things   simple, so in my view it's better to have separate  lambdas where one lambda does one thing, and   you know, depending on how many things you  need done you have that many lambdas, but   the point is not try to over complicate  assemble a single lambda. The idea is to  

keep minimal functionality along with the  function in order to easily maintain that   and also have separation concerns  in global exactly. Okay, thanks.

2022-06-06

Show video