Healthcare Claim Reimbursement using Apache Spark
Hello. Everybody this is Salim sayeth I work. As a principal, architect, in Optim, optim. Is a leader in healthcare services, today. We will be discussing about, our, journey, to, claim reimbursement. Rewrite. Using spa. Today. We will discuss the. Brief. Business, overview of the, claim reimbursement system. Why. We chose file to rewrite our old. System, into, the new system. The, claim ETL, rewrite. Using. Spark we'll discuss that first followed, by the migration. And in, challenges, then. We'll discuss the claim reimbursement rewrite, system, using. Spark. Then. We. Would talk about that italic adoption, the benefits, and performance, gains, that we have seen. Followed. By expanding. The horizons, horizon. And tips, for successful, -. Here. Is a brief overview of claim, reimbursement system. And. You see on the top right corner at the zero, level, there. Is a contract, and there is a coverage, so, the coverage is the healthcare, coverage, of. A, patient. With. The insurance company. That. Really tells that how much, you. Know how much is the patient responsibility, how much is the copay, no, coincidence, and what, are the services, which have covered or not. Things. Like that and, the, contract, is the is the is the document, document. Between, the provider, which is the hospital and the pair which is the instrument, so. The contract, is being the same that we never see that. Talks. About how, much. Of a payment needs, to be made for. A service. So. There. The, gods are complicit in that contract. So. List so our software can. Use. That. Contract, to evaluate. How. Much payment. To be made for, a given K so. Let us see, over. Here on. The left you see the, patient and. The patient, is. He. Walks, into a hospital when, he is sick and our. First the. Curve is documented sky. And then. The patient gets admitted gets. Treated and, after. How the care is why, did the patient is good now. And. Then after after, that the hospital. Accumulates. All the charges and services, done to the patient. Prepare. Sick team, fall. For. Those services, and since that claim to the pair the. Pair will evaluate this claim with. The contract, that that, is already negotiated. And, makes. A. And. Then, that reimbursement, check. Goes. Back to the hospital, there. Is a possibility that this reimbursement, is wrong, or underpaid. And in. That case the hospital would be losing. The revit so, there's, a process that hospital, can figure. Out that. There is. So. They can make an appeal and as, part of the appeal the. Payer would evaluate, that, and. They. Are just paying it so so. Essentially. This. Is our software. In. Octave C 6 T, as. The. Captures. The captured, this under payment and helps. The hospital, made, the appeal. So. It can say essentially it brings, money in the sense that there. Are losses. That are happening behind the scene that you know that. Noticed. Unless our. Software. Is placed, in the hospital, and which, is working. Because. Of the. In. The. Which. Is. Kind. Of. You. Know. It. Can be cost-effective. Essentially. Because you know as comparing. To a comparing. To a DBMS, you might where. You might be paying a license fee here, there is no SSP, the whole source file is completely. Free. And, in such a fierce warrior. There. Is a possibility of. Consoling, if. You can, adopt, and skinning in. A cloud environment. Because. Our pillar of City sources so, if, you if. You learn a longer, in cluster with, all that much. You. Know it's. Not. Going to because so, you will need to adopt dynamic.
Scaling To, but. There is a definitely possibility, of your easel. The. Reasons that we chose part was it is easy to adopt because. As. You know another. Few concept, to master and, spark. Does such, a good job being handling them behind the scene that. You would not really you know be. Worried about using, these distributed. Distributed. Execution, happening, to be in the scene you. Just like, plain simple. Scatter. Stuff things. Like you know you. Know edgy execution. Distributed. Nature execution. Would. Be happening to be honest it for you what do you know so, this is. The. API is very fluent. It. Will it will it will. It will, be natural and easy for you to use. It very. Common. Skill set so you know you can get somebody it. Is easy to. Start. Working on spark, the. Whole development process can be ID based, because you, know because. Your. Best case is. Just. Like you were running you know any standards, just cases without, you feeling like you be feeling, like a cluster, be a machine which is executing, you know. You. Don't need any virtual, machine to run your job. Just. Like I like. To. Do. In. The olden days you. Know. The. Port that you would like is going to be bad and streaming, come right so you don't have to make two sets of core, for. The same business classic. It. Is. With. Many, systems, it, has amazing, set of you. Know connectors, to connect. As. Well as. You. Would never feel. There. Is an active, community support, for, example. If. You ask the question in Stack Overflow you might get a response, within a day or so. Some tutorials, on data and breaks a blog and the. Documentation, on the, sale. Of citizen so that, helps you speed up it. Let. Us look. Into, the, the. ETL the clean ETL, rewrite, process, so, clear material is a is, a system, which kind of acquires the claims, and. Then you know cleans them up and. Disposing. Of it, and then, prepares, it for the reimbursement so. The first piece is acquiring, the claims they eat here. Here, is the system, before and after, so. As you see on the website that, before is a typical. PL, sequel based system where the. Claim comes, as you know files. And then. We have written in. Java. Scarcer which cater parses, the claims, which, are immediate, format so I answer, them and stages. Them into substitute, them and then, we run real, simple code to where. They come from sticking to a routine and, process, them and they. And. Then. From there onwards, it goes to downstream, processes, which is you know. Things. Like assignment. Rule the field process. Reversible. Processes. Which. Are not showing the background and that, system. Will reload in spark and you see on the after on the right hand side and. You see the main difference is the, beautiful. Block is what is the replaced by. So. In, the new system, we. Acquired the claim, files the same way as. Files and then we pass it to the same person. Spice. And. Then we. Do all the business logic over there and. Then we use, market. Based data. After. He's finished the, vision. Goes back to the, staging tables and then. Those staging tables from. Then those. Telling tables it goes back to. So. What I mean is. Processes. The you know game, files and, uses. That they can lake, to. Do, a historical, look up and. Possessing. And. The end of the processing the raziel goes back to. As. Well, as the ledger is pushed into into. To the, stated goals so, then I push to the stating devil and then, from there we run some words sequel. That. Which kind of was either a translated, into rolling to the 500 and. Then. Once the Riverlands the permanent evil there, are many downstream, systems from here which, which. Are depend on the table so they will. Let's. See the, highlights of the, et, relate process so we use, spa. 2.4, the, controls version I, will. Delete, is parking. Format, and we're, working towards magnetic. Different, party to ventilate. And will. Of. This tradition. We. Use currently, spark, standard, cluster, of things we. Planning we are planning to migrate to as a mix.
So. Much down the line. Then. We are also using tools. By Zeppelin. Or move and sparks, share. Ever. Being issues. Let. Us discuss the gains and challenges, that we have seen. As. You see here this, enormous, figure the, x-axis, is volume, and the, y8e is tiny the. Other, end box is for, sparkler. And then the gray, boxes for the beautiful, base note and you. See. The. Spot where this bill is much stable so the warning goes, from 4 million to 20 million still. The spark timing. Did not go up that much because. It could scale well. As. You. See that it will move the time much. Higher at. The same time be the volume will be more. Then. We had failure. So. As you see here from a medium, to high volume scenario. Spark. Always, beats. The. PLC system, what. I am not showing in this diagram is a low volume scenario, let's, say half a million records, of scenario where. The, else. Equal no is actually, faster, than the spark law, because. It is you know the response would you, know perhaps has, some overhead, of distributing. It. Learning. It so in. Reverse Daniels, pH, equal widths, but. You, know the difference in timing is by in minutes, so it doesn't really matter that it is faster for low volume extinguished. So. In. So, in all cases we. Felt like on. The performance, and scalability aspect. Continuing. On the gates so. You, know, when. As. As. Time passes by right the the core base I mean. The. Technology that we have used kind. Of becomes, old and uses shine so, you, know so there may be time for you, know and there were technical debt mounting, on the old port ways so, as part of the dealer it we kind of get rid of all the state debt, and, the. Issues that we have so. That is, there. Is a possibility, of saving cost, I, said. Before by, using dynamic scaling. And. Then, a side effect of this rewrite. Is that. It. Is. So. The early data, is. Splittable, and this, is available for you, know any processing. Which can use. Because. It is pitted. So. It. Is a great infrastructure, for, doing, analysis. And machine learning kind. Of workload. So. That is one advantage that you can use. For. Other, things that you could not do. This. Is a to be a challenge. For any, new system right if you write a new system then the, subsystems. Around that system and you realize that you know the, operation. Team and, supporting. They have built, around. You. And they, used. To the whole system you know so you use.
The Whole, system is tutor we, track, the data state so, all of that needs to change so you need to consider personalizing. Your your, code base which. Is much beyond. So. You need to consider about all the side the tools all. The. Processes. And procedures, and. That needs to change so you need to be developing, new, tool sets to interact, to you people to use, those tool sets. You know they, done lately a little bit from database, because, they're evasive has, in excess so you can really access data quickly but, the park event in a lake is is not, that. Quick when you access. So. It works extremely well for, large volume but if you are deepening something, you, want one record, then we might have to wait let this image to just see. Their accountable. So, that becomes, challenging sometimes. And, then people need to learn new skill to be able to use airplane. Spots. Equal, which. Is you know which is very much like simple still. So. More challenges are like you know there. Are custom tools tools, and scripts which, I develop today as a dog before so that also needs to be refactored. Cost. Saving is possible. But. Only in dynamic environment, as I as, I like. So if, we have, a if you have a infrastructure. Which is not available where, the building sector is like monthly, or yearly then. You. Would be a fair, amount of cost because spa, needs. Lot of resources and if you are maintaining those resources, for lumber bit of time then you are paying. But. If you. Then. There. Is a possibility. So. In. That scenario is going to cost so. Ram Singh is. Actually. A matrix, that, constant. But I do. Not promise that cost-saving. Angelou, aircraft that, is the challenge, the. Process. Like. You may be having multiple teams, is a development cycle so you know you one team may have experience. But, to adopt you need a wider. Skill, so. So. The last possible. Data consistency. Is. A. Very, specific case, here what I mean to say here is that. The. Data is copied, into, sauces. And. So. So. If, your, process depends, on they be. In perfect sync then. There might be issues. Arising. From that. In. One system or not but, the push of the rain eyebrows does not fade there, could be a sync issue create that will be equal, to out of state and that. May cause issues so, if, you're developing, a technology. Along, the line where the data is duplicated. Then we want to make sure that it, is resilient, that failure in one system is not affecting the other system if. You tie them down, to. Be perfectly, synced then. There would be issues. Into. There, will be issues in office. So, the, first part was that a meteor, now what the ETL is done the data goes to that. Tables. And then then, the reimbursement, software. Kicks. Off. And. This, is the is the person who will be talking about the claim reimbursement. Rewriting. This. Is a very complicated, system, as I was saying, that contract. And the. Contracts. Are. Defined, using very. Complicated, business rules. So. Because, of that complexity. Switch, was power so, that we can you know handle of the complexity. Which. Could not be handled with typical. Because. It has the community, the, sequel's with true, complicated, and then. The database, you. Know has follow, me producing. Often planet. And. Tons. Of scale, a bit and. Also. Stability. So. Let's see how we. So. Here's the. Old immune system on the left-hand side here you see the word system is. Receiving. Data from, an. Entity, called Clearing, House which. Is the optimist during house for, receiving, claims from, all the. So. That those. Claims from. Clearing house into. A reference into. A file system in database and ultimately. Comes. To our system where. Procedures. And, as well as some Java so is the combinational, Java, which. Gives the claim you know spent time and there, is pushed, to our active, table set up tables and then from there it goes into you, know a field process. Processes. Further. So. This system is a mediator and it is written in a way. Not. Just the PLC code process was changed the. Whole architecture. Was changed. Because we were organizing. The. Whole bit back with this. Process. So. On the right. What. We are doing here is we are, acquiring all the things through the CDM clearing house, but. The trailing house is direct. Communication, between the trade with house and our system, the, communication. Is happening through, Kafka. So we have defined after. Topics. So claims are being. Sent. To particular. Topic and then. Our application, is listening, to that topic, to receive the claims and this. System is a streaming, system which receives the claims from carpet of it you. Know process is there and then, pushed a little back into another topic which is also. At. The same time it also pushes. The result back into a data. Which. Is. Derelict. For. Analysis. As, well as it also pushes the data back into. So. Let's see in detail how. This is. So. Since. This is a, proprietary. System. Often. Which. Which. You. Know is used for jmv was what we.
Wanted To make sure that, the. Capability. Is exposed. Stripping. EPA. Also. That's component, it will, be an inner school since shared every, we. Also wanted to be able to use. Design. To talk. Of the complexity. So. The crux of this, constraints, is that we wanted this core, capability, to be I know. Exposed, and reused, as many ways as possible so, we wanted as less. Dependency. On, any. Technology as, possible. At, the same time we also wanted, we. Wanted we also wanted the, scalability. And performance we. Would not want it to be constrained. So. Here I'm, showing you like three versions of the. Integration. So. The so, in all the three questions and you see the claim reimbursement. This. Is the. Capability. Of investment. And it. Is, using. And. And. This. Can. You, know it can receive. Input, in. Terms of planes, are objects, so you know I mean the crew inputs, to it I will will, see the. Magic. This is a shareable. Infrastructure. So. This, we can plug in to many different, you. Know delivery. Mechanisms, to, be able to achieve the, claim University, so. The top one is a streaming, mechanism, where. They showed you the. Data, comes from the Clearinghouse through, a book, after and. Then. We. Are using spot to be able to either stream this data in, in. Parallel, and. Convert. That day comes with JSON format. To a domain. Object and then, those domain objects are fed into the. Software. Into. The TRPA and then. Does the link was meant. That. The reimbursement and, the output, goes. Back into you. Know into. The cockatiel, as well as to the derelict. Similarly. There, is a bash. Mechanism, which is essentially. Almost same as the, steering. Mechanism with. A, different. Source. Then, that is passed into the claim reimbursement. And. That. Does all the heavy lifting of university. In person and then. The output is being put into. The, term i'm showing is is, ananda. Non-sparkling. It is a typical, arrest API so. We have requirements that you know. The. Users or even other systems, could, be doing. A live interaction. It. Would be staying alive. And. Expecting. The, reimbursement. Peter quickly. For. That kind of. Delivery. Mechanism, the, same core. Software, which is the team reimbursement, of the same. Software insular name into. As a REST API and. So. As you see in all the different. Integration. Frameworks, the, common, is the. Claim reimbursement. Which. Is the core software whereas. The data. Piece. Is handled, by a different, piece exposure. And. Then the scaling, is also a little bit different so in the top two cases path takes, care of scaling. Let. Us see a sample core which, is the streaming desk or, on. The top I'm showing that two things imported. Over there which, is the claim reimbursement, API. Essentially. You. Know. The. Queue. And. The claims are in some, form so, they're converting, the clean. 13. The, claims. From. This informant, into. Into. 8. And. As. You know that the reimbursement we need to clear as I said would be its name and we also need the contract, so, the contract, is being read in 11.
And, 9. From. Database, directly so. So. The contracts are dead and. Then custard. So, that they are available in all the executors. To. Be, so. Once we get the claim in the contract. Here. When 15, I'm showing you. We. Are simply calling the team D buzzer. Passing. The team contract, and. Asking. And then. That claim reimbursement, Act. So. The responsibility, of, spark. Here is to be able to you know get. The data in. A. Mechanism. Scale. The processing, the. Responsibility. Of the Thingiverse. Api. Used to get, the logic, down and the, responsibility. Of the of. The Kafka system is to be able to you know move. Data from one system and, the. Three together. You. Know providers, the kind of scalability. And, complexity. Let. Us see the highlights. Of implementation, as, I was talking. The. Goal, of the library is written as. Library. In Java so, that we could expose, it across many different. Technologies. Like streaming. So. That business logic is is in, Java, and we. Chose jar because it is compatible to. Will. Be like. We. Specifically. Intentionally. Try to avoid. Writing, the. Business logic in spark, sequel. Because. Because. Once we we use parse equal then. That software can only learn. So. For. That reason we did not use Park sequel to write the poor logic. But. We did use Parsifal, to be. We. Really use. And. If. You. Swapped. It if you're using. API. Do. That you might have to. You. Know. You. May we'll have to use collection. Scholar. We. Use spark heavily, for, its scalability. And in scalability. And stability as, well as you know connectivity. To many various, data sources. So. It is a sensor, component, of this. So. We use. We. We. Carefully designed, the. Structure. In. A way that all. The components of the team are placed. Together, as, an ally it object, so. That we do not have to do any joints, to, fill the clay. Object is like built, during the ATM process and. Which contains all the information of, the claim, just. To. Avoid joints and then, the, contract cell and. Cast it from database, directly, so. Busy. Essentially, the calculation, process ultimately. Becomes. A map office because, you have the whole object. It. Will be available to you and you can just call it, the. API. That. Takes, you a long way into performance. So. You have to do a lot, of planning to be able to design that object. Beforehand. Let. Us see the, result, between. The old system and the new system. So. This experiment. So. Contained. 80. Million records. Which. Is like hundred feet away so we, used, two. Systems like the. Old and new to see the typical. Claim, the most resistant has like two two. Steps one. Is the you, know when, is the reimbursement, like that you get the claim and you will the contract and. That's. One thing and then, once the reverse means calculated, then, step two is to push, straight out into, other. Sources for example in, every. Database. So. Then it can be reported it, can be shown on the inner interface so. There are two converts to that and, I'm. Calling them because, they have different scalability. Well. The. First system, which is you know running from. System. To system, or running. From his streaming source to another streaming, source lands. Extremely. Extremely. Scalable as, you see 80. Million claims were processed, in, just. X. And then. The complexity, is extremely, high here but if. You just scale there. Whereas. Pushing into. A is. Not. That fast actually. Here we were lucky that this, operation, is merely. A. Because. Because, the nature but. If it have been a great. Operation of delete operation. So. Even with the insert operation it is almost, you know two. Times slower. So. This. System gave, us a throughput of almost. Three, hundred thirty, three thousand claims per minute, and. That. Was achieved with twenty, virtual, CPUs with. The memory. Kind. Of hardware, if, we compare that with. Our old system, then, there is no exact, one-to-one competitor, because we, never pull, the van over, systems. On such a high volume like, we could not say there you know my database have 80 million the cars can, you go ahead and calculate all the times we put new route but. We, have done like. The. Same time. So. That way if, you compare, the. Now. In the, new system. So. Essentially. The new system is, 50. Times faster. Quite a view the, new system, is much, much, better than lower because, as. I said if. You, are on cloud and we. Are going together we're not there yet so John the cloud and then, the typical hardware which is like eClass. Software. Declassify. It just nor. Excuse. Me having 20 CPUs at 100. Community. In, kind, of the package. With. That what. Lord we could be finishing, 80 million. But. The comparative. If. We were using a DBMS, there's. No direct comparison, here because. On. The. Dynamic. Scaling. If. You. Which. Cannot be scalable that. Easy. So. It, cannot really, scale. Up and down resources to say cause that's not possible, in.
Postgres, It. Is possible, but the scaling is not that graceful, like. Scaling you requires you know Eastern, stuff. Like that and. At. The same time we haven't seen such, a high, so. There is no exact, comparison here. But. But as you see there, is a possibility of cost saving during. This amount of cost and if you move all your complex. Were lured into, spa, and, and. Use. Your, DBMS. In. That case you would not be spending by. One of, the. Resources and you might be spending resources. So. We can scale down users there save cost and spend extra then. Some cost. So. So. That is kind of in, the discussion for. Reimbursement. D right let's. Move on to the next topic which is data, adapter. So. As I, was showing you we, have. Currently. Artistically. And we are working. Replacing. It, so. These are the. System. So. We, are using you. Know open source data like for the time being and not the data bricks man is still telling and. This open, source, Delic. Is. So. In. The open source version you do not have you. Know a order. So. What. We did is we partition. The data, by, one key and we, order, the data. And. We chose this partition, key and the other key you, know. You. Know consciously. Depending. On the. Most. Access key if, you're accessing, our data Lake was. The time by two keys then. Those are the two keys we used for artisan. So. By believe that happens, is you. Know any, any. Look. Up any. Kind of insert, update on. Those kids at extreme difference whereas. Any other operation on any other, key is, going to be going. To the same same. Slow. First as. There. Are couple of words here if you are using the, open source version outside. Of the italic environment, then the Z or doesn't work. Z. Or Z order is a great, from sonic where you can you know. You. Can actually make. Your day today. For. Couple, of kids right so, without Z order you can order your, attorney. One key so, in that case you, know access, that, one key is accessed, by that one key is going to be fast but. What is the other you have the option of using more, than one key. So. The. Difference is that you can actually have, three four keys over there but. The Z orders you know my efficiency reduces, as you as the number of keys increases, so, you would not really use to, minimum too many keys there but. A couple, is is good. The. The. Optimized command also works only in, case of. Bricks. So, to, get around that you have to be using. Rebuilding. Your pelvic basically. We. Are rebuilding it once, in a while, to. Achieve. The equivalence, of optimism. But. We are planning to go into into. The power zone. And then we, use in a week so we. Just. And. They were seeing the keys. For, partitioning inaudible you need to be. Carefully, designing that because that, impacts. A, lot so. Anything, beyond, those keys I'll still going to be in my foolish slow, by. Equally. Suitable access, is a, cube, a stable but. Access, with these keys are going to be extinct. Yes. As you see here all, the queries on those two keys are extremely fast and it. Is a core, sample I'm showing you how, to build, a. Lake, you. Essentially, we, know. That. It is, okay. And then save that which. Order, by Clause. As. A paper so. The. Region would be a lake. Which. Is going to be extremely fast let's see how. The performance, looks, like so. Here's a comparison. And, you see. First. Several, rows. In. Insert update, mods. Joy. All, of them which have done on those kids are, extremely. Extremely. Fast, you, see you, know simple, read, operation, was. Taking 18 minutes and it takes 10 seconds it's. Equally the operation on one column was.
Taking Three seconds now it takes one second and, then, things. Like it. Should be an update must then. Guide required the cortical, activity guilt but. Dickens. Joins. Us first to the. Things which are not fast is the last row if, there's any operation, which is I mean, which which, is not of those kids then is going to be equal equal factors and you, can always review this time by increasing, the resources. So. I mean you may not be challenged, by this in. Case most, of a processing is not happening by those kids for, having few, T's which are you used for. So, after, the Redux we are now expanding our horizon, we are looking forward to using new technologies. You. Know like, we are not challenged by high volume operation it, is a volume increase the difference matter of I was kidding we, are not depending on the sequel optimization. And at the mercy of a, DBMS exhibited, high. Medium complexity. You know like endless, spark. I was compositing, to do well and. Without. Without. Any actual performance, issues. Integration. With steering workload is much, much. Easier as, compared against. Apple. And machine, learning on a database ed is looking much much closer and then. We are also looking forward to save cost. So. Here are some tips from, the two magazines that we did. Which. Might come handy to you so learning spa on premise does, not save cost as I was saying before so, cloud. Migration has a core component, of healing, then. That way you can promise cost savings which. Is very attractive to the business, consider. Changing, the production support procedures. And I got tool sets, which. Are not, your. Direct part of your software but which comes into play as you deliver the software to production and other. Things come into play for, support. Definitely. Used italic, do, not use the. Base. Because. There is highly a reason for. That the, relic is fast, pretty. Much equivalent. To, operation. Spend, some time on designing. Your schema the optimum design designs. Design, schema, is it. Will take you a long way into. Designing. Application. So. They spend some time over there we. Use data set will. Large degree then as compared to beta. The. Regions we, are willing to you know, complicated. Logic and. We want compile time safety our. Team is a. Java. Code so, not, even finish to know spot syntax, they can do, it. In. John's. Car syntax if, you are using a recipe. And. Then if you have any usable, library then you can just use one. Downside. I decided, it is a little bit slow as data, frame before it has to be serialize. The data so. So. Sometimes, we you will actually need a combination of both so when you see this, logic. Be cognizant, about the. Whole application whole data pipe you manually reading one piece, you. Will remove the product from that but, at the same time you might.
When. You promise. First scalability. And perhaps consider. The whole pipe man consider that you are switching one so. After. That is the bottom line moving to something else so, consider, all that before, making, an expectation. We. Have to do with. Open source park so. I will. Not be scared to use it for some, great application, for enterprise. Just. Go for it it, is pretty stable application, pretty, stable, and. Even. Though you do not have a those. These three, B's or C's B's of application. And you still be okay using. It. Thank. You very much for attending my sister please. Provide me your feedback and this, leads, me to, my. Email, say, yes, and after her call and thank. You very much work in English. You.