Apache Kafka is our topic this week and it's an interesting tool because it's sort of like so many different things in programming sometimes it looks like a database sometimes it looks like a message queue sometimes it solves the problem of overnight batch jobs taking too long to run sometimes it looks like a data integration Suite of tools and if you're dealing with large amounts of data or large numbers of departments trying to cooperate I'm going to go on the line and say it's essential to at least know about Kafka and to know what it does even if you don't use it it's something you want in your mental toolbox so I've brought in a friend of mine he's a Kafka expert and a consultant Neil buesing and we talk about all those different ways of understanding Kafka and looking at it all the things it can actually do for you we talk about when it's the wrong solution to your problem and when it's the right one how do you actually introduce it into an organization successfully so if you have data problems integration problems or processing speed problems let Neil add some knowledge to your toolkit I'm your host Chris Jenkins this is developer voices and today's voice is Neil musing [Music] thank you with me today it's Neil buesing Neil how you doing man doing great how are you very well I'm very glad to see you we've we've crossed paths many times in the past but I've never got you on record before so this should be fun and maybe a little scary we'll have to see a little bit of fear never harms just put the edge on things right exactly you are a consultant and developer in the world of Kafka which is how we met I've got a background in Cafe too but for those that don't know let's start here what's Kafka why do we care about it Kafka is a very fast distributed system to basically allow you to build real-time jobs to do work and the core of Kafka that I like to get into is it's simple in regards that Kafka itself is pretty dumb it is an immutable log of events so it is a record of what's happened and the first thing I usually try to tell people about it when I'm going over Kafka is especially those that are already developers working in the Java space or in a coding language that I can speak to effectively is you're trying to record what has happened so give me an event tell me something you did versus giving me a command and telling you what to do so from a Kafka standpoint I want to record events that I have done and let other people react to it and do their work independently of me and Kafka does this very well Kafka liberates the ability to write applications to where what I have done what my program has completed is completely independent from what other people have to do as a result of it and by making the infrastructure in Kafka to the high throughput low latency durable which means if something goes down things still work or I mean data isn't lost available which means If part of the system goes down the data's deal still available as an application developer I don't have to worry about any of that so Kafka liberates me from having to build infrastructure to send messages send events to others and I get to focus on the business logic of building that and the Paradigm Shift like I I like to tell people is telling me what write an application that says whatever done let other people's do something or not do something as a result there's a lot to Kafka it's try it's very hard to summarize and put together in like a a one minute section or one minute and so I will ramble and give it two and maybe flounder on the way there but it's I mean at its simplest and I think it's where Jay craps has mentioned when he developed it's an immutable lock and if you think of it that way and you don't put anything more to it then then it's pretty easy to grasp and understand the complexity comes with all the integration that you're going to do with it there is stateful processing with Kafka streams or other streaming platforms there's Integrations with Kafka connect those are things built on top of the core Apache Kafka components but at the heart it's an immutable log where you record stuff that's happened and you don't have to worry about if someone read it or didn't read it that's up to them I'm going to ask you my favorite Kafka question which usually divides the room let me guess I I think I know what the question will be and let's see if you can bring it so um you said you're storing large amounts of data with high availability and durability this is really sounding like a database yep I knew that's your question yeah is Kafka a database my answer is no but it has the same building blocks that many databases have and it can give you characteristics of a database when you need it so it can emulate options to a database but I think that's more from a practicality stand Point versus a theoretical one I think you could argue that Kafka is a database theoretically it meets the acid type requirements that a database usually tries to strive for but it's not going to be a database where I do relational inquiry I'm not going to query Kafka on its own to find out events that have happened I need to build something from that I'm not going to be able to uh use it for any type of search any type of business use Case by itself so is it a practical database that I can use on its own no but is it a distributed system that has database aspects that ensure that I don't have to build certain things myself yes I don't have to worry about durability Kafka takes care of that I don't have to worry about availability Kafka takes care of that so from that standpoint I can see why that conversation comes up in in it but as a consultant at a client if someone comes to say can I replace my sequel with with Kafka I would say no what's your what's your use case what's your business problem maybe we can replace my SQL because you're not using it the way typically a relational database is being used you're using it for a a low level event queuing mechanism then yeah we could replace it but if you're if you're searching for users who bought product three days ago okay okay this is perhaps something your world-class qualified to answer then because this is very abstract let's pin it down when someone calls you in to talk about Kafka what problems are they actually trying to solve with it what are they building um consultant in this space for many years I've been brought in to to bring in Kafka for a variety of verticals um I would say when you're I if you're brought into a company that hasn't used Kafka before so they're trying to get to uh real-time streaming the typical use case is we have data that we're refreshing in a system that happens overnight or is two hours old four hours old we can't react to it in real time so typically the the we want to move to Kafka or a similar technology but 99 of the time it's it's Kafka that's the that's the the tool of choice in this space By by most companies and it the the thing is I need to take this effort that I'm doing and speed it up and that's in verticals in healthcare in Financial in retail and others but that's typically it once you get Kafka in place which may take a while and you use it as a conduit so your first task typically is moving data now you have data accessible in Kafka as it's streaming through so the the benefit you now have is you've now exposed that data for other services that could benefit from it so a typical use case would be taking data from Legacy databases and streaming it through Kafka into a say an analytical database a Pinot or Druid and that allows people to do queries on the data analytical type query so how many people are buying iPads at noon at this brick and mortar store in Minnesota is in the U.S here we're just talking about that but I realized we started recording after that conversation the so we got this analytic and now we're flowing data through Kafka to get it to this other analytical data but now now developers can say what I could build my own microservices listening to those events to maybe determine um maybe there's a price issue in a product all of a sudden this product started to have more purchases I could build alerting systems event-driven systems reacting to that data as it's flowing through Kafka that's typically after clients have had experience in using Kafka integrating it with their business requirements that type of questions is is part of the Consulting work and that's usually with the the fun part is you now get to expose them to new use cases that they can do with Kafka that they couldn't do before typically I do that in Kafka streams which I mentioned is not is is part of Apache Kafka but it's not the core Kafka that people tend to talk about when they ask what is Kafka so I didn't go into that which I'm sure we probably will I think we'll touch it so that's a lot of we brought into there there are clients and the one client I like to talk about because we did a talk at Kafka Summit so I know it's public was work at centene with Brian zelly Brian zelly and I did a talk in 2019 San Francisco Summit where we built an event system with Kafka so we were building a routing ecosystem because we wanted to make it easier for people to put their data into Kafka it's a lot easier to tell people to come to Kafka to get your data if we already have it yeah teams will say Hey you have data that I'm interested in I'm gonna I'm gonna write consumers I'm gonna to learn about the abro data model that you have on your topics I am going to ingest that and do business Logic on it you go to a team and says we need your data so we can let other divisions other parts of the company do something what's in it for me so don't make it hard for them make it easy so we built a system that made it easier for them to bring data in to Kafka and then over time people will then start using the product getting access to the data it's all about exposing data so people can do their their business and if it's in Kafka streaming they can do it in real time of this is some kind of if you could publish your business data and then subscribe to it in another place you quickly find there are lots of other reasons to subscribe to the same data for different purposes yes yeah good way to put it I some people wonder and I'm sure you can answer this is um if it's publish subscribe is it a message queue not yet there's a cop can Improvement proposal to to to sort of solve that problem it is not a message queue it is like a message queue the in in a lot of ways it can be if typically when I built IBM mq systems back 10 years ago the the idea was for the One retail business problem that I solved it was one application produced the messages to the queue totally async didn't care about any response from it and another one picked it up I could have replaced that with Kafka in a heartbeat it was our the queuing was pretty much an event system I submitted orders that I created and now I want someone to fulfill them for example that is a paradigm with cueing though there's there's certain I need Global ordering of processing which you can't get with Kafka unless you want really really want it Kafka is trying to work through throughput so its goal is you don't need Global ordering you typically need just ordering of individual messages like if you update your address I update my address you update your address again I update my address again I need to make sure my events are processed in order your events are processed in order but hopefully mine and yours don't have to be processed in order between each of us yeah the Costa has that that yeah you and I exist on different timelines yeah and the same for the shopping basket that I'm trying to fill right correct okay and so and choose the other part that a lot of people do with cues is they want the response mechanism in other words and and there's like temporary cues of them I'm going to publish something but I want you to publish a message back to me after you've completed it that's very hard in Kafka it's letting it it's meant to be totally decoupled when there is a sense of coupling that comes back to the problem that's where it becomes usually not the best pattern to use Kafka yeah I've def I've done a talk on it and I I actually it's one of the most controversial talks I think I did was called synchronous Kafka um and it's typically not the pattern you want to do but there's Frameworks that add like the ability to get a response back from the actual producer so I want to produce a message and I want so I want you to create an order for me I want it to be asynchronous but I want you to tell me the person who published it that the order was created yeah that's not how Kafka typically works and that type of queuing design is a harder one to implement in Kaka I sometimes think that not always but it sometimes it's a sign that you're not really architecting your application correctly because you've opted for a queue or Kafka or some mechanism that says let's be asynchronous let's fire and forget but you haven't really embraced the fire and forget you're pretending it's still a function call with a return value so yeah yeah that can be thought and typically you try to have it solved without Kafka or a different means but you'll be surprised you're brought in they invest it into Apache Kafka and part of the investment in new technology is what can I remove from my technology stack as a result and saying you shouldn't remove remove your JMS cues your rabbit cues should remain or something like that and you're like so I'm spending more I'm investing in more technology and I can't liberate myself from something else so there are times when when you have a small set of of use cases like you should have rabbit or you should have JMS but it's one thing there are ways to do that in Kafka and I will work with clients to decide do we want to do that what's the risk of doing that or do we want to continue using a technology that was more designed for that that's a that's that's architecting trade-offs that's that's part of of that but it's doing that with um the right questions other than you can do everything and Kafka Kafka is great let me show you how to do it which it is by the way but I I know you're a great fan of kefka but you've you've opened yourself up to a very juicy question there when do you get people calling you in saying we want to use Kafka and it's the wrong choice it is a lot of times around the the the microservice the infrastructure is still very rest oriented very much um request response and you've built a very command-centric set of microservices or macro services or modeling depending on terminology and and where people are in their their Journey if you are like create an order for me and let me know when it's done type of thinking that's going to be a hard shift into into Kafka versus I created an order I'm going to tell you about it if you're shipping that's on you if your pricing maybe that's that's on you that's a different service if you adapted that well then then you're better off and there are times where you need to be synchronous I mean we're usually we develop something that is front and center to users on their on their phone or on their computer and they need a response back on what they're doing so there is a synchronous nature to it if you look at a ride share application all that data is typically flowing through Kafka to make sure that that user gets that ride in a reason amount of time but they're not using Kafka from their phone their phone is doing a an HTTP request server send events involved maybe even websockets if you're lucky but it's not copy that is behind the scenes so you're going to have that other integration that that needs to to happen there and so if people try to move Kafka 2 to the front end and there haven't decoupled those systems that may be the case where they're not ready to bring in Kafka yet they're not necessarily got to a more event type thinking right and that's usually when they're not either they need to Pivot more in their architect design to bring Kafka in and use it effectively that's often the hardest thing with technology I and I think it's one of the reasons why we tend to see a lot of new things that are incremental improvements and old things because it's not adopting new technology that's hard it's adopting new ways of thinking about solving problems it's hard so I mean it's always reminded me a lot of functional programming it's exactly yeah yeah I mean and that so Kafka streams is built on Java which is an object-oriented language with functional aspects to it Kafka streams is very functional and it and it and it uses Lambda functions of java very beautifully to where it is is pretty elegant but it is very much the idea of functional design reacting and thinking and that paradigm shift for for me was hard even though I I did I did learn functional languages in college in the industry I've never done things functionally because that's not the technologies that were were prevalent when I started and still aren't I mean there there is functional aspects to stuff but the dominant language is functional wasn't their their primary design I mean yeah python is not a functional by initially or people could argue with me maybe Java certainly isn't um so the I the Paradigm Shift of Kafka is kind of the same the same way and that is the challenge that is the discussion of working with with people in the Enterprise to to figure out how to to think differently I think the the idea of microservices which is now a decade old type concept longer definitely makes it easier for people to think and leverage Kafka if it wasn't for that I don't think people would be able to to grasp and use Kafka effectively that's interesting so because they weren't particularly it's not that like they were co-designed but you think the two approaches are very sympathetic I think so um Kafka for those that are new to it has this call a from a consumer side a consumer group and consumer viewers a read-only connection we should start there correct yeah consumer is a read-only connection except that it has to keep track where it's read so there is some rights that go on for those that um want to know how it works but if I have two consumers they can share the work two consumers in the same group if you start building those in more complicated applications so if I had if I'm reading from orders and I'm reading from inventory and I have two consumers doing them independently but I built them in the same app if I do a a upgrade to one component and not the other I still have to bring both down to bring up that that macro service and it becomes more of a challenge of orchestrating the work that you're doing on top of these systems if if microservices weren't there I think that would be more of a challenge to the people's way of thinking the idea is I write small applications um one of the things that that comes up a lot in my design of microservices for for clients is I don't use Frameworks anymore if I have a choice I write Java code Kafka streams is a framework it to me is is a it is it does all those things that most Frameworks need to do that people go to them for but I don't use Frameworks because my applications are very tiny most of my streams apps do not make any external or calls they don't make any rest calls they don't provide any input requests from from consumers from from web device access so it can be a couple hundred lines of code why do I want to bring in a framework that gives me their opinions that I now have to learn and their infrastructure in order to bring up an application so you write smaller codes very independent you truly can write microservices and not when I wrote microservice 10 years ago it was me pretending to write microservices they were they were small monoliths or they were macro Services I tried to put everything into it so I needed a framework to do dependency injection to do security and and now for most of that I don't when I do if I need to build a a front end that serves web requests I'm probably going to bring in a framework because I don't want to write um the security around HTTP the the cross site scripting checks I'm going to bring in something to help me do that yeah but if I'm using spring or I'm just doing a simple Kafka producer and consumer I'm not I'm just writing in the least amount of framework code that I I possibly can because it should be a fairly straightforward thing to write some stuff as a series of events and read it back is a series of events that should be lightweight code exactly um but but that harks back to something you said you said you built a demo and gave a talk because if people could write things in easily it would they would open up Kafka to them what was it that was hard about writing stuff into Kafka um they had to that's a good question I don't consider it hard to write into Kafka but it is I have to create a connection to a Kafka broker that is not HTTP it is a proprietary connection that requires setting up infrastructure of of open ports and security around that costed client Library does all the work for you but there there's still something new that a person needs to bring in the data format how you're storing the data and that put becomes there's no gatekeeper to validate their data they they're publishing directly to Kafka you usually then write other streams applications to validate check the data then someone has to build that so most of these clients were already doing restful production of their data to restful services so we built a restful front end that then pushed it to Kafka validated they would publish Jason we would convert it to Avro for those that don't know abroad is a binary a serialization of bytes that that is smaller than writing Json and it is well supported in many libraries the the orchestration around it maybe not so much but the binary format I can do it in any language and because it saved about 30 percent in storage and it was strongly typed in other words I knew it was a decimal versus is it an integer or is it and Jason you don't know of a numbers what kind of number it is so you have more more data chat so we did all that for them because we needed to make sure that the data was was um useful to those that were were consuming it so it wasn't that it was necessarily hard it just wasn't their problem it wasn't these other team's responsibility to get their data to others in a streaming way so we did it for them and try to give them the interface that they were most comfortable with right so that that raises the question like how much of your work is like getting how much of it is the architectural side of this is how you build this kind of real-time streaming app versus okay this is how in detail you write the code that does it versus like being a kind of DBA this is how to actually live with it in production it's every week is different every client is different um so the work for the centene the part that we talked about that was a a four to six month pilot project let's get it up and running so I was very much Hands-On writing code and the architect was Brian and great guy and and had that vision and it was taking his vision and putting it in paper and paperment in in Kafka streams in in Kafka and developing that so I wasn't really an architect from a building an application standpoint I was the Kafka architect I was the one determining how we wrote The Producers how we wrote the consumers how we leveraged Avro which keeps coming up because it was a pain point on on many things with that one the data modeling the data governance of it for other clients that are pretty much new to the Kafka in that we have a business problem of trying to get from 12 hours to minutes help us from end to end do that yeah then it's it's finding a team below you to do the the work the daily development work in Kafka because there's a lot to do but you trying to how many applications do we need to build how do we deploy those applications you'll be surprised at how much a Kafka developer and that level needs to become a devops person just to make sure that we can get our applications deployed because I'm going from some clients that deploy a handful of applications to now hundreds or 12 instances of a consumer so it can be performant five to ten different consumers when before they would write maybe two applications to do that so it tends to is that your preference of how you architect it very very granular or does it naturally lean that way I think it naturally leans that way that's a good question from my standpoint I the more successful projects are the ones that are more granular I think that usually indicates that people have decoupled the problem set better which makes it easier to adapt to Kafka right so if you're if you find that you're deploying a lot more services and they're communicating asynchronously with each other it sounds like your deployment management and your monitoring and your observability suddenly become much bigger issues than perhaps they were before yes and getting devops involved or becoming a devops person to help get that involved becomes a big priority it is something that around so I'm actually repeating well I shouldn't say repeating I'm doing another talk this fall at current so Kafka Summit on Kafka streams metrics and doing observability of your caucus stream application so you can make sure it's it's working well and three years ago during the pandemic I did a virtual one of it I am now trying to um modernize the talk then talk again because the observability piece is huge and people don't know or need when they don't they need to invest in it and they don't know how much time it is to really do that and every organization is different in their journey of observability do what tools do they use how easy are developers able to to bring tools into that and observe them but yeah it's it's a critical piece in in a lot of the journey is is a lot about getting visibility into the behavior of the apps and that it is just about more than actually writing the apps themselves okay so is that the price you pay for adopting Kafka or is that the price you pay for adopting a genuinely distributed real-time application yeah I I think it is a price you pay for building a distributed application I do think um Kafka adds more I mean because there's not necessarily standards around doing that and because it's a distributed system how you do that you have to invest in doing it the Kafka way too but you would be if you were using another technology you would have to to do theirs as well um the I think so I mentioned Kips a Kafka Improvement proposal out there this one's in the 700 range I can't remember where it is I think it's a very important Kip out there I wish I had the number now that we're talking about it but it's the ability for consumers to push their metrics to the brokers and make the Brokers have visibility in how the clients are doing so for example many people use cloud providing services for Kafka and they're monitoring tools of that or they've built Kafka and have it on Prem or self-managed and they built their tooling very well around monitoring that they have all the connections set up they have the dashboards now you go to them and say we need to monitor the health of our clients too like isn't monitoring Kafka enough like no it isn't if a consumer is lagging you don't know why or it's not even easy to find out so you have to monitor each application and that's usually where people go oh my gosh now I'm like what is I I should just monitor Kafka so there's a kip out there that allows clients to push their metrics to the Brokers so the Brokers could then make them available for the monitoring tools now I don't have to write scrapers of Prometheus if I use Prometheus and grafana to go and scrape the metrics from each consumer necessarily doing each one differently I can just expand to what I'm looking at on the broker metrics themselves but that requires a change to Kafka to the client libraries to allow clients to push their how well am I doing metrics to the Brokers so then it becomes easier for for them to display what's going on so right we need that Kip I'm a big advocate of that Kip if monitoring is easier people do it better if monitoring doesn't require all these extra steps people do it and then you're not being called at 3am as a result which will be very nice if we could just get to that world where everything was cheap and easy to distribute and we now wake up at 3am except for small children so what you're saying is I mean to put it in equivalent space if I had some kind of queuing mechanism I would monitor the health of the queue but I also end up needing to monitor the readers to see how they're progressing through the queue it would be nice if the readers could report back to the queue and then I've just got one thing to to monitor everything yeah yeah okay we've talked a bit about devops and monitoring um let's talk a bit more about programming tell me about Kafka streams and what's that for and what headaches people have with dealing with it but Kafka streams it is a the best java Library out there if people ask me um so I should since no one's going to ask me I'm going to State it if you disagree with me or please come onto the podcast and discuss your favorite Java Library exactly um it is a pure Java library that allows you to do stateful processing with Kafka and there are plenty of Alternatives out there you can use Flink you can use spark you can use Apache beam these are all Technologies to do stateful processing of events with Kafka and stateful usually means I need to join data like credit card fraud am I gonna enrich it with the user data am I going to compare it to other credit card uses at the same time to do alerts was this credit card used in in the UK and France at the same time and the physical location I need to alert on that but that requires knowing both events that didn't happen at the same time so I need state State needs to be stored somewhere for days hours or forever for me to enrich my data and do something meaningful yeah Mo most of the technology in the space requires infrastructure to set up and do that so I have a set of servers running Flink that I schedule my jobs to or spark or even Apache beam Google data flow there's tons of options out there Kafka streams approaches it's just a Java Library 100 self-contained you spin up Java code today if you're at a client that you do Java work you can use streams it uses rocksdb database internally for the state that it needs so we talked about his Kafka database and the answer is not by itself but it has those components I can't search a topic for an event I need to store it somewhere where I can search it that's what rocks does roxdb is the availability of that state kafka's the durability so Kafka streams will put the data in a what's called a changelog topic so if my jvm crashes my pod dies someone physically cuts off my hard drive and throws it away I can bring up that Java application on that pod and it will rebuild its state from Kafka so Kafka is the durability Kafka is that part of the database that you need to make sure that your data is not lost and you can get and rebuild quickly rocks is that availability aspects to make ensure that I can search for Stuff needed so Costco stream state is part of the Java Library right that makes it fast in regards I'm not having to make any calls out to to do work all the work's done in there so if you write streams effectively you're just going to Kafka for your or you're letting kafka's data come to you basically you're not going to Kafka the data's coming to you to do your enrichment to do your joints that's the the the Paradigm difference of Kafka streams from the other Technologies and as a Java developer of 20 odd years it was very easy and natural for me to use adapt and bring into clients who are already in that that space as well pretty easily yeah I can imagine going into new clients if you don't have to set up extra infrastructure on top that already is a huge advantage but a library like that has to for it to be useful it needs to worry about being distributed and highly available correct yeah and Kafka is that distribution and highly available aspect behind it as well so if you run jobs on a Flink Farm and you need to do a lot of work you may spin up 10 CPUs and they would distribute work between them Costco streams distribute Works through through Kafka so if I have an order event and I need to join that to a user event to enrich that order with user information and that user is on a different streams pod because of how I built the app in in my old world thinking 10 years ago I would say hey even the jvm that has the order go grab the user data go grab the user data and let me enrich my order that was my that's the restful thinking that's that in the Kafka streams world it's totally different and it's it's so elegant at it that's one of the reasons I love kafka's dreams is I have the order but the other worker has the user so it's not me going to grabbing the data from the user it's me re-keying the order so the one that has the user will get the event instead so I'm moving my data to where that I'm moving my order to where my user is and letting that one do the work from a high level standpoint is basically Kafka streams uses Kafka to distribute the the work across all its workers so if I have 12 streams apps running and this one so they each have one twelfth of the orders 1 12 of the users they're shifting work to where I need it to be not going to where I and bringing the data to me yeah it's um I think a good mental model which probably comes from the name is like you've got different streams all flowing gradually together to form a larger river right yeah yeah okay um so what makes all this hard Neil that's a nice wide question oh man what makes this hard there's well it a lot of times it's trying to bring the people that understand the business problem you're trying to solve and get them to understand the technology well enough to trust you that you need to do something a certain way and it's also getting the the developers to really understand the business domain I mean it and that's that's no that's no different to Kafka than it is to any technology in the past I mean I've suffered through with orms I've suffered through that model extensively in 10 15 years ago of of getting object model and database models to to make sense that's always the Gap I just think there's a lot to Kafka that that Gap is is wider it takes more to get up to speed in Kafka than than necessarily um I mean it's just distributed systems are hard and and understanding that it's it's like I think it's also a paradigm shift of thinking like from object programming to functional from SVN to get there is a paradigm shift that that's the end again oh I mean I have since I mentioned it uh my son is using SVN um I when I went from scn to get I felt like I I'm too old for this I I'm never gonna get it yeah and then all of a sudden I got it and I'm like oh my gosh this is liberating I understand how git Works yes there's a lot of nuances to it but and and I I like using the git example because I think others who switch from a previous tool from get to get have that same shift it's like I'm never gonna get it and when you get it you're like this is the best thing ever this is the way you'd build distributed types of code sharing you you decentralize it and and all that stuff I feel Kafka with event thinking is similar there is the I'm never gonna get it type approach I can't think from command thinking to event thinking um I'm always in the restful mindset that I need to get a response back in order to do my work that you feel like this is this is like oh my gosh I I need to go back and and and find a different passion but once you get it you're like oh my gosh this is this is this is great this is liberating so the challenge is trying to get business people to get that that feel of their getting it and the benefit that they're going to get from it um so as an architect as a developer as someone advocating for Kafka it is trying to show them the business benefit of using this technology and that is hard and I think you go with well others are doing it you can approach others are successful in doing it so trust me that you will be successful but let me help you get there it is at least you have the the ability to to reference the the success there's I mean what 80 of the Fortune 100 companies are using Kafka the numbers are high and yeah so it is a successful tool that people use so it's trying to get that shift and once they see it once they get that paradigm shift I think then Things become a lot easier but that switch was hard for me and I see it hard for others if it's not hard for you let me know I would love to to learn how how you got that so quickly because I want to be able to teach that better because that is an important thing it really is it's like I think as our Architects and architecturally minded developers we go around looking for ways of doing things that are fundamentally simpler and we're we're prepared to suffer a lot more to get to there so some of us are naturally attracted to beating our head against git or functional programming or Kafka in order to reach that promised land where things drop into place and suddenly it's so much simpler you wouldn't go back to the old way and and there's a challenge that because you worry are you are you going to get to where it's simpler or did you just make a hard problem equally hard but now on a new technology stack yeah and as a consultant you don't or as a developer you don't want that my success is by people saying okay Neil we don't need you anymore um we're happy versus okay Neil you made it very hard for us to move on we're not happy but we don't need you anymore type it's a you you want people to feel benefits from using Kafka to be happy that they made the transition yeah and and that's and that's the goal but you do worry about it did I make a complicated problem just a distributed complicated problem so now I have more things to worry about yeah I think I'm gonna pick an example of angular version one that was a new way of doing things which just turned out to be a different awful way of doing things um that is one that so I I didn't do a I didn't do any angular so and but I I there was like a week where I wanted to okay I want to learn angular enough to be where I can talk about it and that was one of those things like hey I didn't have the time and B I wasn't giving it and um I'm kind of glad because it did look like it was one of those things that um would be a continued challenge that the the front spaces is is a whole different uh complicated one to talk about I'm sure I'll let someone else talk about that one yeah but it's one of the most fun ones because you get to see people happily using the thing you built and that relates yeah to what you're saying you want your clients to be happy they went down that road and now they're using it for non-technical reasons they're using it to make the business move forward yeah right do you do you have any favorite success stories that you're allowed to share well I like talking about the centene one because I know I'm allowed to share it because Brian has your talk on it I I hear from from them that things keep are are working well from from what I'm the conversation I had so that's definitely a great success um from a I'm trying to that is a hard question because you you don't want to misspeak about what someone's doing with it yeah that's bad um I'm trying to think if there's I've definitely um to that point I've been at the clients where the the the step to get devops can cicd deployment process with kubernetes and I've mentioned this like at open source North which is a conference locally here I mentioned in a talk there that you're going to need to invest in those and ideally you invest in those before you go to Kafka if you're an organization that's new to Kafka new to microservices new to kubernetes new to Cloud you're going to want to pull something off and and try to get all those other things first and I've been at one where we were doing kubernetes kubernetes was relatively new I mean the organization was using it but not to the level that you probably want them to um trying to deploy do they deploy Kafka on VMS bare metal or pods was also another conversation it's just too much so those that were successful and I'll refer back to this sent team one because I know I can't is you have a champion from the business side that has a vision that's solving a business problem that's where you're going to get success when when the champion is a technology evangelist within the organization wanting to use their technology of choice which is a Kafka in this example it may not always lead to the best visual success within the organization so the best for me is having that business Advocate like Brian was in in at centene and me being the technology Advocate and and and bringing those things together you need both for to be successful so if you have one where it's just the technology or the business isn't trying to find the the counterpart in the in the to evangelize the technology then you're probably going to flounder or I think you will maybe I'm trying to my mind is trying to think of what I can talk about on that but but yeah that's the part that that where the challenges probably existed so I would say one one thing about that that is critical and the organizations that have people that really understand the data model domain are critical you don't want just everybody putting Kafka bites into Kafka if you can't come up with an Enterprise solution to make sure people can use that data access that data know how the data is available how the data relates you just have a now you just have more data somewhere that no one knows how to get to and that is critical yeah but do you think sometimes there's an element of if you build it they will come if there were a way to get business data in a structured reliable high quality manner then people would start using it but you can't always put the whole organization on the same page in the same day yeah that is a fair point you need a pilot or something there that gets that to show the value and get more people to embrace it and use it that's probably a good better way of putting it do you think um going back to the abstract then do you think the the most important thing for technology like Kafka is it making data available to different parts of the organization which Kafka is good at is it being highly available or is it being real time hmm or something else well I've tried to the answer is yes to both um which is about I think I mean it has to be real time um but I think that comes with with the highly available too but if if I am not if I'm not able to get data to be in where you can react to it within pretty much when it happened then I'm I'm not going to get this company to the level they are needing to be within their organization um they're they're going to lose to their competitors if it's not real time everybody's expecting real time when an alert on my credit when a credit card gets used and the alert doesn't come in until six hours later because email Isn't reliable or texting that that doesn't help me I can't react to that I'm frustrated so if I'm building a system that isn't the immediate customers are frustrated business owners are frustrated so that real time is is needed I don't know how that if you don't have availability though I think you don't you kind of don't have really the real time you need to make sure it's it's there so I have a hard time separating that well let me put it from another angle uh are the people you're talking to are they mostly saying yeah our system just isn't fast enough or ah our system keeps crashing or I know the data's over there I can't get to it because of integration I think that's that last one is my data's there and I can't get to it I think that's probably the the one is where they come to us I I can't get to my data and that was that's a very common pattern at the the last few clients I've been at is um a we can't get into a real time but we can't even get to it at all yeah um the the pulling data out of a legacy system is you're gonna if you're an Enterprise consultant you're going to be working with how do I get data out of a legacy system so people can use it you have a lot of Mainframe data that people can't use because it's too costly so they don't open up you can't I don't mips are too expensive on my as400 or my Mainframe so you can't access it well I don't care if it's real time or not I need to get it nope you can't access it so getting access to data is a front and center problem in many organizations so maybe that's the probably the primary one is I want I need to get to the data and if I can also make it real time in the process then that's a an extra win yeah yeah I think Kafka is one of those Technologies where it can solve an immediate problem and have some accidental extra benefits along the way yeah yeah so where do you think will be five years from now do you think kefka will be more mainstream do you think the techniques of Kafka we're more mainstream but not the technology what's your prediction well let me let me roll back five years or seven years and then answer that so when I so 2017 time frame is when I went all in Kafka I'm like I'm gonna invest in learning this well because it's it's a it's a business and personal benefit from from doing so and I co-founded a company solely in this space so there's definitely value there the when I did this five years ago I'm like if Kafka is the leader for five years if I can be this in five years I get return on investment in other words I felt that it was worth the effort it becomes that I felt in five years it would become more like other Technologies where more people know it to where than you being an expert in it isn't as beneficial because it's more mainstream it's more common I didn't think it was going to be like the the next technology that faded because it was too hard to bring in to then replace you're not going to replace it with a similar technology you're gonna if someone's gonna have to reinvent the technology the reason to move away from it um but I thought it was going to be more mainstream more it is so hard to find Kafka developers it they it is it is a hard aspect from a Consulting to find people that a know it and want to be in it um so from that standpoint if I use the last five years I think the next five years is very strong for Kafka because if you know Kafka because you're in high demand people will seek you out I get more LinkedIn really obscure ones on on Kafka related inquiries but it's it's it's it's there and I don't see that changing I don't see businesses moving away from it like I said because I don't know what the next thing is that would would replace it and the community aspect of Kafka is so rich so vibrant the the people that are improving it it is top-notch people that are making that product better and it's open source that people continue to to leverage so five years from now I I guess maybe I'm thinking my thoughts from five years will be now like a tenure where it's no longer people seeking out I'm get I'm guessing more will move to cloud services to where they need less Kafka operational experts but the need for people to build Event Systems real-time system thinking in ways to to extract data and make it available that will remain but the operational side will probably diminish because things are improving it to make it easier to run and more people are leveraging experts in organizations to run it for them yeah yeah I could say that so whether you're dealing with Kafka or functional programming or one of those other ways of changing your thinking you think the future is bright for those new minds do I think the future's bright for in regards to it is people who can make the mental shift to build applications in that way do you think it's worth it as a career investment not just as a mental roughage well I'm definitely a biased opinion and could probably say that I have my my Tunnel Vision on but I think so I think there is still the the mental thinking of it is is independent of the technology and that has made so many things available today that weren't available 10 15 years ago ride shares wouldn't happen credit card alerts that the levels a lot of the user experience that people get on their phone that's real time distributed systems are behind all of that to make that happen and I think it's vibrant and that I don't see that changing you keep thinking is is the space going to become saturated I like I said I thought it would be by now and it's not even close in my in my mind I think there are a few headline businesses doing this really well but the majority of businesses I'm even close to that way of thinking yet yeah yeah a lot are in there yeah I think I think you hit the nail on the head there there's a lot in it there's a lot like I said there's like 80 of the Fortune 100 companies are using it I hope my math is right um but are they using it well are the are they the ones that are are showcasing it there's still a lot to learn there there's a lot for me to learn on it I mean as any developer if you look at code you wrote five years ago you should be improved from to where you're not happy with that code yeah at least that's the that's the goal and that's not to say the code is is wrong it's to say that I've I've become better yeah and we and I think that's important yeah we're constantly improving and hopefully so is the industry along the way yeah yeah a positive note to end on and I think I'll let you get back to improving the world in your corner of it and you'll be using I will try for joining us thank you and that's all from Neil for now now looking at the calendar it's currently September 2023. Neil and I are both going to be a conference together in a few weeks time that's current over in San Jose so I'm looking forward to seeing him in person and if by some happy coincidence you're going to be there too do come by and say hi if you're not going to be there but you still want to say hi the internet has your back my contact details are in the show notes as always and that would be a great time to leave a comment or a like or hit subscribe and make sure you catch our next episodes I think that's all for now until next week I've been your host Chris Jenkins this has been developer voices with Neil buesing thanks for listening [Music]
2023-09-12