Eastern webinar - Technology Radar Vol. 24 Sneak Peek Webinar 1
- I would like to welcome all the audience who have taken this time, some of them are definitely early risers, especially in Europe, right? U.K. And I welcome everyone to the Sneak Peek Tech Radar Webinar, My name is Vanya. I am the Head of Technology for ThoughtWorks India, and I would request my fantastic panelists to introduce themselves.
Sudarshan, you wanna go first? - Yeah, sure. Thanks Vanya for the introductions. Hello everyone. Good morning, good afternoon wherever you are. My name is Sudarshan.
I'm a Tech Principal from ThoughWorks India, based in Bangalore. I'm part of the group called Doppler, which helps put up the Radar along with Birgitta. I've been on the Doppler for two three years now, and yeah, this is the Radar Edition 24, so looking forward to talking to you guys for the next time. - Thank you Sudarshan.
Birgitta? - Yeah, hi, my name is Birgitta Boeckeler, I'm based out of ThoughtWorks in Berlin, Germany. I'm also a technical principal, so developer by trade in reality at the moment, mostly architect by day, I would say. And I've been part of the Doppler Group, who curates the Radar for a year now, so this is my third Radar edition as part of that group, and also I am one of the people who has never been at one of the face-to-face meetings to put this together.
So this was our third remote Zoom week to put this together. - Thank you. Thank you Birgitta and Sudarshan for introducing yourself. I think without any further ado, let's take a glance at what this Radar is about.
So the Radar is a document that we have been publishing for the past 11 years, yes, it's 11 already. It's an opinionated advice on what's relevant in our software development world, right? ThoughtWorks is a global software consulting firm. So we do see and deal with diverse set of problems and hence that advice on languages, frameworks, tools, platforms and techniques, emerges from this collective knowledge across the ThoughtWorks ecosystem. And this advice ranges from Adopt to Hold.
Adopt Ring is a strong advice zone, right? Where we see that as the way to be, right? We have had many successful encounters with that technology, and we urge you to consider that item as a strong recommendation when need be. Trial is the ring which denotes something that's worth pursuing. We only put things in Trial which we have used in production, so that's definitely there.
It's not that, it's a whim or a fancy, but actual production use that to make something go into a Trial. Assess is the ring where we put things that are worth exploring, right? Which the organizations can stop to check out, to see how they affect their enterprise. And last but not the least, Hold is the ring, where we advise to proceed with a lot of caution. More often than not, these items are anti-patterns or misinterpretations of certain practices, right? And that is when we put them on Hold. And without any further delay, we would like to dive into the blips that we have handpicked for you for the Sneak Peek Tech Radar Webinar. And the first one is a Webpack 5 Module Federation.
And for that, I'm gonna give it to Birgitta. - Thanks Vanya. Yeah, so Sudarshan and I picked a few blips from the edition that is going to get released next week, some that we found interesting and we try to also pick them from a lot of different areas. So hopefully, we'll have something for everybody because the Radar covers quite a broad range of things, I would say. And here we're starting with a frontend.
And more specifically, web development. So we've had micro frontends as a technique, as an architectural pattern on the Radar for more than four years now. And since then, many of our teams have applied this style successfully, where you try to go beyond just having microservices in the back end, but you also break up your frontend monitors and serve the respective pieces of web frontend together with back-end microservices and then you stitch them together, to serve them in the browser. And the style and the available techniques and frameworks, have also matured a lot over the years, and Webpack 5 Module Federation is a really highly anticipated Webpack feature, highly anticipated by teams working with this architectural style micro frontends, because it tackles one of the biggest challenges with them, where a lot of teams have hand rolled their own things to make that work, and that is how to deal with dependencies of all your micro frontends when they ultimately get served up as one big cohesive application in the browser.
And what comes with that also how to really deploy them independently, because that is one of the ideas that also comes with the whole microservices approach. So one of the goals is to be able to deploy independently, so you can really have those autonomous teams. And there are some challenges there with micro frontends when they depend on each other, when they depend on the same dependency, like react or when you have things like a design system, and respect of libraries that you want to deploy.
And with Webpack 5 Module Federation, you get a new option, a new way of doing this more effectively in the browser where you can say, which modules your micro frontends would share with each other? So you don't maybe load those separately, which ones you wanna load from the build, which ones you wanna load from remote. And we actually will also have another blip on the Radar that also tries to solve this problem which is importmaps. And I would say, so when you look at both of those importmaps and Webpack 5 Module Federation, they are both frameworks that offer more choices when you're making trade-offs between; having the independent deployments, caching flexibility, also (radio interference drowns out speaker) if you have cash app or dependency, so both of those, (radio interference drowns out speaker) really, can you still hear me? - Yes, the audio a little choppy (group chattering). - Yeah, I think it's bad. - Yes it's bad.
- (radio interference drowns out speaker) hopefully. (laughs) And maybe one of the differences is that, of course when you use Webpack 5 Module Federation, you have to be aware that you're locking yourself into actually using Webpack, whereas importmaps is more independent of that, and this also actually is a suggestion to become part of the W3 standards. But at the moment, it's only supported by newer versions of Chrome.
So if you're using importmaps at the moment, you'll also have to use (radio interference drowns out speaker), there's a framework called systemJS that helps you do that. So both of these things, importmaps and Webpack Federation are still quite new. I know that in Germany, (radio interference drowns out speaker) introduction. So it's exciting to see things maturing in that micro frontend space when it comes to this dependencies.
- Fantastic. Thank you Birgitta for that great introduction to Webpack 5 Module Federation. Moving on, the next item that we have, Birgitta for you, is from the platform (indistinct) called Redpanda, and for that I'm gonna invite Sudarshan. - Thank you Vanya. So we started with frontend and now we are moving into a totally different part of the landscape, the streaming landscape.
So if you look at the streaming landscape, currently the Byte5, the most dominant platform is Kafka. And in some places you'll have Balsa, if you're going for more Cloud native solutions, you might have Kinesis, or Azure Event Hub or something like that. Redpanda is another entrant in this space, it's a fairly new entrant, but it's a very exciting entrant. So one of the big concerns people have in the use of Kafka, is it's operationally quite complex to run, it has all the ZooKeeper processes and you've got the Kafka processes, and to operate the cluster effectively, you have to understand how they are interoperating, how the consensus works, leader elections so on and so forth.
So it's actually quite complex to operate. Redpanda essentially aims to address this concern, by essentially running as a single binary. It doesn't have this external dependency like ZooKeeper, it implements the raft consensus protocol within itself. And because it's implementing it within itself and runs a single binary, it's a lot simpler to operate. Interestingly, Kafka also acknowledges this, they have a KIP-500 to address this problem. And I think last week there was a confluent log, which basically said they have merged it to trunks, so which means the next version of Kafka might actually be a lot simpler to operate.
So going back to Redpanda, like a lot of systems actually do this, they actually implement the consensus themselves. But the big concern you might have as a consumer of that product, is do they actually satisfy the correctness properties that you would expect? Like if nodes go down, do they still guarantee correctness, do they still guarantee the log is complete, so on and so forth? Most systems tend to do some form of automated testing like using tools like Jepsen to verify the correctness. Redpanda does the same. They've actually taken Jepsen and extended it to cover longer histories, longer log histories, and stuff like that, to especially verify the correctness.
The one other thing which Redpanda does, which is likely better than Kafka, is it has much better tail latencies. So tail latencies is actually a very common problem in distributed applications. What that really means, is that your application can perform well on average, like your latencies are pretty good on average; and they might also be pretty good at 90 percentile.
But once you go to 99th percentile, you'll actually see most applications have a sharp upturn in their latencies, and that's because it's really hard to engineer for low tail latencies. And they've been actually ground up as engineered for that, and they perform very much better from that tail latency perspective. And the way that they do that, is they actually use a tad Bod called architecture, which is a more modern architectural pattern to get better tail latencies. They use the Seastar framework, which was created by the ScyllaDB folks. You guys might know ScyllaDB, (indistinct) right of Cassandra and C++, and it uses the same pattern. So they basically use the same framework, and implement along with a tad Bod call architectural pattern, they also have a few of the mechanical sympathy techniques.
But the end result of all this is they get almost 10X performance or better latencies on the tail latencies problem. Redpanda implemented Kafka API, that's a very interesting choice. Because essentially what that means is, if you're using Kafka in your ecosystem, you can essentially treat Redpanda as drop and replacement. You can actually point your application at a Redpanda cluster, and essentially things should mostly just work. This is of course a very sound business decision, right? But I found this interesting because actually this is a thing, we also have a separate team in the Radar about how the Kafka API has become a thing by itself.
I think the Kafka ecosystem has become so powerful, that if you are a new database, you almost have to build a native Kafka app, like most of them do. And the more and more the ecosystem embraces this, it's so difficult for competitors to come in unless they can implement the API also, so I think Kafka API is also a very interesting thing. But anyway, Redpanda implements the Kafka API, which means they are a drop and replacement.
Most streaming platforms have some stream processing capability like Kafka's streams, an interesting capability Redpanda offer is, they offer transformations using WebAssembly, which is a very interesting choice. And the reason they do it, is so that users can develop transformations in the language of their choice, now you can write your transformers in any language like Golang or Rust. And once it compiled down to WebAssembly, they use embedded V8 engine to do the transformations.
Of course, this pattern works only if you are transforming a single message, if you're doing proper transformations which merge multiple streams and stuff, then you probably wanna go for a full-on streaming engine. But if you're in the landscape of you're doing some simple transformations, message transformations, something like this makes a lot of sense. And inline transformer if you will, because you're not using a full cluster and clusters resources to do some transformations and stuff.
I think that's mostly what I wanna talk about Redpanda, like one little thing though, while it's very exciting, it's still fairly new. I think we've not seriously tried it in production yet, but it's very compelling, and I think it's a good platform to keep an eye on. - Yeah it's an Assess for something to definitely explore and work, checking if it actually has an impact on your organization. I would like to remind again our audience that feel free to answer or ask questions regarding the items that we are discussing, in the Q&A section, and we're gonna try to answer them as much as possible.
We will try to only limit to the questions that associate with the items that we are discussing, and not answer the others just for the sanity purposes. But in the interest of time moving on, we to a technique. And they are often the favorite part of the Radar, at least for me, and this technique is really interesting because that's a pattern that we have all seen on our projects, which is about the conundrum between peer reviews and pull request, and Birgitta is gonna tell us more about that.
- Yeah, and this is also the first but also the only old blip, I think we're going to talk about. So we try to use the whole dream quite sparingly, so we try to get more positive advice, but usually things that we've put in the whole ring are really things that we continuously see. I'm still at a lot of the kinds of we work with, so we wanna call that out as something that we've seen with bad consequences a lot of times.
So in this technique, we call it this blip, peer review equals pull request. And what we mean by that is not saying that pull requests are generally bad, but that we see this a lot that pull requests become kind of like a mandatory thing that everybody has to do, and then teams seem to be forgetting about all the other options you can use to actually do peer review of your code. And it becomes the one thing. And why we see that as a bad thing is that so pull requests, if you use them a lot and as a mandatory thing, they can break the teams flow kind of and introduced bottlenecks. So they basically introduce cues to your workflow, right? And the more cues you have, the more this will like pile up and really disrupt your field. And also, the more it becomes a mandatory thing and it starts slowing things down and no developer likes bottlenecks, right? It also becomes less effective because people just find ways around it when they need to.
When they have cases where they're like, "Ah, in this case, it's not as important, I just have to get this change through." Then they just find ways around it where somebody just like for the sake of it, just clicks on something, and it doesn't even get peer review. So it's like when it's a one size fits all thing, it means, people don't practice thinking about why they are doing this anymore.
And they might be taking shortcuts in the wrong places, they might be taking one social parts where peer review would have been really valuable actually. What are alternative? Or not alternatives, but what should be part of your whole toolbox of how to do peer reviews? So one of the things that I thought we're big fans of some of you might know, is PAP programming. So that is of course one of the things that you can do to do a peer review just in time, just as you're in the moment when you're writing the code. I also sometimes like to use the term PAD development, where maybe you're not programming, you're not creating code with each other all the time, but you have two people working on a story or on a task and you regularly get back together and put your pieces together, and talk about what you've been doing, and that's how you do the peer review. Then also what I actually like to do when I've worked on a piece by myself, is I pull somebody in and have a conversation with them about the whole thing that I build.
So not commit by commit and they do it asynchronously, and I just throw it over the fence to them, and they look at it, but we actually have a conversation go through it together. And you can actually do this with pull requests as well, so that's also another thing of this. I think where we think this is particularly bad, is when pull requests are used really almost exactly like they're used in open source sports world so asynchronously, and you throw them over the fence and somebody else looks at your code and write some comments and then you ping pong back and forth. So I think having a pull request, but then also having a conversation about it, is also already a lot better than that in my experience.
Show and tells, and the team is another way that you can do peer review. In a lot of these cases, you can actually use the technical pull request feature, if you need for some reason, if you wanna document things, but you find other ways to not make it a bottleneck in the process. - Sure. There are a few questions in the Q&A, for the previous blip.
So while we give people some time to digest what Birgitta has told us about, the peer review and pull request conundrum, I'm gonna ask few questions on Redpanda. So Sudarshan there's a question, is Redpanda open sourced? And if so, is it also written in C++ similar to ScyllaDB? - Yes, so it is not open source, it's actually source open. What that really means is, and many like CockroachDB also is choosing to do this, essentially, I last researched more recently, essentially the problem is what some of these database, open source solutions are finding, is that the big Cloud vendors particularly Amazon, essentially create an offering around their database. And then they make money off it. And some of these solutions obviously don't like it, so essentially the modern licenses, essentially are more source open, in the sense that you can use it, you can look at the source code, you can use it in your project for example.
But if you are a Cloud vendor and you wanna make money off this, then you need to pay them, or you need to engage with them sort of thing. And of course they had a bit of a freemium model in the sense they have a commercial offering as well. I think earlier when I talked about the Batson transformers I think that's only available in the enterprise edition not in the open source. - Okay, cool.
- Under C++ one, yes, it's C++ just accelerating, yeah. - Fantastic. I think the were two questions related to Redpanda, both of them sort of aligned. So we have answered that. Moving on to the pull request and peer review discussion, of discussion as in a small ad hoc meeting than conversation and the PR. Birgitta, do you wanna shed some light on that? People are maybe thinking about this as a small ad hoc meeting than a discussion on the PR, like what do you prefer? - Yeah, for example, yeah, so I think that's what I meant by the show and tell, right? So in a lot of the teams that I've worked with, we've had this ritual, this team ritual of getting together once a week with like a loose agenda where maybe in the physical co-located times, you would just put sticky notes on something, right? And then you get together and you're like, one person is calling out a current problem in the code or a challenge you have, and another person is like, "Oh, we built this new thing.
And we had to use a new framework there, that does just quickly show you what we did there." And then people can also give feedback. So that would have been, or you can do if you don't wanna to wait for that weekly meeting of course, you can make it an (indistinct) meeting.
So yeah, those are like, all these, there's like a wide variety of options how you can do peer review, especially like in a more synchronous way. And that's what we want to call up with that blip that pull requests is not the only way. Think about like in different situations that you have also, how important it is in that situation? And I would also say, I mean, one reason, in some organizations where pull requests, technically the feature is important is because of regulation sometimes, right? For like PCI compliance and stuff like that. And I think for that, I would also say, look at where you really need these regulatory things. It might not be applying to everything that you're building.
And then it usually can be really helpful if you architect for that to be kind of maybe like a different part of the system than the other parts, and where you don't need those high regulatory requirements and then maybe see where you can apply different rules. What is the risk profile, and the documentation requirements in different parts of what you are doing? - Yeah, all great questions. There is another interesting one for pull request peer views, Birgitta, and then we move on. How does working remotely affect the pull request pattern? We are increasingly hiring people we have never met, which increases the risk of a malicious actor, introducing malware or a poor hire, introducing bunks.
I think that's a very valid question. Some insights on that. - Yeah, I think there's two aspects to this question. Like one is that it's a lot easier to kind of give into this temptation to do mostly asynchronous peer review when you're remote. Because it feels like there's more of a barrier if you need to ping somebody to maybe have an audio or video channel, than when you're sitting in the same room and you just walk over to the table and you're like, "Ah, do you have, like whenever you have a minute, can you come over and I'll show you (faintly speaking), right? So, (radio interference drowns out speaker). So I think it needs more (crackling sound drowns speaker), What are the different ad hoc channels you have on the team to set up like an audio or video channel.
(radio interference drowns out speaker) increases the risk (crackling sound drowns out speaker) actor. Yeah, I don't know if I would agree, (radio interference drowns out speaker) I don't know if that comes from maybe being forced to introduce, to open up like new doors to the system, because not everybody can be onsite and have like the (radio interference drowns out speaker). Yeah, so I'm wondering if that question comes from that perspective. More people are remote maybe kind of on short notice because of the pandemic and then you kinda, and they're not onsite anymore so they don't have the physical access to things anymore. And then maybe you have to open up security floors or something for people to access it.
I don't know if it comes from that. But yeah, (crackling sound drowns out speaker) you have kind of like (crackling sound drowns out speaker) architecture approach and good security practices to secure your pipeline and your code changes and all of that which can be like a huge security issue, right? Then I wouldn't say that remote, but I mean, that's always a problem with, or a concern that people have with trunk based development. Like how much do you trust people to push to master without somebody else having looked at the code, right? And there, it comes also down to your safety net. Like what do you have in terms of checks? I'm going to talk about some of that in one of the next blips as well; in terms of checks in terms of tests, in terms of everything that you have in your pipeline to detect when something is going wrong. So yeah, you have to balance all of those things.
- Sure, there are more questions. But in the interest of time, I'm gonna move on. And at the end, we'll try to pick up few more. The next blip that we move on to is from the duals world which is DBT and (indistinct).
- Thank you, Vanya. Okay, so DBT. Okay, yeah, so typically, when you have data hunters, who need data, right? Like if they're performing some analysis and they need it done in new shape, often in a data warehouse sort of setup, then typically what happens is they go to data engineers and ask them for the data in the shape they need. And the data engineers will probably go and create a new ETL pipeline, probably using a (indistinct).
Another problem here is, this is not a very scalable pattern. One of the key ideas from the last few years, in the data and (indistinct), is that we need to, from a data democratization perspective, we need to make it more easy for consumers of data to get data in new shapes without waiting on engineers to create a new pipeline. And we don't, so that translates to, we don't want our data engineers to be writing ETL pipelines. We want data engineers to instead build data platforms which make it easy for consumers of data; whether they are analysts, scientists, or data engineers themselves, to use this platform to more easily create data and the shapes they want and access the data. So DBT is a tool which enables this vision.
It's called data build tool. It's a command line tool based on SQL and Jinja, which is a templating language. It enables users to transform data within the target database. The core idea of DBT is give the ability to the consumers of the data, the power to create transformations. So we would essentially use it in an ELT context, like so the (indistinct) would have happened extract and load, the data will exist in the target database.
And you would use DBT to then transform the data in the shapes that you need. And then you would consume it. Now, DBT, by the way, we had it back in Asses back in 2019 or so. And with this addition of the data, we moved it to Trial because it's just had such a big pickup in the community. We've used it quite successfully in some of our projects. It does what it's intended for really well actually.
This is actually, this sort of is happening now because it's almost like the right tool at the right time and right place, sort of thing. There are a few different trends it builds upon to enable it to do what it's doing. The first is, you couldn't have done this before, let's say five years ago. Because you didn't have a system where you could run transformations on big data and make it work efficiently, right? Like you needed the right sort of system, the right software, the MBP SQL data warehousing technologies, right? Like Redshift, BigQuery or Snowflake.
You need these sort of systems which can apply transformations at scale for something like DBT to work. That's one. And another trend is, emergence of SQL as the language across the data landscape. Like whether you're an analyst working on data warehouses, whether you are a data scientist, probably working on Hadoop style technologies on data lakes, all given data engineers, right? Like across the entire spectrum, SQL's a tool in everyone's tool belt now.
Like everyone needs to know SQL. So having a tool like DBT which uses SQL, means it's immediately accessible by all these roads. So I think these are the trends sort of which make DBT the right tool at the right time in the right place sort of thing. And one of the things we really like about DBT, it's not just like this local sort of tool, right? Like it's not intended to dumb down the problem, give a UI and drag and drop and stuff like that. It actually enforces good engineering practices, as source control and tests and documentation and automated deployments and stuff like that.
It enforces all those patterns guide. And gives a simple interface for people to create a SQL based transformations. It provides adapters for all the big systems, database systems including even Presto, Latina sort of systems, all the way to data warehouses like we talked about Redshift, BigQuery, Snowflake, et cetera. But it's not a silver bullet, right? It is essentially a tool intended for batch transformations. So if you are working in stream transformation context, in an ecosystem where you want streaming transformations, then this might not be the right tool for you.
But having said that, the analytics ecosystem is still very batch driven, it's slowly evolving to become more stream driven; but right now, at least it is very batch driven. So yeah, I mean, I guess to summarize, I would say, it's a tool worth keeping an eye on, it's proving to be extremely powerful from a data democratization perspective. You've got analysts who are not comfortable using DBT to self-serve their new data shapes and consume it and provide analysis; part of the whole movement to create platforms rather than (indistinct) of ETL pipelines. I think this can also, this is also I think a key piece of the jigsaw puzzle of how do you de-centralize data platforms, right? Like how do you data mess style rather than the central data lake style? How do you do that? And for that you need like more platform capabilities so that it's easy to bend your data products.
And I think that one ecosystem, I think DBT can be a critical piece in the jigsaw puzzle. - Cool. We have one more tool coming up next, in fact, two more tools.
Over to Birgitta, Prowler and Recommender. - Yeah, so these are two tools that we're putting on the Radar this time from the category of infrastructure configuration scanning. I apologize by the way, my internet connection seems to be a bit unstable. So Vanya, please call out if I'm not understandable at all anymore.
So yeah, so infrastructure configuration. Scanning is a technique that we've had, that we put on the Radar quite awhile though, actually three years ago or so. And the idea here is that, as there's more and more of a drive towards autonomous teams who are also taking more ownership of the infrastructure they are running, 'cause that's being enabled by the abstraction level of services rising that they're using with top providers. So you can take care of more and more things as a team, in terms of the breadth, because things are becoming more abstract and easier. But they're not also more pitfalls to avoid potentially, right? If you're taking care of this wide range of things that maybe you don't have an expert for on the team, but there are also more and more tools now to help you avoid those pitfalls as you go and to have maybe experts from outside of the teams, kind of like contribute to the automation that you can do in your pipeline to help you scan what you're doing in terms of infrastructure and give you alerts and checks when you're doing something that might not be a good idea. And these are two examples of that.
So the first one is a Prowler, which is an open source tool that you can use to do this type of technique for AWS when you're on AWS. So it's specifically focused on security and compliance. So it's has a lot of checks from the AWS CIS Benchmarking. And it also has an of checks that belong to the category of helping you with PCI DSS compliance and GDPR compliance for the European data privacy regulation. So, yeah, like I said, all of those things in the category of security to really help you put things into your pipeline; as I said, maybe even from the outside, from your security governance you can tell the teams, "Okay, put these checks in when you're changing things in your infrastructure.
We can see when things pop up that you should take care of." The second tool, Recommender, is also from this general category. This is actually a service from Google Cloud. So that's why I put both of them into this category to give two examples from AWS and one from a GCP. And this is a service from Google that is not just restricted to securities.
So the idea is that they have multiple, what they call Recommenders, in different categories. And so it can also help you analyze your usage of compute and tell you where you can potentially optimize things or it can give you again, like insights into security things as well. So you're, what's going on in your firewall configuration? Or an example that I really like is their IAM, their Identity and Access Management Recommender. And I think it's also good example of how in the Recommender, they're using data about what's actually happening in your environment, to tell you where you could improve things.
So in the case of the IAM Recommender, they help you implement the principle of least privilege, right? So the security principle where you try to only give roles the access that they absolutely minimum need. And the IAM Recommender on GCP, we look at what the different roles you have set up are actually doing. And then they will tell you, "Oh, this role has the following privilege and it's actually never using it." And then you can say, "Okay then I'll just put it out." Because sometimes, we don't know what something's going to need and we maybe set it up a bit wider, but then based on that usage, it will tell us where we can reduce the privileges. - Yeah.
Birgitta, there's a question there, Prowler, I'm assuming is an AWS specific service and not available for Azure, is that right? - Yeah, as far as I know, it's AWS specific, yeah. - Okay, great. Moving on, because we are running short on time, the last blip for the day, is a technique called Team cognitive load. And for that, I'm gonna hand over to Sudarshan.
- Thank you, Vanya. So team cognitive load, it's a technique on Trial. So one of the key problems of the sort of solutions in architectures that we design nowadays, right? Is getting the right software boundaries for our components and shipping our teams around them. Typically, what we do is, we apply domain-driven design techniques in designing the software boundaries.
Then we acknowledge Conway's law and we shape our teams around these boundaries, right? That's typically what we do. So Matthew Skelton and Manuel Pais, the writers of the book, "Team Topologies," they argue for using team cognitive load, as a way of influencing your architecture and team shapes. And we find this a very useful tool in reasoning about the health of our team structures, right? It has helped us to sort of, help some of our teams to even validate their intuition.
They recognize a problem and they can use some of the other ways of thinking to identify that. But using team cognitive load, helps to articulate the problem in a different way. And so we felt that it's a technique worth calling out. Essentially what we mean by team cognitive load, is basically what does the team have to deeply internalize? And I know what does it need to keep in its collective head when building, owning and operating the software the team owns, right? Like essentially, what does it take to be a high-performing team in the context of software it owns? So the key idea here is that, a team working with software system that requests too high a cognitive load, cannot effectively own or safely evolve the software. So that's the (indistinct).
And so it's actually super important or useful to assess team cognitive load in some way, and then take an action to minimize it. There are a few different ways you could use to assess it. Like in the book, the writers talk about just as simple as using a questionnaire for the team, for an assessment, or you could use relative domain complexity as a measure, like look at all the domains in your system, give a certain complexity number to one of them.
And then relative to that, assess the complexity of the surrounding domains. And then use that to judge, what is the team cognitive load for those domains like sort of thing? Maybe even this is gonna be quite contextual, right? And it's worth spending time in your context or our context to measure the team cognitive load. But what can you do if you find that the team cognitive load is high? For a software database team, a team first approach to cognitive load, means that you have to limit the software system that a team is expected to work with. It might be that there are some complex (indistinct) that the team is grappling with, which is leading to that additional cognitive load. So you might then look to extract out some of those complexes and some of those (indistinct) and pushing it into the platform layer. A great example of that, is a whole service mesh setup, right? Like if you are in a disputed setup, and you have a lot of complex data related to failure modes or retries, neutral DLS and all those sort of things, pushing it into a service mesh means you've taken complexity out of the team owning service and push it into a different layer.
And that's a great way of addressing this problem. Or you could essentially do the (indistinct) technique or splitting out the complex bits into separate subsystem, or even change the team shape, right? Bringing in additional capabilities or skills. If you notice none of these actions are new, like it's stuff that you might do if you approach it from a pure architectural lens, or the art transformation lens, like these are solid techniques that we have used but approaching it from a team lens, I think just makes it, gives you one more useful way of approaching the problem and might lead you to some ways of breaking the system up, which you may not, if you just purely approach it from an architectural or a domain driven sense. So I think it's a very, very interesting technique; using a team first approach, and using team cognitive load as your parameters to architecting your system.
And we think it can be a very useful not star sort of thing, to focus on as we evolve our systems, constantly ask us ourselves the question, are we keeping our team cognitive load under control? Yes, good. No, what do we do? Should we break up the system? Should we take one of these actions, so on and so forth? So yeah, I think I really liked this blip. I've tried in a few places, but I'm really looking forward to trying it in my clinical. - Also like looking at this cognitive load, usually it's like for individuals when we talk about multitasking and applying that to a group of people, I always think of that as well when, with work in progress limits for a team, right? That if the team, as a group, works on too many things at the same time and also kind of overloads it, right? But I also like this additional thing, this idea from "Team Topologies" to also use it when you consider how you shape the team and tailor that.
So, yeah, it's very nice. - There's one interesting question in the chat around "Team Topologies." Any example of where team load is too high is across number of technologies, business domains, and a mix of these, right? How do you measure to determine if the load is too high or what is the cognitive capacity of a team? I think that's a great question now (indistinct). - It is a great question.
And I don't think we have enough data to summarize and say crisply here are the base, which work. I think it has, right now, at least we are approaching it in a very contextual manner. Like what are the dimensions for your team, which add to complexity, right? Like some things may not add to complexity. For example, and this, I think is very clear, it may not be lines of code. It may not be the size of the system. It might instead be just configuration, like for example, right? And maybe you are deploying it into many contexts and there's a lot of configuration and that's adding to complexity.
So I think, at this point at least, in my senses, it is still contextual. We are still trying to figure out how best to measure it. I like the idea of using relative measures, even if you are not able to come up with absolute measures, sort of like estimates, right? You may not be able to say this is 10, in a scale of one to 10, but you can say this is twice more complex than this other thing.
And if you are able to sort of use relative measures, you might be able to better reason through, and say you know what? The steam in my organization has higher cognitive load than this team, because the complexity of the ecosystem they work with, is twice as much. Maybe something like that might be more easier to reasonable. - And what I just said with the work in progress, like how many topics does a team have in progress at one moment in time? So if you maybe tack their tasks into different buckets, right? And you see they're actually working on 10, they have to work on 10 different things at the same time because it's been in the nature of what they're doing then that's usually a warning sign. - Absolutely, yeah.
- Yeah, there's this one more interesting question on this. It's around whether this exercise is a consistent one time view of the team, or it's something that's a dynamic that keeps on evolving over time? - You know the answer to that, right? (both laughing) You know it. It is absolutely not static, but you know how to pick, I mean, it's like any other assessment, right? If the effort of getting the right assessment is too high, it's gonna happen less frequently. If the effort is trivial, it can be continuous, yeah. - Makes sense, makes sense. In the interest of time, I'm gonna move on to the two themes that we have to discuss.
So thank you both for sharing information about these blips. We will go to two themes now, which should also be available with the upcoming Radar document. And Birgitta is gonna talk to us about the first one; Consolidated Convenience over Best in Class. Birgitta. - Yeah, so the themes are just like a little side thing we usually do after we discussed all of the blips that we see. Like are there things that we noticed, any patterns, any additional things we wanna say for an addition and they are more ephemeral, right? So the blips usually stay there for you to see, and the themes are for this particular edition.
And this one is maybe a bit of generic type of what it specifically refers to, is developer facing (indistinct). So artifact, repository, source control systems, CD pipelines, all of those things for developers. And how we're seeing thiS, like we're at a moment in time right now where a lot of organizations are trying to kind of walk this trade-off of the one-stop shop.
So there's kind of that convenience, or we're on Azure or we're on AWS. So let's also use the services that they have to set up our pipelines to use artifact, repositories and so on; because it's more convenience. But on the other hand, a lot of these tools from the larger platforms that you're already using for other things that are now adding these services on top of what they already have, a lot of these tools are not necessarily the most mature ones in that space yet. So you kind of have to decide, okay, do I use this? Because I have it all in one place and that's more convenient, or is this maybe not mature enough for me yet? And there seems to be a trend that more and more of those things are kind of like can be in one place.
But for example, if we put AWS code pipelines on hold in this edition; so because a lot of our teams have had very bad experience with it, Azure DevOps is actually on try at the moment, I think. But the more teams are using it, the more disputed is also, some teams have very good experience, some teams less good experience. We finally put (indistinct) actions into Assess, which is also a CD pipeline implementation, because there was also, it's been suggested by ThoughtWorkers to put on the Radar for at least a year, I think. And there's always been this like, "Ah, is it mature enough? Is it too generic?" So there's been a lot of like back and forth. These tools are getting more mature, but you should still be a bit cautious when you think about, "Oh, we're on AWS. So let's also use that for all the developer tooling, whatever they have."
So it's kind of like at a point in time, where it's, yeah, where everybody has to make that trade off a little bit. - We can move on to the next theme, which Sudarshan is gonna talk about; Platform Teams Drive Speed to Market. - Thanks, Vanya. I think, as Birgitta said, these teams are just covering, summarizing a lot of our conversations internally.
They may not end up as blips, but definitely we will have touched upon this topic multiple times in various blips. And in every data platform, as a team comes up repeatedly, it's a very, very, very common blip, team rather. So, yeah, I mean, platform teams are not a new concept, right? It's been around for a long time and even back in 2017, I think we had it in the Radar, and it's even made it into the Adopt train.
But yeah, I mean, but we have a lot more learnings now on what makes them successful, and what are the failure modes? It is very clear Platform Teams do drive speed to market. It's absolutely true. Organizations use them and we've used them successfully to reduce operational complexity, accelerate application development, and improve time to market, all of these good stuff.
But just as many positives, there are as many failure modes. And it's easy to slip into one of those failure modes. We are getting more experiences, that's a good part; but painfully right, with failures.
We've got the one platform to rule them all failure mode. We talked about cognitive load before. Think of a platform which has to do the (indistinct) beds, it has to do your run time observability, it has to do your (indistinct). If you've try to do much, then cognitive load is obviously too high.
And apart from the quality of the service, the team is just gonna struggle, right? Under their failure mode as big platform upfront, where it may take years to deliver value because you're just trying to do too much. You're trying to get to a perfect state before getting your users to use it. Putting platforms behind a ticketing system, was one blip we talked about. I don't know whether it actually made it. But you build a platform, but then you put it behind a ticketing system, which means you lose all the things it was originally intended to solve.
The self-serve, and ease of use and all that stuff. Another failure mode we are seeing is layered platform teams which basically means like you take your application stack and create layers and you have a platform for each layer. So it's intending to put logic in there, and your actual feature teams are not at sort of being disengaged sort of thing. So multiple failure modes, right? But what's helping us is that, we are realizing some of the things to do to avoid these pitfalls. And one of the most powerful things is just back product thinking techniques here, right? Have a proper product owner for it, talk to your customers, understand what is the school for it, of the platform that you're applying to, how do you make it usable for your customers? Apply all your product thinking techniques. And there's a good chance that you'll avoid at least some subset of these failure modes.
And I would say that, I experienced we're also coming to some realization about the interaction patterns within platform teams and the various consumer teams. Like, how at various points in the maturity light cycle, different patterns come into play? You might start with a more collaborative pattern initially where you are working with one of the consumer teams, but over time, you'll be seeing it more and more into a clean self-service as a service pattern. So, yeah, I mean, basically I guess, the summary here is, I mean, there's no doubt that this is a very, very useful lots of failure modes, but we have learned a lot as well. And we know some techniques to avoid those failure modes. - Cool, I think with that, we come to the end of what we had planned for this (indistinct).
But there is one question which is still open, about the team cognitive load. Sudarshan and Birgitta, can you please give one example assessment of cognitive load measurement in real life? - Actually, I pasted a link in the chat. No, I've not used it, but I think I give one or two of the teams have used those as an inspiration to derive a different questionnaire. This is, I think, created by the, an assessment by the "Team Topology" creators services.
This could be a starting point, I think. - All right, cool. So we have already crossed the end of time, but if there are few more pressing questions, we can definitely stay back to take them. But in the while, I would like to thank my panelists, Birgitta and Sudarshan for taking out this time to give a sneak peek to our audience who are waiting for the Volume 24 of our Radar. And I would like to tell our audience to stay tuned because the Radar is going to be out pretty soon. I've been advised to not tell what's the timeline looking like, but do stay tuned for any more updates about the Radar.
And I think because there are no more questions, maybe we can call it a day. And thanks everyone. Thanks for joining, and have a great day ahead or a good night. (laughs) Bye, take care, everyone.
- Thank you, Vanya. - [Vanya] Thank you. - [Sudarshan] Bye. (upbeat music).
2021-04-16 08:03