I just got back from Java 1, and the talk of the conference was JDK -24, which was released during the conference. Now, there are a lot of great new features, 24 to be exact. So if you want to, you can go ahead and take a look at this list. We're going to talk about one of those features today, and that is the stream gatherers. So if you click into this, if you've never run a JEP, read a JEP before, this is JEP 485, it's a really good read.
You don't need to read it cover to cover, but if you go through there, you can see a summary of it. Here are the goals of this Jeff. What are we trying to solve for? More importantly, what are the non -goals? What are the things we're not trying to solve for? You get some motivation. Here's what we did before JDK24. Here is what stream gatherers will fix, and there's some examples in there.
Now, I'm going to go through an example today. I have a blog post accompanying this, if you want to go ahead and read through that. Or you can follow along. We're going to do some coding together and work through this challenge.
Now, the challenge is I have this blog post application, and I have a blog post application, and I have categories for each of these posts. And what I want to do is for each category, I want to show the recent three posts. So I have a list of posts, and I need to somehow get that out of there. So there are ways that you can do this before JDK24, using the stream processing, and you can go through and kind of make it work. But you'll see that there are some challenges when doing this. We get some pretty hard -to -read code.
It's a little bit complicated. What stream gatherers are going to do, is allow us to write some of these intermediate operations to get the results that we're looking for like this case. More importantly, these become reusable that we can use throughout our applications, which is really nice.
And we get some other benefits here, which I've laid out in the blog post. So we'll start with a blog post model. We'll have a record for a post.
I'll create some dummy data that will give us like categories and post for each of those. And then we'll walk through kind of an old way to do it, and then a new way to do it. do it using the stream gatherers.
In this blog post, there's a link over to the GitHub repo as well. Not a lot of code here, but if you want to go ahead and take a look at this, you can as well. With that, let's dive in and I'm going to open up IntelJay. You can use notepad, Visual Studio Code, Eclipse, whatever you feel comfortable using. That's really not the point of this.
We'll just dive in and we'll kind of do everything in one file to keep it simple here. All right. So here we are in. IntelliJ, I'm going to go ahead and create a new project called Stream Gatherers. We are going to use Maven. I'm going to use JDK24. Again, you need to be on JDK24
higher to take advantage of this feature. And I'm going to come in here and I'll just call this Gatherers. And I'll go ahead and create a new project. All right, from here, I'm going to delete some of this code that just comes with our basic project and we'll start from scratch here. So I'm going to
create a new project. So I'm going to a new public static void main and this is where we'll do the bulk of our work here today. Now, we are going to use the idea of a blog post. And to do this, we're going to need a bunch of blog posts as data to work on. So I'm going to create this method here that is going to create a bunch of blog posts. Now we need a blog post type. So I'm going to create a new Java class. I'm going to call
this blog post and this will be a record. So in here we'll create a blog post. So in here we'll create a a long ID, a title, an author, and some content, and category, and category, and finally, a local, date, time, which is going to be the published date. All right. So this is our record. If you're new to Java and you
haven't seen a record, this is kind of like a class. This is a is an immutable class, meaning once we create an instance of it, we can't go ahead and change things. That gives us a bunch of benefits. But from a record standpoint, you'll notice this is much less verbose than a class. We don't need to declare all of our fields, Gitters and setters, a two string, and equals in hash code, et cetera. So we have this nice compact type called a blog post, which is going to represent a blog post in our system. And here, I just didn't want to type all
this out. I have a bunch have different blog posts and they each have categories. So they have different categories. Let's go down. So here's spring. So a bunch of different categories and then titles and some content.
And this will just make it easier for us to kind of work with some type of collection of data and learn more about like stream gatherers here. So the first thing that we're going to need is we're going to need those blog posts. So I'm going to say a list of blog posts. We'll call it. We'll call these posts is equal to create sample blog posts. So that will get us a bunch of posts. Then what I want to do is I want to start out by just taking a look at how we've used streams in the past. And to do so, let's use something like posts by category. And we'll pass in the
posts. And we want to get all the posts that have the category Java. Right. So this is going to be a method here in a our main class, we can go ahead and create a method. And that creates a static void.
So we're not returning anything. We're just going to list out all of the posts by a particular category. And this, let's go ahead and change this to category. And now we could start to work with streams how we've done in the past. We have this collection of data. In this case, it's a list of blog posts. And what I want to do is kind of filter on that. And then that is exactly what streams allow us to do it. It allows us to run some intermediate operations on it, like
filter, map, et cetera. There's a whole bunch of those. And then kind of use a terminal operator to kind of collect everything back into something else. So here's what we'll do. We'll take our posts. We'll call this stream method on it. And that will give us access to these intermediate
operations. So if we hit dot, we can start to see things like filter and limit and sorted and collecting gatherer. We'll get to all those. So I want to just filter these by a particular category. So I'll say, here's my post.
And what I want to do is get the category on that post. And when it equals the category that was sent in, that will kind of filter out everything that we need. I want to go ahead and sort this. So I'll use a
comparator. So we say comparator .comparring. And we're going to sort on the blog post, the published date. So we'll get the published date and we'll say dot, oops, dot reversed. And that will give us those in kind of descending order. Then we can say, all right, I want to limit those to maybe three. And the terminal
operation, what are we going to do when we kind of collect all of those things back? I want to use the two list method to return those into a list. So I'll say this is posts by category. And that will give us all of those.
So I want to go ahead and print those out. So I'll say system .out. .print line. And we will say, let's get a new line here and say post by category. Category that looks good.
And then I'll just use the post. That category for each and print those out. And actually we can clean this up a little here. And we can just say, system .out printline. Right. So that will give us that. Let's go ahead and just put something here that says copy that and we'll say find all posts by category. Okay. Now we'll do that. So this
is our first iteration. We're getting a of blog posts. We are passing this to this method. And we're basically filtering and sorting those down and limiting it to three and then printing those out. So let's go ahead and do that.
We see our post by category Java. And then we see our three posts here. So we're three posts in there. If we went and looked through that collection, these would be the latest three posts. So great. So now what I
want to do is take a look at something a little bit more complex and before JDK 24 and stream gathers how we might have solved for this problem. So let's do one more of these and we'll say this is going to be before JDK 24. So before JDK 24, let's say nested collections, right? So we'll create a new method. We'll call this nested. collectors and we'll pass in our posts. Okay, we don't have that method defined yet.
So we'll go ahead and create that method. And here we are. So now let's talk about how we might have solved for a little bit more complex problem prior to JDK 24. And I'm going to go ahead and paste a couple things in here. But prior to JDK 24, what if I wanted to group by category, order by published date, and limit to three recent posts? So we've kind of done this here, but what I want is I want all of the blog posts. So everything in that collection, I want to group by category, and for each category, list the latest three blog posts. So this is a little bit more difficult prior to JDK 24.
I want to, again, start with a stream, but I can't use an intermediate operation here. If I filter, I'm just filtering by one. So what I'm ultimately doing here is using a a terminal operation. So I'm going to say collect. And what I'm going to do with collect is use one of the collectors. So there's this collector's class class. I'm not actually using that one, but if you look at this, the collector's class has a whole bunch of collectors that implement various useful reduction operations, such as accumulating elements into collections, summarizing elements according to various criteria, etc. So really, really,
this is kind of a terminal operation. It's not meant to be a filtering operation or an intermediate operation, right? But in this case, we kind of have to, because in this case, we want to group by. So what I want to do is I want to group by the blog post category, right? And then I'm going to apply another collector. So I'm going to say collectors. Dot collecting and then. So what are we going to do after that? We are going to do after that. We are going to
to collect the posts into a list. So let me just comment this. So we have first, group all posts by category, just so we can follow along here. So now I want to collect the posts into a list. So I'll say collectors. Dot two lists. So we're collecting those into
a list. And then transform each list by sorting and limiting. Okay. So now what I want to do is I'm going to say I have a list of blog posts. We'll say these are category posts. And with these, now I'm going to stream these. Now I'm going to stream these. So I'm going to say category posts. stream. Right. And, , oops, did I miss something here? sorry. This is implicit, right? So I'm saying stream. And now what I want to do
is I want to sort them by the same way that we were doing before. So comparing using the blog post published date, right? And then reversed. And that will give us that sort. Then I want to limit the latest three. And then I want to go ahead and turn that. into a list. All right. So again, if we're following along, first group all of the posts by the category, then we're using a collector's grouping by to do that. Then we're going to say
collectors that collecting and then. Now we collect those posts into a list. So for each category, we collect them into a list. Then we can go ahead and say, okay, sort these by the published date, limit to three, and then turn that into a list. And then turn that into a list. lists. So ultimately, we're going to get something back from this. This is going to be a map of string. So the category and then a list of blog posts. So I'll say recent posts by
category. Okay. So that's going to give us what we need there. Now, I've written a little bit of a utility function to kind of print these out because we're going to do this a couple times. So I'm going to put this down here. And all this does is taking those recent posts. It says, hey, recent post by category. For each one, we're going to, for each category, we're going to print out the category and then print out the list of blog posts. So that means I should be able to come up here and just say recent
post by category and pass in those recent posts, right? Nope. Oops, want to print, print recent post. posts by category. Cool. So now, let's do this. So now we're saying before JDK 24, nested collections, we should be able to run that. I'm going to comment this out just so we don't run that one as well. And we're going to run this application. And we should get what we expected, which was, here's the category, here's the three recent posts. Here's the category. Here's
the three recent posts. Now, these aren't alphabetized. We could also introduce some code to do that. But for this example, I'm not going to. So that is one way of doing it.
I'm going to paste in another way, so we don't have to talk through this, and then we'll get to the gatherer stuff. But another way you might solve this is using the map, then transform. But again, it's not an intermediate operation, right? First, we are grouping by category. So we are running a terminal operation to collect all of those by the category. Then we convert them to a stream of map entries. Then we convert each entry to a new entry with sorted and limited values. So again, there are ways to do
this prior to JDK 24, but they aren't the kind of best way. And these are very specific to this particular instance. So if I wanted to kind of write a group by category, intermediate operation, that really wouldn't work. There's a spelling mistake here, but that's okay.
So back in the blog post, you'll see I have some examples of really kind of reasons why this isn't the greatest approach. One readability is not all that great. Two, you can't reuse these. Three, you can't use these in parallel streams. So there are some kind of downsides to this. But if you're in a bind and you're on something less than JDK, 24, it is possible. So now I want to introduce kind of stream gatherers. And
before we kind of write our own stream gatherer to accomplish this same functionality, I want to take a look at some of the custom gatherers that kind of come with JDK 24. And to do that, I'm going to paste some in. We're not going to now walk through all of these, but there are some examples here.
So the biggest thing is that when we start using these intermediate operations, in JDK24, there's now a gather call, a gather function. And this gather function, we can see, takes something called a gatherer. And this is the new kind of interface in JDK24. The gatterer, if we go back up here, a library that implements transformations based on the gatherer, such as streamedact gatherer, must adhere to the following constraint. So there's some things to look.
look through in the documentation if you want to take a little bit more deep dive into that. But there is a class here called gatherers, and this introduces some useful gatherers that weren't there before. So things like window fixed. You can see window fixed is a gatherer that gathers elements into windows, encounter ordered groups of elements of a fixed size. So there's things like that. And actually, if we go down to the structure here, you can see there's window sliding, fold, scan, map concurrent. So there are some useful gatherers that are a part of JDK24.
And so here's an example. If I wanted a fixed width example, I can pass my blog posts in. I can stream this. I can limit it to nine.
So I only want nine posts. But I want a window fixed of three. So I only want three of these. a time. And then what I can do is kind of batch this or for each this out and say, here is the batch of that. So let's take a look at this example. And I'll just paste this in here and send our posts in. Whoops. And then we can go ahead and run this. And we see the posts
in the batches of three. So here's the first batch. Here's the second batch. Here's the third batch. And again, these may not be something you need, but there are a bunch of useful operations in this gatherers class that now you can just pass to the gatherer method. So that is one kind of use of JDK24. But what I want to take a look at here is being able to solve for the problem that we just did before, which is kind of group by limit. So here, I want to say public static void, group by, with,
limit. And again, this is going to take in a list of blog posts. I actually don't need a limit here. We could pass in a limit. Actually, let's just leave it there. So now what I want to
do is use a custom gatherer to create a recent posts by category view. All right. So we're going to say, we're going to start with post .Stream the same way that we did before. Next, we're going to pass in a gatherer. We'll come back to that in a second. And then once that gatherer is done, we'll use a terminal operation to collect. And in this case, I'm going to use the collectors .2 map so that I can say, hey, I want the map.
that entry get key. That will be the key. And then I want the map. That will be the value. That will be the value. And that will return what we want. So now the question becomes, actually, let's just turn this into a variable. And this is going to be a map.
We'll come back to that. A recent posts by category. Right. And this is a actually going to be a string. So similar is the one that we paste it in above. We are going to get a string and a list of block posts. Right. So that will be that. So now comes the challenge of we need a gatherer.
So again, we have those gatherers in Java .com .Ututil .Stream .comdaths, but that's not going to work for this one. I need to write my own gatherer. So similar to the way this class does it. We have a private gatherers. This class is not intended to be instantiated. You have these static methods that will take in something and
then, in this case, return something. This is a generic type here. So if we look at the gatherer interface, this is where understanding how to kind of create a gatherer, really comes into play. So there's a bunch of documentation here. Again, this is an intermediate operation that transform a stream of input elements into a stream of output elements.
And the way that this works is by taking in the type of elements used on the gather operation, the potentially mutable state type of the gatherer operation, and the type of the we are going to return from it. So type, mutable state, and return. So that is the gatherer interface and that is kind of what we are going to do here. So I'm going to follow that pattern and I'm going to create a new Java class and I'm going to call this my blog gatherers, right? And so this makes it usable. I can go ahead and say that this is a new Java class. I'm going to call this! final class. And again, just like the one before, private blog gatherers, this class
is not meant to be initialized. Okay? So now what I want to do is create some functions in here. Now, and again, these are related. So I put these in a related class, but now anywhere
in my application that I need to use this functionality, I can just import these static functions. and use them. So we are going to, I'm just going to copy and paste this one in. Instead of walking through each line of code, we'll kind of talk through it.
So here is a static function. Again, this is going to be a gatherer. And again, if you're not used to generics, don't get too scared by this. There's a little bit going on. But this, this is a at the end of the day, gives us a bunch of pros, right? In our main class, when we want to use this, it's much more readable. Now, I understand, like, trying to write this function isn't as readable, but once you start to get the hang of how to create these gathers, it's pretty easy.
Remember that gatherer interface defines a type, a mutable state, and a return. So in this case, we are going to be working on blog posts. The immutable state, here or the mutable state is going to be a list of blog posts. And then the return
type is going to be a map entry of K. So the K is basically the category in this point. So a category of lists of blog posts. So this is called group with limit. It takes three arguments here. The key extractor, the limit, and the comparator. So this is what we will pass to this particular function. So what we're going
to do. do is return a gatherer .of. And what we are doing is initializing with an empty map to store our grouped items.
So we're saying, hey, we have a hash map. Again, it's a category and a list of posts. Then we're going to process each blog post that comes in. So as we process each blog post, we get the key for the blog post. In this case, the category. So we get the category. And then we add this post to its group, creating the group, if it's needed. So if we don't have the group already created, say, Java is the
category. Then we create that group and we add that post. And then we continue processing that stream. Then we use a combiner. This is for parallel streams. We just use the first map as it's a simple, first map in this simple use case. When all posts have been processed, emit the results downstream.
So we are saying, hey, for each key, for category, and posts, we are going to sort those. Again, this should look pretty familiar. This is what we've kind of done before. Here's the comparator. Here's the limit. And this is what becomes like really usable, right? We can use this group by limit however we want. If we wanted to group by something else, or if we wanted to limit, say, 10, and we wanted to maybe get all of them by published data sending instead of descending, this becomes very usable. And then we just just submit those or we push those into the downstream map.
So I get that this can be a little confusing, but this becomes so much more usable. And once this is in place, once we have this class of all these, like, helpful gatherers in our application, they become very easy to use. So now if we come back over to our main class here, we just want to go ahead and take advantage of that.
So now what I can do is I can say blog gatherers, because again, that is a class that we are not instantiating. And I can call that static function group by with limit. So I have to pass a couple of things there, right? The first one was, how are we grouping these? In this case, it's by category.
So we'll say group by category. Now I'm going to set the limit. So let's just say, so. it's different than the one before that I pass in a limit of two. And then the comparator. Again, this should look pretty familiar.
We're saying, hey, order these by published date, reversed. And that means I want the newest first, right? So now this becomes much more readable. I can look at this and go, oh, okay, we are grouping by with limit. That makes sense. I could even. do some different things here.
So this becomes, for me, much more readable. And again, we can just call this print recent post by category. And if we come back up here and say group by with limit. And oh, yeah, we wanted to make this a little bit more flexible. So let's say limit here. Okay? So now let's go ahead and run this again.
And now I have my category. And this time, just two of the posts by published date. So this is really cool. Again, I'm going to, let's paste some code into here. In the final repo that you have, there'll be a whole bunch more of these.
So if I wanted to come in here and say, all right, so now I've pasted in a bunch more. What if I wanted to get related posts? So I can use this gatherer to process that's, stream and get a bunch of related posts. Or I wanted to calculate, that's actually a helper method, if I wanted to extract tags or calculate a reading time, or get popular authors, or get a monthly archive, et cetera, et cetera. So there are a bunch of gatherers that we can now write, and we have them all in this blog gatherers class that we can go ahead and take advantage of when we are using this.
So I think that's all for. today, I just wanted to kind of talk through JDK24 stream gatherers and how we've kind of used streams in the past. Again, we looked at this example of how we might do that.
And I think we had an original example of, yeah. So in the past, we've used stream to use these intermediate operations, use the terminal operation to collect them in the back, collect them back. But then when we had a little bit more complex examples, like, how did we do it? And we kind of had to kind of hack together these solutions to satisfy those requirements. And now we have a method, this gatherer, that allows us to write our own intermediate operations. No matter how simple or complex they may be, we have this option to go ahead and introduce our own functionality, which is really exciting. So I think that's it for today.
If you are excited about stream gatherers, let me know in the comments below. Let me know what types of gathers your writing. Oh, and I didn't mention this. And I'll have to get, maybe I'll see if I could find a link and put it in the description below. But there
are already like groups of custom gatherers out there, like repositories of, hey, here's a bunch of like custom gatherers that might be really useful to you. So in some cases, you may not even have to write these gatherers. You can just kind of pull them in, which is exciting. So hey, I had a lot of fun putting this together. If you learned something new today, do me a big favor of friends. Leave me a thumbs up.
to the channel and as always happy coding
2025-04-01 20:25