On the Collectors used in the Stream API


Introduction to Collectors
Collect is an extremely useful terminal operation to transform the elements of the stream into a different kind of result, like a List
, a Set
or a Map
.
Java supports various built-in collectors via the Collectors
class.
A very common use case is to convert a list into another list, after filtering the elements.
List<Integer> listOfNumbers = Arrays.asList(1, 2, 3, 4, 5);
List<Integer> listOfEvenNumbers = listOfNumbers.stream()
.filter(n -> n % 2 == 0)
.collect(Collectors.toList());
Formal definition
The official java doc for java.util.stream.Collector defines it as follows:
A Collector is a mutable reduction operation that accumulates input elements into a mutable result container, optionally transforming the accumulated result into a final representation after all input elements have been processed.
What is a mutable reduction operation
A mutable reduction operation, such as Stream.collect(), collects the stream elements into a mutable result container as it processes them. On the other hand, a non-mutable reduction operation, such as Stream.reduce() uses immutable result containers and as a result needs to create a new (instance of the) container at every intermediate step of reduction. This degrades performance.
How a Collector works
As explained above, a collector collects the elements of a stream into a mutable container. There are five steps to this process:
Step 1 - Supplier provides the mutable empty result container It is an instance of the Supplier functional interface which provides an instance of a Collection(or Map) to hold the collected elements.
Step 2 - Accumulator adds individual elements into the result container It is an instance of the BiConsumer functional interface. It adds individual elements of stream encountered by it into the result container. This step is known as a fold in functional programming parlance.
Step 3 - Combiner combines two partial results It is an instance of a BinaryOperator functional interface which combines two partial results returned by two separate groups of accumulations done in parallel.
Step 4 - Optional Finisher to put the processed elements in a desired form It is an instance of a Function interface. If required, a Finisher can be used to map the collected elements in the result container to a different required form.
Step 5 - Final Result The final collected elements are returned by the Collection in the result container i.e. Collection instance.
A Collector consists of four different operations: a supplier, an accumulator, a combiner and a finisher.
The Collector Interface
It is defined as follows:
public interface Collector<T, A, R>
T is the element type being processed by the Stream.
A is the type of the accumulated result container which keeps on getting elements (of type T) added throughout the collecting process.
R is the type of the result container, or the collection, which is returned back as the final output by the collector.
Predefined Collectors
Java provides static methods in the java.util.stream.Collectors class for the most common mutable reduction operations.
joining()
To convert a list of numbers to a string, separated by commas:
String numbersJoined = listOfNumbers.stream()
.map(Object::toString)
.collect(Collectors.joining(", "));
summingInt()
To sum the elements of a list:
int total = listOfNumbers.stream()
.collect(Collectors.summingInt(Integer::intValue));
groupingBy() and partitioningBy()
Aggregations on the elements of a stream are quite common:
Map<String, List<Employee>> employeesByDepartment =
listOfEmployees.stream()
.collect(Collectors.groupingBy(Employee::getDepartment));
Partition is other common collector, which is used to partition elements based on a condition:
Map<Boolean, List<Employee>> partitionedEmployees =
listOfEmployees.stream()
.collect(Collectors.partitioningBy(e -> e.getDepartment() == CHOSEN_DEPARTMENT));
// To get only the employees in the chosen department:
List<Employee> employeesByDepartment = partitionedEmployees.get(true);
count()
For counting items in a list after a certain condition:
long countEvenNumbers = listOfNumbers.stream()
.filter(n -> n % 2 == 0)
.count();
Other predefined collectors are toSet()
, toMap()
, maxBy()
, minBy()
, reducing()
, averaginInt()
, counting()
, mapping()
Custom Collectors
A new collector is created by using the Collector.of()
method. It receives four parameters: a supplier, an accumulator, a combiner and a finisher.
For example, let's create a collector that prints the duplicates of each element separated by commas:
Collector<Integer, StringJoiner, String> numberJoinerCollector =
Collector.of(
() -> new StringJoiner(", "), // supplier
(sj, num) -> sj.add(num.toString()), // accumulator
StringJoiner::merge, // combiner
StringJoiner::toString); // finisher
String result = listOfNumbers
.stream()
.collect(numberJoinerCollector);
Since strings in Java are immutable, a helper class like StringJoiner
is needed to let the collector construct a string. The combiner knows how to merge two StringJoiners into one.
Conclusion
The definition of collectors was explained in detail in this article. After that, the most common predefined collectors were introduced with examples. Finally, how to create custom collectors, using the Collector<T, A, R>
interface, was shown.
Subscribe to my newsletter
Read articles from José Ramón (JR) directly inside your inbox. Subscribe to the newsletter, and don't miss out.
Written by

José Ramón (JR)
José Ramón (JR)
Software Engineer for quite a few years. From C programmer to Java web programmer. Very interested in automated testing and functional programming.