Panduan untuk Java 8 groupingBy Collector

1. Perkenalan

Dalam tutorial ini, kita akan melihat bagaimana kolektor groupingBy bekerja menggunakan berbagai contoh.

Agar kami dapat memahami materi yang dibahas dalam tutorial ini, kami memerlukan pengetahuan dasar tentang fitur Java 8. Kita dapat melihat pengantar Aliran Java 8 dan panduan Kolektor Java 8 untuk dasar-dasar ini.

2. Pengelompokan Dengan Kolektor

Java 8 Stream API memungkinkan kami memproses kumpulan data dengan cara deklaratif.

Metode pabrik statis Collectors.groupingBy () dan Collectors.groupingByConcurrent () memberi kita fungsionalitas yang mirip dengan klausa ' GROUP BY' dalam bahasa SQL. Kami menggunakannya untuk mengelompokkan objek berdasarkan beberapa properti dan menyimpan hasil dalam contoh Peta .

Metode groupingBy yang kelebihan beban adalah:

  • Pertama, dengan fungsi klasifikasi sebagai parameter metode:

static  Collector
    
     > groupingBy(Function classifier)
    
  • Kedua, dengan fungsi klasifikasi dan kolektor kedua sebagai parameter metode:

static  Collector
    
      groupingBy(Function classifier, Collector downstream)
    
  • Terakhir, dengan fungsi klasifikasi, metode pemasok (yang menyediakan implementasi Peta yang berisi hasil akhir), dan kolektor kedua sebagai parameter metode:

static 
    
      Collector groupingBy(Function classifier, Supplier mapFactory, Collector downstream)
    

2.1. Contoh Kode Setup

Untuk mendemonstrasikan penggunaan groupingBy () , mari kita tentukan kelas BlogPost (kita akan menggunakan aliran objek BlogPost ):

class BlogPost { String title; String author; BlogPostType type; int likes; } 

Selanjutnya, BlogPostType :

enum BlogPostType { NEWS, REVIEW, GUIDE } 

Kemudian Daftar dari BlogPost objek:

List posts = Arrays.asList( ... );

Mari kita juga menentukan Tuple kelas yang akan digunakan untuk posting kelompok dengan kombinasi mereka jenis dan penulis atribut:

class Tuple { BlogPostType type; String author; } 

2.2. Pengelompokan Sederhana dengan Satu Kolom

Mari kita mulai dengan metode groupingBy paling sederhana , yang hanya menggunakan fungsi klasifikasi sebagai parameternya. Fungsi klasifikasi diterapkan ke setiap elemen aliran. Kami menggunakan nilai yang dikembalikan oleh fungsi sebagai kunci ke peta yang kami dapatkan dari kolektor groupingBy .

Untuk mengelompokkan posting blog dalam daftar posting blog menurut tipenya :

Map
    
      postsPerType = posts.stream() .collect(groupingBy(BlogPost::getType)); 
    

2.3. groupingBy dengan Jenis Kunci Peta Kompleks

Fungsi klasifikasi tidak terbatas pada hanya mengembalikan nilai skalar atau String. Kunci dari peta yang dihasilkan bisa berupa objek apa pun selama kami memastikan bahwa kami menerapkan metode kode sama dan hash yang diperlukan .

To group the blog posts in the list by the type and author combined in a Tuple instance:

Map
    
      postsPerTypeAndAuthor = posts.stream() .collect(groupingBy(post -> new Tuple(post.getType(), post.getAuthor()))); 
    

2.4. Modifying the Returned Map Value Type

The second overload of groupingBy takes an additional second collector (downstream collector) that is applied to the results of the first collector.

When we specify a classification function, but not a downstream collector, the toList() collector is used behind the scenes.

Let's use the toSet() collector as the downstream collector and get a Set of blog posts (instead of a List):

Map
    
      postsPerType = posts.stream() .collect(groupingBy(BlogPost::getType, toSet())); 
    

2.5. Grouping by Multiple Fields

A different application of the downstream collector is to do a secondary groupingBy to the results of the first group by.

To group the List of BlogPosts first by author and then by type:

Map
    
      map = posts.stream() .collect(groupingBy(BlogPost::getAuthor, groupingBy(BlogPost::getType)));
    

2.6. Getting the Average from Grouped Results

By using the downstream collector, we can apply aggregation functions in the results of the classification function.

For instance, to find the average number of likes for each blog post type:

Map averageLikesPerType = posts.stream() .collect(groupingBy(BlogPost::getType, averagingInt(BlogPost::getLikes))); 

2.7. Getting the Sum from Grouped Results

To calculate the total sum of likes for each type:

Map likesPerType = posts.stream() .collect(groupingBy(BlogPost::getType, summingInt(BlogPost::getLikes))); 

2.8. Getting the Maximum or Minimum from Grouped Results

Another aggregation that we can perform is to get the blog post with the maximum number of likes:

Map
    
      maxLikesPerPostType = posts.stream() .collect(groupingBy(BlogPost::getType, maxBy(comparingInt(BlogPost::getLikes)))); 
    

Similarly, we can apply the minBy downstream collector to get the blog post with the minimum number of likes.

Note that the maxBy and minBy collectors take into account the possibility that the collection to which they are applied could be empty. This is why the value type in the map is Optional.

2.9. Getting a Summary for an Attribute of Grouped Results

The Collectors API offers a summarizing collector that we can use in cases when we need to calculate the count, sum, minimum, maximum and average of a numerical attribute at the same time.

Let's calculate a summary for the likes attribute of the blog posts for each different type:

Map likeStatisticsPerType = posts.stream() .collect(groupingBy(BlogPost::getType, summarizingInt(BlogPost::getLikes))); 

The IntSummaryStatistics object for each type contains the count, sum, average, min and max values for the likes attribute. Additional summary objects exist for double and long values.

2.10. Mapping Grouped Results to a Different Type

We can achieve more complex aggregations by applying a mapping downstream collector to the results of the classification function.

Let's get a concatenation of the titles of the posts for each blog post type:

Map postsPerType = posts.stream() .collect(groupingBy(BlogPost::getType, mapping(BlogPost::getTitle, joining(", ", "Post titles: [", "]")))); 

What we have done here is to map each BlogPost instance to its title and then reduce the stream of post titles to a concatenated String. In this example, the type of the Map value is also different from the default List type.

2.11. Modifying the Return Map Type

When using the groupingBy collector, we cannot make assumptions about the type of the returned Map. If we want to be specific about which type of Map we want to get from the group by, then we can use the third variation of the groupingBy method that allows us to change the type of the Map by passing a Map supplier function.

Let's retrieve an EnumMap by passing an EnumMap supplier function to the groupingBy method:

EnumMap
    
      postsPerType = posts.stream() .collect(groupingBy(BlogPost::getType, () -> new EnumMap(BlogPostType.class), toList())); 
    

3. Concurrent groupingBy Collector

Similar to groupingBy is the groupingByConcurrent collector, which leverages multi-core architectures. This collector has three overloaded methods that take exactly the same arguments as the respective overloaded methods of the groupingBy collector. The return type of the groupingByConcurrent collector, however, must be an instance of the ConcurrentHashMap class or a subclass of it.

To do a grouping operation concurrently, the stream needs to be parallel:

ConcurrentMap
    
      postsPerType = posts.parallelStream() .collect(groupingByConcurrent(BlogPost::getType)); 
    

If we choose to pass a Map supplier function to the groupingByConcurrent collector, then we need to make sure that the function returns either a ConcurrentHashMap or a subclass of it.

4. Java 9 Additions

Java 9 memperkenalkan dua kolektor baru yang bekerja dengan baik dengan groupingBy ; informasi lebih lanjut tentang mereka dapat ditemukan di sini.

5. Kesimpulan

Pada artikel ini, kami mempelajari penggunaan kolektor groupingBy yang ditawarkan oleh Java 8 Collectors API.

Kami mempelajari bagaimana groupingBy dapat digunakan untuk mengklasifikasikan aliran elemen berdasarkan salah satu atributnya, dan bagaimana hasil klasifikasi ini dapat dikumpulkan, dimutasi, dan direduksi lebih lanjut menjadi penampung akhir.

Implementasi lengkap dari contoh-contoh dalam artikel ini dapat ditemukan di proyek GitHub.