Panduan untuk ConcurrentMap

1. Ikhtisar

Peta tentu saja merupakan salah satu koleksi Java yang paling banyak digunakan.

Dan yang terpenting, HashMap bukanlah implementasi yang aman untuk thread, sementara Hashtable menyediakan keamanan thread dengan menyinkronkan operasi.

Meskipun Hashtable aman untuk utas, ini tidak terlalu efisien. Peta tersinkronisasi penuh lainnya , Collections.synchronizedMap, juga tidak menunjukkan efisiensi yang besar. Jika kita menginginkan keamanan thread dengan throughput tinggi di bawah konkurensi tinggi, implementasi ini bukanlah cara yang tepat.

Untuk mengatasi masalah ini, Java Collections Framework memperkenalkan ConcurrentMap di Java 1.5 .

Diskusi berikut ini didasarkan pada Java 1.8 .

2. ConcurrentMap

ConcurrentMap adalah perpanjangan dari antarmuka Peta . Ini bertujuan untuk memberikan struktur dan panduan untuk memecahkan masalah rekonsiliasi throughput dengan keamanan benang.

Dengan mengganti beberapa metode default antarmuka, ConcurrentMap memberikan pedoman untuk implementasi yang valid guna menyediakan operasi atomic yang aman untuk thread dan memori yang konsisten.

Beberapa implementasi default diganti, menonaktifkan dukungan nilai / kunci null :

  • getOrDefault
  • untuk setiap
  • menggantikan semua
  • computeIfAbsent
  • computeIfPresent
  • menghitung
  • menggabungkan

API berikut juga diganti untuk mendukung atomicity, tanpa implementasi antarmuka default:

  • putIfAbsent
  • menghapus
  • ganti (key, oldValue, newValue)
  • ganti (kunci, nilai)

Sisa tindakan secara langsung diwarisi dengan pada dasarnya konsisten dengan Peta .

3. ConcurrentHashMap

ConcurrentHashMap adalah implementasi ConcurrentMap siap pakai .

Untuk kinerja yang lebih baik, ini terdiri dari larik node sebagai keranjang tabel (dulu merupakan segmen tabel sebelum Java 8 ) di bawah kap, dan terutama menggunakan operasi CAS selama pembaruan.

Bucket tabel diinisialisasi dengan malas, setelah penyisipan pertama. Setiap keranjang dapat dikunci secara terpisah dengan mengunci simpul pertama di dalam keranjang. Operasi baca tidak memblokir, dan perbaruan pertentangan diminimalkan.

Jumlah segmen yang diperlukan relatif terhadap jumlah untaian yang mengakses tabel sehingga pembaruan yang sedang berlangsung per segmen tidak lebih dari satu kali.

Sebelum Java 8 , jumlah "segmen" yang diperlukan relatif terhadap jumlah utas yang mengakses tabel sehingga pembaruan yang sedang berlangsung per segmen tidak lebih dari satu kali.

Itulah mengapa konstruktor, dibandingkan dengan HashMap , menyediakan argumen concurrencyLevel ekstra untuk mengontrol jumlah perkiraan utas yang akan digunakan:

public ConcurrentHashMap(
public ConcurrentHashMap( int initialCapacity, float loadFactor, int concurrencyLevel)

Dua argumen lainnya: initialCapacity dan loadFactor bekerja sama dengan HashMap .

Namun, sejak Java 8 , konstruktor hanya ada untuk kompatibilitas mundur: parameter hanya dapat mempengaruhi ukuran awal peta .

3.1. Keamanan Benang

ConcurrentMap menjamin konsistensi memori pada operasi kunci / nilai dalam lingkungan multi-threading.

Tindakan di utas sebelum menempatkan objek ke dalam ConcurrentMap sebagai kunci atau nilai terjadi-sebelum tindakan setelah akses atau penghapusan objek di utas lain.

Untuk mengonfirmasi, mari kita lihat kasus memori yang tidak konsisten:

@Test public void givenHashMap_whenSumParallel_thenError() throws Exception { Map map = new HashMap(); List sumList = parallelSum100(map, 100); assertNotEquals(1, sumList .stream() .distinct() .count()); long wrongResultCount = sumList .stream() .filter(num -> num != 100) .count(); assertTrue(wrongResultCount > 0); } private List parallelSum100(Map map, int executionTimes) throws InterruptedException { List sumList = new ArrayList(1000); for (int i = 0; i < executionTimes; i++) { map.put("test", 0); ExecutorService executorService = Executors.newFixedThreadPool(4); for (int j = 0; j  { for (int k = 0; k  value + 1 ); }); } executorService.shutdown(); executorService.awaitTermination(5, TimeUnit.SECONDS); sumList.add(map.get("test")); } return sumList; }

Untuk setiap tindakan map.computeIfPresent secara paralel, HashMap tidak memberikan pandangan yang konsisten tentang apa yang seharusnya menjadi nilai integer saat ini, yang mengarah ke hasil yang tidak konsisten dan tidak diinginkan.

Sedangkan untuk ConcurrentHashMap , kita bisa mendapatkan hasil yang konsisten dan benar:

@Test public void givenConcurrentMap_whenSumParallel_thenCorrect() throws Exception { Map map = new ConcurrentHashMap(); List sumList = parallelSum100(map, 1000); assertEquals(1, sumList .stream() .distinct() .count()); long wrongResultCount = sumList .stream() .filter(num -> num != 100) .count(); assertEquals(0, wrongResultCount); }

3.2. Kunci / Nilai Nihil

Sebagian besar API yang disediakan oleh ConcurrentMap tidak mengizinkan kunci atau nilai null , misalnya:

@Test(expected = NullPointerException.class) public void givenConcurrentHashMap_whenPutWithNullKey_thenThrowsNPE() { concurrentMap.put(null, new Object()); } @Test(expected = NullPointerException.class) public void givenConcurrentHashMap_whenPutNullValue_thenThrowsNPE() { concurrentMap.put("test", null); }

Namun, untuk tindakan komputasi * dan penggabungan , nilai yang dihitung dapat menjadi null , yang menunjukkan pemetaan nilai kunci dihapus jika ada atau tetap tidak ada jika sebelumnya tidak ada .

@Test public void givenKeyPresent_whenComputeRemappingNull_thenMappingRemoved() { Object oldValue = new Object(); concurrentMap.put("test", oldValue); concurrentMap.compute("test", (s, o) -> null); assertNull(concurrentMap.get("test")); }

3.3. Dukungan Streaming

Java 8 provides Stream support in the ConcurrentHashMap as well.

Unlike most stream methods, the bulk (sequential and parallel) operations allow concurrent modification safely. ConcurrentModificationException won't be thrown, which also applies to its iterators. Relevant to streams, several forEach*, search, and reduce* methods are also added to support richer traversal and map-reduce operations.

3.4. Performance

Under the hood, ConcurrentHashMap is somewhat similar to HashMap, with data access and update based on a hash table (though more complex).

And of course, the ConcurrentHashMap should yield much better performance in most concurrent cases for data retrieval and update.

Let's write a quick micro-benchmark for get and put performance and compare that to Hashtable and Collections.synchronizedMap, running both operations for 500,000 times in 4 threads.

@Test public void givenMaps_whenGetPut500KTimes_thenConcurrentMapFaster() throws Exception { Map hashtable = new Hashtable(); Map synchronizedHashMap = Collections.synchronizedMap(new HashMap()); Map concurrentHashMap = new ConcurrentHashMap(); long hashtableAvgRuntime = timeElapseForGetPut(hashtable); long syncHashMapAvgRuntime = timeElapseForGetPut(synchronizedHashMap); long concurrentHashMapAvgRuntime = timeElapseForGetPut(concurrentHashMap); assertTrue(hashtableAvgRuntime > concurrentHashMapAvgRuntime); assertTrue(syncHashMapAvgRuntime > concurrentHashMapAvgRuntime); } private long timeElapseForGetPut(Map map) throws InterruptedException { ExecutorService executorService = Executors.newFixedThreadPool(4); long startTime = System.nanoTime(); for (int i = 0; i  { for (int j = 0; j < 500_000; j++) { int value = ThreadLocalRandom .current() .nextInt(10000); String key = String.valueOf(value); map.put(key, value); map.get(key); } }); } executorService.shutdown(); executorService.awaitTermination(1, TimeUnit.MINUTES); return (System.nanoTime() - startTime) / 500_000; }

Keep in mind micro-benchmarks are only looking at a single scenario and aren't always a good reflection of real world performance.

That being said, on an OS X system with an average dev system, we're seeing an average sample result for 100 consecutive runs (in nanoseconds):

Hashtable: 1142.45 SynchronizedHashMap: 1273.89 ConcurrentHashMap: 230.2

In a multi-threading environment, where multiple threads are expected to access a common Map, the ConcurrentHashMap is clearly preferable.

However, when the Map is only accessible to a single thread, HashMap can be a better choice for its simplicity and solid performance.

3.5. Pitfalls

Retrieval operations generally do not block in ConcurrentHashMap and could overlap with update operations. So for better performance, they only reflect the results of the most recently completed update operations, as stated in the official Javadoc.

There are several other facts to bear in mind:

  • results of aggregate status methods including size, isEmpty, and containsValue are typically useful only when a map is not undergoing concurrent updates in other threads:
@Test public void givenConcurrentMap_whenUpdatingAndGetSize_thenError() throws InterruptedException { Runnable collectMapSizes = () -> { for (int i = 0; i  { for (int i = 0; i < MAX_SIZE; i++) { concurrentMap.put(String.valueOf(i), i); } }; executorService.execute(updateMapData); executorService.execute(collectMapSizes); executorService.shutdown(); executorService.awaitTermination(1, TimeUnit.MINUTES); assertNotEquals(MAX_SIZE, mapSizes.get(MAX_SIZE - 1).intValue()); assertEquals(MAX_SIZE, concurrentMap.size()); }

If concurrent updates are under strict control, aggregate status would still be reliable.

Although these aggregate status methods do not guarantee the real-time accuracy, they may be adequate for monitoring or estimation purposes.

Note that usage of size() of ConcurrentHashMap should be replaced by mappingCount(), for the latter method returns a long count, although deep down they are based on the same estimation.

  • hashCode matters: note that using many keys with exactly the same hashCode() is a sure way to slow down a performance of any hash table.

To ameliorate impact when keys are Comparable, ConcurrentHashMap may use comparison order among keys to help break ties. Still, we should avoid using the same hashCode() as much as we can.

  • iterators are only designed to use in a single thread as they provide weak consistency rather than fast-fail traversal, and they will never throw ConcurrentModificationException.
  • the default initial table capacity is 16, and it's adjusted by the specified concurrency level:
public ConcurrentHashMap( int initialCapacity, float loadFactor, int concurrencyLevel) { //... if (initialCapacity < concurrencyLevel) { initialCapacity = concurrencyLevel; } //... }
  • caution on remapping functions: though we can do remapping operations with provided compute and merge* methods, we should keep them fast, short and simple, and focus on the current mapping to avoid unexpected blocking.
  • keys in ConcurrentHashMap are not in sorted order, so for cases when ordering is required, ConcurrentSkipListMap is a suitable choice.

4. ConcurrentNavigableMap

For cases when ordering of keys is required, we can use ConcurrentSkipListMap, a concurrent version of TreeMap.

As a supplement for ConcurrentMap, ConcurrentNavigableMap supports total ordering of its keys (in ascending order by default) and is concurrently navigable. Methods that return views of the map are overridden for concurrency compatibility:

  • subMap
  • headMap
  • tailMap
  • subMap
  • headMap
  • tailMap
  • descendingMap

keySet() views' iterators and spliterators are enhanced with weak-memory-consistency:

  • navigableKeySet
  • keySet
  • descendingKeySet

5. ConcurrentSkipListMap

Previously, we have covered NavigableMap interface and its implementation TreeMap. ConcurrentSkipListMap can be seen a scalable concurrent version of TreeMap.

In practice, there's no concurrent implementation of the red-black tree in Java. A concurrent variant of SkipLists is implemented in ConcurrentSkipListMap, providing an expected average log(n) time cost for the containsKey, get, put and remove operations and their variants.

In addition to TreeMap‘s features, key insertion, removal, update and access operations are guaranteed with thread-safety. Here's a comparison to TreeMap when navigating concurrently:

@Test public void givenSkipListMap_whenNavConcurrently_thenCountCorrect() throws InterruptedException { NavigableMap skipListMap = new ConcurrentSkipListMap(); int count = countMapElementByPollingFirstEntry(skipListMap, 10000, 4); assertEquals(10000 * 4, count); } @Test public void givenTreeMap_whenNavConcurrently_thenCountError() throws InterruptedException { NavigableMap treeMap = new TreeMap(); int count = countMapElementByPollingFirstEntry(treeMap, 10000, 4); assertNotEquals(10000 * 4, count); } private int countMapElementByPollingFirstEntry( NavigableMap navigableMap, int elementCount, int concurrencyLevel) throws InterruptedException { for (int i = 0; i < elementCount * concurrencyLevel; i++) { navigableMap.put(i, i); } AtomicInteger counter = new AtomicInteger(0); ExecutorService executorService = Executors.newFixedThreadPool(concurrencyLevel); for (int j = 0; j  { for (int i = 0; i < elementCount; i++) { if (navigableMap.pollFirstEntry() != null) { counter.incrementAndGet(); } } }); } executorService.shutdown(); executorService.awaitTermination(1, TimeUnit.MINUTES); return counter.get(); }

Penjelasan lengkap tentang masalah kinerja di balik layar berada di luar cakupan artikel ini. Detailnya dapat ditemukan di Javadoc ConcurrentSkipListMap , yang terletak di bawah java / util / concurrent di file src.zip .

6. Kesimpulan

Dalam artikel ini, kami terutama memperkenalkan antarmuka ConcurrentMap dan fitur ConcurrentHashMap dan yang dibahas di ConcurrentNavigableMap sebagai pengurutan kunci yang diperlukan.

Kode sumber lengkap untuk semua contoh yang digunakan dalam artikel ini dapat ditemukan di proyek GitHub.