Tuesday, August 17, 2010

ConcurrentHashMap revealed

Java 1.5 introduced some new cool concurrency stuff which is located in java.util.concurrent package. There are lots of good things there including new ConcurrentHashMap class which I'm going to talk about. That class is targeted to be used in concurrent environment and provides significant performance benefits over synchronized version of HashMap. Javadoc doesn't really provide lots of details on how that class works and why can be better than synchronized version of HashMap. I think that understanding of these details is crucial for using that class in a right way, so I've made some research to undercover implementation details and mechanisms used to implement that class.

Map Regions

Firstly I will repeat what is already said in JavaDoc - that map implementation consists of regions. Each region is hash table - an array where each element is associated with some range of hash codes and contains linked list of entries. It means that structurally region is very similar to normal HashMap. All write operations have to acquire write lock on the one whole region.
All read operations on the region are performed without locking, apart from one small exception. That exception happens on attempt to read value of entry and that value is "null". In that case code suspects that compiler and could possibly assign entry to array element before calling entry's constructor (good example and explanation of that problem can be found here). When that happens code tries to acquire write lock and perform reading inside that lock. Such strategy guarantees that initialization of the object is completed before the reference is assigned to the array element. As it is stated in comments in ConcurrentHashMap's source code, chance that comliler will re-order initialization and assignment in that paticular case is extremely low, so chance of "blocking read" is negligible.
Another interesting thing about region is that it has volatile counter which counts number of modifications of that region. That value is used in many methods of ConcurrentHashMap to identify that region’s state is not changed during execution of current method. Basically it is used as a means of solving ABA problem. It doesn't seem right to me that such strategy is applied even on read methods, the most of them are much slower than they can be without that check. There is also some locking (e.g. in size() operation) of individual segments which looks unnecessary as far as result is not going to be accurate anyway. But, on the other hand, that class was created by Doug Lea who is very respectable for his concurrency research, so I suspect I'm missing something...

Iteration through ConcurrentHashMap

Some other stuff, which may raise questions, is relationship between collection and enumeration objects returned by “keySet()”, “values()”, “entrySet()”, “keys()” and “elements()” and ConcurrentHashMap instance which have created them. Implementation of all these objects is based on internal class HashIterator and the noticeable thing here is that instance of that class has some information about the state of ConcurrentHashmap at the moment of it’s (HashIterator’s instance) creation. As an example, there is direct reference to the segment’s data array and on the moment when the HashIterator instance is actually used, given segment could have already decided to create a new array, because the old one appeared to be too small. It means that data returned by these collections and enumerations may not reflect changes in the instance of ConcurrentHashmap which happened after given collection or enumeration was created.
Performance considerations for these objects are exactly the same as for ConcurrentHashMap’s methods - read methods are non-blocking and for update methods lock is acquired on segment level. Interesting, but not really practically useful fact, is that HashIterator isn’t that smart as ConcurrentHashMap itself and doesn’t lock when entry’s value is null. So in theory, it is possible that iterator created by “elements()” method will return null value from “next()” or “nextElement()”.

Building the map

To build a map properly you have to understand what constructor’s arguments mean. Here is a brief explanation in the light of things which are written above:
  • Default initial capacity. That parameter is used to calculate initial capacity of segments. The capacity of each segment is nearest power of two bigger than provided initial capacity divided by number of regions (see Default concurrency level). E.g. initial capacity is 1000 and concurrency level is 3, then number of regions is 4 and capacity of each region is 256 (which is nearest power of two bigger than 1000/4). Default value of that argument is 16. 
  • Default load factor. Is used to identify the moment when the region has to be resized (see section about memory consumption below). No magic, used as is. Default value of that argument is 0.75. 
  • Default concurrency level. Defines number of regions which will be created. Exact number of regions is nearest power of two bigger than provided concurrency level. Concurrency level is always less than 2^16. E.g. if concurrency level is 18, number of regions will be 32. Default value of that argument is 16. Keep in mind, that more regions (better concurrency), means higher memory consumption.
And, probably, the last bit - memory consumption. In these terms ConcurrentHashMap is approximately the same as HashMap, but multiplied by number of regions. Region grows at the moment when number of elements appeared to be equal to loadFactor*currentSizeOfRegion value. New region size is always twice as big as previous one, maximum size of region is 2^30.

Migration to ConcurrentHashMap

Other possible question which can be raised is the migration from synchronized version of HashMap to CuncurrentHashMap. There are several things which has to be evaluated during the migration process:
  • Some methods can use synchronized map object for synchronization purposes. If someone accrue the lock on the object, all it’s methods are locked as well. That’s be biggest fun. There is no simple way to archive such behaviour with CuncurrentHashMap. 
  • CuncurrentHashMap’s size method is slow. In the worst case it spins couple of time and then get the lock on all regions. This is much-much-much slower than HashMap implementation, which returns back just the value of the “size” field. 
  • Iterator of CuncurrentHashMap does not throw ConcurrentModificationException. Do not see how this can cause problem, but just keep it is mind. 
  • Containing the same amount of elements ConcurrentHashMap object requires more memory than HashMap object (see “Building the map” section).

All in all, ConcurrentHashMap is brilliant container. In multi-threaded environments it will have much bigger throughput than synchronized version of HashMap on read and write operations. Such improvement is result of absence of locking on read and using of region-based locking on write operations. There is sill small chance that segment will be locked for a very short time on read, but chance is so small that it can be just ignored. Write operations are locking just one region, other regions can be updated at the same time.

The price for all these improvements are increase in memory usage and some degradation of performance on read operations. More memory is used because each region wastes approximately the same amount of memory as one HashMap. Performance drop on accessing elements is because of ABA checks. Write operations do not seem to introduce any performance degradation, apart from use of locking.

12 comments:

Peter said...

Thank you for the info, exactly what I needed right now

Stas said...

You are welcome, Peter! Glad that it is useful.

suman said...

Hi Stas,
The information was very helpful,
I have one question, iterator of concurrent hasmap may or may not relflect the right changes, it will be helpful why is this behaviour with iterator and how its implemented

Many Thanks

Stas said...

hi Suman,

Not a problem. I've updated post with some info about iterators. Hopefully it answers your questions.

javarevisited said...

Good stuff dude , keep it going.

Concurrent HashMap is indeed best choice in multithreaded environment if number of reader outnumbers writer to avoid contention and for better throughput and performance.
thouch correct Use of ConcurrentHashMap is a topic of discussion.

Thanks
Javin
FIX Protocol tutorial

Dhananjay said...

Thanks a lot for the details.

Is ConcurrentHashMap , good choice over Hashtable , given a scenarion I have a single writer thread and multiple concurrent readers.

Stanislav Kobylansky said...

Yes, it was designed specifically for such cases when you have few write threads and lots of read ones.

Andrey Markhel said...

Заранее извиняюсь за то что комментарий на русском, просто не очень хорошо пишу по-английски, если коммент заинтересует - потом может переведете. Итак, возможно вы писали статью для более ранней версии ConcurrentHashTable, но в сорцах, тех что у меня(java sun jdk 1.6.22) присутствуют более развесистые комментарии, и исходя из них у вас в статье достаточно много неточностей.
1) "That exception happens on attempt to read value of entry and that value is "null". In that case code suspects that compiler and could possibly assign entry to array element before calling entry's constructor (good example and explanation of that problem can be found here). In reality it would mean that the compiler has a bug, because such optimization contradicts Java Memory Model agreement" - Как раз-таки такой реордеринг совсем не противоречит JMM, и это вовсе не баг компилятора, а возможное его поведение(хотя до сих пор такого поведения не обнаружено). Все дело в том, что поле value - не final а volatile, и поэтому НЕСИНХРОНИЗИРОВАННЫЙ читатель может увидеть этот реоредринг(он теоретически возможен, но практически бессмысленен, поэтому видимо, нигде и не реализован)
2) "Another interesting thing about region is that it has volatile counter which counts number of modifications of that region. That value is used in many methods of ConcurrentHashMap to identify that region’s state is not changed during execution of current method. Basically it is used as a means of solving ABA problem." -Это правда, а вот дальше
"It doesn't seem right to me that such strategy is applied even on read methods, the most of them are much slower than they can be without that check"
Во-первых, сначала вы говорили про счетчик modCount, а теперь про просто count, который не счетчик модификаций, а счетчик числа элементов. Который к тому же, в любом случае считывается внутри тела метода(для всех read операций, если count =0, то не надо проходить все сегменты, таким образом одно волатильное чтение нивелируется этой оптимизацией, в слчае write операций, каунт - в любом случае должен быть увеличен на 1, так что дополнительное волатильное чтение не вносит дополнительных расходов.
"There is also some locking (e.g. in size() operation) of individual segments which looks unnecessary as far as result is not going to be accurate anyway" - На самом деле очень тонкий момент, до того как я полез в исходники, я думал так же как и вы, после прочтения java concurrency in practice, что метод size() не возвращает точный результат - НО, если посмотреть в исходники - то получается что сначала считается каунт всех сегментов и если модификаций не было - то возвращается точный результат, если модификации были - то все сегменты лочатся, и опять же будет точный результат, так как без лока что-либо поменять не получится. Я на 100% не уверен, если что поправьте меня.
3) "Interesting, but not really practically useful fact, is that HashIterator isn’t that smart as ConcurrentHashMap itself and doesn’t lock when entry’s value is null. So in theory, it is possible that iterator created by “elements()” method will return null value from “next()” or “nextElement()”" - Опять же неправда, такого быть не может. Вся магия заключена в методе advance(), который вызывается в конструкторе и при next(). Его суть в том, что все null элементы тупо скипаются, и следующий доступный элемент выбирается не null.

А в целом, ваши статьи обычно довольно неплохие, читаю регулярно, иногда черпаю очень полезную инфу. Спасибо.

Stas said...

Hi Andrey, thanks for a good comment, I really appreciate it. Nice to come back to the post which I haven’t touched for such a long and time and thanks for pointing to some inaccuracies. Also, sorry for delay with reply.

Very valid point about the version of JDK. I haven’t really thought about it before :) I have had a look at implementations and there are some differences. There is no much difference between jdk5 & jdk6, but it seems like jdk7 has lots of updates. And, of course, in jdk8 it is going to be completely different beast.

1. That's correct, I will fix it.
But I do not think that final field would help. If that field would be final, situation is still the same. As I understand what is going on there, the case is following. One thread initializes HashEntry and assigns an array element to its reference:

tab[index] = new HashEntry(key, hash, first, value);

Roughly, that action consists of three steps:
1. Memory allocation.
2. Running constructor. (which contains volatile write)
3. Assigning array element to a reference to the object.

Then the other thread is reading array element and its value (volatile read). The problem is that 2 & 3 can be reordered and value will be visible to the other thread before the object is constructed. I think that when I looked at that code first time, I was confused with that volatile write in constructor. Such write guarantees happens-before, but it doesn’t mean that what happened after can't be re-ordered with volatile write. All in all, it means that there is chance that the other thread can see an object, which is not yet constructed, even that there is volatile read.

Regarding to final fields, in that particular case, it doesn't help at all. Here is quote from JLS:

"A thread that can only see a reference to an object after that object has been completely initialized is guaranteed to see the correctly initialized values for that object's final fields."

in this case other thread can see reference before object has been completely initialized, so final fields won't help.

2. I wouldn't be so sure about it. Attempt to return the size of concurrent collection is a waste of effort anyway - context switch may happen at any time and nothing can guarantee that returned value is going to be close to reality. Though, there attempts can make it close and hard to say, if it worth it... Seems like that Dough Lee thinks it does, but it's not clear to me why.

3. Sorry, I wasn't clear here, I will fix wording. advance() is all about HashEntry itself, but it doesn't check entry's value. Indeed it won't return null, but it's possible that returned HashEntry has null in value field. Ideally, it should do the same thing as get().

Jitender Jain said...

When a thread(thread1) is writing to a segment1 in CHM, will another thread(thread2) which is a read operation on segment1 will be blocked(as currently thread1 is writing to segment 1) ?

Jitender Jain said...

Thanks Stas, but how this CHM achieves this interally(is there some copy maintained which could be read by get() method ?). And what value will get() will fetch(the old value ?)

Stas said...

"the old value" is not something which you can easily define in concurrency. What you get "happens-before" relationship, so all you can say is that "put" happens before "get". Sorry, if such definition is a little confusing... Basically, it means that all actions on the thread which "put" value (up to the point when "put" was executed), are visible to thread which "get" that value. Such relationship provides consistent view of the state.

Lots of good reading on that topic is available here: http://www.cs.umd.edu/~pugh/java/memoryModel/