Friday, April 13, 2007

Performance comparision between ConcurrentHashMap and synchronized HashMap in Terracotta

Since the introduction of ConcurrentHashMap in Java 5, it has been the better choice over HashMap in highly threaded applications. To see how much better, Brian Goetz from his book "Java Concurrency in Practice" wrote a test to compare performance between the two maps. The test sceanario is this:

For N threads concurrently execute a loop that chooses a random key and looks up value corresponding to that key. If the value is found, it is removed with a probability of 0.02. If not, it is added to the map with a probability of 0.6.

The result is ConcurrentHashMap performs much better when number of threads increases. Below is the graph taken from the book with the author's permission.
Since Open Terracotta supports ConcurrentHashMap, I wanted so see if the performance advantage is still there (and how much) in distributed environment. My test scenario is modeled after Brian's. However, the number of threads are spread over 4 nodes (Linux RH4). Here is my test code:


package tc.qa;

import java.util.Map;
import java.util.Random;
import java.util.concurrent.ConcurrentHashMap;
import java.util.concurrent.CyclicBarrier;
import java.util.concurrent.atomic.AtomicInteger;

public class ConcurrentHashMapLoadTest extends Thread {
private static double WRITE_PROBABILITY = 0.6;
private static double REMOVE_PROBABILITY = 0.02;
private static int THREAD_COUNT = 4;
private static int VM_COUNT = 4;
private static int RUNTIME = 5 * 60 * 1000;
private static int KEY_RANGE = 100000;

// roots - in TC world, these are static finals and accessible across JVMs
private Map map = new ConcurrentHashMap();
private AtomicInteger throughput = new AtomicInteger(0);
private CyclicBarrier barrier;

private Random random;
private int op;

public ConcurrentHashMapLoadTest() {
random = new Random();
barrier = new CyclicBarrier(THREAD_COUNT);
}

public void run() {

// ready
if (barrier() == 0) {
System.out.println("Started...");
}

// go
long start = System.currentTimeMillis();
while (System.currentTimeMillis() - start < RUNTIME) {
Integer key = new Integer(random.nextInt(KEY_RANGE));
if (get(map, key) != null) {
if (random.nextDouble() < REMOVE_PROBABILITY) {
remove(map, key);
op++;
}
} else {
if (random.nextDouble() < WRITE_PROBABILITY) {
put(map, key, key);
op++;
}
}
op++;
}

throughput.addAndGet(op);

if (barrier() == 0) {
System.out.println("Map type: "
+ map.getClass().getSimpleName());
System.out.println("Runtime: " + RUNTIME);
System.out.println("Number of threads: " + THREAD_COUNT);
System.out.println("Write probability: " + WRITE_PROBABILITY);
System.out.println("Remove probability: " + REMOVE_PROBABILITY);
System.out.println("Ops per second: "
+ (throughput.intValue() * 1000.0 / RUNTIME));
}
}

private int barrier() {
try {
return barrier.await();
} catch (Exception e) {
e.printStackTrace();
}
return -1;
}

private Object get(Map map, Object key) {
if (map instanceof ConcurrentHashMap) {
return map.get(key);
} else {
synchronized (map) {
return map.get(key);
}
}
}

private void put(Map map, Object key, Object value) {
if (map instanceof ConcurrentHashMap) {
map.put(key, value);
} else {
synchronized (map) {
map.put(key, value);
}
}
}

private void remove(Map map, Object key) {
if (map instanceof ConcurrentHashMap) {
map.remove(key);
} else {
synchronized (map) {
map.remove(key);
}
}
}

public static void getParams() {
WRITE_PROBABILITY = Double.parseDouble(System.getProperty("wp", "0.6"));
REMOVE_PROBABILITY = Double.parseDouble(System.getProperty("rmp", "0.02"));
THREAD_COUNT = Integer.parseInt(System.getProperty("thread", "4"));
VM_COUNT = Integer.parseInt(System.getProperty("vm", "4"));
RUNTIME = Integer.parseInt(System.getProperty("runtime", "300000"));
}

public static void main(String[] args) {
getParams();
int threads_per_vm = THREAD_COUNT / VM_COUNT;
for (int t = 0; t < threads_per_vm; t++) {
ConcurrentHashMapLoadTest thread = new ConcurrentHashMapLoadTest();
thread.start();
}
}

}


I wrapped the put(), get() methods to easily switch the map types. After running this test on 4 nodes with jdk1.6.0_01, I got result as below:

The Y axis is throughput, normalized to throughput of 4 threads in ConcurrentHashMap.
With network overhead and locks contention in Terracotta, ConcurrentHashMap still far outweights HashMap. Notice how performance of HashMap doesn't degrade much when number of threads increases. This all thanks to the way Terracotta replicates only the delta changes in the map to other nodes. No serialization involved.

The gap between ConcurrentHashMap and HashMap in Terracotta isn't big as it was in the case of 1 JVM like Brian's test. I'm not sure I'm comparing oranges with apples here because obviously it's not the same test, just the same idea. We're always striving for better performance overall and each release has proven that.

Check us out and let us know what you think. http://www.terracotta.org

Hung-

13 comments:

Anonymous said...

Hey, I recently added a news widget from www.widgetmate.com to my blog. It shows the latest news, and just took a copy and paste to implement. Might interest you too.

Attila Szegedi said...

BTW, when you say Terracotta supports ConcurrentHashMap, is that the one in java.util.concurrent or its predecessor in EDU.oswego.cs.dl.util.concurrent, or both?

Unknown said...

We're currently only supporting java.util.concurrent.ConcurrentHashMap

Though if you have a strong case of using the one in oswego, I suggest you create an jirra issue to request that.

Neil Coffey said...

I got similar results for ConcurrentHashMap in my comparison with synchronized HashMap. Interestingly, even on a uniprocessor machine, I found better throughput by overestimating the number of simultaneous writes when constructing the ConcurrentHashMap.

Shahzad said...

I am unable to run this example. I am using Windows XP-SP3. The program executes & halts. The example given by Brian in his post @ ibm.com is working. The program didn't output anything, but keep on running. If I use the same example to test the performance of Synchronized Hashmap, then it gave me a very slow response time. The response time goes down to 5 second on 10K operations on 96 threads. While ConcurrentHashMap's response is 0.53 seconds on 10K operations on 96 threads. Any comments.

Javin Paul said...

Hi,
Thanks for this Nice article just to add while discussing about HashMap its worth mentioning following questions which frequently asked in Java interviews now days like How HashMap works in Java or How get() method of HashMap works in JAVA very often. on concept point of view these questions are great and expose the candidate if doesn't know deep details.

Javin
Difference between FIX4.2 vs FIX4.4

William said...

Could you tell me which tool you used to output the throughput-threads graph?
Thanks.

hhuynh said...

I used Google doc. You could make graphs from a spreadsheet data with Google Spreadsheet

William said...

Hung

Thanks

JP@ classpath in Java said...

ConcurrentHashMap is indeed best choice in case of multithreaded environment if numbers of reader is much greater than number of writer to avoid contention and to increase throughput and performance but deciding between SynchronizedHashMap and ConcurrentHashMap in Java is still requires understanding of usecases and actual environment.

GMB said...

Hi, interesting post.
For everyone that wants to run the example code, be careful because the code is not-terminating if the number of threads is > 1.
The problem is that every Thread instantiate his own CyclicBarrier, with as number of parties =
THREAD_COUNT.




So if ThreadCount = 1 the call to :


barrier.await()returns, but if ThreadCount > 1 that call never returns.We must have an instance of CyclicBarrier shared among all threads.

Jeet said...

can we have cloneable concurrentHashMap?

hashmap vs hashtable said...

See here for some more differences between ConcurrentHashMap and synchronizedMap in Java