For N threads concurrently execute a loop that chooses a random key and looks up value corresponding to that key. If the value is found, it is removed with a probability of 0.02. If not, it is added to the map with a probability of 0.6.
The result is ConcurrentHashMap performs much better when number of threads increases. Below is the graph taken from the book with the author's permission.
package tc.qa;
import java.util.Map;
import java.util.Random;
import java.util.concurrent.ConcurrentHashMap;
import java.util.concurrent.CyclicBarrier;
import java.util.concurrent.atomic.AtomicInteger;
public class ConcurrentHashMapLoadTest extends Thread {
private static double WRITE_PROBABILITY = 0.6;
private static double REMOVE_PROBABILITY = 0.02;
private static int THREAD_COUNT = 4;
private static int VM_COUNT = 4;
private static int RUNTIME = 5 * 60 * 1000;
private static int KEY_RANGE = 100000;
// roots - in TC world, these are static finals and accessible across JVMs
private Map map = new ConcurrentHashMap();
private AtomicInteger throughput = new AtomicInteger(0);
private CyclicBarrier barrier;
private Random random;
private int op;
public ConcurrentHashMapLoadTest() {
random = new Random();
barrier = new CyclicBarrier(THREAD_COUNT);
}
public void run() {
// ready
if (barrier() == 0) {
System.out.println("Started...");
}
// go
long start = System.currentTimeMillis();
while (System.currentTimeMillis() - start < RUNTIME) {
Integer key = new Integer(random.nextInt(KEY_RANGE));
if (get(map, key) != null) {
if (random.nextDouble() < REMOVE_PROBABILITY) {
remove(map, key);
op++;
}
} else {
if (random.nextDouble() < WRITE_PROBABILITY) {
put(map, key, key);
op++;
}
}
op++;
}
throughput.addAndGet(op);
if (barrier() == 0) {
System.out.println("Map type: "
+ map.getClass().getSimpleName());
System.out.println("Runtime: " + RUNTIME);
System.out.println("Number of threads: " + THREAD_COUNT);
System.out.println("Write probability: " + WRITE_PROBABILITY);
System.out.println("Remove probability: " + REMOVE_PROBABILITY);
System.out.println("Ops per second: "
+ (throughput.intValue() * 1000.0 / RUNTIME));
}
}
private int barrier() {
try {
return barrier.await();
} catch (Exception e) {
e.printStackTrace();
}
return -1;
}
private Object get(Map map, Object key) {
if (map instanceof ConcurrentHashMap) {
return map.get(key);
} else {
synchronized (map) {
return map.get(key);
}
}
}
private void put(Map map, Object key, Object value) {
if (map instanceof ConcurrentHashMap) {
map.put(key, value);
} else {
synchronized (map) {
map.put(key, value);
}
}
}
private void remove(Map map, Object key) {
if (map instanceof ConcurrentHashMap) {
map.remove(key);
} else {
synchronized (map) {
map.remove(key);
}
}
}
public static void getParams() {
WRITE_PROBABILITY = Double.parseDouble(System.getProperty("wp", "0.6"));
REMOVE_PROBABILITY = Double.parseDouble(System.getProperty("rmp", "0.02"));
THREAD_COUNT = Integer.parseInt(System.getProperty("thread", "4"));
VM_COUNT = Integer.parseInt(System.getProperty("vm", "4"));
RUNTIME = Integer.parseInt(System.getProperty("runtime", "300000"));
}
public static void main(String[] args) {
getParams();
int threads_per_vm = THREAD_COUNT / VM_COUNT;
for (int t = 0; t < threads_per_vm; t++) {
ConcurrentHashMapLoadTest thread = new ConcurrentHashMapLoadTest();
thread.start();
}
}
}
I wrapped the put(), get() methods to easily switch the map types. After running this test on 4 nodes with jdk1.6.0_01, I got result as below:
The Y axis is throughput, normalized to throughput of 4 threads in ConcurrentHashMap.
With network overhead and locks contention in Terracotta, ConcurrentHashMap still far outweights HashMap. Notice how performance of HashMap doesn't degrade much when number of threads increases. This all thanks to the way Terracotta replicates only the delta changes in the map to other nodes. No serialization involved.
The gap between ConcurrentHashMap and HashMap in Terracotta isn't big as it was in the case of 1 JVM like Brian's test. I'm not sure I'm comparing oranges with apples here because obviously it's not the same test, just the same idea. We're always striving for better performance overall and each release has proven that.
Check us out and let us know what you think. http://www.terracotta.org
Hung-
13 comments:
Hey, I recently added a news widget from www.widgetmate.com to my blog. It shows the latest news, and just took a copy and paste to implement. Might interest you too.
BTW, when you say Terracotta supports ConcurrentHashMap, is that the one in java.util.concurrent or its predecessor in EDU.oswego.cs.dl.util.concurrent, or both?
We're currently only supporting java.util.concurrent.ConcurrentHashMap
Though if you have a strong case of using the one in oswego, I suggest you create an jirra issue to request that.
I got similar results for ConcurrentHashMap in my comparison with synchronized HashMap. Interestingly, even on a uniprocessor machine, I found better throughput by overestimating the number of simultaneous writes when constructing the ConcurrentHashMap.
I am unable to run this example. I am using Windows XP-SP3. The program executes & halts. The example given by Brian in his post @ ibm.com is working. The program didn't output anything, but keep on running. If I use the same example to test the performance of Synchronized Hashmap, then it gave me a very slow response time. The response time goes down to 5 second on 10K operations on 96 threads. While ConcurrentHashMap's response is 0.53 seconds on 10K operations on 96 threads. Any comments.
Hi,
Thanks for this Nice article just to add while discussing about HashMap its worth mentioning following questions which frequently asked in Java interviews now days like How HashMap works in Java or How get() method of HashMap works in JAVA very often. on concept point of view these questions are great and expose the candidate if doesn't know deep details.
Javin
Difference between FIX4.2 vs FIX4.4
Could you tell me which tool you used to output the throughput-threads graph?
Thanks.
I used Google doc. You could make graphs from a spreadsheet data with Google Spreadsheet
Hung
Thanks
ConcurrentHashMap is indeed best choice in case of multithreaded environment if numbers of reader is much greater than number of writer to avoid contention and to increase throughput and performance but deciding between SynchronizedHashMap and ConcurrentHashMap in Java is still requires understanding of usecases and actual environment.
Hi, interesting post.
For everyone that wants to run the example code, be careful because the code is not-terminating if the number of threads is > 1.
The problem is that every Thread instantiate his own CyclicBarrier, with as number of parties =
THREAD_COUNT.
So if ThreadCount = 1 the call to :
barrier.await()returns, but if ThreadCount > 1 that call never returns.We must have an instance of CyclicBarrier shared among all threads.
can we have cloneable concurrentHashMap?
See here for some more differences between ConcurrentHashMap and synchronizedMap in Java
Post a Comment