Thursday, December 18, 2008

Link checker with Java - better version

I posted an old blog regarding checking for broken links with Java. However, that version uses 3rd party library httpunit and doesn't handle forwarding case. If a url forwards to another url, http response code would be 302 and the old version would consider that link is broken.

The new version would follow the forwarding url and check it. If the forwarded url is dead, it would be detected.

So here is the new code, without any extra 3rd party API:


private static boolean isLive(String link) {
HttpURLConnection urlConnection = null;
try {
URL url = new URL(link);
urlConnection = (HttpURLConnection) url.openConnection();
urlConnection.setRequestMethod("HEAD");
urlConnection.setConnectTimeout(5000); /* timeout after 5s if can't connect */
urlConnection.setReadTimeout(5000); /* timeout after 5s if the page is too slow */
urlConnection.connect();
String redirectLink = urlConnection.getHeaderField("Location");
if (redirectLink != null && !link.equals(redirectLink)) {
return isLive(redirectLink);
} else {
return urlConnection.getResponseCode() == HttpURLConnection.HTTP_OK;
}
} catch (Exception e) {
return false;
} finally {
if (urlConnection != null) {
urlConnection.disconnect();
}
}
}

public static void main(String[] args) {
System.out.println(isLive("http://google.com"));
System.out.println(isLive("http://somefakelink.com"));
}

Simple demo of clustering Ehcache

To cluster a cache between 2 or more JVM's, it's quite simple if you use Terracotta underneath. It will handle the networking, the shared memory, transparently for you. There's no extra API but a configuration file.

Terracotta also has maven plugin support so you could grab the demo project below and run without downloading anything.

The demo is only one Java source file. It is meant to be run in 2 VM's.

I used a cyclic barrier to synchronize between the 2 VM's. It will block until both VM hit the "await()" call. The first node will create a cache and populate it with 2 elements. The second node then queries out of the cache to demonstrate that it could get to the shared data.

package demo;

import java.util.concurrent.CyclicBarrier;

import net.sf.ehcache.Cache;
import net.sf.ehcache.CacheManager;
import net.sf.ehcache.Element;
import net.sf.ehcache.store.MemoryStoreEvictionPolicy;

public class EhcacheDemo {

private CacheManager manager = CacheManager.create();
private CyclicBarrier barrier = new CyclicBarrier(2);

public void run() throws Exception {
Cache testCache;

System.out.println("Waiting for the other node...");
if (barrier.await() == 0) {
System.out.println("Creating the testCache...");
testCache = new Cache("test", 100, MemoryStoreEvictionPolicy.LFU, true,
null, false, 60, 30, false, 0, null);
manager.addCache(testCache);
}

barrier.await();
testCache = manager.getCache("test");

if (barrier.await() == 0) {
System.out.println("First node: adding 2 elements...");
testCache.put(new Element("k1", "v1"));
testCache.put(new Element("k2", "v2"));
} else {
System.out.println("Second node: querying 3 elements...");
System.out.println(testCache.get("k1"));
System.out.println(testCache.get("k2"));
System.out.println(testCache.get("k3"));
}

System.out.println("Done!");
}

public static void main(String[] args) throws Exception {
new EhcacheDemo().run();
}

}


Note that second node is also querying for non-existed element "k3" as a sanity check.

So how does Terracotta know what I want to share? From the above code, I tell Terracotta to share the cache "manager" and "barrier" by marking them as root in Terracotta configuration file tc-config.xml

<application>
<dso>
<instrumented-classes/>
<roots>
<root>
<field-name>demo.EhcacheDemo.manager</field-name>
</root>
<root>
<field-name>demo.EhcacheDemo.barrier</field-name>
</root>
</roots>
</dso>
</application>



After you download the project, just run this Maven commands:

mvn install
mvn tc:run


The output would be something like this:

[INFO] [node1] Waiting for the other node...
[INFO] [node0] Waiting for the other node...
[INFO] [node0] Creating the testCache...
[INFO] [node1] First node: adding 2 elements...
[INFO] [node0] Second node: querying 3 elements...
[INFO] [node1] Done!
[INFO] [node0] [ key = k1, value=v1, version=1, hitCount=2, CreationTime = 1229592169906, LastAccessTime = 1229592170081 ]
[INFO] [node0] [ key = k2, value=v2, version=1, hitCount=2, CreationTime = 1229592170070, LastAccessTime = 1229592170145 ]
[INFO] [node0] null
[INFO] [node0] Done!

Sunday, September 21, 2008

Compile and test with different JDK with a Maven project

If you ever have the need to compile and run tests of your Maven project with different JDK, here are some tips:

1. The cheapest way



Maven by default use your JAVA_HOME to compile the project and then run the tests.
So if want to run your tests with different jdk, just set your JAVA_HOME to the jdk of your desire. Here's a neat treat in Bash to set an env variable locally to the current process.

Assume you have environment variables JAVA_HOME_15 and JAVA_HOME_16 pointing to their respective jdk1.5 and jdk1.6, you can then do this:


%> JAVA_HOME=$JAVA_HOME_15 mvn install # run with 1.5
%> JAVA_HOME=$JAVA_HOME_16 mvn install # run with 1.6


2. The Maven way

(hairy verbose and messy pom.xml profiles)

This only works with Maven-2.1.0-M1. Failed with 2.0.9 for me

You have to specify which JDK you want for 2 of these plugins: maven-compiler-plugin and maven-surefire-plugin.


<build>
<plugins>
<plugin>
<groupId>org.apache.maven.plugins</groupId>
<artifactId>maven-compiler-plugin</artifactId>
<configuration>
<verbose>true</verbose>
<fork>true</fork>
<executable>${jdk}/bin/javac</executable>
<compilerVersion>1.5</compilerVersion>
</configuration>
</plugin>
<plugin>
<groupId>org.apache.maven.plugins</groupId>
<artifactId>maven-surefire-plugin</artifactId>
<configuration>
<jvm>${jdk}/bin/java</jvm>
<forkMode>once</forkMode>
</configuration>
</plugin>
</plugins>
</build>


So you see I use the property ${jdk} Now we just need to define a profile for each of the JDK we want and set the ${jdk} appropriately.


<profiles>
<profile>
<id>default_jdk</id>
<activation>
<activeByDefault>true</activeByDefault>
</activation>
<properties>
<jdk>${env.JAVA_HOME}</jdk>
</properties>
</profile>
<profile>
<id>jdk15</id>
<activation>
<activeByDefault>false</activeByDefault>
</activation>
<properties>
<jdk>${env.JAVA_HOME_15}</jdk>
</properties>
</profile>
<profile>
<id>jdk16</id>
<activation>
<activeByDefault>false</activeByDefault>
</activation>
<properties>
<jdk>${env.JAVA_HOME_16}</jdk>
</properties>
</profile>
</profiles>


Above I defined 3 profiles. The first one is the default one. It's a trick to set ${jdk} to $JAVA_HOME so that Maven knows what to use if you don't specify any profile to run.

Here how you would activate each profile:


%> mvn install # no profile so by default run tests with JAVA_HOME
%> mvn install -Pjdk15 # run with jdk1.5
%> mvn install -Pjdk16 # run with jdk1.6


Note that in the profile I use env variables JAVA_HOME_15 and JAVA_HOME_16 also. You can of course fill in an absolute path to your JDK.

So there you have it. Took me awhile to figure out.

Wednesday, March 26, 2008

Bug bite

It's a very subtle bug but results in great headache. I didn't notice my bug until someone pointed it out. Good lesson learned: Always have tests when you refactor.

See if you could spot it.


// BEFORE
private LockID[] getAllLockIDs(LockID id) {
LockID[] lids = new LockID[transactionStack.size() + 1];
lids[0] = id;
for (int i = 1; i < lids.length; i++) {
TransactionContext tc = (TransactionContext) transactionStack.get(i - 1);
lids[i] = tc.getLockID();
}
return lids;
}


// AFTER
private List getAllLockIDs(LockID id) {
List lids = new ArrayList();
lids.add(id);
for (int i = 1; i < transactionStack.size(); i++) {
TransactionContext tc = (TransactionContext) transactionStack.get(i - 1);
lids.add(tc.getLockID());
}
return lids;
}

Thursday, February 07, 2008

Pod Racing: How to synchronize threads across multiple JVMs

Sometimes, you want to start the job in multiple JVMs only when all the VM has started. This scenario often comes up with multiplayer games. Let's take a pod racing game for example. The players need to start at the same time and each game server handles 1 player (for simplicity's sake)

Starting from Java 5, there's a CyclicBarrier to synchronize threads and is a perfect tool for this. However, it only handles threads in 1 VM. D'oh! Here's where Terracotta comes in. If you make that CyclicBarrier a shared object, Terracotta will handle the clustering for you, transparently (no API). All of a sudden, your CyclicBarrier object will be seen across all the VMs participated in the game.

Here's an example of how to do it. First, let's take a look at the PodRacer class
package demo;

import java.util.concurrent.CyclicBarrier;

public class PodRacer {
public final static int COUNT = 2;
private final CyclicBarrier barrier = new CyclicBarrier(COUNT);
private String name;

public PodRacer(String name) {
this.name = name;
}

public void ready() {
System.out.println(name + ": ready");
}

public void set() throws Exception {
/* in Terracotta world, all threads in different VMs will block here */
barrier.await();
}

public void go() throws Exception {
System.out.println(name + ": go");
Thread.sleep((int) (Math.random() * 5000) + 10);
System.out.println(name + ": arrived at " + System.currentTimeMillis());
}

public static void main(String[] args) throws Exception {
PodRacer racer = new PodRacer(System.getProperty("racer.name", "unknown"));
racer.ready();
racer.set();
racer.go();
}
}


Notice I hardcoded number of racers here to 2 but it can be made dynamic. If you run this class just like a normal Java program, it will block at racer.set() call because there's only 1 thread arriving at the barrier whereas it requires 2 for the barrier to be lifted. And if you don't use Terracotta underneath, it doesn't matter how many VMs you start, it will just block there.

Next, I'll add Terracotta into the mix and share the barrier object. That can be achieved by marking it in Terracotta configuration file tc-config.xml

<?xml version="1.0" encoding="UTF-8"?>
<con:tc-config xmlns:con="http://www.terracotta.org/config">
<servers>
<server host="%i" name="localhost">
<dso-port>9510</dso-port>
<jmx-port>9520</jmx-port>
<data>terracotta/server-data</data>
<logs>terracotta/server-logs</logs>
</server>
</servers>
<clients>
<logs>terracotta/client-logs</logs>
</clients>
<application>
<dso>
<instrumented-classes/>
<roots>
<root>
<field-name>demo.PodRacer.barrier</field-name>
</root>
</roots>
</dso>
</application>
</con:tc-config>


And that's it. When I start 2 JVMs running PodRacer class with Terracotta, I will efficiently have 2 threads calling racer.set() on the shared barrier

To make it even easier to try out with Terracotta, there's a Maven 2 plugin that will handle starting up your project with Terracotta enabled. Here's the pom.xml of my project:


<?xml version="1.0" encoding="UTF-8"?>
<project xmlns="http://maven.apache.org/POM/4.0.0" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:schemaLocation="http://maven.apache.org/POM/4.0.0 http://maven.apache.org/maven-v4_0_0.xsd">
<modelVersion>4.0.0</modelVersion>
<groupId>PodRacing</groupId>
<artifactId>PodRacing</artifactId>
<version>0.0.1</version>
<build>
<plugins>
<plugin>
<artifactId>maven-compiler-plugin</artifactId>
<configuration>
<source>1.5</source>
<target>1.5</target>
</configuration>
</plugin>
<plugin>
<groupId>org.terracotta.maven.plugins</groupId>
<artifactId>tc-maven-plugin</artifactId>
<version>1.0.3</version>
<executions>
<execution>
<phase>package</phase>
<goals>
<goal>bootjar</goal>
</goals>
</execution>
</executions>

<configuration>
<processes>
<process nodeName="racer1"
className="demo.PodRacer"
jvmargs="-Dracer.name=Anakin"/>
<process nodeName="racer2"
className="demo.PodRacer"
jvmargs="-Dracer.name=Sebulba"/>
</processes>

</configuration>
</plugin>
</plugins>
</build>
<pluginRepositories>
<pluginRepository>
<releases />
<snapshots />
<id>terracotta</id>
<url>http://download.terracotta.org/maven2</url>
</pluginRepository>
</pluginRepositories>
</project>


See how I defined 2 racers through the process element in the plugin configuration. That will start 2 distinct VMs with Terracotta enabled. The output looks something like this:

[INFO] Starting DSO nodes
[INFO] Starting node racer1: c:\jdk\jdk1.6.0_02\jre/bin/java.exe -Dcom.tc.l1.modules.repositories=file:/C:/Users/hhuynh/.m2/repository/ -Dtc.nodeName=racer1 -Dtc.numberOfNodes=2 -Dtc.config=d:\work\workspace\projects\PodRacing\tc-config.xml -Dtc.classpath=file:/c:/Users/hhuynh/AppData/Local/Temp/tc-classpath37736.tmp -Dtc.session.classpath=/C:/Users/hhuynh/.m2/repository/org/terracotta/tc-session/2.5.0/tc-session-2.5.0.jar -Dcom.tc.l1.modules.repositories=file:/C:/Users/hhuynh/.m2/repository/ -Xbootclasspath/p:d:\work\workspace\projects\PodRacing\target\dso-boot.jar -Dracer.name=Anakin -cp d:\work\workspace\projects\PodRacing\target\classes; demo.PodRacer
[INFO] Starting node racer2: c:\jdk\jdk1.6.0_02\jre/bin/java.exe -Dcom.tc.l1.modules.repositories=file:/C:/Users/hhuynh/.m2/repository/ -Dtc.nodeName=racer2 -Dtc.numberOfNodes=2 -Dtc.config=d:\work\workspace\projects\PodRacing\tc-config.xml -Dtc.classpath=file:/c:/Users/hhuynh/AppData/Local/Temp/tc-classpath37737.tmp -Dtc.session.classpath=/C:/Users/hhuynh/.m2/repository/org/terracotta/tc-session/2.5.0/tc-session-2.5.0.jar -Dcom.tc.l1.modules.repositories=file:/C:/Users/hhuynh/.m2/repository/ -Xbootclasspath/p:d:\work\workspace\projects\PodRacing\target\dso-boot.jar -Dracer.name=Sebulba -cp d:\work\workspace\projects\PodRacing\target\classes; demo.PodRacer
[INFO] ------------------------------------------------------------------------

[INFO] [racer2] Sebulba: ready
[INFO] [racer1] Anakin: ready
[INFO] [racer1] Anakin: go
[INFO] [racer2] Sebulba: go
[INFO] [racer2] Sebulba: arrived at 1202406914274
[INFO] Finished node racer2
[INFO] [racer1] Anakin: arrived at 1202406915125
[INFO] Finished node racer1


Ouch, Anakin sucked!

Download the Maven project and try it out. Have fun.

P.S. More about sharing memory between JVMs here http://unserializableone.blogspot.com/2006/10/share-precious-heap-memory-accross.html or visit Terracotta