I'm just testing out my javascript to toggle code sections in my blog. It's a useful feature to help keep your blog appears short while still provides enough details.
Show code
Hope this works :)
Tuesday, September 18, 2007
Monday, September 17, 2007
Distributed EhCache as second level cache under Hibernate
EhCache is a one of the great options for Hibernate second level cache. By making it distributed, multiple web applications will be able to share the same cache thus enhance your overall performance and availability. To enable the distributed cache, Terracotta 2.4.3 has a built in support for EhCache 1.3.0 and 1.2.4. I will go through an example of how this be done.
The stack could be visualized like this:
Terracotta is the driving force though its presence is transparent to your web app thanks to bytecode instrumentation. First, let take a look at enabling EhCache in Hibernate configuration file
[+]hibernate.cfg.xml
It's just the same standard setting that you would use for non-distributed case. You want to turn on query cache mode, point Hibernate to ehcache.xml and finally specify a provider. The ehcache.xml can be as simple as this:
[+]ehcache.xml
One thing I'd like to point out is that Terracotta persists heap memory to disk efficiently (and fault them in as needed) so "overflowToDisk" becomes redundant in StandardQueryCache. As a matter of fact, Terracotta doesn't honor this option. Now that the cache is set up, in our entity mapping files, we need to let Hibernate know which entities we'd like to cache during runtime. In my example, I have an Event table (title, date) and a Person table (firstname, lastname) that have a many-to-many relationship through a join table PERSON_EVENT. With that in mind, let's examine Event.hbm.xml and Person.hbm.xml:
[+]Mapping files
The details might be distracting but if you're familiar with Hibernate, this should be as simple as it gets :)
Now we get past all the settings, the fun stuff begins with our servlets. I've created 2 servlets, one called CreateEvents that will populate data into our table. The other, QueryEvents, will query and display the cache hit statistic.
[+]CreateEvents.java
Some default data (3 persons and 2 events) are created during init() phase. With (2), I added an option to add additional Person to the database so later we can use it to demonstrate cache invalidating. To be able to get statistic of cache hit and miss, (3), a query statistic object is created. It will give us the hit/miss count in (4), for the query "select * from Event".
With QueryEvents.java, we will ask Hibernate for list of persons, and events, which would prove to us whether the cache is used or database is used:
[+]QueryEvents.java
I ran these two servlets in 1 Tomcat to make sure everything working correctly:
By hitting http://localhost:8080/Events/create (mapped to CreateEvents servlet)
Since it's the first time we query for events, that's why we have one cache miss. Hibernate went to db to get the data.
Now we hit http://localhost:8080/Events/query (mapped to QueryEvents servlet), the result is:
As expected, the Event query is now hit the cache, raising the hit count to 1. And since it's the first time we query for Person, the cache miss is 1. The participants list proves that the event pojos coming from the cache are valid and can be re-associated to this Hibernate session. Now, if we run CreateEvents servlet on Tomcat 1 and QueryEvents on Tomcat 2, the distributed cache should give us the same result.
This is where Terracotta comes in. There is no change in settings needed in any of the Hibernate mappings files, nor ehcache.xml. There is no code change either. What you need to do is to run Tomcat with Terracotta enabled. The process involved setting 3 java system properties to Tomcat jvm
Detailed instructions can be found here
Luckily, Terracotta has a nice Session configuration tool that will help you set up 2 Tomcats (or Weblogic) cluster. All you need is to import your WAR file. I created a Evetns.war file that contains both of my serlvets and all the needed jars. I need to configure tc-config.xml to let Terracotta knows that I'm using Hibernate, EhCache by adding those modules (1). Also, classes that will be shared need to be instrumented (2). Terracotta also supports sharing of session by declaring your webapp name (3). However, I'm not clustering any session in this example.
[+]tc-config.xml
The session configurator will start up Terracotta server and 2 Tomcats. We can now access CreateEvents servlet on the first Tomcat at port 9081 by hitting http://localhost:9081/Events/create and hit QueryEvents on the second Tomcat at http://localhost:9082/Events/query.
The result I got is:
If I reload Events/query, the statistic is as expected:
To test that the cache is invalidated, after hitting Events/query, in the cache we now have a list of 3 persons. If we create one new person, by hitting http://localhost:9081/Events/create?fn=John&ln=Smith, what we have in the cache now is stale data. Of course, thanks to Terracotta, the second Tomcat + Hibernate is aware of this situation and stale data will be invalidated. Which leads to a cache miss (instead of a hit) when we reload http://localhost:9082/Events/query
As you can see, Event query cache hit continues to rise, when we now have a cache miss in Person query since the cached data is made invalid. Hibernate had to hit the database for new fresh data.
I hope I didn't bore you with too much details but I think it's important to each steps. Terracotta is greatly beneficial if you choose to use EhCache as distributed cache with Hibernate.
You can download the project here and give it a try.
The stack could be visualized like this:
----------- -----------
Tomcat 1 Tomcat 2
----------- -----------
Hibernate Hibernate
------------------------------
EhCache
TERRACOTTA
------------------------------
Terracotta is the driving force though its presence is transparent to your web app thanks to bytecode instrumentation. First, let take a look at enabling EhCache in Hibernate configuration file
[+]hibernate.cfg.xml
... database setting ...
<property name="cache.use_query_cache">true</property>
<property name="cache.provider_configuration_file_resource_path">ehcache.xml</property>
<property name="cache.use_second_level_cache">true</property>
<property name="cache.provider_class">org.hibernate.cache.EhCacheProvider</property>
<mapping resource="Event.hbm.xml" />
<mapping resource="Person.hbm.xml" />
It's just the same standard setting that you would use for non-distributed case. You want to turn on query cache mode, point Hibernate to ehcache.xml and finally specify a provider. The ehcache.xml can be as simple as this:
[+]ehcache.xml
<ehcache>
<diskStore path="user.dir"/>
<defaultCache
maxElementsInMemory="10000"
eternal="false"
overflowToDisk="false"
timeToIdleSeconds="300"
timeToLiveSeconds="300"
diskPersistent="false"
diskExpiryThreadIntervalSeconds="120"
memoryStoreEvictionPolicy="LRU"/>
<cache name="org.hibernate.cache.StandardQueryCache"
maxElementsInMemory="100"
eternal="false"
timeToIdleSeconds="120"
timeToLiveSeconds="120"
overflowToDisk="false"/>
<cache name="org.hibernate.cache.UpdateTimestampsCache"
maxElementsInMemory="5000"
timeToIdleSeconds="120"
timeToLiveSeconds="120"
eternal="true"/>
</ehcache>
One thing I'd like to point out is that Terracotta persists heap memory to disk efficiently (and fault them in as needed) so "overflowToDisk" becomes redundant in StandardQueryCache. As a matter of fact, Terracotta doesn't honor this option. Now that the cache is set up, in our entity mapping files, we need to let Hibernate know which entities we'd like to cache during runtime. In my example, I have an Event table (title, date) and a Person table (firstname, lastname) that have a many-to-many relationship through a join table PERSON_EVENT. With that in mind, let's examine Event.hbm.xml and Person.hbm.xml:
[+]Mapping files
Event.hbm.xml:
<class name="events.Event" table="EVENTS">
<cache usage="read-write" />
<id name="id" column="EVENT_ID">
<generator class="native" />
</id>
<property name="date" column="EVENT_DATE" />
<property name="title" />
<set name="participants" table="PERSON_EVENT" lazy="true"
inverse="true" cascade="lock">
<cache usage="read-write" />
<key column="EVENT_ID" />
<many-to-many column="PERSON_ID"
class="events.Person" />
</set>
</class>
Person.hbm.xml:
<class name="events.Person" table="PERSON">
<cache usage="read-write" />
<id name="id" column="PERSON_ID">
<generator class="native"/>
</id>
<property name="firstname"/>
<property name="lastname"/>
<set name="events" table="PERSON_EVENT">
<cache usage="read-write" />
<key column="PERSON_ID"/>
<many-to-many column="EVENT_ID" class="events.Event"/>
</set>
</class>
Event.java:
public class Event {
private Long id;
private String title;
private String date;
private Set participants = new HashSet();
/* getters and setters */
}
Person.java:
public class Person {
private Long id;
private String firstname;
private String lastname;
private Set events = new HashSet();
/* getters and setters */
}
The details might be distracting but if you're familiar with Hibernate, this should be as simple as it gets :)
Now we get past all the settings, the fun stuff begins with our servlets. I've created 2 servlets, one called CreateEvents that will populate data into our table. The other, QueryEvents, will query and display the cache hit statistic.
[+]CreateEvents.java
public class CreateEvents extends HttpServlet {
/* (1) */
public void init() throws ServletException {
super.init();
generateData();
}
protected void doGet(HttpServletRequest req, HttpServletResponse resp)
throws ServletException, IOException {
/* (2) */
if (req.getParameter("fn") != null && req.getParameter("ln") != null) {
EventManager.createAndStorePerson(req.getParameter("fn"), req
.getParameter("ln"));
}
resp.setContentType("text/html");
resp.getWriter().println("<html><body>");
/* (3) */
Statistics stats = HibernateUtil.getSessionFactory().getStatistics();
stats.setStatisticsEnabled(true);
QueryStatistics queryStats = stats.getQueryStatistics("from Event");
/* (4) */
resp.getWriter().println(
"Events created: " + EventManager.query("Event") + "<br/>");
resp.getWriter().println(
"Event query cache miss: " + queryStats.getCacheMissCount() + "<br/>");
resp.getWriter().println(
"Event query cache hit: " + queryStats.getCacheHitCount() + "<br/>");
resp.getWriter().println("</body></html>");
}
private void generateData() {
HibernateUtil.dropAndCreateDatabaseSchema();
/* people */
Long p1 = EventManager.createAndStorePerson("Ichigo", "Kurosaki");
Long p2 = EventManager.createAndStorePerson("Abarai", "Renji");
Long p3 = EventManager.createAndStorePerson("Ishida", "Uryu");
/* events */
Long e1 = EventManager.createAndStoreEvent("Event 1", "2007-09-30");
Long e2 = EventManager.createAndStoreEvent("Event 2", "2007-12-01");
/* participants in e1 */
EventManager.addPersonToEvent(p1, e1);
EventManager.addPersonToEvent(p2, e1);
EventManager.addPersonToEvent(p3, e1);
/* participants in e2 */
EventManager.addPersonToEvent(p2, e2);
EventManager.addPersonToEvent(p3, e2);
}
}
Some default data (3 persons and 2 events) are created during init() phase. With (2), I added an option to add additional Person to the database so later we can use it to demonstrate cache invalidating. To be able to get statistic of cache hit and miss, (3), a query statistic object is created. It will give us the hit/miss count in (4), for the query "select * from Event".
With QueryEvents.java, we will ask Hibernate for list of persons, and events, which would prove to us whether the cache is used or database is used:
[+]QueryEvents.java
public class QueryEvents extends HttpServlet {
protected void doGet(HttpServletRequest req, HttpServletResponse resp)
throws ServletException, IOException {
resp.setContentType("text/html");
resp.getWriter().println("<html><body>");
Statistics stats = HibernateUtil.getSessionFactory().getStatistics();
stats.setStatisticsEnabled(true);
QueryStatistics queryStats = stats.getQueryStatistics("from Event");
/* (1) query Event */
List events = EventManager.query("Event");
resp.getWriter().println(
"Events found: " + events + "<br/>");
resp.getWriter().println(
"Event query cache miss: " + queryStats.getCacheMissCount() + "<br/>");
resp.getWriter().println(
"Event query cache hit: " + queryStats.getCacheHitCount() + "<br/>");
/* (2) query People */
queryStats = stats.getQueryStatistics("from Person");
resp.getWriter().println("<br/>" +
"People found: " + EventManager.query("Person") + "<br/>");
resp.getWriter().println(
"Person query cache miss: " + queryStats.getCacheMissCount() + "<br/>");
resp.getWriter().println(
"Person query cache hit: " + queryStats.getCacheHitCount() + "<br/><br/>");
/* (3) reassociate transient Event objects to this hibernate session */
/* so we can query for participants */
Session session = HibernateUtil.getSessionFactory().getCurrentSession();
session.beginTransaction();
for (Iterator it = events.iterator(); it.hasNext();) {
session.lock((Event) it.next(), LockMode.NONE);
}
resp.getWriter().println("Participants of Even 1: " + ((Event)events.get(0)).getParticipants() + "<br/>");
resp.getWriter().println("Participants of Even 2: " + ((Event)events.get(1)).getParticipants() + "<br/>");
session.close();
resp.getWriter().println("</body></html>");
}
}
I ran these two servlets in 1 Tomcat to make sure everything working correctly:
By hitting http://localhost:8080/Events/create (mapped to CreateEvents servlet)
Events created: [Event 1: 2007-09-30, Event 2: 2007-12-01]
Event query cache miss: 1
Event query cache hit: 0
Since it's the first time we query for events, that's why we have one cache miss. Hibernate went to db to get the data.
Now we hit http://localhost:8080/Events/query (mapped to QueryEvents servlet), the result is:
Events found: [Event 1: 2007-09-30, Event 2: 2007-12-01]
Event query cache miss: 1
Event query cache hit: 1
People found: [Ichigo Kurosaki, Abarai Renji, Ishida Uryu]
Person query cache miss: 1
Person query cache hit: 0
Participants of Even 1: [Abarai Renji, Ishida Uryu, Ichigo Kurosaki]
Participants of Even 2: [Abarai Renji, Ishida Uryu]
As expected, the Event query is now hit the cache, raising the hit count to 1. And since it's the first time we query for Person, the cache miss is 1. The participants list proves that the event pojos coming from the cache are valid and can be re-associated to this Hibernate session. Now, if we run CreateEvents servlet on Tomcat 1 and QueryEvents on Tomcat 2, the distributed cache should give us the same result.
This is where Terracotta comes in. There is no change in settings needed in any of the Hibernate mappings files, nor ehcache.xml. There is no code change either. What you need to do is to run Tomcat with Terracotta enabled. The process involved setting 3 java system properties to Tomcat jvm
-Xbootclasspath/p:"path/to/Terracotta/bootjar"
-Dtc.install-root=/path/to/Terracotta/install
-Dtc.config=/path/to/tc-config.xml
Detailed instructions can be found here
Luckily, Terracotta has a nice Session configuration tool that will help you set up 2 Tomcats (or Weblogic) cluster. All you need is to import your WAR file. I created a Evetns.war file that contains both of my serlvets and all the needed jars. I need to configure tc-config.xml to let Terracotta knows that I'm using Hibernate, EhCache by adding those modules (1). Also, classes that will be shared need to be instrumented (2). Terracotta also supports sharing of session by declaring your webapp name (3). However, I'm not clustering any session in this example.
[+]tc-config.xml
<?xml version="1.0" encoding="UTF-8"?>
<con:tc-config xmlns:con="http://www.terracotta.org/config">
<servers>
<server host="%i" name="localhost">
<dso-port>9510</dso-port>
<jmx-port>9520</jmx-port>
<data>terracotta/server-data</data>
<logs>terracotta/server-logs</logs>
</server>
</servers>
<clients>
<logs>terracotta/client-logs</logs>
( 1 )
<modules>
<module name="clustered-hibernate-3.1.2" version="1.0.0" />
<module name="clustered-ehcache-1.3" version="1.0.0" />
</modules>
</clients>
<application>
<dso>
( 2 )
<instrumented-classes>
<include>
<class-expression>events.Event</class-expression>
</include>
<include>
<class-expression>events.Person</class-expression>
</include>
<include>
<class-expression>events.EventManager</class-expression>
</include>
</instrumented-classes>
( 3 )
<web-applications>
<web-application>Events</web-application>
</web-applications>
</dso>
</application>
<system>
<configuration-model>development</configuration-model>
</system>
</con:tc-config>
The session configurator will start up Terracotta server and 2 Tomcats. We can now access CreateEvents servlet on the first Tomcat at port 9081 by hitting http://localhost:9081/Events/create and hit QueryEvents on the second Tomcat at http://localhost:9082/Events/query.
The result I got is:
Events/create:
Events created: [Event 1: 2007-09-30, Event 2: 2007-12-01]
Event query cache miss: 1
Event query cache hit: 0
Events/query:
Events found: [Event 1: 2007-09-30, Event 2: 2007-12-01]
Event query cache miss: 0
Event query cache hit: 1
People found: [Ichigo Kurosaki, Abarai Renji, Ishida Uryu]
Person query cache miss: 1
Person query cache hit: 0
Participants of Even 1: [Abarai Renji, Ishida Uryu, Ichigo Kurosaki]
Participants of Even 2: [Abarai Renji, Ishida Uryu]
If I reload Events/query, the statistic is as expected:
Event query cache miss: 0
Event query cache hit: 2
Person query cache miss: 1
Person query cache hit: 1
To test that the cache is invalidated, after hitting Events/query, in the cache we now have a list of 3 persons. If we create one new person, by hitting http://localhost:9081/Events/create?fn=John&ln=Smith, what we have in the cache now is stale data. Of course, thanks to Terracotta, the second Tomcat + Hibernate is aware of this situation and stale data will be invalidated. Which leads to a cache miss (instead of a hit) when we reload http://localhost:9082/Events/query
Event query cache miss: 0
Event query cache hit: 3
People found: [Ichigo Kurosaki, Abarai Renji, Ishida Uryu, John Smith]
Person query cache miss: 2
Person query cache hit: 1
As you can see, Event query cache hit continues to rise, when we now have a cache miss in Person query since the cached data is made invalid. Hibernate had to hit the database for new fresh data.
I hope I didn't bore you with too much details but I think it's important to each steps. Terracotta is greatly beneficial if you choose to use EhCache as distributed cache with Hibernate.
You can download the project here and give it a try.
Labels:
ehcache
Thursday, September 06, 2007
Simple directory browser for Amazon S3
If you have used Amazon S3, you might be annoyed at the lack of support for directory browsing. If you hit your repo url, all you get is an XML file listing your content. So I wrote this small javascript to read that XML file, then display as a directory tree just like you're browsing an FTP site.
For example, say Terracotta has a S3 repo at http://download.terracotta.org. If you click on that link, you'll get an XML file. Now I want to be able to see what we have under http://download.terracotta.org/maven2, you'll get nothing but an error trying to go there.
My javascript is under the page http://download.terracotta.org/maven2/index.html
Just right click on that page and read the source, the javascript is pretty simple. Could have been made better or fancier but I'm no javascript guru :)
Maybe it could be of use for you.
Note: S3 doesn't serve index.html file automatically (after all, it's not a web server) so you have to hit the index.html explicitly.
For example, say Terracotta has a S3 repo at http://download.terracotta.org. If you click on that link, you'll get an XML file. Now I want to be able to see what we have under http://download.terracotta.org/maven2, you'll get nothing but an error trying to go there.
My javascript is under the page http://download.terracotta.org/maven2/index.html
Just right click on that page and read the source, the javascript is pretty simple. Could have been made better or fancier but I'm no javascript guru :)
Maybe it could be of use for you.
Note: S3 doesn't serve index.html file automatically (after all, it's not a web server) so you have to hit the index.html explicitly.
Subscribe to:
Posts (Atom)