Wednesday, July 15, 2009

JPA implementation patterns: Data Access Objects

The JPA, short for Java Persistence API, is part of the Java EE 5 specification and has been implemented by Hibernate, TopLink, EclipseLink, OpenJPA, and a number of other object-relational mapping (ORM) frameworks. Because JPA was originally designed as part of the EJB 3.0 specification, you can use it within an EJB 3.0 application. But it works equally well outside of EJB 3.0, for example in a Spring application. And when even Gavin King, the designer of Hibernate, recommends using JPA in the second edition of Hibernate in Action, a.k.a. Java Persistence with Hibernate, it's obvious that JPA is here to stay.

Once you get over your fear of annotations ;-), you find that there is plenty of literature out there that explains the objects and methods within the API, the way these objects work together and how you can expect them to be implemented. And when you stick to hello-world-style programs, it all seems pretty straight forward. But when you start writing your first real application, you find that things are not so simple. The abstraction provided by JPA is pretty leaky and has ramifications for larger parts of your application than just your Data Access Objects (DAO's) and your domain objects. You need to make decisions on how to handle transactions, lazy loading, detached object (think web frameworks), inheritance, and more. And it turns out that the books and the articles don't really help you here.

At least, that is what I discovered when I really started using JPA for the first time. In the coming weeks, I would like to discuss the choices I came up against and the decisions I made and why I made them. When I'm done, we'll have a number of what I would like to not-too-modestly call JPA implementation patterns. ;-)

Do we really need a DAO?

So, let's start with the thing you would probably write first in your JPA application: the data access object (DAO). An interesting point to tackle before we even start is whether you even need a DAO when using JPA. The conclusion of that discussion more than a year ago was "It depends" and while it is very hard to argue with such a conclusion :-), I would like to stick with the idea that a DAO does have its place in a JPA application. Arguably it provides only a thin layer on top of JPA, but more importantly making a DAO per entity type gives you these advantages:

  • Instead of having to pick the right EntityManager method every time you want to store or load data, you decide which one to use once and you and your whole team can easily stick to that choice.
  • You can disallow certain operations for certain entity types. For example, you might never want your code to remove log entries. When using DAO's, you just do not add a remove method to your LogEntry DAO.
  • Theoretically, by using DAO's you could switch to another persistence system (like plain JDBC or iBATIS). But because JPA is such a leaky abstraction I think that that is not realistically possible for even a slightly complex application. You do get a single point of entry where you can add tracing features or keep performance statistics.
  • You can centralize all the queries on a certain entity type instead of scattering them through your code. You could use named queries to keep the queries with the entity type, but you'd still need some central place where the right parameters are set. Putting both the query, the code that sets the parameters, and the cast to the correct return type in the DAO seems a simpler thing to do. For example:
    public List findExecutingChangePlans() {
    Query query = entityManager.createQuery(
    "SELECT plan FROM ChangePlan plan where plan.state = 'EXECUTING'");
    return (List) query.getResultList();
    }

So when you decide you are going to use DAO's, how do you go about writing them? The highlighted (in bold) comment in the Javadoc for Spring's JpaTemplate seems to suggest that there's not much point in using that particular class, which also makes JpaDaoSupport superfluous. Instead you can write your JPA DAO as a POJO using the @PersistenceContext annotation to get an EntityManager reference. It will work in an EJB 3.0 container and it will work in Spring 2.0 and up if you add the PersistenceAnnotationBeanPostProcessor bean to your Spring context.

The type-safe generic DAO pattern

Because each DAO shares a lot of functionality with the other DAO's, it makes sense to have a base class with the shared functionality and then subclass from that for each specific DAO. There are a lot of blogs out there about such a type-safe generic DAO pattern and you can even download some code from Google Code. When we combine elements from all these sources, we get the following JPA implementation pattern for DAO's.

The entity class

Let's say we want to persist the following Order class:

@Entity
@Table(name = "ORDERS")
public class Order {
@Id
@GeneratedValue
private int id;
private String customerName;
private Date date;

public int getId() { return id; }
public void setId(int id) { this.id = id; }

public String getCustomerName() { return customerName; }
public void setCustomerName(String customerName) { this.customerName = customerName; }

public Date getDate() { return date; }
public void setDate(Date date) { this.date = date;}
}

Don't worry too much about the details of this class. We will revisit the specifics in other JPA implementation patterns. The @Table annotation is there because ORDER is a reserved keyword in SQL.

The DAO interfaces

First we define a generic DAO interface with the methods we'd like all DAO's to share:

public interface Dao {
void persist(E entity);
void remove(E entity);
E findById(K id);
}

The first type parameter, K, is the type to use as the key and the second type parameter, E, is the type of the entity. Next to the basic persist, remove, and findById methods, you might also like to add a List findAll() method. But like the entity class itself, we will revisit the DAO methods in later JPA implementation patterns.

Then we define one subinterface for each entity type we want to persist, adding any entity specific methods we want. For example, if we'd like to be able to query all orders that have been added since a certain date, we can add such a method:

public interface OrderDao extends Dao {
List findOrdersSubmittedSince(Date date);
}

The base DAO implementation

The third step is to create a base JPA DAO implementation. It will have basic implementation of all the methods in the standard Dao interface we created in step 1:

public abstract class JpaDao implements Dao {
protected Class entityClass;

@PersistenceContext
protected EntityManager entityManager;

public JpaDao() {
ParameterizedType genericSuperclass = (ParameterizedType) getClass().getGenericSuperclass();
this.entityClass = (Class) genericSuperclass.getActualTypeArguments()[1];
}

public void persist(E entity) { entityManager.persist(entity); }

public void remove(E entity) { entityManager.remove(entity); }

public E findById(K id) { return entityManager.find(entityClass, id); }
}

Most of the implementation is pretty straight forward. Some points to note though:

  • The constructor of the JpaDao includes the method proposed by my colleague Arjan Blokzijl to use reflection to get the entity class.
  • The @PersistenceContext annotation causes the EJB 3.0 container or Spring to inject the entity manager.
  • The entityManager and entityClass fields are protected so that subclasses, i.e. specific DAO implementations, can access them.

The specific DAO implementation

And finally we create such a specific DAO implementation. It extends the basic JPA DAO class and implements the specific DAO interface:

public class JpaOrderDao extends JpaDao implements OrderDao {
public List findOrdersSubmittedSince(Date date) {
Query q = entityManager.createQuery(
"SELECT e FROM " + entityClass.getName() + " e WHERE date >= :date_since");
q.setParameter("date_since", date);
return (List) q.getResultList();
}
}

Using the DAO

How you get a reference to an instance of your OrderDao depends upon whether we use EJB 3.0 or Spring. In EJB 3.0 we'd use a annotation like this:

@EJB(name="orderDao")
private OrderDao orderDao;

while in Spring we can use the XML bean files or we can use autowiring like this:

@Autowired
public OrderDao orderDao;

In any case, once we have a reference to the DAO we can use it like this:

Order o = new Order();
o.setCustomerName("Peter Johnson");
o.setDate(new Date());
orderDao.persist(o);

But we can also use the entity specific query we added to the OrderDao interface:

List orders = orderDao.findOrdersSubmittedSince(date);
for (Order each : orders) {
System.out.println("order id = " + each.getId());
}

With this type-safe DAO pattern we get the following advantages:

  • No direct dependency on the JPA api from client code.
  • Type-safety through the use of generics. Any casts that still need to be done are handled in the DAO implementation.
  • One logical place to group all entity-specific JPA code.
  • One location to add transaction markers, debugging, profiling, etc. Although as we will see later, we will need to add transaction markers in other parts of our applications too.
  • One class to test when testing the database access code. We will revisit this subject in a later JPA implementation pattern.

I hope this convinces you that you do need DAO's with JPA. :-)

And that wraps up the first JPA implementation pattern. In the next blog of this series we will build on this example to discuss the next pattern. In the meantime I would love to hear from you how you write your DAO's!


Thanks to the original author and source

Tuesday, July 14, 2009

Clustering a Java Web Application With Amazon ElasticLoad Balancing

Amazon lately introduced three great services:
  • Elastic Load Balancing: automatically distributes incoming application traffic across multiple Amazon EC2 instances.
  • Auto Scaling: automatically scale your Amazon EC2 capacity up or down according to conditions you define.
  • CloudWatch: is a web service that provides monitoring for AWS cloud resources. Auto Scaling is enabled by CloudWatch.

Let's say you have a tremendous business idea like selling binoculars on the web. You wouldn't believe it (as I also didn't) but one of my college told me a story about his friend making a living by selling telescopes on the web. So you developed a simple web application, it's tested on your laptop, and you want to go public. First you don't want to invest too much into hardware and licenses so you just create a Amazon Web Services account, and start up just a small instance with jetty or Tomcat serving, for a reasonable 0.10 usd/hour fee. This is the way that AWS is selling like hot cakes: no upfront costs / pay as you go.

After a couple of weeks you realize, that binoculars sell like hot cakes, and your lonely instance can't serve all the request. You have 2 choices: either change to a bigger instance (a large instance costs 0.40 usd/hour), or you start up a second or third small instance. Let's say you went on the second way (a large instance will be saturated soon, so you will need more instances anyway) and you face a couple of problems:

  • You need to distribute the load between the web servers.
  • In the peak hours you need 3 instances while during the night one server is sufficient to handle the request, you don't want babysit the application and start up and shut down instances regarding the load.

To distribute the load between several web servers, you'll need some sort of load balancer. Hardware load balancers are out of scope as they are quite expensive, and anyway, you decided to use Amazon's virtual environment. You could use round robin DNS (setting multiple ip address for the same DNS name), but then it gets tricky when you scale up or down: you have to refresh the DNS entries (A-records) and you have to choose a reasonable TTL (time to live) value which influences how quick your changes will propagate on the net.

Most probably you would go with the software load balancing approach, and you end up choosing Apache with mod_proxy_balancer. Then you face another decision: if you co-locate Apache with your Java web server, then you increase the load on the web server, you have the same problem with maintaining changing numbers, or changing ip (in case of a restart of apache) in the DNS entry. Or if you use dedicated instances for Apache, you almost double the costs: you pay 2-3 x 10 cents hourly for the web server instances, and 2x10 cents for Apaches (if you want to eliminate a single point of failure)

This is where you can introduce Elastic Load Balancing which costs just 2.5 cents/hour. (When I'm talking about costs, it's just a rough estimation, as I calculate only the box usage, and not the network traffic, but let's say the network traffic is about the same for the different scenarios I have described above.)


Original Source

Returning Zero-Length Arrays in Java

It's best not to return null from a method that returns an array type. Always returning an array, even if the array has zero length, greatly improves the generality of algorithms. If you anticipate that your methods will return zero-length arrays frequently, you might be concerned about the performance implications of allocating many such arrays. To solve that problem, simply allocate a single array, and always return the same one, for example: private static final int ZERO_LENGTH_ARRAY[] = new int[0];
This array is immutable (it can't be changed), and can be shared throughout the application.

Understanding Java's Integer Pool Can Avoid Problems

The next time you come across a situation where two Integers defined with equal values fail an equivalency (==) test, remember this tip. Suppose you have two integers defined as follows:Integer i1 = 128
Integer i2 = 128
If you then execute the test (i1 == i2), the returned result is false. The reason is that the JVM maintains a pool of Integer values (similar to the one it maintains for Strings). But the pool contains only integers from -128 to 127. Creating any Integer in that range results in Java assigning those Integers from the pool, so the equivalency test works. However, for values greater than 127 and less than -128), the pool does not come into play, so the two assignments create different objects, which then fail the equivalency test and return false.
In contast, consider the following assignments:Integer i1 = new Integer(1); // Any number in the Integer pool range
Integer i2 = new Integer(1); // Any number in the Integer pool range
Because these assignments are made using the 'new' keyword, they are new instances created, and not picked up from the pool. Hence, testing (i1== i2) on the preceding assignments returns false.
Here's some code that illustrates the Integer pool:public class IntegerPoolTest {
public static void main(String args[]){
Integer i1 = 100;
Integer i2 = 100;
// Comparison of integers from the pool - returns true.
compareInts(i1, i2);

Integer i3 = 130;
Integer i4 = 130;
// Same comparison, but from outside the pool
// (not in the range -128 to 127)
// resulting in false.
compareInts(i3, i4);
Integer i5 = new Integer(100);
Integer i6 = new Integer(100);
// Comparison of Integers created using the 'new' keyword
// results in new instances and '==' comparison leads to false.
compareInts(i5, i6);
}
private static void compareInts(Integer i1, Integer i2){
System.out.println("Comparing Integers " +
i1 + "," + i2 + " results in: " + (i1 == i2) );
}
}