Java EE clustering
Muster for the cluster buster
Enterprise-oriented systems must often be both scalable to deal with changing performance requirements and available 24x7 (or at least very close to this level of availability).
Java systems, whether they are J2EE or Java EE 5 (collectively called Java EE in this article), are no exception. In many cases, the best way to provide for ease of scalability and high availability of the deployed system is to employ a Java EE cluster.
A Java EE cluster is a cluster of application server instances. Within such a cluster each application server instance typically contains the same set of Java EE components (such configurations are known as homogeneous configurations, heterogeneous configurations are also possible with some Application Server systems). Some form of load balancing technology that allows any clients to view the whole cluster as a single server may then front these clustered servers. Thus if any particular server within the cluster fails, this should have little or no impact on the clients. In addition, if additional processing power is required adding additional servers to the cluster can provide this.
As well as issues such as where the members of the cluster are heterogeneous or homogenous, it is also necessary to consider whether they managed both the web tier and the EJB tier or just one tier in an enterprise application.
One of the most common approaches is often referred to as the "Co-located tier architecture". In this approach all tiers of the web application are deployed onto the same Application Server cluster - Figure 1 illustrates this.
The advantages of the co-located tier architecture are its simplicity and robustness. However, with this simplicity comes some limitations. For example, in such an architecture, it is not possible to load balance and fail-over the EJB tier separately from the web tier. To some extent the benefits of this may depend on the Application Server being used. For example, it may be that for web applications to take advantage of load balanced EJBs, those EJBs must be operating within a separate cluster.
By contrast, a multi-tier architecture involves separate deployment of the web container tier and the EJB container tier. These tiers may be clustered separately or jointly. One of the benefits of this approach is the provision of load balancing at each tier in the hierarchy – Figure 2 illustrates this.
Although many architects recommend a multi-tier architecture, others consider it sub-optimal. This is for a number of reasons including:
- Additional effort is required to provide two separate load balancers that handle different technologies (i.e. HTTP and RMI/IIOP) with potentially different algorithms. This is extra work for the system that may be of limited benefit.
- All communication between the web tier and the EJB tier must now use inter cluster communication. This is less efficient that communication within a server cluster.
- Most vendors implement their server architecture in such a way that both the web tier and the EJB tier, when co-located share a JVM, thus if the EJB tier dies then so does the associated web tier. If dispatching/load balancing fronts the servers, then the next request will merely be routed to a new co-located server. This is a simpler solution to implement and maintain.
- Most Application Server vendors (such as WebLogic) will optimize the EJB load balancing mechanism to let requests first choose the EJB container in the same server. In this way load balancing must only be performed manually at the first level of requests.
Java EE Oriented Clustering
As well as the use of load balancers/dispatchers, Java EE clustering facilities tend to be supported at three levels:
- Web-tier via HTTPSession clustering.
- EJB component clustering.
- JMS Technology.
However, the Java EE specification does not incorporate explicit support for clustering technologies. Thus the support provided for each of the above can vary widely from vendor to vendor. Java EE compliant applications can also be written that may or may not be clusterable. This can result in the following issues affecting a Java EE application when it is to be used within a clustered environment:
- Applications built on a stand-alone Java EE server may not run in a clustered environment. This may be due to un-intended dependencies or associations between one tier and another which remain unnoticed until the application is clustered.
- Applications may run within a cluster, but run far more slowly than without clustering - this may be due to the amount of state information that must be replicated etc.
- Applications may run in one vendor’s clustered environment but not another. This may be due to restrictions imposed by, or support features provided by, one or another vendor.
This last issue relates to the fact that different server vendors may offer different clustering policies, different failover policies and different approaches to the level of failover supported. They may also provide vastly different default optimizations that may allow implementations to work (e.g. utilizing local interfaces if the Web tier and the EJB tier are run within the same JVM).
Developing an application to execute within a cluster is not a trivial matter, and should be taken into account form the ground up when the system is being designed. For example, there are several design patterns which are used extensively within the Java world, that can if implemented inappropriately, cause problems within a clustered environment. For example:
Object Caches Object caches are often used to improve the performance of Java applications. However, most cache implementations assume that the application part using the cache is in the same JVM as the cache itself. If the application runs in two or more JVMs the cache may at best become less useful and at worst may cause system failures.
Static variables Static variables are often used with a number of different design patterns, such as factory and singleton. Again, this approach may assume that all objects using the static singleton use the same singleton object in the same JVM. However, in a clustered environment, various parts of the application may run in different JVMs.
Event based services Event based services may be services triggered due to some situation occurring, such as a timer being triggered, or something happening within the application. In many cases such event-based services rely on the fact that they are running within a single JVM for effective operation.
Java EE clusters are a very effective way of providing for high availability of Java systems, however clustering should be taken into account from day one and incorporated into the design of the system. This means that issues such as whether the system should be co-located or not, whether servers should be homogeneous of not, and ensuring that the design does not preclude clustering techniques should all be considered as part of the basic analysis of the system. ®