Welcome!

IBM Cloud Authors: Zakia Bouachraoui, Elizabeth White, Yeshim Deniz, Pat Romanski, Liz McMillan

Related Topics: IBM Cloud

IBM Cloud: Article

J2EE Caching

Retrieving, assembling, and presenting information

Most Web applications are typically based on the presentation of information, meaning that functional operations pertaining to retrieving, assembling, and presenting information in the form of content and data largely outnumber functional operations that actually modify the information.

For example, browsing detailed product information in an online store occurs exponentially more often than updating the product information. Caching offers critical benefits to the application that include, but are not limited to:

  • Faster response times and increased application responsiveness: Requests are serviced faster, as pre-cached information is served up without having to query the data store and apply complex business rules and calculations to the result set.
  • Increased throughput: Applications designed to take advantage of caching have much higher throughput rates because it is not necessary to retrieve data and perform complex calculations on the data for every request running on the same physical hardware infrastructure.
  • High availability: If back-end components go offline temporarily, previously cached information can still be served up to the client.
  • Scalability: Caching in the presentation and business tiers of the J2EE architecture greatly helps to reduce the consumption of system resources, allowing the applications to be scaled horizontally more easily when the need to support a higher client load arises. The application itself will need to be well designed to ensure horizontal scalability.

Caching and Web Application Architecture


Figure 1 depicts a widely used robust Web architecture. There are several opportunities in this architecture where information can be cached efficiently and served up.

Following is a brief description of the various points at which caching can be accomplished:

  • WebSphere Edge Server in a reverse proxy configuration: WebSphere Edge Server is the initial point of contact for every request within an organization's DMZ. The Edge Server Web Traffic Express (WTE) component caches responses of dynamic content like JSP pages and servlets. An Edge Server adapter module is installed in WebSphere Application Server (WAS), which results in all dynamic content responses being modified by the adapter before returning to the Web server. The WebSphere Edge Server and WAS work together in invalidating caches to keep stale information from being served up in the cache.
  • Web server caching: Most Web servers allow static files being served to be cached. Caching dynamic content can be done either by the Web server itself or by the WAS Web Server plug-in component. Both of these scenarios are typically limited in terms of cache invalidations and can be used as simple caching mechanisms with time-based invalidations.
  • Presentation and business tier caching: These two logical tiers can be physically located in the same virtual machine, different virtual machines, or even on different nodes. Caching here offers the most flexibility of control over the cache in terms of security, configuration, and maintenance of the cache at run time.
  • EIS caches: WAS caches prepared statements and data itself can be configured to be cached in most enterprise information stores.
The general rule of thumb is to cache information as close to the source of the request (the client) as possible to get the best caching result. Several other factors need to be considered carefully along with response times, including security (caching of sensitive information in the DMZ will have enterprise security architects up in arms), hardware and software limitations, caching requirements and level of control required on the caching engine in terms of configuration, monitoring, etc.

Presentation and Business Tier Caching

Content being cached in these two tiers of the J2EE Web application architecture varies. For example, the presentation tier's dynamic components (servlet and JSP output) are usually prime candidates to be cached in part, like portlet output that forms part of a bigger page. Alternatively, in the business tier, objects that result from a combination of complex queries and business process calculations on result sets are typical candidates for caching.

CACHING CONSIDERATIONS
One of the key elements to designing a successful application caching mechanism is to understand the caching requirements. Some important questions that will help determine the design of caching mechanism include:

  • What information is to be cached?
  • Will the information be cached in the presentation tier, business tier, or both tiers?
  • How do you invalidate the information in the cache to ensure stale information does not exist in the cache?
Along with these questions, it is also important to understand:
  • What is the size of an object/dynamic content being cached? What will be the approximate size of the entire cache? Will the objects be cached entirely in memory or be shared between memory and disk?
  • What algorithm will be used, if the cache is exhausted, to move content/data out of the cache?
  • Will the cache need to be clustered between various nodes in a clustered environment? If it does, what mechanism will be used to accomplish this? When clustered, will there be a single remote cache or will each node maintain its own local copy of the cache to synch?
  • How does clustering impact network traffic - will the clustering mechanism used result in a noticeable increase in network traffic just to maintain the consistency of the cache creating network bottlenecks, or will the cache implement an efficient mechanism that results in minimal network traffic to maintain the consistency of the caches in a cluster?
SAMPLE APPLICATION
We will use an example to describe asynchronous caching solutions of which most Web applications can take advantage. The sample application is to build part of the ACME online store. (the sample EAR file with source and configuration instructions is available as a download). We will tackle one simple use case as shown in Figure 2.

The architecture of the sample application (as shown in Figure 3) is made up of one WebSphere cell containing two nodes: Node A and Node B. Node A runs the WAS instance hosting the application that services the end clients of ACME, allowing clients to browse, view, and buy products from ACME using their Web site. Node B runs a JMS Server component (either the Embedded Messaging Server running under its own dedicated JVM or WebSphere MQ) and WAS hosts an application that lets ACME application administrators update product details.

Open Symphony's OSCACHE is used as the cache engine. Alternatively, other caching products, such as WebSphere's built-in dynamic cache, can be effectively plugged to achieve the same result. The OSCACHE design allows caches to be created and manipulated using JSP tags and/or an API. The tag library is used to cache parts or entire dynamic Web page contents. The API can be used to manipulate the caches in both the presentation and the business tiers that are built after complex processes. The example will use the API to cache product value objects in the business tier.

BUSINESS TIER CACHING
Figure 4 demonstrates how asynchronous caching using JMS and message beans can be employed in the business tier of an enterprise application. In the figure, logical boundaries (the logical components can physically be in the same virtual machine, different virtual machines, or even different servers) are indicated by rounded rectangles. A solid line connecting two shapes represents a request while a dotted line connecting two shapes represents an optional request that may or may not be required.

The most notable steps to caching in the business tier are:

  • Steps 1-3: An HTTP request comes from the browser requesting the product details handled by a controller servlet. The request is forwarded to the business tier (this is a remote request if the Web container and EJB container reside in different virtual machines) through a business delegate.
  • Step 4: In well-designed applications, a session EJB facade handles requests to the EJB container and a value object is returned back to the Web container as requested data. This allows a framework to be built that could allow the facade to check the cache on every request in the EJB container. If the requested value object is already in the cache, the data is retrieved from the cache and returned immediately. Effectively, neither the data is retrieved from the EIS tier nor are complex operations performed on the retrieved data, enhancing the health of the various servers and the application itself.
  • Step 5: The product details are not found in the cache, so they have to be retrieved from the EIS store. All operations are performed on this data. The resulting product details value object is stored in the cache immediately. Future requests for details will be served up from the cache.
  • Step 6: The product administrator updates the information using the product update component.
  • Step 7: The product update component updates the data store and publishes an asynchronous JMS message to the product update topic. It is important to ensure that the JMS message is sent on product updates; otherwise, stale data will be pulled from the cache. To make sure this does not happen, the update transaction is processed using a two-phase transaction (XA enabled).
  • Steps 8-9: The update cache message bean is configured to be a subscriber to the product update topic and is triggered when the product update JMS message is published successfully. The message bean clears the product entry from the cache and optionally reloads the cache with the updated information. If the optional step is not done, the product detail is cached on the next request.
The caching API is similar to any data structure supporting key-value pairs (hashtable, properties, maps, etc.). The OSCACHE API throws an exception if a particular entry is not found, forcing clients to handle exceptions in all retrievals from the cache, which could lead to cumbersome operations. Utility wrapper classes that provide a client-friendly cache API will alleviate this issue.

Web Tier Caching

CACHING DYNAMIC CONTENT
The OSCACHE caching tag libraries provide custom tags that allow applications to cache the contents of entire (or selective parts of) JSP pages (and servlets). In this example, we will use JSP pages, but the API can be used to simulate the same behavior for servlets. The OSCACHE custom tag "cache" is specified as:

<cache:cache key = ' <%= request.getParameter ( "productId" ) %>'
refreshpolicyclass = "com.stratus.sample.web.cache.ProductCacheRefreshPolicy"
refreshpolicyparam =' <%=request.getParameter("productId")%>' >
<!--RETRIEVE PRODUCT DETAILS AND DISPLAY DETAILS HERE -->
</cache>

The product ID is used as a unique key to store the rendered content for each product. The refreshpolicyclass attribute specifies a class that implements the needsRefresh() method of com.opensymphony.oscache.web.WebEntryRefreshPolicy interface. This method will be called on every request by the OSCACHE framework to verify whether or not the JSP code between the tags needs to be processed. If the needsRefresh() method returns true, the product details will be retrieved from the business tier. If the needsRefresh() method returns false, the rendered content will be delivered out of the cache. The value of refreshpolicyparam will be made available to the ProductCacheRefreshPolicy class to help needsRefresh() method in making the decision on refreshing the cache.

INVALIDATING DYNAMIC CONTENT
We looked at how the custom cache tag can be used to cache dynamic content; we will now look at how the needsRefresh() method decides whether or not the cache needs to be invalidated for a product servicing a request.

In the Web tier, the message bean could have been substituted by registering a javax.jms.MessageListener to a TopicSession, but one thing to note here is that, although the JMS API allows registration of message listeners for particular topics using TopicSession.setMessageListener(), this cannot be used, as most application servers do not support the registration of message listeners in the Web tier due to connection pooling of JMS connections. In the upcoming J2EE v1.5 specification, this rule is mandated. Therefore, we will create a JMS message topic subscriber thread that is started when the Web application is started. The three will block on the TopicSubscriber.receive () method until "product updated" messages are published by the administrative application. The thread maintains and asynchronously updates a static list of updated product IDs based on the incoming messages. The needsRefresh() method checks this list of updated products, which is updated asynchronously by the subscriber thread.

Figure 5 demonstrates how asynchronous caching can be employed effectively in the Web tier of an enterprise application. The business tier is left untouched and is the same as before. The most notable steps with respect to caching in the business tier are:

  • Step 1: An HTTP request comes from the browser requesting the product details handled by a controller servlet and forwarded to the JSP page handling the product details.
  • Steps 2-3: The cache tag executes the refresh policy and invokes the needRefresh() method, which checks to see if the requested product ID is in the list of updated products. This list is updated by the JMS topic subscriber thread. Once the check is done, the product is removed from the updated list as the product is cached after the check.
  • Steps 3a-3b: These are optional steps that are called if the rendered content for the requested product is not available in the cache or the product has been updated.
  • Step 4: If the rendered output for the requested product is available in the cache and the needRefresh() method of the WebRefreshPolicy returns false, the content is retrieved from the cache and returned to the browser.
  • Steps 5-8: This is the same as explained previously in the business tier section.
  • Steps 9-10: When the product administrator updates the product, the application publishes a product updated message to the topic. Both the message bean in the business tier and the subscriber thread in the Web tier are active subscribers to the product update topic and receive the product updated messages and update their respective caches.

Common Challenges

Having a well-defined application framework is critical, as most or all requests coming into the Web and EJB container should be handled by controllers and facades respectively. This enables the framework to provide automatic caching functionality to business applications that use these frameworks.

Caching large objects requires a significant amount of memory. Deciding on the size of memory, the size of the disk allocated for caching, the number of object instances that can be cached, and the algorithm to use when removing entries from the cache due to maximum limits all have a tremendous effect on the cache performance and the overall application and have to be well thought out and defined based on the application.

The OSCACHE cache engine provides a robust mechanism for updating clustered caches. This is disparate from the WebSphere or any other appserver clustering mechanism. OSCACHE uses either JMS or JavaGroups to cluster cache components. In a clustered environment, each node maintains its own local copy of the cache; and when you flush an entry from the cache, a request to clear it is sent to all nodes in the network. This results in lower network traffic as opposed to a single cache with each node querying the cache on every request.

OSCACHE throws an exception if an entry is not present in the cache. This might be a hassle for clients to deal with in extending the appropriate OSCACHE cache administrator and cache classes. It will also block threads depending on cache settings when the cache entry is being updated. Extra care should be taken to make sure this does not cause any problems.

Two-phase commits should be used to ensure that appropriate update messages are published when the EIS Tier is updated. Most EJB containers will handle two-phase commits, although it should be noted that two-phase commits have inherent overhead associated with them.

Choosing a Caching Solution

Here are some considerations to think of before choosing a caching product or solution:
  • Does the caching solution impose requirements on the architecture? Caching solutions should not define any sort of special requirement or change the way a request is handled in the J2EE architecture. The solution should be introducible into an existing architecture without modifications to the Web architecture itself.
  • Is it easy to configure and use? Does it provide a graphical interface to configure the cache?
  • How many mechanisms are available to invalidate the cache? A caching solution should at the least include time, group, and rules as well as provide an extensible programming mechanism for invalidating the cache.
  • Does the caching solution provide clustering support? Is the clustering solution effective? Are there multiple mechanisms for clustering caches? Is the clustering mechanism efficient? A caching solution should at the least provide one form of clustering mechanism and it should ideally result in minimal network traffic.
  • Does the caching solution require additional software to provide the capabilities specified above? A caching solution should not impose any other software requirements.
  • Does the caching solution have a monitoring and reporting mechanism? A caching solution should ideally have monitoring and reporting mechanisms that allow the cache configurations to be further refined and tuned.

Caching engines include:

  • OpenSymphony OSCache
  • WebSphere Dynamic Cache (built into WebSphere Application Server)
  • WebSphere Edge Server
  • JCache (JSR-107)

Conclusion

One of the most important responsibilities for application architects and designers is to provide robust, scalable, and highly available Web applications while imposing minimal hardware requirements. An available and powerful option that goes a long way in accomplishing these goals is the caching of data and/or content at various tiers of the J2EE architecture. To manage caches in real time, asynchronous cache invalidation maintains the health of the cache. It accurately eliminates stale data without redundantly invalidating caches based on fixed-time intervals or periodically checking the state of the cache and invalidating stale entries. Providing application libraries in the initial phases of the design and development process takes advantage of asynchronous caching mechanisms and lay the foundation for well-designed, robust applications.

More Stories By Hari Kanangi

Hari Kanangi works as a consultant for Stratus
Solutions Inc., helping clients design, develop, and
deploy J2EE-based solutions. He is a Sun-certified Java Programmer, J2EE architect, and a WebSphere-certified specialist. Hari has been concentrating on WebSphere, J2EE, and Java-based technologies for the past 5 years.

Comments (2)

Share your thoughts on this story.

Add your comment
You must be signed in to add a comment. Sign-in | Register

In accordance with our Comment Policy, we encourage comments that are on topic, relevant and to-the-point. We will remove comments that include profanity, personal attacks, racial slurs, threats of violence, or other inappropriate material that violates our Terms and Conditions, and will block users who make repeated violations. We ask all readers to expect diversity of opinion and to treat one another with dignity and respect.


IoT & Smart Cities Stories
The platform combines the strengths of Singtel's extensive, intelligent network capabilities with Microsoft's cloud expertise to create a unique solution that sets new standards for IoT applications," said Mr Diomedes Kastanis, Head of IoT at Singtel. "Our solution provides speed, transparency and flexibility, paving the way for a more pervasive use of IoT to accelerate enterprises' digitalisation efforts. AI-powered intelligent connectivity over Microsoft Azure will be the fastest connected pat...
There are many examples of disruption in consumer space – Uber disrupting the cab industry, Airbnb disrupting the hospitality industry and so on; but have you wondered who is disrupting support and operations? AISERA helps make businesses and customers successful by offering consumer-like user experience for support and operations. We have built the world’s first AI-driven IT / HR / Cloud / Customer Support and Operations solution.
Codete accelerates their clients growth through technological expertise and experience. Codite team works with organizations to meet the challenges that digitalization presents. Their clients include digital start-ups as well as established enterprises in the IT industry. To stay competitive in a highly innovative IT industry, strong R&D departments and bold spin-off initiatives is a must. Codete Data Science and Software Architects teams help corporate clients to stay up to date with the mod...
At CloudEXPO Silicon Valley, June 24-26, 2019, Digital Transformation (DX) is a major focus with expanded DevOpsSUMMIT and FinTechEXPO programs within the DXWorldEXPO agenda. Successful transformation requires a laser focus on being data-driven and on using all the tools available that enable transformation if they plan to survive over the long term. A total of 88% of Fortune 500 companies from a generation ago are now out of business. Only 12% still survive. Similar percentages are found throug...
Druva is the global leader in Cloud Data Protection and Management, delivering the industry's first data management-as-a-service solution that aggregates data from endpoints, servers and cloud applications and leverages the public cloud to offer a single pane of glass to enable data protection, governance and intelligence-dramatically increasing the availability and visibility of business critical information, while reducing the risk, cost and complexity of managing and protecting it. Druva's...
BMC has unmatched experience in IT management, supporting 92 of the Forbes Global 100, and earning recognition as an ITSM Gartner Magic Quadrant Leader for five years running. Our solutions offer speed, agility, and efficiency to tackle business challenges in the areas of service management, automation, operations, and the mainframe.
The Jevons Paradox suggests that when technological advances increase efficiency of a resource, it results in an overall increase in consumption. Writing on the increased use of coal as a result of technological improvements, 19th-century economist William Stanley Jevons found that these improvements led to the development of new ways to utilize coal. In his session at 19th Cloud Expo, Mark Thiele, Chief Strategy Officer for Apcera, compared the Jevons Paradox to modern-day enterprise IT, examin...
With 10 simultaneous tracks, keynotes, general sessions and targeted breakout classes, @CloudEXPO and DXWorldEXPO are two of the most important technology events of the year. Since its launch over eight years ago, @CloudEXPO and DXWorldEXPO have presented a rock star faculty as well as showcased hundreds of sponsors and exhibitors! In this blog post, we provide 7 tips on how, as part of our world-class faculty, you can deliver one of the most popular sessions at our events. But before reading...
DSR is a supplier of project management, consultancy services and IT solutions that increase effectiveness of a company's operations in the production sector. The company combines in-depth knowledge of international companies with expert knowledge utilising IT tools that support manufacturing and distribution processes. DSR ensures optimization and integration of internal processes which is necessary for companies to grow rapidly. The rapid growth is possible thanks, to specialized services an...
At CloudEXPO Silicon Valley, June 24-26, 2019, Digital Transformation (DX) is a major focus with expanded DevOpsSUMMIT and FinTechEXPO programs within the DXWorldEXPO agenda. Successful transformation requires a laser focus on being data-driven and on using all the tools available that enable transformation if they plan to survive over the long term. A total of 88% of Fortune 500 companies from a generation ago are now out of business. Only 12% still survive. Similar percentages are found throug...