Welcome!

IBM Cloud Authors: Liz McMillan, Pat Romanski, Aruna Ravichandran, Yeshim Deniz, Carmen Gonzalez

Related Topics: Recurring Revenue, Java IoT, Microservices Expo, IBM Cloud

Recurring Revenue: Article

How to Diagnose Java Resource Starvation

Using the IBM Thread & Monitor Dump Analyzer for Java

Where do thread dumps go? The IBM JVM checks each of the following locations for existence and write-permission then stores the thread dump in the first one that's available:

  1. The location specified by the IBM_JAVACOREDIR environment variable if set to (_CEE_DMPTARG on z/OS).
  2. The current working directory of the JVM processes.
  3. The location specified by the TMPDIR environment variable, if set.
  4. The /tmp directory (C:\Temp on Windows).
  5. If the Javadump can't be stored in any of the above, it is put in STDERR.

Note that enough free disk space (possibly up to 2.5MB) must be available for the thread dump file to be written correctly.

Now let's analyze thread dumps with IBM Thread and Monitor Dump Analyzer for Java version. You can download a copy of latest IBM Thread and Monitor Dump Analyzer for Java from http://www.alphaworks.ibm.com/tech/jca. Uncompress the package to a directory then you can find a jar file. Here we have jca31.jar that we'll run with a Java virtual machine.

Usage

<Java runtime environment binary path>java -Xmx[heap size] -jar jca<IBM Thread and Monitor Dump Analyzer version>.jar_

To run the latest version of IBM Thread and Monitor Dump Analyzer for Java we need Java runtime environment 5.0 or higher version. IBM Thread and Monitor Dump Analyzer for Java can analyze thread dumps taken from 1.3.1, 1.4.1, 1.4.2,5.0, or 6.0. If you see java.lang.OutOfMemoryError while you're processing thread dumps, try increasing the maximum thread size (-Xmx) value to give the JVM more memory. The maximum heap size shouldn't be larger than the size of the available physical memory size to void performance degradation.

Starting IBM® Thread and Monitor Dump Analyzer for Java version 3.1
java -Xmx500m -jar jca31.jar

Figure 2 shows the first screen from IBM Thread and Monitor Dump Analyzer for Java.

We can click on an open icon or File->Open to open the thread dumps. Select thread dumps and click on the Open button to analyze the thread dumps. You don't need to worry about which vendor's Java virtual machine to generate thethread dumps. IBM Thread and Monitor Dump Analyzer for Java automatically detects most Java thread dump formats, such as IBM Javacores, Solaris thread dumps, and HP-UX thread dumps (see Figure 3). Please don't look for Javacores on Solaris and HP-UX systems.

Overall analysis can be done by clicking on a row in the Thread Dump List. We can get basic information such as File name, Cause of thread dump, timestamp of thread dump, Process ID, Java version, Free Java heap size, Allocated Java heap size, Memory Segment Analysis, Number of loaded classes in Java heap and Number of classloaders in the Java heap (see Figure 4).

In Figure 5, we have the Command line, Thread Status Analysis, Thread Method Analysis, and Thread Aggregation Analysis.

We see two blocked threads (pink background color) in the Thread Status Analysis.

There are lots of states a thread can run into.

  • Runnable: the thread can run when given the chance.
  • Condition Wait: the thread is waiting. For example, because:
    - A sleep() call is made
    - The thread has been blocked for I/O
    - A wait() method is called to wait on a monitor being notified
    - The thread is synchronizing with another thread with a join() call
  • Monitor Wait: thread is waiting on a monitor lock that's no longer available on IBM JVM 5.0 or 6.0.
  • Suspended: the thread has been suspended by another thread.
  • Zombie: the thread has been killed.
  • Blocked: the thread is waiting to obtain a lock that something else currently owns. This replaces the old Monitor Wait.
  • Parked: the thread has been parked by the new concurrency API (java.util.concurrent).

Let's select all thread dumps and run Compare Threads by clicking on the right mouse button, icon, or Analysis->Compare Threads menu (see Figure 6).

We have four hang suspects. We can disregard the Signal Dispatcher thread since we got the thread dumps with signal 3 (see Figure 7).

Be aware that I use the word "suspect" instead of "perpetrator" because hang suspects are threads that have the same stack traces in multiple thread dumps. It's not unusual to see Signal Dispatcher threads have same stack trace in multiple thread dumps.

Let's click on one of Plato's cells that has method name. In the middle pane, we can see there are two threads waiting. We can click on each of them to see detailed information. In the right pane, we can see detailed information about this Plato thread like thread name, state, monitor, Java stack trace, and native stack trace.

In the left pane, we have a table and cells that include method name, thread state, and hang indicator. The red background color indicates that the thread is a hang suspect.

In this analysis screen, it shows Plato, Socrates, and Aristotle are hang suspects since they have red background colors. Aristotle and Socrates are blocked by Plato because we see Aristotle and Socrates in Plato'sWaiting Thread List.

Do you remember the method, run() in the class, Philosopher that calls the sleep method for two seconds when two chopsticks are available? So Plato is eating his dinner now. He's not sleeping (see Figure 8).

If we click on Socrates which is one of blocked threads, we can see who is blocking Socrates (see Figure 9). We already know that it's Plato. Let's click on Socrates' method cell. In the bottom of the middle pane, it shows that Plato is blocking this thread.

We can take a look at the problem from another perspective: monitor comparison analysis (see Figure 10). Let's start monitor comparison analysis by selecting the monitor comparison analysis icon or Analysis->Compare Monitors menu. It shows that Plato has a thread that owns monitors because there's a row for the Plato thread. Each cell has a red background color. So it's a hang suspect. This could be a problem. A thread is a hang suspect and owns monitors. If there are any threads waiting for the monitors, it's a problem. If we click on any of the red cells, we can see there are two waiting threads in the middle pane. Of course you can click on each of waiting threads to check for detailed information about each waiting thread.

Let's close the Compare Monitors window and Compare Threads windows, select a thread dump and select the Analysis->Thread Detail menu to take a look at more details about the thread dump (see Figure 11).

This is the single thread dump analysis (see Figure 12). On the right pane, we can see the overall analysis of this thread dump. Aristotle and Socrates are blocked. Plato has an icon that indicates it owns the monitors .

Let's click on Plato's state cell to display the waiting threads (see Figure 13). We can also confirm that Plato blocks the Aristotle and Socrates threads.

There's one last window I'd like to share with readers. We can close the Thread Detail window, select a thread dump from Thread Dump List, and select Analysis->Monitor Detail menu to visualize monitors and threads in the thread dump. In the Monitor detail view of the thread dump, we can see the picture more clearly if it's not clear enough from the previous windows.

We have a Plato in the tree view and it has two children, Socrates and Aristotle, which means Plato owns monitors that Socrates and Aristotle are trying to acquire. The numbers [2/2] in front of Plato mean there are 2(first/TotalSize) waiting threads directly/indirectly and 2(second/Size) waiting threads directly. If Socrates owned a monitor and a thread waiting to own the monitor, that monitor is indirectly owned by Plato's thread. If that's the case we would see [3/2] in front of Plato.

TotalSize is just like totalsize in the HeapAnalyzer (see Figure 14). It's the total number of threads blocked directly or indirectly by a thread. Size is just like size in the HeapAnalyzer. That's the number of threads blocked directly by a thread. We don't have nested monitor ownership in this thread dump. So total size is the same as size in this test case. So far we've analyzed thread dumps generated by an IBM Java virtual machine to investigate the resource starvation problem. Next we'll analyze a thread dump from a different vendor. As shown in Figure 15, let's open a sun.log file from Listing 8 using the same method as in Figure 3.

You may have noticed that there are three entries with suffixes "_1,"_2," and "_3" in the Thread Dump List even though we opened only one file - sun.log. That indicates that there are three thread dumps in sun.log. This vendor's thread dump doesn't have as much information as the IBM thread dumps provided in Figures 4 and 5. This vendor's JVM generates thread dumps to a standard out whereas IBM's Java virtual machine created separate files or Javacores. Thread dumps' location can be quite different from one vendor's JVM to another. You might want to check the documentation of your Java virtual machine to find out where the thread dumps are and how to generate them.

Let's select three entries and invoke the thread comparison analysis as we did in Figure 6.

We can see Aristotle and Socrates are waiting on the monitor whereas they're blocked in Figure 7. Figure 16 shows a thread comparison analysis from IBM Java thread dumps/javacores. I personally prefer to see "waiting on monitor" rather than "blocked" in resource starvation problems caused by monitors because it seems more explicit than just "blocked." This is another reminder that each vendor has its own format or definition in its thread dumps.

Let's click on Plato's first method cell to see what else we can find out. From this vendor's thread dump, we can see where monitor locks are (see Figure 17).

We can see this thread locked two different monitors, [0x244dba80] (a com.ibm.jinwoo.starvation.Chopstick) and [0x244dba60] (a com.ibm.jinwoo.starvation.Chopstick) before executing line 62 of Philosopher.java. This is very useful information considering we didn't run a debugger. What we did just sent a signal to this vendor's JVM process.

Java stack trace from another vendor's Java virtual machine
at java.lang.Thread.sleep(Native Method)
at com.ibm.jinwoo.starvation.Philosopher.eat(Philosopher.java:72)
at com.ibm.jinwoo.starvation.Philosopher.run(Philosopher.java:62)
- locked [0x244dba60] (a com.ibm.jinwoo.starvation.Chopstick)
- locked [0x244dba80] (a com.ibm.jinwoo.starvation.Chopstick)

We didn't see this kind of information from IBM's thread dump in Figure 13 where we also didn't see where the monitor locks are happening. But we can see one more stack, java/lang/Thread.sleep(Thread.java:850) from IBM's thread dump, where another vendor's thread dump has only one stack for the java/long/Thread.sleep method.

Java stack trace from a IBM Java virtual machine
at java/lang/Thread.sleep(Native Method)
at java/lang/Thread.sleep(Thread.java:850)
at com/ibm/jinwoo/starvation/Philosopher.eat(Philosopher.java:72)
at com/ibm/jinwoo/starvation/Philosopher.run(Philosopher.java:62)

Let's click on a method cell of Socrates (see Figure 18). We can see where the thread is waiting to lock the monitor. We don't see this information in IBM's thread dump in Figure 9.

Let's invoke the thread detail analysis of a thread as we did in Figure 11. The analysis shown in Figure 19 is a little different than in Figure 12, the thread detail analysis on IBM's thread dump. We can see two threads are waiting for monitors. In Figure 12, we see two blocked threads.

You can check out the monitor detail analysis and other analysis windows to see any differences between two different JVM providers. Overall there's enough information in both types of thread dumps to diagnose resource starvation problems especially when it's due to monitor contention.

What if you don't use Java's built-in monitor and implement your own monitor? Let's experiment with this scenario. Don't worry. I'll write test classes for you. Luckily we can reuse DiningPhilosopher.java.

In this revised version of the Chopstick class, we need to add an instance variable and two instance methods to implement our own monitor. An instance of Philosopher, owner is added to indicate ownership of a chopstick. The methods, pickUp() and putDown(), are added for exclusive access to a chopstick.

If someone got a chopstick, a philosopher will wait for one second. If nobody picked up the chopstick, he will pick it up. If a philosopher owns a chopstick, he'll make the owner of the chopstick null and notify. See Listing 5.

We need to change the Philosopher class a little bit to invoke methods, pickUp() and putDown(), instead of using ‘synchronized ().' You might have noticed that it's much easier to use Java's built-in monitor than to use our own monitor but there are times when you really want to implement your own monitor for granular control.

A philosopher will pick up the left chopstick first then the right chopstick. If both chopsticks are available, he'll eat for two seconds and put them down for the other philosophers. (Not really appetizing, is it.) See Listing 6.

We can follow the same procedures we used above to compile, run, and generate thread dumps during resource starvation.

Now it's time for the diagnosis. Let's open another vendor's thread dumps first to be fair.

Let's click on one of the thread dumps (see Figure 20). This time I redirected standard out to a file, sun.log. Surprise! We don't see any threads waiting on the monitor!! You can compare this with Figure 15 where we see two threads waiting on monitors. There's no monitor information whereas we see information on two monitors in Figure 15.

Let's select three thread dumps and invoke a thread comparison analysis as we did in Figure 6. We can see Plato, Aristotle, and Socrates are just in Object.wait() (see Figure 21). We don't see any ‘waiting on monitor.'

We can click on one of Plato's method cells to find more details (see Figure 22). We can compare this with Figure 17. There's no information about which threads are waiting.

Let's close compare threads window, select a thread dump from Thread Dump List, and select Analysis->Monitor Detail menu to visualize the monitors and threads in a thread dump. Another surprise! You don't see anything like Figure 14. There is no monitor information at all (see Figure 23).

Can IBM's thread dumps do any better? Let's open them up. In Figure 24 we don't see any blocked threads as in Figure 5.

Let's select all the thread dumps and run a Compare Threads by clicking on the right mouse button, icon, or Analysis->Compare Threads menu (see Figure 25). In Figure 6, we saw blocked threads but we don't see them anymore.

Plato's detailed view doesn't show us any clue (see Figure 26). There are no waiting threads. There's no monitor ownership.

If you implement your own monitor and you encounter resource starvation caused by the monitor, you are on your own! None of the Java virtual machines actually analyzed application flow and provided us with any useful information to diagnose this problem. If possible, we need to use Java's built-in monitor to take advantage of information available in the thread dumps. Otherwise, you need to run a debugger. You might know that it's very challenging to run a debugger on production systems. In most cases, you're not allowed to run a debugger on production systems without any management approval.

Summary
This article illustrated resource starvation caused by monitor contention, implemented two test cases to simulate resource starvation using a revised version of the Dining Philosophers Problem and demonstrated how easily IBM Thread and Monitor Dump Analyzer for Java can help diagnose s resource starvation problem with step-by-step instructions.

You can diagnose a resource starvation problem with a debugger. What if you can't run a debugger in your environment? What if you can't recreate the problem with your debugger that your client reported? Thread dumps are very useful when a debugger isn't the best choice. What's even better is that we have IBM Thread and Monitor Dump Analyzer for Java to analyze thread dumps if you don't want to read through hundreds or thousands of thread stacks in raw thread dumps.

There are more features in IBM Thread and Monitor Dump Analyzer for Java that I haven't talked about. I hope I can introduce you to them in future articles.

References

More Stories By Jinwoo Hwang

Jinwoo Hwang is a software engineer, inventor, author, and technical leader at IBM WebSphere Application Server Technical Support in Research Triangle Park, North Carolina. He joined IBM in 1995 and worked with IBM Global Learning Services, IBM Consulting Services, and software development teams prior to his current position at IBM. He is an IBM Certified Solution Developer and IBM Certified WebSphere Application Server System Administrator as well as a SUN Certified Programmer for the Java platform. He is the architect and creator of the following technologies:

Mr. Hwang is the author of the book C Programming for Novices (ISBN:9788985553643, Yonam Press, 1995) as well as the following webcasts and articles:

Mr. Hwang is the author of the following IBM technical articles:

  • VisualAge Performance Guide,1999
  • CORBA distributed object applet/servlet programming for IBM WebSphere Application Server and VisualAge for Java v2.0E ,1999
  • Java CORBA programming for VisualAge for Java ,1998
  • MVS/CICS application programming for VisualAge Generator ,1998
  • Oracle Native/ODBC application programming for VisualAge Generator ,1998
  • MVS/CICS application Web connection programming for VisualAge Generator ,1998
  • Java applet programming for VisualAge WebRunner ,1998
  • VisualAge for Java/WebRunner Server Works Java Servlet Programming Guide ,1998
  • RMI Java Applet programming for VisualAge for Java ,1998
  • Multimedia Database Java Applet Programming Guide ,1997
  • CICS ECI Java Applet programming guide for VisualAge Generator 3.0 ,1997
  • CICS ECI DB2 Application programming guide for VigualGen, 1997
  • VisualGen CICS ECI programming guide, 1997
  • VisualGen CICS DPL programming guide, 1997

Mr. Hwang holds the following patents in the U.S. / other countries:


Comments (0)

Share your thoughts on this story.

Add your comment
You must be signed in to add a comment. Sign-in | Register

In accordance with our Comment Policy, we encourage comments that are on topic, relevant and to-the-point. We will remove comments that include profanity, personal attacks, racial slurs, threats of violence, or other inappropriate material that violates our Terms and Conditions, and will block users who make repeated violations. We ask all readers to expect diversity of opinion and to treat one another with dignity and respect.


@ThingsExpo Stories
As hybrid cloud becomes the de-facto standard mode of operation for most enterprises, new challenges arise on how to efficiently and economically share data across environments. In his session at 21st Cloud Expo, Dr. Allon Cohen, VP of Product at Elastifile, will explore new techniques and best practices that help enterprise IT benefit from the advantages of hybrid cloud environments by enabling data availability for both legacy enterprise and cloud-native mission critical applications. By rev...
As popularity of the smart home is growing and continues to go mainstream, technological factors play a greater role. The IoT protocol houses the interoperability battery consumption, security, and configuration of a smart home device, and it can be difficult for companies to choose the right kind for their product. For both DIY and professionally installed smart homes, developers need to consider each of these elements for their product to be successful in the market and current smart homes.
Join IBM November 1 at 21st Cloud Expo at the Santa Clara Convention Center in Santa Clara, CA, and learn how IBM Watson can bring cognitive services and AI to intelligent, unmanned systems. Cognitive analysis impacts today’s systems with unparalleled ability that were previously available only to manned, back-end operations. Thanks to cloud processing, IBM Watson can bring cognitive services and AI to intelligent, unmanned systems. Imagine a robot vacuum that becomes your personal assistant tha...
In his Opening Keynote at 21st Cloud Expo, John Considine, General Manager of IBM Cloud Infrastructure, will lead you through the exciting evolution of the cloud. He'll look at this major disruption from the perspective of technology, business models, and what this means for enterprises of all sizes. John Considine is General Manager of Cloud Infrastructure Services at IBM. In that role he is responsible for leading IBM’s public cloud infrastructure including strategy, development, and offering ...
SYS-CON Events announced today that N3N will exhibit at SYS-CON's @ThingsExpo, which will take place on Oct 31 – Nov 2, 2017, at the Santa Clara Convention Center in Santa Clara, CA. N3N’s solutions increase the effectiveness of operations and control centers, increase the value of IoT investments, and facilitate real-time operational decision making. N3N enables operations teams with a four dimensional digital “big board” that consolidates real-time live video feeds alongside IoT sensor data a...
In a recent survey, Sumo Logic surveyed 1,500 customers who employ cloud services such as Amazon Web Services (AWS), Microsoft Azure, and Google Cloud Platform (GCP). According to the survey, a quarter of the respondents have already deployed Docker containers and nearly as many (23 percent) are employing the AWS Lambda serverless computing framework. It’s clear: serverless is here to stay. The adoption does come with some needed changes, within both application development and operations. Tha...
SYS-CON Events announced today that Avere Systems, a leading provider of enterprise storage for the hybrid cloud, will exhibit at SYS-CON's 21st International Cloud Expo®, which will take place on Oct 31 - Nov 2, 2017, at the Santa Clara Convention Center in Santa Clara, CA. Avere delivers a more modern architectural approach to storage that doesn't require the overprovisioning of storage capacity to achieve performance, overspending on expensive storage media for inactive data or the overbui...
With major technology companies and startups seriously embracing Cloud strategies, now is the perfect time to attend 21st Cloud Expo October 31 - November 2, 2017, at the Santa Clara Convention Center, CA, and June 12-14, 2018, at the Javits Center in New York City, NY, and learn what is going on, contribute to the discussions, and ensure that your enterprise is on the right path to Digital Transformation.
SYS-CON Events announced today that mruby Forum will exhibit at the Japan External Trade Organization (JETRO) Pavilion at SYS-CON's 21st International Cloud Expo®, which will take place on Oct 31 – Nov 2, 2017, at the Santa Clara Convention Center in Santa Clara, CA. mruby is the lightweight implementation of the Ruby language. We introduce mruby and the mruby IoT framework that enhances development productivity. For more information, visit http://forum.mruby.org/.
Digital transformation is changing the face of business. The IDC predicts that enterprises will commit to a massive new scale of digital transformation, to stake out leadership positions in the "digital transformation economy." Accordingly, attendees at the upcoming Cloud Expo | @ThingsExpo at the Santa Clara Convention Center in Santa Clara, CA, Oct 31-Nov 2, will find fresh new content in a new track called Enterprise Cloud & Digital Transformation.
Amazon is pursuing new markets and disrupting industries at an incredible pace. Almost every industry seems to be in its crosshairs. Companies and industries that once thought they were safe are now worried about being “Amazoned.”. The new watch word should be “Be afraid. Be very afraid.” In his session 21st Cloud Expo, Chris Kocher, a co-founder of Grey Heron, will address questions such as: What new areas is Amazon disrupting? How are they doing this? Where are they likely to go? What are th...
Most technology leaders, contemporary and from the hardware era, are reshaping their businesses to do software. They hope to capture value from emerging technologies such as IoT, SDN, and AI. Ultimately, irrespective of the vertical, it is about deriving value from independent software applications participating in an ecosystem as one comprehensive solution. In his session at @ThingsExpo, Kausik Sridhar, founder and CTO of Pulzze Systems, will discuss how given the magnitude of today's applicati...
SYS-CON Events announced today that NetApp has been named “Bronze Sponsor” of SYS-CON's 21st International Cloud Expo®, which will take place on Oct 31 – Nov 2, 2017, at the Santa Clara Convention Center in Santa Clara, CA. NetApp is the data authority for hybrid cloud. NetApp provides a full range of hybrid cloud data services that simplify management of applications and data across cloud and on-premises environments to accelerate digital transformation. Together with their partners, NetApp emp...
Smart cities have the potential to change our lives at so many levels for citizens: less pollution, reduced parking obstacles, better health, education and more energy savings. Real-time data streaming and the Internet of Things (IoT) possess the power to turn this vision into a reality. However, most organizations today are building their data infrastructure to focus solely on addressing immediate business needs vs. a platform capable of quickly adapting emerging technologies to address future ...
SYS-CON Events announced today that SkyScale will exhibit at SYS-CON's 21st International Cloud Expo®, which will take place on Oct 31 – Nov 2, 2017, at the Santa Clara Convention Center in Santa Clara, CA. SkyScale is a world-class provider of cloud-based, ultra-fast multi-GPU hardware platforms for lease to customers desiring the fastest performance available as a service anywhere in the world. SkyScale builds, configures, and manages dedicated systems strategically located in maximum-security...
Join IBM November 1 at 21st Cloud Expo at the Santa Clara Convention Center in Santa Clara, CA, and learn how IBM Watson can bring cognitive services and AI to intelligent, unmanned systems. Cognitive analysis impacts today’s systems with unparalleled ability that were previously available only to manned, back-end operations. Thanks to cloud processing, IBM Watson can bring cognitive services and AI to intelligent, unmanned systems. Imagine a robot vacuum that becomes your personal assistant th...
SYS-CON Events announced today that Avere Systems, a leading provider of hybrid cloud enablement solutions, will exhibit at SYS-CON's 21st International Cloud Expo®, which will take place on Oct 31 – Nov 2, 2017, at the Santa Clara Convention Center in Santa Clara, CA. Avere Systems was created by file systems experts determined to reinvent storage by changing the way enterprises thought about and bought storage resources. With decades of experience behind the company’s founders, Avere got its ...
SYS-CON Events announced today that Taica will exhibit at the Japan External Trade Organization (JETRO) Pavilion at SYS-CON's 21st International Cloud Expo®, which will take place on Oct 31 – Nov 2, 2017, at the Santa Clara Convention Center in Santa Clara, CA. ANSeeN are the measurement electronics maker for X-ray and Gamma-ray and Neutron measurement equipment such as spectrometers, pulse shape analyzer, and CdTe-FPD. For more information, visit http://anseen.com/.
High-velocity engineering teams are applying not only continuous delivery processes, but also lessons in experimentation from established leaders like Amazon, Netflix, and Facebook. These companies have made experimentation a foundation for their release processes, allowing them to try out major feature releases and redesigns within smaller groups before making them broadly available. In his session at 21st Cloud Expo, Brian Lucas, Senior Staff Engineer at Optimizely, will discuss how by using...
In this strange new world where more and more power is drawn from business technology, companies are effectively straddling two paths on the road to innovation and transformation into digital enterprises. The first path is the heritage trail – with “legacy” technology forming the background. Here, extant technologies are transformed by core IT teams to provide more API-driven approaches. Legacy systems can restrict companies that are transitioning into digital enterprises. To truly become a lead...