Discussion: JVM hangs
Have you ever met the issue like JVM hangs? My JVM takes more than 90% system CPU for 4 hours, meanwhile I find that both log4j and "kill -3" don't work at that moment, which makes me in trouble to do the RCA.
I've tried to simulate the case by creating infinite loop to make the CPU usage to be very high (almost 100%), but the log4j and "kill -3" work well.
As a result, I suppose that when the case happened, the JVM was in a "hang" state and would not handle any requests. (including RMI calls)
Do you guys have any idea or suggestions about that?
Is this regularly happen on you? What type of application cause for this? Better if you can provide such details.
I came across such instance few times because of bad multiple thread handling in my applications.
Actually, it's not a regular issue. So far, I have ever met the issue twice in all. I've found the root cause of the first time, it's aroused by the incompatibility between JDK and Solaris patch. The second one, what I can ensure is that the Solaris patch is not installed at all, so it should be a new issue.
My JVM did some RMI calls, fork some external processes like "df -kl" and some other business related transactions. After analyzing the logs collected by the customer ( I don't have the chance to do the on site debugging), I found the log4j did not work at that moment. (no any log output) The JVM took more than 90% CPU for 4 hours, after that the customer reboot the system...
Besides, we have a daemon process which will do "kill -3 <pid>" to dump the JVM when the idle CPU of the system is very low as well to help us do trouble shooting. However, that did not work either...no any JVM thread stack was printed out to the logs.
I'm quite frustrated, even for the first time, I "solved" the issue after Solaris updated the patch. But I still don't know what condition will make the JVM to be "suspended" completely...
any ideas or tips would be extremly appreciated...
There is something suspecting to me. What cause to hold the log4j? May be that log output stream can cause the issue. Can you just look at the last few lines of the log, and try to have an idea about the last process.
Are you working on Windows or Linux or something else?
I'm running the application under solaris. I suspect the output stream of log4j too. But as I know that log4j has its own exclusive stream and will not use the System.in/out/err, how the exclusive stream is impacted? Do you have any clues?
Yes, log4j executes on it's own threads. But need to communicate with your application through the JVM. If anyhow the log4j hang on those streaming and stuff, cause to the VM as well.
So my first suggestion is, if you can easily isolate the application from the log4j and check that there is any miscellaneous process going on in your code. Make sure that your application not cause for this.