I know that we tend to blame every damn thing other than our code when things do not work as expected.
(those were the days, when I used to blame my RAM when my C programs used to crash. :D )
I know.. my code can have bugs but atleast as of now, it's been hard to trace it.
Surprisingly, I get strange results from JVM and that makes me believe all fault is with JVM.
1: JVM crashes quite number of times when my application is running in full throttle.
2: I use Eclipse to debug my code and I am way dissapointed using it. Especially when multiple threads are involved, when I see that my applications seems to be hung, I try to pause the application.
No response from eclipse... nothing happens.. I don't get control of the application at all.
3: I tried changing the default GC and other options as suggested, but nothing of it has helped me.
Are there any good java debuggers which do what we want?
4: I also print the application running and stopped time using -XX option.
Even this stops printing when the applications hangs.
It might be because the application has stopped running for a long time(> 8 hrs) which means GC is running all that while(which is quite absurd). OR
application and GC has already quit but eclipse fails to indicate that to me.
When I see this hang... If I try to stop the application, I am often ignored!
Even if I forcefully close eclipse, huge chunk of memory is still in use by java and I can't even force kill that java application!
I have to reboot.
All these make me think that JVM/eclipse has more bugs than my code. :)
What happens if you run the application without using Eclipse? As far as debuggers go, I would use jconsole to see what line each thread is at.
It is possible to lock up the JVM hard if you get a deadlock in the AWT thread, or during finalization, or various other mechanisms.
Without knowing exactly what your app is doing I would still lay odds on it being your code, because many of us here have written hefty applications with no problem at all from the JVM.
Which JVM are you using?
If the JVM finds itself spending the bulk of its time in GC then it will end up throwing an OutOfMemory exception. This is to prevent the situation you are ascribing to it...so again, locking up is not a sign of a problem in gc.
As said, what happens if you don't run via Eclipse?
Also what does your batching code look like?
It could be your db causing the problem for all we know.
But I doubt these are the problems.
1: my app doesn't involve any UI.
2: My batching process involves processing 1000's of records and storing a few of them.
I have an exact count of how many items I store in memory. Once I exceed the number of in-memory items, I get rid of them.
That's the reason, I am able to run the application for around 15-20 mins processing around 50 million records.
Until this time, the memory consumption is just around 1.5 GB as seen by system monitor.
I didn;t know abt jconsole. Thanks.. I'll give it a try.
3: It's the same path of code which gets executed again and again. If it's memory leak, then it should have got exposed earlier itself.
It's like all of a sudden, after 15-20 mins, GC seems to be busy releasing lot of memory. GC should have thrown OutOfMemory oops error, but it has not done that too.
4: I am using the latest openjdk on linux.
5: I did try running out of eclipse by using ant scripts generated from eclipse directly on command prompt.
But the problem is, when the application hangs, I do not how to debug from there.
6: I tried to get rid of JVM by compiling into native code.
But java compilers seem to be in beta stage. It crashed while compiling my code!!! I used gcj.
What happens when you use an Oracle JRE?
Have you profiled your app to see where the memory is going? Heap dumps at regular intervals, say 5 minutes which would give you 3 or 4 good ones?
I still suspect it's how you're doing your work. The whole "storing some" just rings alarms for me.
Ok, so (we think at least) the lock up has something to do with garbage collection. Are you overriding "finalize" anywhere? If so, you may be introducing a deadlock condition there, which would then appear to be related to garbage collection. Deadlocks can be inadvertently coded in such a way that they happen probabilistically....although such things tend to manifest themselves after days of running to help ensure you can never track them down. ;P
Nopes, I have not touched finalize.
I will try to analyze the memory dumps. I was not aware of these features and I just got used to jconsole!
Every time I strongly believe that problem is with my code, I see JVM crashing. Then I am forced to believe that JVM is the culprit :( I am struck in this loop. :)
If you don't even know the basics of profiling and memory debugging in Java then how can you lay the blame at the garbage collector?
jmap, jhat, Eclipse MAT should all have been things used to see exactly what's going on inside there.
Onve you have some dumps then you might get an idea of what's being cleared up.
Of course, again, we haven't seen any of your code...so it's even more guesswork from our side.
Well, one other thing that can cause the JVM to crash (or possibly lock up) is to replace the class or jar files for your project while the JVM is running. This can happen inadvertently if you are compiling a running project where you run from the same location you compile. If you are doing nightly builds, and you are running for multiple days at a time, this could be the problem.
Instead of creating and dumping all of those objects over and over, why not make them cacheable and reassignable? In other words, create a pool of them, and on each batch, reset their values as needed. Your GC issues (if that's what ey are) will be moot.
That's the thing, though.
There's nothing to actually indicate this is anything to do with GC.
No point trying to fix something that may well not be broken as you'll likely end up making things worse.
1: The program is not compiled after it started running.
2: I did try to cache objects and reuse them as the first resort to isolate GC issue but that didn't help.
3: I am not saying GC is the culprit. I don't have any evidence for that.
All that I am saying is java is unable to handle my application and GC SEEMED to be the culprit.
If my application is buggy, then I don't expect it to run for days and crash after processing more than 200 million records.
As I said earlier, it's the same path of execution and hence the chances of any new timing window creating trouble is very small but yet possible.
There is no database involved, there's no sockets involved... it's just a simple application which runs tirelessly.
But whatever the bug is, the least thing I expect is for JVM to crash.
I see it crash repeatedly.
The latest crash is funny!
*** glibc detected *** /usr/lib/jvm/java-6-openjdk/bin/java: corrupted double-linked list: 0x00007f02044019b0 ***
======= Backtrace: =========
I am now repenting for not using C instead of Java.
I mean, somehow the rest of the enterprise world manages to do absolutely huge things with java... I'd say whatever is wrong is something specific with your app/environment (as in openJDK?).Quote:
I am now repenting for not using C instead of Java.
As I said earlier, and you ignored, what happens when you try a JVM that isn't open JDK?
As quad64bit says, other enterprise apps of hefty proportions survive. I know, I've worked on several.
As for "If my application is buggy, then I don't expect it to run for days and crash after processing more than 200 million records." i would point out that we had one recently that ran for 3 months before hitting an out of memory error caused by the tiniest of leaks in some db code. So no, time-to-crash is no indicator.
But that crash strikes me as a bug in Open JDK. So, again, try the Sun/Oracle JVM.
The only times I've ever seen a JVM crash is:
- when I recompiled classes out from under the JVM
- when I used JNI and had a bug in it
- when using buggy add-on packages that use JNI
Tolls is spot on though. If you think it's a JVM bug, then why have you not tried other JVMs? It might even be a documented problem...have you searched the bugs for the version of OpenJVM you are using?
I tried using a different JVM but that too fails after running sometime.
# A fatal error has been detected by the Java Runtime Environment:
# SIGSEGV (0xb) at pc=0x00007ff4de4f7ef9, pid=2199, tid=140689545758464
# JRE version: 6.0_24-b07
# Java VM: Java HotSpot(TM) 64-Bit Server VM (19.1-b02 mixed mode linux-amd64 compressed oops)
# Problematic frame:
# C [libc.so.6+0x7aef9]
Is there anything else that I can try?
Are you using JNI, or are you using any 3rd party packages that use JNI?