After compiling my code, running it several times successfully on my machine I deployed it on the production server and once I did it ran for around 5 minutes then terminated with nothing but the word “Killed”. An ambiguous message that provides no explanation why this happened and who and why killed it.
Googling the word “Killed” didn’t really at first, however digging into it I found out that the killed message is often associated with the OOM Killer, a Linux daemon that hunts out misbehaving applications and then terminates it with the “Killed” message, it seemed that OOM was picking my application every time, I went back to the development server, profiled the application thoroughly, fixed minor memory leaks, even forced garbage collection on a timer, none of that really helped.
I read about how OOM Killer works, pages and pages discussing if its good or bad, all agreed on one fact though, once OOM Killer starts hunting down applications soon enough your system will need to be restarted, its hunting accuracy is far from perfect. you can track a certain application’s OOM Score by the following command
This score is whether the application will get terminated or not, if you want an application to be exempted from OOM Killer you can modify the value in oom_adj however this is highly un-recommended.
echo -17> /proc/<PID>/oom_adj
I didn’t need to do that though, the machine hung up, had to do a hard restart, once the machine was restarted I ran the same exact code and it ran perfectly, so far its been running for several hours without a glitch, gradually increasing the machine’s load everything is still working, so obviously that was the mandatory restart required for any java application server running for extended periods of time (my original up time was around 200 days).