On a specific host, one process sometimes crashes, so kernel dumping was enabled for it. Once a dump was recorded, but since then there have been three more drops - there are no dumps.

$ cat /etc/security/limits.conf | grep core | grep -v '#' * - core unlimited $ cat /proc/sys/kernel/core_pattern /tmp/core.%e.%p.%h.%t $ ulimit -a core file size (blocks, -c) unlimited data seg size (kbytes, -d) unlimited scheduling priority (-e) 0 file size (blocks, -f) unlimited pending signals (-i) 2063246 max locked memory (kbytes, -l) 64 max memory size (kbytes, -m) unlimited open files (-n) 1024 pipe size (512 bytes, -p) 8 POSIX message queues (bytes, -q) 819200 real-time priority (-r) 0 stack size (kbytes, -s) 10240 cpu time (seconds, -t) unlimited max user processes (-u) 2063246 virtual memory (kbytes, -v) unlimited file locks (-x) unlimited $ cat /proc/$(pgrep myprocess)/limits | grep core Max core file size unlimited unlimited bytes 

There is enough free space (about 17Gb, the dump takes 2-4 Gb). No one stopped the manual process. That was exactly the fall, I judge by the logs:

monit:

 [NOVT Feb 12 05:07:21] error : 'myprocess' process is not running [NOVT Feb 12 05:07:21] info : 'myprocess' trying to restart [NOVT Feb 12 05:07:21] info : 'myprocess' start: /etc/init.d/myprocess [NOVT Feb 12 05:09:21] info : 'myprocess' process is running with pid 22233 

nginx, which sends requests to this process (left only the necessary). We see that at 05:06:20 already returned 502.

 1.2.3.4 myhost - [12/Feb/2016:05:05:49 +0600] "POST someurl" 200 2659 ... 1.2.3.4 myhost - [12/Feb/2016:05:05:49 +0600] "POST someurl" 200 933 ... 1.2.3.4 myhost - [12/Feb/2016:05:06:20 +0600] "POST someurl" 502 166 ... 

I specifically tested and made sure that core dumps are written to the exact same configuration, including when there is already one dump (used kill -SIGSEGV pid ). UPD: tested right on this host: dump written.

The documentation lists possible causes, but it seems that no conditions are changing, so dumps should be recorded always or never.

Questions:

  • Can the linux process somehow shut down abnormally so that it does not fall under the conditions in which the dump is not initialized?
  • What else could be the reason, where to dig, what to research?
  • By the way, I was looking for a suitable label for setting limits in linux, but I did not find it. Know - add) - Nick Volynkin
  • and how the process starts? Can I write a limit directly to an init script? - o2gy
  • @ o2gy cat /proc/$(pgrep myprocess)/limits | grep core cat /proc/$(pgrep myprocess)/limits | grep core confirms that a process that is already running has the correct limits. So far we are digging in the direction of the code with which the process ends. Some codes except 0 also do not cause a dump record. When we dig up - I will write a detailed answer. ) - Nick Volynkin
  • one
    Exactly, maybe there is a return / exit in the process somewhere stupidly there is no peel, so there is no dump. - o2gy
  • in principle, it is better to destroy any exit (0) to heras or to provide in the debug mode that this event gets into the logs - strangeqargo

1 answer 1

From the @ o2gy comment :

Maybe there is a return / exit somewhere in the process? There is no core dump, so the final dump is not generated.