Uploaded image for project: 'Erlang/OTP'
  1. Erlang/OTP
  2. ERL-525

better diagnostics for memory allocation failures

    XMLWordPrintable

    Details

    • Type: Improvement
    • Status: Help Wanted
    • Priority: Minor
    • Resolution: Unresolved
    • Affects Version/s: R16B03-1, 17.5, 18.3.4.2
    • Fix Version/s: None
    • Component/s: erts
    • Labels:
      None

      Description

      When the VM terminates due to a memory allocation failure, it includes a small note at the start of the crash dump file with a message similar to "failed to allocate N bytes of type T".

      With the VM's layered and highly complex memory allocation framework, this is not enough to deduce the root cause of the allocation failure. In particular, if an OS-level call failed, this information should be included. For example:

      failed to allocate N bytes of type heap, due to mmap(<actual mmap params>) failing with ENOMEM in file F.c line L.

      and similar for any failed mremap, sbrk, brk, malloc, etc.

      We have been chasing infrequent and highly non-deterministic crashes due to allocation failures for quite some time. The non-deterministic nature of them, coupled with the fact that the hosts had plenty of free RAM when the VM claimed "out of memory", made it difficult to pinpoint the root cause. A more detailed message in the crash dump, like the above, would have steered us in the right direction much earlier.

      (The root cause turned out to be external to the VM, so the only issue with the VM is the lack of details in the allocation failure message.)

        Attachments

          Activity

            People

            Assignee:
            otp_team_vm Team VM
            Reporter:
            Mikael Pettersson Mikael Pettersson
            Votes:
            1 Vote for this issue
            Watchers:
            3 Start watching this issue

              Dates

              Created:
              Updated: