Uploaded image for project: 'Erlang/OTP'
  1. Erlang/OTP
  2. ERL-959

Distribution inet send_cnt & recv_cnt reports incorrect values

    Details

    • Type: Bug
    • Status: Resolved
    • Priority: Major
    • Resolution: Not a Bug
    • Affects Version/s: 22.0.1
    • Fix Version/s: None
    • Component/s: inets
    • Labels:
      None

      Description

      Packet counts reported by distribution link are either too low or too high.

      For example, the network interface reports ~3kpps TX in total, but send_cnt via erlang:port_info/2 reports 0.4pps (7500x fewer pps).

      On the same host, the network interface reports ~3kpps RX, but recv_cnt reports 16kpps (5x more).

      Furthermore, if send_cnt is 0.4pps how can send_bytes be 100MB/s ? A TCP packet cannot exceed 64K in Linux, and according to recv_cnt / recv_bytes received packets do not exceed 1.5KB (the Ethernet MTU site), but sent packets exceed 150MB...

      I've attached the relevant metrics screenshots. Will package the entire setup in Docker Compose and share so that you can reproduce locally.

      1. erl-959.tar
        107 kB
        Gerhard Lazu
      2. erl-959.tar.sha256
        0.1 kB
        Gerhard Lazu
      1. avg-inet-packet-size-explain.png
        480 kB
      2. no-tls-dist-inet-avg-packet-size-received.png
        1.46 MB
      3. no-tls-dist-inet-avg-packet-size-sent.png
        1.48 MB
      4. no-tls-dist-inet-overview.png
        1.63 MB
      5. rabbitmq-overview-tls-dist-1mb-inet-buffer.png
        1009 kB
      6. rabbitmq-overview-tls-dist-default-inet-buffer.png
        958 kB
      7. tls-dist-1mb-inet-buffer.png
        1.48 MB
      8. tls-dist-default-inet-buffer.png
        1.44 MB
      9. tls-dist-inet-avg-packet-size-vs-avg-tcp-packet-values.png
        2.16 MB
      10. tls-dist-processes-1mb-inet-buffer.png
        1.56 MB
      11. tls-dist-processes-default-inet-buffer.png
        1.70 MB

        Activity

        gerhard Gerhard Lazu created issue -
        Hide
        gerhard Gerhard Lazu added a comment - - edited

        For extra context, this fits within prometheus.erl - Add a collector for distribution related metrics. Distribution uses TLS, Erlang 22.0.1 runs on Ubuntu 18.04 LTS within Docker with Kernel 4.9.125-linuxkit.

        Show
        gerhard Gerhard Lazu added a comment - - edited For extra context, this fits within prometheus.erl - Add a collector for distribution related metrics . Distribution uses TLS, Erlang 22.0.1 runs on Ubuntu 18.04 LTS within Docker with Kernel 4.9.125-linuxkit.
        Hide
        gerhard Gerhard Lazu added a comment -

        Attaching all files required to reproduce the issue described here. See the README.md in the tarball for more info. File SHA256 is 20d795899dcdc129487c9c6411f29b67bd3ac34ef6a05623bcb75f81b13c939e

        Show
        gerhard Gerhard Lazu added a comment - Attaching all files required to reproduce the issue described here. See the README.md in the tarball for more info. File SHA256 is 20d795899dcdc129487c9c6411f29b67bd3ac34ef6a05623bcb75f81b13c939e
        gerhard Gerhard Lazu made changes -
        Field Original Value New Value
        Attachment erl-959.tar.sha256 [ 11339 ]
        Attachment erl-959.tar [ 11340 ]
        gerhard Gerhard Lazu made changes -
        Description Packet counts reported by distribution link are either too low or too high.

        For example, the network interface reports ~3kpps TX in total, but send_cnt via erlang:port_info/2 reports 0.4pps (7500x fewer pps).

        On the same host, the network interface reports ~3kpps RX, but recv_cnt reports 16kpps (5x more).

        Furthermore, if send_cnt is 0.4pps how can send_bytes be 100MB/s ? A TCP packet cannot exceed 64K in Linux, and according to recv_cnt / recv_bytes received packets do not exceed 1.5K (the MTU site), but sent packets exceed 150MB.

        I've attached the relevant metrics screenshots. Will package the entire setup in Docker Compose and share so that you can reproduce locally.
        Packet counts reported by distribution link are either too low or too high.

        For example, the network interface reports ~3kpps TX in total, but send_cnt via erlang:port_info/2 reports 0.4pps (7500x fewer pps).

        On the same host, the network interface reports ~3kpps RX, but recv_cnt reports 16kpps (5x more).

        Furthermore, if send_cnt is 0.4pps how can send_bytes be 100MB/s ? A TCP packet cannot exceed 64K in Linux, and according to recv_cnt / recv_bytes received packets do not exceed 1.5KB (the Ethernet MTU site), but sent packets exceed 150MB...

        I've attached the relevant metrics screenshots. Will package the entire setup in Docker Compose and share so that you can reproduce locally.
        rickard Rickard Green made changes -
        Assignee Team PS [ otp_team_ps ]
        Hide
        lukas Lukas Larsson added a comment -

        Hello,

        I think there is some confusion about terminology. The packet referred to in the inet:getstat/2 documentation is the not the number of TCP packets, but rather the number of inet packets, which for the distribution channel is the number of distribution messages sent.

        I don't know if this explains all of the discrepancies, but it should explain some of them.

        Show
        lukas Lukas Larsson added a comment - Hello, I think there is some confusion about terminology. The packet referred to in the inet:getstat/2 documentation is the not the number of TCP packets, but rather the number of inet packets , which for the distribution channel is the number of distribution messages sent. I don't know if this explains all of the discrepancies, but it should explain some of them.
        Hide
        gerhard Gerhard Lazu added a comment - - edited

        Hi Lukas,

        Your answer puts those packets into context, it definitely helps.

        How would you explain the fact that rabbit@rabbitmq-dist-tls1 sends 0.3 pps to rabbit@rabbitmq-dist-tls2, while rabbit@rabbitmq-dist-tls2 receives 27.7 kpps from rabbit@rabbitmq-dist-tls1? You can see this in the second screenshot, the one with the tooltips. This results in the receiving node reporting that all inet packets are 1.5KB (which matches the MTU size), while the sending node reports inet packets in excess of 150MB. If the sent messages are counted as inet packets before they get split into TCP packets, that is understandable. In this case I would expect the receiving node to report TCP packets and not Ethernet frames - which currently seems to be the case.

        Show
        gerhard Gerhard Lazu added a comment - - edited Hi Lukas, Your answer puts those packets into context, it definitely helps. How would you explain the fact that rabbit@rabbitmq-dist-tls1 sends 0.3 pps to rabbit@rabbitmq-dist-tls2 , while rabbit@rabbitmq-dist-tls2 receives 27.7 kpps from rabbit@rabbitmq-dist-tls1 ? You can see this in the second screenshot, the one with the tooltips. This results in the receiving node reporting that all inet packets are 1.5KB (which matches the MTU size), while the sending node reports inet packets in excess of 150MB. If the sent messages are counted as inet packets before they get split into TCP packets, that is understandable. In this case I would expect the receiving node to report TCP packets and not Ethernet frames - which currently seems to be the case.
        Hide
        lukas Lukas Larsson added a comment -

        I don't have any explanation for that. I tried to replicate the behaviour using the docker-compose files that you provided, but on my system the values seems to correspond on the sending and receiving side.

        Show
        lukas Lukas Larsson added a comment - I don't have any explanation for that. I tried to replicate the behaviour using the docker-compose files that you provided, but on my system the values seems to correspond on the sending and receiving side.
        Hide
        gerhard Gerhard Lazu added a comment -

        This doesn't feel like a bug, marking it as such.

        Would like to double-check that our understanding is accurate:

        1. Sent inet packets are arguments passed to gen_tcp:send/2 or equivalent. inet packets are split into TCP packets by the system kernel. The number reported by send_cnt is the message payload (inet packet) that gen_tcp:send/2 is called with.

        2. Received inet packets are data returned by gen_tcp:recv/2. The maximum size of these packets is determined by the inet dist buffer, which defaults to 1460 bytes. This can be configured by setting i.e.:

        -kernel inet_dist_connect_options [{buffer,1048576}] -kernel inet_dist_listen_options [{buffer,1048576}]
        

        Increasing this setting as per the above configuration yields the following result (attaching screenshot shortly).

        Show
        gerhard Gerhard Lazu added a comment - This doesn't feel like a bug, marking it as such. Would like to double-check that our understanding is accurate: 1. Sent inet packets are arguments passed to gen_tcp:send/2 or equivalent. inet packets are split into TCP packets by the system kernel. The number reported by send_cnt is the message payload (inet packet) that gen_tcp:send/2 is called with. 2. Received inet packets are data returned by gen_tcp:recv/2. The maximum size of these packets is determined by the inet dist buffer, which defaults to 1460 bytes. This can be configured by setting i.e.: -kernel inet_dist_connect_options [{buffer,1048576}] -kernel inet_dist_listen_options [{buffer,1048576}] Increasing this setting as per the above configuration yields the following result (attaching screenshot shortly).
        gerhard Gerhard Lazu made changes -
        Status Open [ 1 ] Resolved [ 5 ]
        Assignee Team PS [ otp_team_ps ] Lukas Larsson [ lukas ]
        Resolution Not a Bug [ 10102 ]
        Hide
        lukas Lukas Larsson added a comment -

        The names of your distributed nodes kind of implies that you are running the Erlang distribution over TLS, is that correct? If you do, then that changes what is considered a packet as it will batch things and optimize in order to minimize the overhead of TLS.

        1. Sent inet packets are arguments passed to gen_tcp:send/2 or equivalent. inet packets are split into TCP packets by the system kernel. The number reported by send_cnt is the message payload (inet packet) that gen_tcp:send/2 is called with.

        Yes. Though I would like to amend that "inet packets are split *or joined* into TCP packets by the system kernel".

        2. Received inet packets are data returned by gen_tcp:recv/2. The maximum size of these packets is determined by the inet dist buffer, which defaults to 1460 bytes.

        Yes, if the packet mode is set to 0. If it is not set to 0, then it is the amount of inet packets sent by the other side.

        Show
        lukas Lukas Larsson added a comment - The names of your distributed nodes kind of implies that you are running the Erlang distribution over TLS, is that correct? If you do, then that changes what is considered a packet as it will batch things and optimize in order to minimize the overhead of TLS. 1. Sent inet packets are arguments passed to gen_tcp:send/2 or equivalent. inet packets are split into TCP packets by the system kernel. The number reported by send_cnt is the message payload (inet packet) that gen_tcp:send/2 is called with. Yes. Though I would like to amend that "inet packets are split * or joined * into TCP packets by the system kernel". 2. Received inet packets are data returned by gen_tcp:recv/2. The maximum size of these packets is determined by the inet dist buffer, which defaults to 1460 bytes. Yes, if the packet mode is set to 0. If it is not set to 0, then it is the amount of inet packets sent by the other side.
        Hide
        gerhard Gerhard Lazu added a comment - - edited

        The reason why you didn't see any distribution traffic is because the nodes sometimes don't auto-cluster. It's a race condition that we were able to reproduce on a different host, the quick fix is to restart everything: docker-compose down && docker-compose-up

        Yes, the distribution uses TLS. You can see the relevant TLS processes on the same dashboard, towards the end of the page. If you want to disable TLS, remove all lines in rabbitmq-env.conf and restart the rabbitmq-dist-tls nodes: docker-compose restart rabbitmq-dist-tls{1..3}

        So inet packets larger than the TCP window are split into TCP packets by the system kernel, and packets smaller than the TCP window are joined into TCP packets by the system kernel, correct?

        I can confirm that when TLS is disabled, the number of inet packets matches on both the send & receive sides. I can also see that the size of inet packets is in the TCP window size ballpark. See attached screenshots (tested on OTP 22.0.2)

        Show
        gerhard Gerhard Lazu added a comment - - edited The reason why you didn't see any distribution traffic is because the nodes sometimes don't auto-cluster. It's a race condition that we were able to reproduce on a different host, the quick fix is to restart everything: docker-compose down && docker-compose-up Yes, the distribution uses TLS. You can see the relevant TLS processes on the same dashboard, towards the end of the page. If you want to disable TLS, remove all lines in rabbitmq-env.conf and restart the rabbitmq-dist-tls nodes: docker-compose restart rabbitmq-dist-tls{1..3 } So inet packets larger than the TCP window are split into TCP packets by the system kernel, and packets smaller than the TCP window are joined into TCP packets by the system kernel, correct? I can confirm that when TLS is disabled, the number of inet packets matches on both the send & receive sides. I can also see that the size of inet packets is in the TCP window size ballpark. See attached screenshots (tested on OTP 22.0.2)
        gerhard Gerhard Lazu made changes -
        Attachment Screenshot 2019-05-30 at 18.26.56.png [ 11336 ]
        gerhard Gerhard Lazu made changes -
        Attachment Screenshot 2019-05-30 at 18.26.45.png [ 11337 ]
        Hide
        lukas Lukas Larsson added a comment -

        So inet packets larger than the TCP window are split into TCP packets by the system kernel, and packets smaller than the TCP window are joined into TCP packets by the system kernel, correct?

        smaller packets *may be* joined.

        I'm not really surprised that the packet size that you get from the inet driver varies quite a bit under TLS. The TLS dist uses all kinds of tricks that will vary the packet to bytes ratio in order to be secure and performant at the same time.

        Show
        lukas Lukas Larsson added a comment - So inet packets larger than the TCP window are split into TCP packets by the system kernel, and packets smaller than the TCP window are joined into TCP packets by the system kernel, correct? smaller packets * may be * joined. I'm not really surprised that the packet size that you get from the inet driver varies quite a bit under TLS. The TLS dist uses all kinds of tricks that will vary the packet to bytes ratio in order to be secure and performant at the same time.
        gerhard Gerhard Lazu made changes -
        Attachment avg-inet-packet-size-explain.png [ 11351 ]
        Hide
        gerhard Gerhard Lazu added a comment -

        Thank you, all captured.

        Show
        gerhard Gerhard Lazu added a comment - Thank you, all captured.

          People

          • Assignee:
            lukas Lukas Larsson
            Reporter:
            gerhard Gerhard Lazu
          • Votes:
            0 Vote for this issue
            Watchers:
            2 Start watching this issue

            Dates

            • Created:
              Updated:
              Resolved:

              Development