Resolution: Not a Bug
Affects Version/s: 22.0
Fix Version/s: None
ERL-908 fixed, we still hit issues when testing RabbitMQ with Erlang 22 built from the Git master branch.
An Erlang client hangs when trying to connect to an Erlang server using TLS 1.3. I'm going to attach:
- the test script I used to start the server and the client; it configures dbg to track calls into the SSL application
- the self-signed TLS certificates/keys
- the output of dbg
- a network capture
The client hangs in ssl:connect() or ssl:handshake() (when usign gen_tcp to open the underlying TCP connection) because the state machine doesn't complete the handshake and never replies to the caller.
After spending quite some time reading the code, I see two problems:
The first one is that after handling the Server Hello record from the server, the client considers that the negotiated version is TLS 1.2: in tls_handshake:hello(), it only considers the Version field of the Server Hello (#server_hello.server_version), which is always TLS 1.2 starting from TLS 1.3, instead of the effective version stored in #server_hello_selected_version.
I tried the following patch to fix this:
The second issue is that after Server Hello, the server sends the remaining handshake records wrapped inside Application Data records and the state machine doesn't seem to handle that: it calls ssl_connection:read_application_data() which, if I understand correctly, sends the data to the process owning the socket. Therefore those records are never handled by the state machine. This includes the Server Hello Done record, which explains why the state machine never replies to the initial caller.
I tried to prepare a patch but couldn't, my knowledge of the SSL application code is way too limited. In particular, I don't know how to decipher the fragment in the Application Data record to pass it to tls_handshake:get_tls_handshake().
To start the test script for the server side:
To start the test script for the client side: