I'm a contributor to the kafka_ex project (written in Elixir) and since OTP 22 we've had random timeouts when trying to send/receive to Kafka.
I've nailed it down to a problem happening between a `:ssl.send` and an `:ssl.recv`, so I believe it's a regression in OTP's `:ssl` library.
To make sure of this, I wrote a test that does the same thing many times: produce a message on Kafka, which involves one `:ssl.send()` followed by one `:ssl.recv`.
After a few iterations (somewhere between ~50 and ~300), a timeout occurs in the `recv`.
Once that happened, any attempt to use the socket times out. Using a newly instantiated socket, on the other hand, works.
- We've had this error repeatedly and consistently when using kafka_ex with *OTP 21.3* or OTP 22.
- We do not have this error with OTP 21.2 and lower.
- We do not have this error when we connect to Kafka without SSL (so using `gen_tcp` rather than `:ssl`)
You can find here a repo with systematic testing of different versions of OTP: https://github.com/jbruggem/kafka_ex_ssl_bug#run-many-times