Howdy @andrew_p,
I'll be the first to state that I'm no expert on SSL ... however I'm studying the mbedtls_*** apis as quickly as I can. In this discussion, let us assume that the ESP32 is going to be an HTTP client and there is a remote web server out there that is an HTTP server. From what I can see, SSL happens "outside" of the HTTP protocols ... and by that I mean that the underlying transport protocol (TCP in this case) is encrypted. So when our ESP32 wishes to send an HTTP request ... eg:
Then that is the text of the HTTP request our app will send ... and it will be passed to mbedtls ... which encrypts the data and then transmits it over a regular socket. At the receiver, the encrypted text will be received (again over regular sockets) and will then be passed through the SSL stack on the receiver side to decrypt.
If we are thinking straight so far ... then lets delve deeper.
At a high level, I understand SSL uses large random numbers to encrypt the data. My basic knowledge says that there is a public and a private key (a pair of numbers). And the ESP32 will have one PAIR (a public and private) and the partner will have another pair. When an SSL session starts, the ESP32 will ask the receiver for its public key ... when the ESP32 receives that, it will send ITS public key encrypted using the public key of the receiver. The receiver will now decrypt the message using its private key and now both ends of the session know each other's public keys. Now they can exchange data freely using each other's public keys for encryption and only the correct receiver should be able to decrypt as they are the only ones who know the secret (their respective private keys).
At the simplest level, we now have encryption at play ... and no certificates our other "bits and pieces" were used.
Any problems with this story? Not superficially ... it can and does work. However, there are "issues" associated with this story IF we need deeper (better?) security. The first issue is the question of "are we actually talking to who we think we are talking to?" ... if I connect to IP address 1.2.3.4 and start exchanging SSL encrypted data ... am I "really" talking to 1.2.3.4? That's where certificates can come into play. Those certificates can validate that entity I am talking to is actually who it claims to be. There is also the concept of mutually exchanged certificates ... where the ESP32 could send a certificate that validates it is who it claims to be such that the receiver can know that the ESP32 is who it claims to be.
And this is where you have to ask yourself ... how far do I need to go?
For example ... when I use a browser on my desktop and connect to my bank over SSL, I want my browser to know that the bank is who it claims to be (a reason in a second). Once I connect my browser to the bank, I then enter my account number and password ... it is at THAT point that the bank knows who I am and trust that I am who I claim to be (by virtue of the userid/password pair ... that should only be known by me). Because the bank sent me a certificate AND I validated that the certificate was correct .. THEN I trust that I now have a secure connection to the bank ... and not someone impersonating that bank. If there was an impersonator, they would be able to get my userid/password. However, when the bank sent me their certificate at the start of the session ... it was MY responsibility (i.e. my browser or ESP32) to validate that the certificate was correct for whom I was trying to contact. If not correct, then it is up to ME to terminate the conversation ... not the bank ... it is happy to carry on ... because it doesn't need to trust my "physical browser" or "physical ESP32" ... as it doesn't use that for authentication ... but rather uses the supplied userid/password pair.