Re: [mbed-tls] Some thoughts towards mbed TLS 3.0 - mbed-tls

21 Apr 2020

      Hi Torsten,
...
this will be a long mail. Sorry for that.
On the contrary, thank you so much for this extensive and well though-out
feedback, that's very helpful!
I'll try to complement Gilles's reply, and skip the points he already answered
unless I have something to add, as I'm generally in full agreement with what
he wrote.
...

For certification and evaluation purposes I need some test vectors
for each crypto function on target. While I know about the
comprehensive self-test program I'm now talking about built-in
functions like mbedtls_sha512_self_test(), etc to be enabled with
#define MBEDTLS_SELF_TEST.
These self-tests are very different in coverage. For SHA-384 and
SHA-512 they are fine, for HMAC-SHA-384 and HMAC-SHA-512 I couldn't
find any as well as for HKDF-HMAC-SHA-256 (in RFC 5869) or
HKDF-HMAC-SHA-384/512 (official test vectors difficult to find).
AES-CTR and AES-XTS are only tested with key length 128 bit, not with
256 bit. AES-CCM is not tested with 256 bit and even for 128 bit,
the test vector from the standard NIST SP 800-38C with long
additional data is not used.
The builtin self-test for GCM is the best I've seen with mbedtls:
all three key lengths are tested as well as the IUF-interface and
the comfort function. Bravo!

Indeed so far we don't have clear guidelines on self-test functions, and we
should try to have more consistency here. Since this is incremental,
backwards-compatible improvements, it can be done at any time (unlike some API
improvements that can only be done when preparing major versions).
Contributions in this area are certainly welcome!
...

While at Elliptic Curve Cryptography: I assume that some of you
know that projectives coordinates as outer interface to ECC are
dangerous, see David Naccache, Nigel P. Smart, Jacques Stern:
Projective Coordinates Leak, Eurocrypt 2004, pp. 257–267.

Yes, and as it happens this was revisited recently by researchers, and brought
back to our attention:
https://tls.mbed.org/tech-updates/security-advisories/mbedtls-security-advis...
...
you have Jacobian coordinates, i.e. projective coordinates, as outer
  interface. In the comment, its is noted that only the affine part is
  used, but can this be assured? In all circumstances?
In practice, I'm pretty confident this is the case, because both
ecp_normalize_jac() and ecp_normalize_mxz() set Z to 1, these functions are
always called before returning data, and if we forgot to call those functions,
we'd get incorrect X and Y coordinates and fail unit test and interop tests.
But as a matter of principle, I agree that exposing Jacobian coordinates in
the API was a poor decision. Fortunately, we already intended to start phasing
out the existing ECP interface in 3.0 (to which extend exactly is still to be
discussed), and in the future is should be fully removed in favour of the PSA
Crypto API, which doesn't have this problem.
...

In my personal opinion the definition

[...]
mbedtls_ecp_keypair;
is dangerous. Why not differentiate between private and public key
   and domain parameters?
I agree, and I think it's a generally accepted opinion in the crypto
implementation community that private and public keys should be clearly
distinguished from one another, for example by using distinct types (in typed
languages).
This is by the way a long-standing problem that existed in the RSA module from
the start, and when ECC was added it followed a similar pattern.
Again, I think the transition of the PSA Crypto API is going to be the answer
here, as it will provide much cleaner key management.
...

Regarding ECC examples: I found it very difficult that there isn't
a single example with known test vectors as in the relevant crypto
standards, i.e. FIPS 186-4 and ANSI X9.62-2005, with raw public
keys. What I mean are (defined) curves, public key value Q=(Qx,Qy)
and known signature values r and s. In the example ecdsa.c you
generate your own key pair and read/write the signature in
serialized form. In the example programs/pkey/pk_sign.c and
pk_verify.c you use a higher interface pk.h and keys in PEM format.
So, it took me a while for a program to verify (all) known answer
tests in the standards (old standards as ANSI X9.62 1998 have more
detailed known answer tests). One needs this interface with raw
public keys for example for CAVP tests, see The FIPS 186-4 Elliptic
Curve Digital Signature Algorithm Validation System (ECDSA2VS).

In the moment, there is no single known answer tests for ECDSA
(which could be activated with #define MBEDTLS_SELF_TEST). I
wouldn't say that you need an example for every curve and hash
combination, as it is done in ECDSA2VS CAVP, but one example for
one of the NIST curves and one for Curve25519 and - if I have a
wish free - one for Brainpool would be fine. And this would solve
#9 above.

I might be misunderstanding, but isn't the function
ecdsa_prim_test_vectors() in tests/suites/test_suite_ecdsa.function close to
what you're looking for? I mean, except for the fact that it's not in the
self-test function, and that it's using data from RFCs rather than from NIST
or ANSI?
But I agree this is a bit messy and could be made cleaner.
Also, if you ever feel like contributing your dream self-test function for
ECDSA as PR, this will be a very welcome contribution!
...

Feature request: Since it was irrelevant for my task (only
verification, no generation) I didn't have a detailed look a your
ECC side-channel countermeasures. But obviously you use the same
protected code for scalar multiplication in verify and sign,
right? Wouldn't it be possible to use Shamir's trick in
verification with fast unprotected multi-scalar multiplication. In
the moment, mbedtls_ecdsa_verify is a factor 4-5 slower than
mbedtls_ecdsa_sign, while OpenSSLs verify is faster than sign.

Yes, this was done to save both on code size and developer time, but I agree
it's suboptimal for performance. As Gilles wrote, as a general rule we tend to
favour code size and security/maintainability over performance, but in that
instance if we provide an option for a verify-only ECDSA build (which I agree
we should), then we're likely to reach smaller code size by having a
standalone unprotected implementation of multi-scalar multiplication, so that
would probably be a great option: better performance _and_ smaller code.
Unfortunately we're already quite busy with other things, so this kind of
optimisation will probably have to wait for a bit more, but we're taking good
note of the suggestion and will try to implement it in the future.
Again, thanks for this comprehensive and useful feedback.
Regards,
Manuel.