Hi Torsten,
this will be a long mail. Sorry for that.
On the contrary, thank you so much for this extensive and well though-out feedback, that's very helpful!
I'll try to complement Gilles's reply, and skip the points he already answered unless I have something to add, as I'm generally in full agreement with what he wrote.
For certification and evaluation purposes I need some test vectors for each crypto function on target. While I know about the comprehensive self-test program I'm now talking about built-in functions like mbedtls_sha512_self_test(), etc to be enabled with #define MBEDTLS_SELF_TEST.
These self-tests are very different in coverage. For SHA-384 and SHA-512 they are fine, for HMAC-SHA-384 and HMAC-SHA-512 I couldn't find any as well as for HKDF-HMAC-SHA-256 (in RFC 5869) or HKDF-HMAC-SHA-384/512 (official test vectors difficult to find). AES-CTR and AES-XTS are only tested with key length 128 bit, not with 256 bit. AES-CCM is not tested with 256 bit and even for 128 bit, the test vector from the standard NIST SP 800-38C with long additional data is not used. The builtin self-test for GCM is the best I've seen with mbedtls: all three key lengths are tested as well as the IUF-interface and the comfort function. Bravo!
Indeed so far we don't have clear guidelines on self-test functions, and we should try to have more consistency here. Since this is incremental, backwards-compatible improvements, it can be done at any time (unlike some API improvements that can only be done when preparing major versions).
Contributions in this area are certainly welcome!
- While at Elliptic Curve Cryptography: I assume that some of you know that projectives coordinates as outer interface to ECC are dangerous, see David Naccache, Nigel P. Smart, Jacques Stern: Projective Coordinates Leak, Eurocrypt 2004, pp. 257–267.
Yes, and as it happens this was revisited recently by researchers, and brought back to our attention: https://tls.mbed.org/tech-updates/security-advisories/mbedtls-security-advis...
you have Jacobian coordinates, i.e. projective coordinates, as outer interface. In the comment, its is noted that only the affine part is used, but can this be assured? In all circumstances?
In practice, I'm pretty confident this is the case, because both ecp_normalize_jac() and ecp_normalize_mxz() set Z to 1, these functions are always called before returning data, and if we forgot to call those functions, we'd get incorrect X and Y coordinates and fail unit test and interop tests.
But as a matter of principle, I agree that exposing Jacobian coordinates in the API was a poor decision. Fortunately, we already intended to start phasing out the existing ECP interface in 3.0 (to which extend exactly is still to be discussed), and in the future is should be fully removed in favour of the PSA Crypto API, which doesn't have this problem.
- In my personal opinion the definition
[...] mbedtls_ecp_keypair;
is dangerous. Why not differentiate between private and public key and domain parameters?
I agree, and I think it's a generally accepted opinion in the crypto implementation community that private and public keys should be clearly distinguished from one another, for example by using distinct types (in typed languages).
This is by the way a long-standing problem that existed in the RSA module from the start, and when ECC was added it followed a similar pattern.
Again, I think the transition of the PSA Crypto API is going to be the answer here, as it will provide much cleaner key management.
Regarding ECC examples: I found it very difficult that there isn't a single example with known test vectors as in the relevant crypto standards, i.e. FIPS 186-4 and ANSI X9.62-2005, with raw public keys. What I mean are (defined) curves, public key value Q=(Qx,Qy) and known signature values r and s. In the example ecdsa.c you generate your own key pair and read/write the signature in serialized form. In the example programs/pkey/pk_sign.c and pk_verify.c you use a higher interface pk.h and keys in PEM format.
So, it took me a while for a program to verify (all) known answer tests in the standards (old standards as ANSI X9.62 1998 have more detailed known answer tests). One needs this interface with raw public keys for example for CAVP tests, see The FIPS 186-4 Elliptic Curve Digital Signature Algorithm Validation System (ECDSA2VS).
In the moment, there is no single known answer tests for ECDSA (which could be activated with #define MBEDTLS_SELF_TEST). I wouldn't say that you need an example for every curve and hash combination, as it is done in ECDSA2VS CAVP, but one example for one of the NIST curves and one for Curve25519 and - if I have a wish free - one for Brainpool would be fine. And this would solve #9 above.
I might be misunderstanding, but isn't the function ecdsa_prim_test_vectors() in tests/suites/test_suite_ecdsa.function close to what you're looking for? I mean, except for the fact that it's not in the self-test function, and that it's using data from RFCs rather than from NIST or ANSI?
But I agree this is a bit messy and could be made cleaner.
Also, if you ever feel like contributing your dream self-test function for ECDSA as PR, this will be a very welcome contribution!
- Feature request: Since it was irrelevant for my task (only verification, no generation) I didn't have a detailed look a your ECC side-channel countermeasures. But obviously you use the same protected code for scalar multiplication in verify and sign, right? Wouldn't it be possible to use Shamir's trick in verification with fast unprotected multi-scalar multiplication. In the moment, mbedtls_ecdsa_verify is a factor 4-5 slower than mbedtls_ecdsa_sign, while OpenSSLs verify is faster than sign.
Yes, this was done to save both on code size and developer time, but I agree it's suboptimal for performance. As Gilles wrote, as a general rule we tend to favour code size and security/maintainability over performance, but in that instance if we provide an option for a verify-only ECDSA build (which I agree we should), then we're likely to reach smaller code size by having a standalone unprotected implementation of multi-scalar multiplication, so that would probably be a great option: better performance _and_ smaller code.
Unfortunately we're already quite busy with other things, so this kind of optimisation will probably have to wait for a bit more, but we're taking good note of the suggestion and will try to implement it in the future.
Again, thanks for this comprehensive and useful feedback.
Regards, Manuel.