Hello, list,
Release notes for Mbed TLS 3.5.2 [0] have this:
The SHA256 hashes for the archives are: 35890edf1a2c7a7e29eac3118d43302c3e1173e0df0ebaf5db56126dabe5bb05 mbedtls-3.5.2.tar.gz 55c1525e7d5de18b84a1d1e5540950b4a3bac70e02889cf309919b2877cba63b mbedtls-3.5.2.zip
However, attempting to download mbedtls-3.5.2.tar.gz yields a file with hash:
eedecc468b3f8d052ef05a9d42bf63f04c8a1c50d1c5a94c251c681365a2c723
What's going on with those hashes?
[0]: https://github.com/Mbed-TLS/mbedtls/releases/tag/v3.5.2 https://web.archive.org/web/20240126105153/https://github.com/Mbed-TLS/mbedt...
Hello,
GitHub offers two downloads: mbedtls-${version}.tar.gz and v${version}.tar.gz. (That's because GitHub releases have to have a tag called v${version}, and in addition there's the tag naming scheme we normally use which is mbedtls-${version}.) They're identical apart from the directory name. It looks like we mixed up the checksums in the release announcements: they're the checksums of the v files but we give the names of the mbedtls- files. Sorry for the confusion, I'll go and fix the release announcements.
Best regards,
On Thu, Feb 01, 2024 at 03:02:50PM +0100, Gilles Peskine wrote:
Date: Thu, 1 Feb 2024 15:02:50 +0100 From: Gilles Peskine gilles.peskine@arm.com To: Wojtek Porczyk woju@invisiblethingslab.com, mbed-tls@lists.trustedfirmware.org Subject: Re: [mbed-tls] SHA-256 mismatch for mbedtls-3.5.2.tar.gz User-Agent: Mozilla Thunderbird
Hello,
GitHub offers two downloads: mbedtls-${version}.tar.gz and v${version}.tar.gz. (That's because GitHub releases have to have a tag called v${version}, and in addition there's the tag naming scheme we normally use which is mbedtls-${version}.) They're identical apart from the directory name. It looks like we mixed up the checksums in the release announcements: they're the checksums of the v files but we give the names of the mbedtls- files. Sorry for the confusion, I'll go and fix the release announcements.
Thanks for clarification. For the record, I was able to reproduce both of those archives directly from repo checkout:
% git co v3.5.2 Previous HEAD position was [...] HEAD is now at daca7a3979c2 Update BRANCHES.md % git head commit daca7a3979c22da155ec9dce49ab1abf3b65d3a9 (HEAD, tag: v3.5.2, tag: mbedtls-3.5.2, origin/master) Author: Dave Rodgman dave.rodgman@arm.com Date: Wed Jan 24 09:49:11 2024 +0000
Update BRANCHES.md
Signed-off-by: Dave Rodgman dave.rodgman@arm.com % git archive --format=tar.gz --prefix mbedtls-mbedtls-3.5.2/ HEAD | sha256sum - eedecc468b3f8d052ef05a9d42bf63f04c8a1c50d1c5a94c251c681365a2c723 - % git archive --format=tar.gz --prefix mbedtls-3.5.2/ HEAD | sha256sum - 35890edf1a2c7a7e29eac3118d43302c3e1173e0df0ebaf5db56126dabe5bb05 -
(note --prefix). So it's false alarm.
Checking our historical releases, it turns out we've made this mistake many times. In fact we've almost never posted the checksum for the correct name!
Are checksums important in this day and age? They were very useful back in the days when a release announcement was a PGP-signed email, and a release tarball was something you'd grab from some local FTP mirror. But nowadays the most secure way to get our release announcements is an HTTPS web page from GitHub, and the normal way to get the tarball is an HTTPS download from GitHub. So there's not much of a difference in terms of security, so what advantage does the checksum have?
In terms of integrity, only insiders (people with write permission on the GitHub repository) can edit a release announcement, and this is unlikely to be detected in real time but has a public log. Changing the release itself (as in, moving the tag to a different commit) is also something only insiders can do, with a more restricted access list; it is unlikely to be detected in real time (but trivial to check), but I'm not sure how apparent it would be who and when the tampering happened. I don't see a clear advantage to the release announcement in terms of integrity. Unless a third party decides to guarantee the integrity of the release announcement somehow, but then they could also guarantee the integrity of the release content if they want — and in fact I'd expect them to actually archive the release, since that guarantees availability of the content in addition to non-tampering.
If we stop providing checksums, is that a real loss?
On Thu, Feb 01, 2024 at 06:21:24PM +0100, Gilles Peskine wrote:
Checking our historical releases, it turns out we've made this mistake many times. In fact we've almost never posted the checksum for the correct name!
Are checksums important in this day and age? They were very useful back in the days when a release announcement was a PGP-signed email, and a release tarball was something you'd grab from some local FTP mirror. But nowadays the most secure way to get our release announcements is an HTTPS web page from GitHub, and the normal way to get the tarball is an HTTPS download from GitHub. So there's not much of a difference in terms of security, so what advantage does the checksum have?
In terms of integrity, only insiders (people with write permission on the GitHub repository) can edit a release announcement, and this is unlikely to be detected in real time but has a public log. Changing the release itself (as in, moving the tag to a different commit) is also something only insiders can do, with a more restricted access list; it is unlikely to be detected in real time (but trivial to check), but I'm not sure how apparent it would be who and when the tampering happened. I don't see a clear advantage to the release announcement in terms of integrity. Unless a third party decides to guarantee the integrity of the release announcement somehow, but then they could also guarantee the integrity of the release content if they want — and in fact I'd expect them to actually archive the release, since that guarantees availability of the content in addition to non-tampering.
If we stop providing checksums, is that a real loss?
First, let me state that I'm not a contributor to mbedtls, only a downstream user and repackager [0], so I'm not in position to propose any changes to mbedtls release process. Having said that:
1) Everything depends on threat model. For example, "insiders [...] that can edit a release announcement" also include GitHub operators, state personnel that can reasonably coerce them and/or anyone that can "cause issuance" in WebPKI, i.e. MITM HTTPS connection. Does your model consider all those actors trusted? We don't have a full list of those people, in contrast to holders of release keys who can usually be named. Disposing of PGP in this area is a significant regression from those elder days you mentioned.
2) It's demonstrably false that GitHub provides reliable public log for its various features that accompany repositories. In another project we have a list of issues that suddenly disappeared without trace and notification. Critically, GitHub refuses to comment or reinstate those issues unless contacted by original issue author, and not by the project that those issues were posted to. Because of two distinct incidents, separated by 3 years across MS acquisition, I don't agree to the assumption that GH is a reliable append-only record of any information not in git log.
Here's one of the threads for posterity: [1]. I admit it looks relatively innocent, content was removed and not edited, but nevertheless, they simply lost trust.
For those two reasons, stopping providing checksums as they are currently, just pasted in the release notes, does not seem like a meaningful change. Also probably no-one checks on them in any capacity, since IIUC I'm the first one to notice the problem. Charitable assesment would say, that's because people are thoroughly educated in contemporary threat models, well versed in state-of-the-art cryptosystems that solve attacks in those models, and refuse to even read unsigned hashes. Instead they're resorting to TOFU and just pin to whatever hash they observed when downloading from developer's workstation for the first time (myself I'm guilty of just that).
But maybe you could go the other way, and use this opportunity to provide signed releases? There are many options: signed tags, detached signatures over tarballs added to GitHub releases, or even just clearsigned output of sha256sum tool (```-----BEGIN PGP SIGNED MESSAGE----- 0123abcd mbedtls-12.34.5.tar.gz```). Yet another possibility would be to provide signed binaries: in the project I'm currently caring for, we sign just apt repo and don't provide signed sources, but AFAICT it's not an option here.
Thank you for your consideration.
[0] https://github.com/gramineproject/gramine/blob/1cf1f46646646a3b9c6b371e67c80... [1] https://groups.google.com/d/msgid/qubes-devel/YJkKmslcFlMUiCNt%40mutt
On 02/02/2024 11:33, Wojtek Porczyk wrote:
First, let me state that I'm not a contributor to mbedtls, only a downstream user and repackager [0], so I'm not in position to propose any changes to mbedtls release process. Having said that:
You're a primary consumer for releases, so we definitely welcome your point of view.
(…)
For those two reasons, stopping providing checksums as they are currently, just pasted in the release notes, does not seem like a meaningful change.
More critically (see my other email), those checksums were never stable, for non-security reasons (e.g. compression changes). So it looks like we'll drop them before it even gets to security considerations.
But maybe you could go the other way, and use this opportunity to provide signed releases? There are many options: signed tags, detached signatures over tarballs added to GitHub releases, or even just clearsigned output of sha256sum tool (```-----BEGIN PGP SIGNED MESSAGE----- 0123abcd mbedtls-12.34.5.tar.gz```).
We're going to look into this. How likely we are to actually do it will depend on demand, so I invite you to make your voice heard on GitHub.
Yet another possibility would be to provide signed binaries: in the project I'm currently caring for, we sign just apt repo and don't provide signed sources, but AFAICT it's not an option here.
Binaries are a no-go because most users of Mbed TLS want to build it for their own embedded environment. But official source tarballs are definitely an option.
Best regards,
Further to the considerations below — which upon further analysis are even less in favor of checksums than I initially thought — checksums in release announcements have a bigger flaw that a colleague pointed out: the archives in release assets are cached but can be regenerated (confirmed in https://github.com/orgs/community/discussions/45830), so they are at the whims of changes in the underlying software, for example if the compression changes.
So even if we generate correct checksums at release time, and even if there's no confusion between archive names, historical checksums are likely to become wrong over time.
I think this forces us to drop checksums, since we define a release as a git tag and not as a tarball.
We may switch to hosting official tarballs and PGP-signing announcements in the future, but if we do that we'll have to create a process from scratch. Please file an issue on GitHub if there's interest for that.
Best regards,
On Thu, Feb 1, 2024 at 12:22 PM Gilles Peskine via mbed-tls wrote:
Checking our historical releases, it turns out we've made this mistake many times. In fact we've almost never posted the checksum for the correct name!
Just curious, but does that include the 2.x releases?
I wouldn't swear to it, but I _think_ I usually verified the checksum to make sure nothing was corrupted in the download (gently pushing back on the 'probably no-one checks on them' idea).
Are checksums important in this day and age?
I'd say nice but not all that important. I don't see a checksum on the same webpage linking to the download as anything more than a way to verify that the download wasn't corrupted.
Lee
mbed-tls@lists.trustedfirmware.org