Julius,
On the subject of DebugFS's purpose it was envisages and is today as Sandrine describes as a debug build only capability. Saying that though there has been some early thoughts that it could evolve into a Secure Debug feature where this type of capability or something like it is always on requiring debug certificates for authenticated access. This is something very much for a possible future evolution and is not in the patches available today. We would welcome any thoughts on such an evolution in this space.
Joanna
On 13/12/2019, 13:01, "TF-A on behalf of Sandrine Bailleux via TF-A" <tf-a-bounces@lists.trustedfirmware.org on behalf of tf-a@lists.trustedfirmware.org> wrote:
Hi Julius,
OK, in that case I can see that a solution based on TF-A's DebugFS interface might not be desirable. Indeed, our original intention was to make the whole DebugFS system a debug-only feature (hence its name!). As such, I agree that it is likely not to get the same level of scrutiny and testing as other features intended for production systems.
One of the main use cases we have in mind for DebugFS is, being able to peek and poke into the firmware for testing purposes. Today, when doing functional testing from the normal world (for example, using TF-A Tests), we are limited to what's exposed through the SMC interface. And even then, we have limited visibility on what really happened in the firmware, as we can only deduce so much from the SMC return value(s). DebugFS could be used to bridge this gap, by providing a side channel for getting internal firmware state information.
Going back to the SMC-based solution then, I am not quite convinced SYSTEM_RESET2 is the right interface for intentionally triggering a panic in TF-A. I think the semantics do not quite match. If anything, a firmware crash seems more like a shutdown operation to me rather than a reset (we don't recover from a firmware crash). I am not even sure we should look into the PSCI SMC range, as it's not a power-management operation.
Julius, you wrote:
> It's the same problem that the SMC/PSCI spec and the TF repository layout is only designed to deal with generic vs. SoC-vendor-specific differentiation. If the normal world OS needs a feature, we can only make it generic or duplicate it across all vendors running that OS.
So it sounds like it's not the first time that you hit this issue, is it? Do you have any other example of Normal World OS feature you would have liked to expose through a generic SMC interface? I am wondering whether this could help choosing the right SMC range, if we can identify some common criteria among a set of such features.
Regards, Sandrine -- TF-A mailing list TF-A@lists.trustedfirmware.org https://lists.trustedfirmware.org/mailman/listinfo/tf-a
On Fri, Dec 13, 2019 at 6:20 AM Joanna Farley Joanna.Farley@arm.com wrote:
On the subject of DebugFS's purpose it was envisages and is today as Sandrine describes as a debug build only capability. Saying that though there has been some early thoughts that it could evolve into a Secure Debug feature where this type of capability or something like it is always on requiring debug certificates for authenticated access. This is something very much for a possible future evolution and is not in the patches available today. We would welcome any thoughts on such an evolution in this space.
I guess this gets into a bit of a philosophy discussion and becomes a matter of opinion, so there's probably no one right answer. Personally, adding authentication on top of this doesn't really resolve my concerns and adds yet more on top. I'm a strong proponent of the concept of a minimal Trusted Computing Base, i.e. keeping the amount of code executing at the highest privilege level as small and low-complexity as possible. Any code can have bugs, so the idea is that the more complicated the code you run in EL3 is (and the more complicated APIs it exposes), the more likely it becomes that you accidentally have an exploitable vulnerability in there. Like a p9 filesystem driver, a certificate-based authentication system (especially if it's based on x509/ASN.1 which are notoriously hard to implement safely) is a pretty complex piece of code with a pretty large attack surface that I'd rather not have in my EL3 firmware if I can avoid it. I understand that for certain use cases you may need something like this (if you really want a very extensive and extensible debugging API that must be restricted to a few authenticated actors), but in my use case I really just need the ability to trigger one small debugging feature and that feature itself doesn't need to be restricted, so a minimal SMC interface would work much better for that case.
On 13/12/2019, 13:01, "TF-A on behalf of Sandrine Bailleux via TF-A" <tf-a-bounces@lists.trustedfirmware.org on behalf of tf-a@lists.trustedfirmware.org> wrote: Going back to the SMC-based solution then, I am not quite convinced SYSTEM_RESET2 is the right interface for intentionally triggering a panic in TF-A. I think the semantics do not quite match. If anything, a firmware crash seems more like a shutdown operation to me rather than a reset (we don't recover from a firmware crash). I am not even sure we should look into the PSCI SMC range, as it's not a power-management operation.
Crash recovery behavior is platform dependent (via plat_panic_handler()). On all the platforms we use in Chrome OS we have that implemented as a system reboot. I think for most systems (whether it's a Chromebook, a server or some embedded device) that's probably what you want for random runtime crashes (and least in a production environment), but I agree that TF doesn't enforce any standard behavior so it's hard to clearly match it to one or the other SMC.
So it sounds like it's not the first time that you hit this issue, is it? Do you have any other example of Normal World OS feature you would have liked to expose through a generic SMC interface? I am wondering whether this could help choosing the right SMC range, if we can identify some common criteria among a set of such features.
No, it's the first time I've really run into this. But I think we might quickly come up with more uses for a "non-secure OS" SMC range if we had one. We often see roughly the same SMC again on different platforms, because fundamentally they usually need to do the same kinds of things -- for example, most platforms have some kind of DDR frequency scaling which always needs part of it implemented in EL3, so they all need some kind of SMC to switch to a new DDR frequency. Many also need some kind of "write value to secure register" SMC that just allows the non-secure OS to write a few whitelisted registers that are only accessible in EL3 for some reason. If we could standardize these interfaces in a non-vendor-specific SMC range, we might be able to reduce some code duplication both on the TF and the Linux side.
I guess none of these things are really Linux-specific, now that I think of it. So really, I guess the problem is that it would be great to have a range of "generic" SMC IDs that can be easily and unbureaucratically allocated to try out new features, without having to ask Arm to write a big specification document about it every time. It's sort of a development velocity issue.
tf-a@lists.trustedfirmware.org