Search results for ""SMC to intentionally trigger a panic in TF-A"" - TF-A

SMC to intentionally trigger a panic in TF-A

by Julius Werner

Hi Soby et. al., I'd like to implement a small new feature and ask some guidance for how to go about it: Chrome OS has the ability to automatically collect crash reports from runtime crashes in Trusted Firmware, and we would like to set up automated tests to ensure this feature stays working. In order to do this we need a way for the non-secure OS to intentionally trigger a panic in EL3. The obvious solution would be to implement a new SMC for that. (It's common for operating systems to have similar facilities, e.g. Linux can force a kernel panic by writing 'c' into /proc/sysrq-trigger.) My main question is: where should I get an SMC function ID for this? This is not a silicon or OEM specific feature, so the SiP Service Calls and OEM Service Calls ID ranges seem inappropriate (or do you think it would make sense to treat Google or Chrome OS as the "OEM" here, even though that's not quite accurate?). There are ranges for Trusted Applications and the Trusted OS but unfortunately none for the normal world OS. Is this something that would make sense to allocate under Standard Service Calls? Could you just find an ID for me to use there or does everything in that range need a big specification document written by Arm? Thanks, Julius

5 years, 10 months

Re: [TF-A] SMC to intentionally trigger a panic in TF-A

by Sandrine Bailleux

Hi Julius, OK, in that case I can see that a solution based on TF-A's DebugFS interface might not be desirable. Indeed, our original intention was to make the whole DebugFS system a debug-only feature (hence its name!). As such, I agree that it is likely not to get the same level of scrutiny and testing as other features intended for production systems. One of the main use cases we have in mind for DebugFS is, being able to peek and poke into the firmware for testing purposes. Today, when doing functional testing from the normal world (for example, using TF-A Tests), we are limited to what's exposed through the SMC interface. And even then, we have limited visibility on what really happened in the firmware, as we can only deduce so much from the SMC return value(s). DebugFS could be used to bridge this gap, by providing a side channel for getting internal firmware state information. Going back to the SMC-based solution then, I am not quite convinced SYSTEM_RESET2 is the right interface for intentionally triggering a panic in TF-A. I think the semantics do not quite match. If anything, a firmware crash seems more like a shutdown operation to me rather than a reset (we don't recover from a firmware crash). I am not even sure we should look into the PSCI SMC range, as it's not a power-management operation. Julius, you wrote: > It's the same problem that the SMC/PSCI spec and the TF repository layout is only designed to deal with generic vs. SoC-vendor-specific differentiation. If the normal world OS needs a feature, we can only make it generic or duplicate it across all vendors running that OS. So it sounds like it's not the first time that you hit this issue, is it? Do you have any other example of Normal World OS feature you would have liked to expose through a generic SMC interface? I am wondering whether this could help choosing the right SMC range, if we can identify some common criteria among a set of such features. Regards, Sandrine

5 years, 7 months

Re: [TF-A] SMC to intentionally trigger a panic in TF-A

by Dan Handley

Hi Julius > -----Original Message----- > From: Julius Werner <jwerner(a)chromium.org> > Sent: 11 September 2019 03:00 > To: Dan Handley <Dan.Handley(a)arm.com> > Cc: tf-a(a)lists.trustedfirmware.org > Subject: Re: [TF-A] SMC to intentionally trigger a panic in TF-A > > Hi Dan, > > Whoops, sorry, this fell through the cracks for me since I wasn't on the to: > line. Thanks for your response! > You're welcome. <snip> > > However, I think there might already be support for what you need. PSCI is > part of the standard service and the function SYSTEM_RESET2 allows for both > architectural and vendor-specific resets. The latter allows for vendor- > specific semantics, which could include crashing the firmware as you suggest. > > > > Chrome OS could specify what such a vendor-specific reset looks like and > each Chromebook's platform PSCI hooks could be implemented accordingly. > > Right, but defining a separate vendor-specific reset type for each platform > is roughly the same as defining a separate SiP SMC for each of them. It's the > same problem that the SMC/PSCI spec and the TF repository layout is only > designed to deal with generic vs. > SoC-vendor-specific differentiation. If the normal world OS needs a feature, > we can only make it generic or duplicate it across all vendors running that > OS. > Not quite. The SiP SMC range is already populated with existing SiP stuff whereas the vendor-specific bits of the reset_type in PSCI SYSTEM_RESET2 is unlikely to contain much/any vendor specific stuff. Therefore Chrome OS could define something "generic to Chrome OS" in this space that Chromebooks could implement. There could also be a Chrome OS specific folder for this kind of functionality that Chromebooks pull in. > > Alternatively, this could potentially be defined as an additional > architectural reset. This would enable a generic implementation but would > require approval/definition by Arm's Architecture team. Like me they might > have concerns about this being defined at a generic architectural level. > > Yes, I think that would be the best option. Could you kick off that process > with the Architecture team? Or tell me who I should talk to about this? > OK, I'll fire off an email internally now and then either put you in contact or let you know how it goes. Regards Dan. IMPORTANT NOTICE: The contents of this email and any attachments are confidential and may also be privileged. If you are not the intended recipient, please notify the sender immediately and do not disclose the contents to any other person, use it for any purpose, or store or copy the information in any medium. Thank you.

5 years, 9 months

Re: [TF-A] SMC to intentionally trigger a panic in TF-A

by Sandrine Bailleux

Hi Julius, As you were mentioning that the Linux kernel uses /proc/sysrq-trigger for a similar purpose, I was wondering whether you'd be open to a solution based on a "DebugFS" entry. As you may have seen on the mailing list, Olivier posted a proposal for introducing a firmware debug interface, which has many similarities to how /proc or /sys works in the kernel world: https://lists.trustedfirmware.org/pipermail/tf-a/2019-October/000120.html TF-A patches for this feature are up for review right now and Olivier has also posted some TF-A Tests patches that demonstrate how this can be used from normal world. In addition, we are also working on a Linux driver for this. As you can imagine, DebugFS uses an SMC interface under the hood (currently allocated in the SiP range). But being an abstraction over the SMC layer, which specific SMC function ID is used does not matter so much and it does not need to be standardized by any Arm specification. You'd need to mandate all Chrome OS devices to have this DebugFS entry in the firmware but the backend could vary from platform to platform. Would that suit your use case? Regards, Sandrine IMPORTANT NOTICE: The contents of this email and any attachments are confidential and may also be privileged. If you are not the intended recipient, please notify the sender immediately and do not disclose the contents to any other person, use it for any purpose, or store or copy the information in any medium. Thank you.

5 years, 7 months

Re: [TF-A] SMC to intentionally trigger a panic in TF-A

by Dan Handley

Hi Julius > -----Original Message----- > From: TF-A <tf-a-bounces(a)lists.trustedfirmware.org> On Behalf Of Julius > Werner via TF-A > Sent: 20 August 2019 02:15 > > Hi Soby et. al., > > I'd like to implement a small new feature and ask some guidance for how to go > about it: Chrome OS has the ability to automatically collect crash reports > from runtime crashes in Trusted Firmware, and we would like to set up > automated tests to ensure this feature stays working. > In order to do this we need a way for the non-secure OS to intentionally > trigger a panic in EL3. The obvious solution would be to implement a new SMC > for that. (It's common for operating systems to have similar facilities, e.g. > Linux can force a kernel panic by writing 'c' into /proc/sysrq-trigger.) > OK I can see the use of that, although I'd be a bit concerned about such a thing being available as a general service in case it gets used as an attack vector. For example, a test program could aggressively use this service to try to get the firmware to leak secure world information or something about its behaviour. > My main question is: where should I get an SMC function ID for this? > This is not a silicon or OEM specific feature, so the SiP Service Calls and > OEM Service Calls ID ranges seem inappropriate (or do you think it would make > sense to treat Google or Chrome OS as the "OEM" > here, even though that's not quite accurate?). I guess in theory you could mandate that all Chrome OS SiPs provide a specific function ID in their own specific SiP service, but I don't think that's the right solution here... > There are ranges for Trusted > Applications and the Trusted OS but unfortunately none for the normal world > OS. I don't think the TOS range is right either. > Is this something that would make sense to allocate under Standard > Service Calls? Could you just find an ID for me to use there or does > everything in that range need a big specification document written by Arm? > For sure everything in the standard or architectural ranges require specification by Arm, although this does not necessarily need to be big. However, I think there might already be support for what you need. PSCI is part of the standard service and the function SYSTEM_RESET2 allows for both architectural and vendor-specific resets. The latter allows for vendor-specific semantics, which could include crashing the firmware as you suggest. Chrome OS could specify what such a vendor-specific reset looks like and each Chromebook's platform PSCI hooks could be implemented accordingly. Alternatively, this could potentially be defined as an additional architectural reset. This would enable a generic implementation but would require approval/definition by Arm's Architecture team. Like me they might have concerns about this being defined at a generic architectural level. Regards Dan. IMPORTANT NOTICE: The contents of this email and any attachments are confidential and may also be privileged. If you are not the intended recipient, please notify the sender immediately and do not disclose the contents to any other person, use it for any purpose, or store or copy the information in any medium. Thank you.

5 years, 10 months

Re: [TF-A] SMC to intentionally trigger a panic in TF-A

by Joanna Farley

Julius, On the subject of DebugFS's purpose it was envisages and is today as Sandrine describes as a debug build only capability. Saying that though there has been some early thoughts that it could evolve into a Secure Debug feature where this type of capability or something like it is always on requiring debug certificates for authenticated access. This is something very much for a possible future evolution and is not in the patches available today. We would welcome any thoughts on such an evolution in this space. Joanna On 13/12/2019, 13:01, "TF-A on behalf of Sandrine Bailleux via TF-A" <tf-a-bounces(a)lists.trustedfirmware.org on behalf of tf-a(a)lists.trustedfirmware.org> wrote: Hi Julius, OK, in that case I can see that a solution based on TF-A's DebugFS interface might not be desirable. Indeed, our original intention was to make the whole DebugFS system a debug-only feature (hence its name!). As such, I agree that it is likely not to get the same level of scrutiny and testing as other features intended for production systems. One of the main use cases we have in mind for DebugFS is, being able to peek and poke into the firmware for testing purposes. Today, when doing functional testing from the normal world (for example, using TF-A Tests), we are limited to what's exposed through the SMC interface. And even then, we have limited visibility on what really happened in the firmware, as we can only deduce so much from the SMC return value(s). DebugFS could be used to bridge this gap, by providing a side channel for getting internal firmware state information. Going back to the SMC-based solution then, I am not quite convinced SYSTEM_RESET2 is the right interface for intentionally triggering a panic in TF-A. I think the semantics do not quite match. If anything, a firmware crash seems more like a shutdown operation to me rather than a reset (we don't recover from a firmware crash). I am not even sure we should look into the PSCI SMC range, as it's not a power-management operation. Julius, you wrote: > It's the same problem that the SMC/PSCI spec and the TF repository layout is only designed to deal with generic vs. SoC-vendor-specific differentiation. If the normal world OS needs a feature, we can only make it generic or duplicate it across all vendors running that OS. So it sounds like it's not the first time that you hit this issue, is it? Do you have any other example of Normal World OS feature you would have liked to expose through a generic SMC interface? I am wondering whether this could help choosing the right SMC range, if we can identify some common criteria among a set of such features. Regards, Sandrine -- TF-A mailing list TF-A(a)lists.trustedfirmware.org https://lists.trustedfirmware.org/mailman/listinfo/tf-a

5 years, 7 months

Re: [TF-A] SMC to intentionally trigger a panic in TF-A

by Julius Werner

Hi Sandrine, Yes, I think using debugfs on the kernel side to control this feature (and other firmware control/debug stuff) is perfectly fine and as a good a solution as any other. I am not quite sure about pulling the whole 9p interface into Trusted Firmware, though... it feels a bit "heavy" for a firmware use case (e.g. the 9p layer alone is a thousand lines of code). I'm not sure I see the benefit over using the same debugfs interface on the kernel side but backing it by a kernel driver that translates the file accesses into a simpler, record-based SMC interface (that could also avoid the shared memory requirement, at least for simple requests). Maybe this depends on what else you're planning to do with this interface. It also seems that you're intending for this to only be used for developer builds and never make it into a production image (e.g. it's printing a warning when not building with DEBUG=1). In that case, I can see that code size and complexity may not be a big concern. However, for the use case I had in mind I'd need this to be enabled in production images (Chrome OS doesn't really distinguish between test and production images for automated testing), so I'm looking more for a lightweight API where each command can be enabled/disabled individually at compile time. I'd have to look at and test it in more detail when it's done, but given that it will likely increase code size by quite a bit and that I'm not sure how much I can trust it to be secure for production, I don't think this would work for my use case.

5 years, 7 months

Re: [TF-A] SMC to intentionally trigger a panic in TF-A

by Julius Werner

Hi Dan, Whoops, sorry, this fell through the cracks for me since I wasn't on the to: line. Thanks for your response! > OK I can see the use of that, although I'd be a bit concerned about such a thing being available as a general service in case it gets used as an attack vector. For example, a test program could aggressively use this service to try to get the firmware to leak secure world information or something about its behaviour. Yes, of course, we can gate this with a build option so it would only be available where desired. > However, I think there might already be support for what you need. PSCI is part of the standard service and the function SYSTEM_RESET2 allows for both architectural and vendor-specific resets. The latter allows for vendor-specific semantics, which could include crashing the firmware as you suggest. > > Chrome OS could specify what such a vendor-specific reset looks like and each Chromebook's platform PSCI hooks could be implemented accordingly. Right, but defining a separate vendor-specific reset type for each platform is roughly the same as defining a separate SiP SMC for each of them. It's the same problem that the SMC/PSCI spec and the TF repository layout is only designed to deal with generic vs. SoC-vendor-specific differentiation. If the normal world OS needs a feature, we can only make it generic or duplicate it across all vendors running that OS. > Alternatively, this could potentially be defined as an additional architectural reset. This would enable a generic implementation but would require approval/definition by Arm's Architecture team. Like me they might have concerns about this being defined at a generic architectural level. Yes, I think that would be the best option. Could you kick off that process with the Architecture team? Or tell me who I should talk to about this? Thanks, Julius

5 years, 10 months

Re: [TF-A] SMC to intentionally trigger a panic in TF-A

by Julius Werner

On Fri, Dec 13, 2019 at 6:20 AM Joanna Farley <Joanna.Farley(a)arm.com> wrote: > On the subject of DebugFS's purpose it was envisages and is today as Sandrine describes as a debug build only capability. Saying that though there has been some early thoughts that it could evolve into a Secure Debug feature where this type of capability or something like it is always on requiring debug certificates for authenticated access. This is something very much for a possible future evolution and is not in the patches available today. We would welcome any thoughts on such an evolution in this space. I guess this gets into a bit of a philosophy discussion and becomes a matter of opinion, so there's probably no one right answer. Personally, adding authentication on top of this doesn't really resolve my concerns and adds yet more on top. I'm a strong proponent of the concept of a minimal Trusted Computing Base, i.e. keeping the amount of code executing at the highest privilege level as small and low-complexity as possible. Any code can have bugs, so the idea is that the more complicated the code you run in EL3 is (and the more complicated APIs it exposes), the more likely it becomes that you accidentally have an exploitable vulnerability in there. Like a p9 filesystem driver, a certificate-based authentication system (especially if it's based on x509/ASN.1 which are notoriously hard to implement safely) is a pretty complex piece of code with a pretty large attack surface that I'd rather not have in my EL3 firmware if I can avoid it. I understand that for certain use cases you may need something like this (if you really want a very extensive and extensible debugging API that must be restricted to a few authenticated actors), but in my use case I really just need the ability to trigger one small debugging feature and that feature itself doesn't need to be restricted, so a minimal SMC interface would work much better for that case. > On 13/12/2019, 13:01, "TF-A on behalf of Sandrine Bailleux via TF-A" <tf-a-bounces(a)lists.trustedfirmware.org on behalf of tf-a(a)lists.trustedfirmware.org> wrote: > Going back to the SMC-based solution then, I am not quite convinced > SYSTEM_RESET2 is the right interface for intentionally triggering a > panic in TF-A. I think the semantics do not quite match. If anything, a > firmware crash seems more like a shutdown operation to me rather than a > reset (we don't recover from a firmware crash). I am not even sure we > should look into the PSCI SMC range, as it's not a power-management > operation. Crash recovery behavior is platform dependent (via plat_panic_handler()). On all the platforms we use in Chrome OS we have that implemented as a system reboot. I think for most systems (whether it's a Chromebook, a server or some embedded device) that's probably what you want for random runtime crashes (and least in a production environment), but I agree that TF doesn't enforce any standard behavior so it's hard to clearly match it to one or the other SMC. > So it sounds like it's not the first time that you hit this issue, is > it? Do you have any other example of Normal World OS feature you would > have liked to expose through a generic SMC interface? I am wondering > whether this could help choosing the right SMC range, if we can identify > some common criteria among a set of such features. No, it's the first time I've really run into this. But I think we might quickly come up with more uses for a "non-secure OS" SMC range if we had one. We often see roughly the same SMC again on different platforms, because fundamentally they usually need to do the same kinds of things -- for example, most platforms have some kind of DDR frequency scaling which always needs part of it implemented in EL3, so they all need some kind of SMC to switch to a new DDR frequency. Many also need some kind of "write value to secure register" SMC that just allows the non-secure OS to write a few whitelisted registers that are only accessible in EL3 for some reason. If we could standardize these interfaces in a non-vendor-specific SMC range, we might be able to reduce some code duplication both on the TF and the Linux side. I guess none of these things are really Linux-specific, now that I think of it. So really, I guess the problem is that it would be great to have a range of "generic" SMC IDs that can be easily and unbureaucratically allocated to try out new features, without having to ask Arm to write a big specification document about it every time. It's sort of a development velocity issue.

5 years, 7 months

Re: [TF-A] SMC to intentionally trigger a panic in TF-A

by Soby Mathew

On 13/12/2019 22:04, Julius Werner via TF-A wrote: > On Fri, Dec 13, 2019 at 6:20 AM Joanna Farley <Joanna.Farley(a)arm.com> wrote: >> On the subject of DebugFS's purpose it was envisages and is today as Sandrine describes as a debug build only capability. Saying that though there has been some early thoughts that it could evolve into a Secure Debug feature where this type of capability or something like it is always on requiring debug certificates for authenticated access. This is something very much for a possible future evolution and is not in the patches available today. We would welcome any thoughts on such an evolution in this space. > > I guess this gets into a bit of a philosophy discussion and becomes a > matter of opinion, so there's probably no one right answer. > Personally, adding authentication on top of this doesn't really > resolve my concerns and adds yet more on top. I'm a strong proponent > of the concept of a minimal Trusted Computing Base, i.e. keeping the > amount of code executing at the highest privilege level as small and > low-complexity as possible. Any code can have bugs, so the idea is > that the more complicated the code you run in EL3 is (and the more > complicated APIs it exposes), the more likely it becomes that you > accidentally have an exploitable vulnerability in there. Like a p9 > filesystem driver, a certificate-based authentication system > (especially if it's based on x509/ASN.1 which are notoriously hard to > implement safely) is a pretty complex piece of code with a pretty > large attack surface that I'd rather not have in my EL3 firmware if I > can avoid it. I understand that for certain use cases you may need > something like this (if you really want a very extensive and > extensible debugging API that must be restricted to a few > authenticated actors), but in my use case I really just need the > ability to trigger one small debugging feature and that feature itself > doesn't need to be restricted, so a minimal SMC interface would work > much better for that case. Hi Julius, Just to trying to understand, if TF-A were to expose a crash inducing SMC, this would still be restricted to special builds for your test runs ? This would not make it to production for Chromebook right ? I agree 9p filesystem is not desirable in a EL3 runtime firmware. We could enhance it to use a more tight data structure, if there is a desire in that direction. If that is the case, leaving aside the 9p filesystem issues, can DebugFS serve this requirement (we can remove the limitation that it is restricted to only Debug builds) ? The intention that DebugFS can prove useful atleast in the verification/testing space and if there is more we can do to get there, it would be good to know. > >> On 13/12/2019, 13:01, "TF-A on behalf of Sandrine Bailleux via TF-A" <tf-a-bounces(a)lists.trustedfirmware.org on behalf of tf-a(a)lists.trustedfirmware.org> wrote: >> Going back to the SMC-based solution then, I am not quite convinced >> SYSTEM_RESET2 is the right interface for intentionally triggering a >> panic in TF-A. I think the semantics do not quite match. If anything, a >> firmware crash seems more like a shutdown operation to me rather than a >> reset (we don't recover from a firmware crash). I am not even sure we >> should look into the PSCI SMC range, as it's not a power-management >> operation. > > Crash recovery behavior is platform dependent (via > plat_panic_handler()). On all the platforms we use in Chrome OS we > have that implemented as a system reboot. I think for most systems > (whether it's a Chromebook, a server or some embedded device) that's > probably what you want for random runtime crashes (and least in a > production environment), but I agree that TF doesn't enforce any > standard behavior so it's hard to clearly match it to one or the other > SMC. > >> So it sounds like it's not the first time that you hit this issue, is >> it? Do you have any other example of Normal World OS feature you would >> have liked to expose through a generic SMC interface? I am wondering >> whether this could help choosing the right SMC range, if we can identify >> some common criteria among a set of such features. > > No, it's the first time I've really run into this. But I think we > might quickly come up with more uses for a "non-secure OS" SMC range > if we had one. We often see roughly the same SMC again on different > platforms, because fundamentally they usually need to do the same > kinds of things -- for example, most platforms have some kind of DDR > frequency scaling which always needs part of it implemented in EL3, so > they all need some kind of SMC to switch to a new DDR frequency. Many > also need some kind of "write value to secure register" SMC that just > allows the non-secure OS to write a few whitelisted registers that are > only accessible in EL3 for some reason. If we could standardize these > interfaces in a non-vendor-specific SMC range, we might be able to > reduce some code duplication both on the TF and the Linux side. > > I guess none of these things are really Linux-specific, now that I > think of it. So really, I guess the problem is that it would be great > to have a range of "generic" SMC IDs that can be easily and > unbureaucratically allocated to try out new features, without having > to ask Arm to write a big specification document about it every time. > It's sort of a development velocity issue. > We have utilized the ARM SiP range for some "generic" purposes in the past (see PMF and the execution state switch SMCs). This could be direction for the some of use-cases. But if the SMCs are meant to be truly generic and to be relied on for use by generic normal world software components, it would need to be properly specified I would think. For dynamically modifying some EL3 registers, it would be good to get these requirements out. Perhaps there is scope for architecting some of them as an ARM specification. If not, we could revert to a TF-A standard if there is enough pull for them (perhaps utilizing the ARM SiP range). Best Regards Soby Mathew

5 years, 6 months

TF-A search results for query ""SMC to intentionally trigger a panic in TF-A""