Hi Feng,
This is standard practice for operating systems, hypervisors, and firmware running on Armv8-A systems. A key distinction between SPSel,#0 and SPSel,#1 is that you can tell which stack pointer you were using when you took an exception as they correspond to different offsets in the vector table. Often times, taking an exception from the same EL when using SPSel,#0 is non-terminal whereas taking an exception from the same EL when already using SPSel,#1 is considered terminal.
Take for example the scenario where your operating system, hypervisor, or firmware is running some task/thread code at EL1/EL2/EL3 and runs out of stack space, triggering a translation fault and attempting to stack some data (we’ll probably be using unmapped guard pages at the stack boundaries). The first thing the exception handler will try to do is stack the GPRs. If the reason you took the exception is because the stack pointed to by SP_EL1/2/3 has itself overflowed, this attempt to stack the GPRs will itself cause a translation fault and you’ll get stuck in a recursive exception.
In contrast, if the reason you took the exception is because the stack pointed to by SP_EL0 has overflowed, the exception handler will successfully stack the GPRs to the SP_EL1/2/3 stack and be able to diagnose + log what went wrong before rebooting gracefully.
Hope that helps,
Ash.
From: TF-A <tf-a-bounces(a)lists.trustedfirmware.org> on behalf of Chen Feng via TF-A <tf-a(a)lists.trustedfirmware.org>
Reply to: Chen Feng <puck.chen(a)hisilicon.com>
Date: Friday, 17 April 2020 at 09:02
To: "tf-a(a)lists.trustedfirmware.org" <tf-a(a)lists.trustedfirmware.org>, Alexei Fedorov <Alexei.Fedorov(a)arm.com>, Yatharth Kochar <Yatharth.Kochar(a)arm.com>, Sandrine Bailleux <Sandrine.Bailleux(a)arm.com>
Cc: "puck.chen(a)hisilicon.com" <puck.chen(a)hisilicon.com>, "lizhong11(a)hisilicon.com" <lizhong11(a)hisilicon.com>
Subject: [TF-A] sp select in atf
Hello all,
I see the atf use different sp, and special for smc64 it use the sp_el0.
So for the unhandled-exception handler, it must switch to sp_el0 to do
the back-trace-dump. Since it default used sp_el3 when exception happen.
My question here is why using different sp in atf code? Just use the
sp_el3 for all scenes seems more simpler.
Cheers,
- feng
--
TF-A mailing list
TF-A(a)lists.trustedfirmware.org<mailto:TF-A@lists.trustedfirmware.org>
https://lists.trustedfirmware.org/mailman/listinfo/tf-a
IMPORTANT NOTICE: The contents of this email and any attachments are confidential and may also be privileged. If you are not the intended recipient, please notify the sender immediately and do not disclose the contents to any other person, use it for any purpose, or store or copy the information in any medium. Thank you.
Hello all,
I see the atf use different sp, and special for smc64 it use the sp_el0.
So for the unhandled-exception handler, it must switch to sp_el0 to do
the back-trace-dump. Since it default used sp_el3 when exception happen.
My question here is why using different sp in atf code? Just use the
sp_el3 for all scenes seems more simpler.
Cheers,
- feng
As of now ARM_LINUX_KERNEL_AS_BL33 is only supported when RESET_TO_BL31=1, along with it you need to pass PRELOADED_BL33_BASE as well as ARM_PRELOADED_DTB_BASE.
AFAIK This feature is not tested for platforms which uses all the BL(1/2/31) stages from TF-A . The most likely reason for this is loading and authentication of Linux Image.
BL2 which is responsible for loading of various images, does not have support to load linux image.
With platforms having RESET_TO_BL31, TF-A relies on prior loader which loads kernel and device tree blobs at respective address.
In short, if your platform has RESET_TO_BL31=1, it will be quite easy else you need to understand the BL2 loading mechanism and see if you can extend it for loading Linux and DTB.
Kernel Output format zImage/Image should work.
Hope this helps!
________________________________
From: TF-A <tf-a-bounces(a)lists.trustedfirmware.org> on behalf of François Ozog via TF-A <tf-a(a)lists.trustedfirmware.org>
Sent: 15 April 2020 13:31
To: tf-a(a)lists.trustedfirmware.org <tf-a(a)lists.trustedfirmware.org>
Subject: [TF-A] Linux as BL33
I want to use Linux as BL33 on a Marvell Macchiatobin.
Currently I have the successful boot flow:
TFA (mainline v2.2) -> U-Boot (Mainline 2020.04rc5) -> Kernel (5.6.3)
with U-root initrd (6.0.0, https://github.com/u-root/u-root ) ->
Ubuntu 19.10
The 5.6.3 "intermediary" kernel is 5.5MB uncompressed , u-root initrd
is 3.5MB compressed (some form of golang based busybox).
I was pointed to the ARM_LINUX_KERNEL_AS_BL33 option which is not
supported on the Macchiatobin.
It does not look too difficult to add, but I'd like to have some
feedback/guidance on how to do it:
- how to add the option to the TFA platform
- how to generate a usable kernel (compile options? non relocatable
kernel? output format, i.e. Image, zImage, uImage...)
Thanks for your help
-FF
--
TF-A mailing list
TF-A(a)lists.trustedfirmware.org
https://lists.trustedfirmware.org/mailman/listinfo/tf-a
IMPORTANT NOTICE: The contents of this email and any attachments are confidential and may also be privileged. If you are not the intended recipient, please notify the sender immediately and do not disclose the contents to any other person, use it for any purpose, or store or copy the information in any medium. Thank you.
+Harb
________________________________
From: TF-A <tf-a-bounces(a)lists.trustedfirmware.org> on behalf of Vivek Prasad via TF-A <tf-a(a)lists.trustedfirmware.org>
Sent: Thursday, April 16, 2020 10:19 AM
To: Stuart Yoder <stuart.yoder(a)arm.com>; Alexei Fedorov <Alexei.Fedorov(a)arm.com>; tf-a <tf-a(a)lists.trustedfirmware.org>; Raghu Krishnamurthy <raghu.ncstate(a)icloud.com>
Cc: Loc Ho <loc.ho(a)amperecomputing.com>; Vivek Kumar <vivek(a)amperecomputing.com>; Benjamin Chaffin <bchaffin(a)amperecomputing.com>; Ard Biesheuvel <Ard.Biesheuvel(a)arm.com>; Mohamad Ammar <moe(a)amperecomputing.com>; Charles Garcia-Tobin <Charles.Garcia-Tobin(a)arm.com>
Subject: Re: [TF-A] Proposal for Measured Boot Implementation
Hello Stuart, Alexei,
Chiming-in here on Ampere's behalf...
We analysed this proposal internally. And we see a number issues with this, some of which was already raised by Raghu in the previous threads.
Here is a summary of the main issues that we see.
* Only supporting mbedtls, and this is fixed config at compile time.
* We propose that there should be a variable for the algorithm to be used, which can be setup at initialization time.
* This solution relies on taking the hash directly from the digest as the measurement, instead of the computed hash. This is not safe, especially considering measured boot may use a different hash bank, so digest hash may not be correct/valid.
* Only measuring the BL2 image, per the ARM SBSG we must be measuring and logging *all* images/boot phases
* BL31
* BL32 (all secure partitions)
* BL33 (UEFI or any other non-secure boot loader)
* Once we ERET into BL33, the measure boot flow continues and is owned by that boot loader
* Only see support for PCR0, any/all unsigned config data must be logged to PCR1.
* Passing PCRs to non-secure software before logging is not compliant with TCG Static-Root-of-Trust Measurement (SRTM) requirements
* It was discussed before in separate conversations… especially in systems where you are talked about two different signing domains where BL33 is a different trust/signing domain.
* BL33 should only do hash-log-extend… there is no need for BL33 to be aware of the current PCR value (beyond what is provided in the boot event log).
* Based on comments on the mail thread, there seem to be bad assumptions/expectations around TPM accessibility from non-secure world.
* Expecting SPI/I2C TPMs to be directly accessed from non-secure world instead of abstracting hardware details via the TCG CRB interface (which has been already standardized as the defacto mechanism for ARM on past mobile, client, and server solutions).
* CRB will "just work" for Aptio/EDK2/Linux/Windows/Hyper-V/VMWare
* NOTE: This goes back to what is a “productizable” TPM solution. We want it to be turn-key solution for customers without having to support/develop proprietary drivers.
-Vivek/Harb
Hi all,
Thanks to all who have commented on this proposal so far. I've edited
the original document to try and incorporate all feedback gathered so
far (through the TSC meeting, this email thread and the TF-A tech call).
Please have another look and flag anything I might have missed:
https://developer.trustedfirmware.org/w/collaboration/project-maintenance-p…
The major changes are:
== Removed concept of self-review ==
This is proving too controversial, several people do not want to allow
self-review.
Roles of maintainer and code owner are still cumulative but cannot be
both exercised for the same patch.
The exact method of dealing with review bottleneck is still to be
decided. In addition to the current proposal of increasing the
maintainers pool, the most popular alternatives mentioned so far are:
- Set a minimum wait time for feedback before a patch can be merged
without any further delay.
- Mandate distinct reviewers for a patch.
== Enhanced the section "Patch contribution Guidelines" ==
Mentioned that patches should be small, on-topic, with comprehensive
commit messages.
== Added a note about how to deal with disagreement ==
If reviewers cannot find a common ground, the proposal is to call out a
3rd-party maintainer.
== Removed "out-of-date" platform state ==
Squashed it into "limited support" to reduce the number of states.
== Removed "orphan" state from platform support life cycle ==
This concept is orthogonal to the level of functionality.
Added a note in the "Code Owner" section instead.
== Per-project guidelines as a complementary document ==
Added a list of things that it would typically cover.
== Added requirement on fully supported platforms to document the
features they support ==
== Added todo mentioning that the proposal might cover branching
strategies in the future ==
The full diff may be seen here:
https://developer.trustedfirmware.org/phriction/diff/73/?l=4&r=5
This proposal is still open for discussion at this stage and further
feedback is most welcome!
Regards,
Sandrine
I want to use Linux as BL33 on a Marvell Macchiatobin.
Currently I have the successful boot flow:
TFA (mainline v2.2) -> U-Boot (Mainline 2020.04rc5) -> Kernel (5.6.3)
with U-root initrd (6.0.0, https://github.com/u-root/u-root ) ->
Ubuntu 19.10
The 5.6.3 "intermediary" kernel is 5.5MB uncompressed , u-root initrd
is 3.5MB compressed (some form of golang based busybox).
I was pointed to the ARM_LINUX_KERNEL_AS_BL33 option which is not
supported on the Macchiatobin.
It does not look too difficult to add, but I'd like to have some
feedback/guidance on how to do it:
- how to add the option to the TFA platform
- how to generate a usable kernel (compile options? non relocatable
kernel? output format, i.e. Image, zImage, uImage...)
Thanks for your help
-FF
Looping-in Thomas & Deepak, responsible for the RD-N1 landing team platforms releases. They might be able to help.
Thanks
Matteo
From: TF-A <tf-a-bounces(a)lists.trustedfirmware.org> On Behalf Of ??(??) via TF-A
Sent: 14 April 2020 06:47
To: TF-A <tf-a-bounces(a)lists.trustedfirmware.org>; Raghu Krishnamurthy via TF-A <tf-a(a)lists.trustedfirmware.org>
Subject: [TF-A] 回复:Re: [RAS] BL32 UnRecognized Event - 0xC4000061 and BL31 Crashed
Hi RagHu,
Really appreciate your help.
I was downloaded this software stack from git.linaro.org. This software stack include ATF, kernel, edk2 and so on.
The user guide i used from linaro is:https://git.linaro.org/landing-teams/working/arm/arm-reference-platforms…
1) What platform you are running on? Can this issue be reproduced
outside your testing environment, perhaps on FVP or QEMU?
A: I am running on ARM N1-Edge FVP platform. It can reproduced on this FVP platform.
2) What version of TF-A and StandaloneMM is being used? Preferably the
commit-id, so that we can be sure we are looking at the same code.
A: TF-A: https://git.linaro.org/landing-teams/working/arm/arm-tf.git tag:RD-INFRA-20191024-RC0
StandloneMM seems build from edk2 & edk2-platform. so i just put edk2 and edk2-platform version information. if anything i missed, please let me know.
edk2: https://git.linaro.org/landing-teams/working/arm/edk2.git<https://git.linaro.org/landing-teams/working/arm/edk2.git/> tag:RD-INFRA-20191024-RC0
edk2-platform: https://git.linaro.org/landing-teams/working/arm/edk2-platforms.git<https://git.linaro.org/landing-teams/working/arm/edk2-platforms.git/> tag:RD-INFRA-20191024-RC0
3) What version of the kernel and sdei driver is being used?
A: kernel-release: https://git.linaro.org/landing-teams/working/arm/kernel-release.git tag:RD-INFRA-20191024-RC0
The sdei driver was included in kernel, do i need to provide sdei driver version? If need please let me know.
4) I can't tell from looking at the log but do you know if writing 0x123
to sde_ras_poison causes a DMC620 interrupt or an SError or external
abort through memory access ?
A: Sorry, linaro only refered it will inject the DMC-620 single-bit RAS error. So I am also not sure which exception type it will trigger.
BRs,
Bin Wu
------------------原始邮件 ------------------
发件人:TF-A <tf-a-bounces(a)lists.trustedfirmware.org<mailto:tf-a-bounces@lists.trustedfirmware.org>>
发送时间:Tue Apr 14 01:25:47 2020
收件人:Raghu Krishnamurthy via TF-A <tf-a(a)lists.trustedfirmware.org<mailto:tf-a@lists.trustedfirmware.org>>
主题:Re: [TF-A] [RAS] BL32 UnRecognized Event - 0xC4000061 and BL31 Crashed
Hello,
>>Does BL31 need to send 0xC4000061 event to BL32 again?
I don't think it will. It is really odd that
0xC4000061(SP_EVENT_COMPLETE_AARCH64) ever reaches the BL32/MM handler.
This is from looking at the upstream code quickly but it definitely
depends on the platform you are running, what version of TF-A you are
using, build options used. Is it possible that the unhandled exception
is occurring after successful handling of the DMC620 error but there is
a following issue that occurs right after, causing the crash?
From the register dump it looks like there was an Instruction abort
exception at address 0 while running in EL3. Something seems to have
gone seriously wrong to have 0xC4000061 ever go back to BL32 and to get
an instruction abort at address 0.
>>Does current TF-A support to run RAS test? It seems BL31 will crash.
See above. The answer really depends on the factors mentioned above.
The following would be helpful to know:
1) What platform you are running on? Can this issue be reproduced
outside your testing environment, perhaps on FVP or QEMU?
2) What version of TF-A and StandaloneMM is being used? Preferably the
commit-id, so that we can be sure we are looking at the same code.
3) What version of the kernel and sdei driver is being used?
4) I can't tell from looking at the log but do you know if writing 0x123
to sde_ras_poison causes a DMC620 interrupt or an SError or external
abort through memory access ?
Thanks
Raghu
On 4/13/20 12:16 AM, 吴斌(郅隆) via TF-A wrote:
> Dear Friends,
>
> I am using TF-A to test RAS feature.
> When I triggered DMC620 RAS error in Linux(echo 0x123 >
> /sys/kernel/debug/sdei_ras_poison).
> BL32 will recieve
> UnRecognized Event - 0xC4000061(SP_EVENT_COMPLETE_AARCH64) and finally
> BL31 crashed.
>
> In my understanding, this 0xC4000061 should consumed by BL31, not send
> it to BL32 again.
>
> A piece of error log as below:
>
> *************************************
>
> CperWrite - CperAddress@0xFF610064
> CperWrite - 1 Section@FFBE91A8, Length 80, SectionType@FFBE9138
> CperWrite - Got Error Section: Platform Memory.
> MmEntryPoint Done
> Received delegated event
> X0 : 0xC4000061
> X1 : 0x0
> X2 : 0x0
> X3 : 0x0
> Received event - 0xC4000061 on cpu 0
> UnRecognized Event - 0xC4000061
> Failed delegated event 0xC4000061, Status 0x2
> Unhandled Exception in EL3.
> x30 = 0x0000000000000000
> x0 = 0x00000000ff007e00
> x1 = 0xfffffffffffffffe
> x2 = 0x00000000600003c0
> x3 = 0x0000000000000000
> x4 = 0x0000000000000000
> x5 = 0x0000000000000000
> x6 = 0x00000000ff015080
> x7 = 0x0000000000000000
> x8 = 0x00000000c4000061
> x9 = 0x0000000000000021
> x10 = 0x0000000000000040
> x11 = 0x00000000ff00f2b0
> x12 = 0x00000000ff0118c0
> x13 = 0x0000000000000002
> x14 = 0x00000000ff016b70
> x15 = 0x00000000ff003f20
> x16 = 0x0000000000000044
> x17 = 0x00000000ff010430
> x18 = 0x0000000000000e3c
> x19 = 0x0000000000000000
> More error log please refer to attachment.
>
> My question is,
> 1. Does BL31 need to send 0xC4000061 event to BL32 again?
> 2. Does current TF-A support to run RAS test? It seems BL31 will crash.
>
> Appreciate your help.
>
> BRs,
> Bin Wu
>
--
TF-A mailing list
TF-A(a)lists.trustedfirmware.org<mailto:TF-A@lists.trustedfirmware.org>
https://lists.trustedfirmware.org/mailman/listinfo/tf-a
Hi RagHu,
Really appreciate your help.
I was downloaded this software stack from git.linaro.org. This software stack include ATF, kernel, edk2 and so on.
The user guide i used from linaro is:https://git.linaro.org/landing-teams/working/arm/arm-reference-platforms…
1) What platform you are running on? Can this issue be reproduced
outside your testing environment, perhaps on FVP or QEMU?
A: I am running on ARM N1-Edge FVP platform. It can reproduced on this FVP platform.
2) What version of TF-A and StandaloneMM is being used? Preferably the
commit-id, so that we can be sure we are looking at the same code.
A: TF-A: https://git.linaro.org/landing-teams/working/arm/arm-tf.git tag:RD-INFRA-20191024-RC0
StandloneMM seems build from edk2 & edk2-platform. so i just put edk2 and edk2-platform version information. if anything i missed, please let me know.
edk2: https://git.linaro.org/landing-teams/working/arm/edk2.git tag:RD-INFRA-20191024-RC0
edk2-platform: https://git.linaro.org/landing-teams/working/arm/edk2-platforms.git tag:RD-INFRA-20191024-RC0
3) What version of the kernel and sdei driver is being used?
A: kernel-release: https://git.linaro.org/landing-teams/working/arm/kernel-release.git tag:RD-INFRA-20191024-RC0
The sdei driver was included in kernel, do i need to provide sdei driver version? If need please let me know.
4) I can't tell from looking at the log but do you know if writing 0x123
to sde_ras_poison causes a DMC620 interrupt or an SError or external
abort through memory access ?
A: Sorry, linaro only refered it will inject the DMC-620 single-bit RAS error. So I am also not sure which exception type it will trigger.
BRs,
Bin Wu
------------------原始邮件 ------------------
发件人:TF-A <tf-a-bounces(a)lists.trustedfirmware.org>
发送时间:Tue Apr 14 01:25:47 2020
收件人:Raghu Krishnamurthy via TF-A <tf-a(a)lists.trustedfirmware.org>
主题:Re: [TF-A] [RAS] BL32 UnRecognized Event - 0xC4000061 and BL31 Crashed
Hello,
>>Does BL31 need to send 0xC4000061 event to BL32 again?
I don't think it will. It is really odd that
0xC4000061(SP_EVENT_COMPLETE_AARCH64) ever reaches the BL32/MM handler.
This is from looking at the upstream code quickly but it definitely
depends on the platform you are running, what version of TF-A you are
using, build options used. Is it possible that the unhandled exception
is occurring after successful handling of the DMC620 error but there is
a following issue that occurs right after, causing the crash?
From the register dump it looks like there was an Instruction abort
exception at address 0 while running in EL3. Something seems to have
gone seriously wrong to have 0xC4000061 ever go back to BL32 and to get
an instruction abort at address 0.
>>Does current TF-A support to run RAS test? It seems BL31 will crash.
See above. The answer really depends on the factors mentioned above.
The following would be helpful to know:
1) What platform you are running on? Can this issue be reproduced
outside your testing environment, perhaps on FVP or QEMU?
2) What version of TF-A and StandaloneMM is being used? Preferably the
commit-id, so that we can be sure we are looking at the same code.
3) What version of the kernel and sdei driver is being used?
4) I can't tell from looking at the log but do you know if writing 0x123
to sde_ras_poison causes a DMC620 interrupt or an SError or external
abort through memory access ?
Thanks
Raghu
On 4/13/20 12:16 AM, 吴斌(郅隆) via TF-A wrote:
> Dear Friends,
>
> I am using TF-A to test RAS feature.
> When I triggered DMC620 RAS error in Linux(echo 0x123 >
> /sys/kernel/debug/sdei_ras_poison).
> BL32 will recieve
> UnRecognized Event - 0xC4000061(SP_EVENT_COMPLETE_AARCH64) and finally
> BL31 crashed.
>
> In my understanding, this 0xC4000061 should consumed by BL31, not send
> it to BL32 again.
>
> A piece of error log as below:
>
> *************************************
>
> CperWrite - CperAddress@0xFF610064
> CperWrite - 1 Section@FFBE91A8, Length 80, SectionType@FFBE9138
> CperWrite - Got Error Section: Platform Memory.
> MmEntryPoint Done
> Received delegated event
> X0 : 0xC4000061
> X1 : 0x0
> X2 : 0x0
> X3 : 0x0
> Received event - 0xC4000061 on cpu 0
> UnRecognized Event - 0xC4000061
> Failed delegated event 0xC4000061, Status 0x2
> Unhandled Exception in EL3.
> x30 = 0x0000000000000000
> x0 = 0x00000000ff007e00
> x1 = 0xfffffffffffffffe
> x2 = 0x00000000600003c0
> x3 = 0x0000000000000000
> x4 = 0x0000000000000000
> x5 = 0x0000000000000000
> x6 = 0x00000000ff015080
> x7 = 0x0000000000000000
> x8 = 0x00000000c4000061
> x9 = 0x0000000000000021
> x10 = 0x0000000000000040
> x11 = 0x00000000ff00f2b0
> x12 = 0x00000000ff0118c0
> x13 = 0x0000000000000002
> x14 = 0x00000000ff016b70
> x15 = 0x00000000ff003f20
> x16 = 0x0000000000000044
> x17 = 0x00000000ff010430
> x18 = 0x0000000000000e3c
> x19 = 0x0000000000000000
> More error log please refer to attachment.
>
> My question is,
> 1. Does BL31 need to send 0xC4000061 event to BL32 again?
> 2. Does current TF-A support to run RAS test? It seems BL31 will crash.
>
> Appreciate your help.
>
> BRs,
> Bin Wu
>
--
TF-A mailing list
TF-A(a)lists.trustedfirmware.org
https://lists.trustedfirmware.org/mailman/listinfo/tf-a
Hi Varun,
1. The value of '1' sets ‘standard’ type of BP which according to GCC documentation:
"turns on all types of branch protection features. If a feature has additional tuning options, then ‘standard’ sets it to its standard level. "
It equals to "bti+pac-ret".
2. Yes. See above and use option value of '1'.
Regards.
Alexei
________________________________
From: TF-A <tf-a-bounces(a)lists.trustedfirmware.org> on behalf of Varun Wadekar via TF-A <tf-a(a)lists.trustedfirmware.org>
Sent: 10 April 2020 19:28
To: tf-a(a)lists.trustedfirmware.org <tf-a(a)lists.trustedfirmware.org>
Cc: Kalyani Chidambaram Vaidyanathan <kalyanic(a)nvidia.com>; Anthony Zhou <anzhou(a)nvidia.com>
Subject: Re: [TF-A] BRANCH_PROTECTION
Hello,
Can someone please help clarify?
-Varun
From: TF-A <tf-a-bounces(a)lists.trustedfirmware.org> On Behalf Of Varun Wadekar via TF-A
Sent: Tuesday, April 7, 2020 9:58 PM
To: tf-a(a)lists.trustedfirmware.org
Cc: Kalyani Chidambaram Vaidyanathan <kalyanic(a)nvidia.com>; Anthony Zhou <anzhou(a)nvidia.com>
Subject: [TF-A] BRANCH_PROTECTION
External email: Use caution opening links or attachments
Hello,
Can someone please help me understand if
1. a ‘value’ of ‘1’ for BRANCH_PROTECTION covers the PAuth protection provided by a value of ‘2’ and/or ‘3’?
2. there is a way to enable BTI and “pac-ret” at the same time?
The docs provide this information.
<snip>
- ``BRANCH_PROTECTION``: Numeric value to enable ARMv8.3 Pointer Authentication
and ARMv8.5 Branch Target Identification support for TF-A BL images themselves.
If enabled, it is needed to use a compiler that supports the option
``-mbranch-protection``. Selects the branch protection features to use:
- 0: Default value turns off all types of branch protection
- 1: Enables all types of branch protection features
- 2: Return address signing to its standard level
- 3: Extend the signing to include leaf functions
The table below summarizes ``BRANCH_PROTECTION`` values, GCC compilation options
and resulting PAuth/BTI<https://tegra-sw-opengrok.nvidia.com/source/s?path=PAuth/BTI&project=stage-…> features.
+-------+--------------+-------+-----+
| Value | GCC option | PAuth | BTI |
+=======+==============+=======+=====+
| 0 | none | N | N |
+-------+--------------+-------+-----+
| 1 | standard | Y | Y |
+-------+--------------+-------+-----+
| 2 | pac-ret | Y | N |
+-------+--------------+-------+-----+
| 3 | pac-ret+leaf | Y | N |
+-------+--------------+-------+-----+
This option defaults to 0 and this is an experimental feature.
Note that Pointer Authentication is enabled for Non-secure world
irrespective of the value of this option if the CPU supports it.
<snip>
Thanks,
Varun
________________________________
This email message is for the sole use of the intended recipient(s) and may contain confidential information. Any unauthorized review, use, disclosure or distribution is prohibited. If you are not the intended recipient, please contact the sender by reply email and destroy all copies of the original message.
________________________________
IMPORTANT NOTICE: The contents of this email and any attachments are confidential and may also be privileged. If you are not the intended recipient, please notify the sender immediately and do not disclose the contents to any other person, use it for any purpose, or store or copy the information in any medium. Thank you.
Hello,
>>Does BL31 need to send 0xC4000061 event to BL32 again?
I don't think it will. It is really odd that
0xC4000061(SP_EVENT_COMPLETE_AARCH64) ever reaches the BL32/MM handler.
This is from looking at the upstream code quickly but it definitely
depends on the platform you are running, what version of TF-A you are
using, build options used. Is it possible that the unhandled exception
is occurring after successful handling of the DMC620 error but there is
a following issue that occurs right after, causing the crash?
From the register dump it looks like there was an Instruction abort
exception at address 0 while running in EL3. Something seems to have
gone seriously wrong to have 0xC4000061 ever go back to BL32 and to get
an instruction abort at address 0.
>>Does current TF-A support to run RAS test? It seems BL31 will crash.
See above. The answer really depends on the factors mentioned above.
The following would be helpful to know:
1) What platform you are running on? Can this issue be reproduced
outside your testing environment, perhaps on FVP or QEMU?
2) What version of TF-A and StandaloneMM is being used? Preferably the
commit-id, so that we can be sure we are looking at the same code.
3) What version of the kernel and sdei driver is being used?
4) I can't tell from looking at the log but do you know if writing 0x123
to sde_ras_poison causes a DMC620 interrupt or an SError or external
abort through memory access ?
Thanks
Raghu
On 4/13/20 12:16 AM, 吴斌(郅隆) via TF-A wrote:
> Dear Friends,
>
> I am using TF-A to test RAS feature.
> When I triggered DMC620 RAS error in Linux(echo 0x123 >
> /sys/kernel/debug/sdei_ras_poison).
> BL32 will recieve
> UnRecognized Event - 0xC4000061(SP_EVENT_COMPLETE_AARCH64) and finally
> BL31 crashed.
>
> In my understanding, this 0xC4000061 should consumed by BL31, not send
> it to BL32 again.
>
> A piece of error log as below:
>
> *************************************
>
> CperWrite - CperAddress@0xFF610064
> CperWrite - 1 Section@FFBE91A8, Length 80, SectionType@FFBE9138
> CperWrite - Got Error Section: Platform Memory.
> MmEntryPoint Done
> Received delegated event
> X0 : 0xC4000061
> X1 : 0x0
> X2 : 0x0
> X3 : 0x0
> Received event - 0xC4000061 on cpu 0
> UnRecognized Event - 0xC4000061
> Failed delegated event 0xC4000061, Status 0x2
> Unhandled Exception in EL3.
> x30 = 0x0000000000000000
> x0 = 0x00000000ff007e00
> x1 = 0xfffffffffffffffe
> x2 = 0x00000000600003c0
> x3 = 0x0000000000000000
> x4 = 0x0000000000000000
> x5 = 0x0000000000000000
> x6 = 0x00000000ff015080
> x7 = 0x0000000000000000
> x8 = 0x00000000c4000061
> x9 = 0x0000000000000021
> x10 = 0x0000000000000040
> x11 = 0x00000000ff00f2b0
> x12 = 0x00000000ff0118c0
> x13 = 0x0000000000000002
> x14 = 0x00000000ff016b70
> x15 = 0x00000000ff003f20
> x16 = 0x0000000000000044
> x17 = 0x00000000ff010430
> x18 = 0x0000000000000e3c
> x19 = 0x0000000000000000
> More error log please refer to attachment.
>
> My question is,
> 1. Does BL31 need to send 0xC4000061 event to BL32 again?
> 2. Does current TF-A support to run RAS test? It seems BL31 will crash.
>
> Appreciate your help.
>
> BRs,
> Bin Wu
>
Dear Friends,
I am using TF-A to test RAS feature.
When I triggered DMC620 RAS error in Linux(echo 0x123 > /sys/kernel/debug/sdei_ras_poison).
BL32 will recieve UnRecognized Event - 0xC4000061(SP_EVENT_COMPLETE_AARCH64) and finally BL31 crashed.
In my understanding, this 0xC4000061 should consumed by BL31, not send it to BL32 again.
A piece of error log as below:
*************************************
CperWrite - CperAddress@0xFF610064
CperWrite - 1 Section@FFBE91A8, Length 80, SectionType@FFBE9138
CperWrite - Got Error Section: Platform Memory.
MmEntryPoint Done
Received delegated event
X0 : 0xC4000061
X1 : 0x0
X2 : 0x0
X3 : 0x0
Received event - 0xC4000061 on cpu 0
UnRecognized Event - 0xC4000061
Failed delegated event 0xC4000061, Status 0x2
Unhandled Exception in EL3.
x30 = 0x0000000000000000
x0 = 0x00000000ff007e00
x1 = 0xfffffffffffffffe
x2 = 0x00000000600003c0
x3 = 0x0000000000000000
x4 = 0x0000000000000000
x5 = 0x0000000000000000
x6 = 0x00000000ff015080
x7 = 0x0000000000000000
x8 = 0x00000000c4000061
x9 = 0x0000000000000021
x10 = 0x0000000000000040
x11 = 0x00000000ff00f2b0
x12 = 0x00000000ff0118c0
x13 = 0x0000000000000002
x14 = 0x00000000ff016b70
x15 = 0x00000000ff003f20
x16 = 0x0000000000000044
x17 = 0x00000000ff010430
x18 = 0x0000000000000e3c
x19 = 0x0000000000000000
More error log please refer to attachment.
My question is,
1. Does BL31 need to send 0xC4000061 event to BL32 again?
2. Does current TF-A support to run RAS test? It seems BL31 will crash.
Appreciate your help.
BRs,
Bin Wu
Hello,
Can someone please help clarify?
-Varun
From: TF-A <tf-a-bounces(a)lists.trustedfirmware.org> On Behalf Of Varun Wadekar via TF-A
Sent: Tuesday, April 7, 2020 9:58 PM
To: tf-a(a)lists.trustedfirmware.org
Cc: Kalyani Chidambaram Vaidyanathan <kalyanic(a)nvidia.com>; Anthony Zhou <anzhou(a)nvidia.com>
Subject: [TF-A] BRANCH_PROTECTION
External email: Use caution opening links or attachments
Hello,
Can someone please help me understand if
1. a 'value' of '1' for BRANCH_PROTECTION covers the PAuth protection provided by a value of '2' and/or '3'?
2. there is a way to enable BTI and "pac-ret" at the same time?
The docs provide this information.
<snip>
- ``BRANCH_PROTECTION``: Numeric value to enable ARMv8.3 Pointer Authentication
and ARMv8.5 Branch Target Identification support for TF-A BL images themselves.
If enabled, it is needed to use a compiler that supports the option
``-mbranch-protection``. Selects the branch protection features to use:
- 0: Default value turns off all types of branch protection
- 1: Enables all types of branch protection features
- 2: Return address signing to its standard level
- 3: Extend the signing to include leaf functions
The table below summarizes ``BRANCH_PROTECTION`` values, GCC compilation options
and resulting PAuth/BTI<https://tegra-sw-opengrok.nvidia.com/source/s?path=PAuth/BTI&project=stage-…> features.
+-------+--------------+-------+-----+
| Value | GCC option | PAuth | BTI |
+=======+==============+=======+=====+
| 0 | none | N | N |
+-------+--------------+-------+-----+
| 1 | standard | Y | Y |
+-------+--------------+-------+-----+
| 2 | pac-ret | Y | N |
+-------+--------------+-------+-----+
| 3 | pac-ret+leaf | Y | N |
+-------+--------------+-------+-----+
This option defaults to 0 and this is an experimental feature.
Note that Pointer Authentication is enabled for Non-secure world
irrespective of the value of this option if the CPU supports it.
<snip>
Thanks,
Varun
________________________________
This email message is for the sole use of the intended recipient(s) and may contain confidential information. Any unauthorized review, use, disclosure or distribution is prohibited. If you are not the intended recipient, please contact the sender by reply email and destroy all copies of the original message.
________________________________
Hi,
Please find the latest report on new defect(s) introduced to ARM-software/arm-trusted-firmware found with Coverity Scan.
3 new defect(s) introduced to ARM-software/arm-trusted-firmware found with Coverity Scan.
New defect(s) Reported-by: Coverity Scan
Showing 3 of 3 defect(s)
** CID 355490: Uninitialized variables (UNINIT)
/plat/brcm/board/stingray/src/paxb.c: 516 in pcie_cores_init()
________________________________________________________________________________________________________
*** CID 355490: Uninitialized variables (UNINIT)
/plat/brcm/board/stingray/src/paxb.c: 516 in pcie_cores_init()
510
511 pcie_core_soft_reset(core_idx);
512
513 VERBOSE("PCIe core %u is powered up\n", core_idx);
514 }
515
>>> CID 355490: Uninitialized variables (UNINIT)
>>> Using uninitialized value "ret".
516 return ret;
517 }
518
519 void paxb_rc_cfg_write(unsigned int core_idx, unsigned int where,
520 uint32_t val)
521 {
** CID 355489: Control flow issues (NO_EFFECT)
/plat/brcm/board/stingray/src/bl2_setup.c: 685 in plat_bcm_bl2_plat_arch_setup()
________________________________________________________________________________________________________
*** CID 355489: Control flow issues (NO_EFFECT)
/plat/brcm/board/stingray/src/bl2_setup.c: 685 in plat_bcm_bl2_plat_arch_setup()
679 EMMC_ERASE_PARTITION);
680 #endif
681
682 bcm_board_detect();
683 #ifdef DRIVER_EMMC_ENABLE
684 /* Initialize the card, if it is not */
>>> CID 355489: Control flow issues (NO_EFFECT)
>>> This less-than-zero comparison of an unsigned value is never true. "bcm_emmc_init(true) < 0U".
685 if (bcm_emmc_init(true) < 0)
686 WARN("eMMC Card Initialization Failed!!!\n");
687 #endif
688
689 #if BL2_TEST_I2C
690 i2c_test();
** CID 355488: (VARARGS)
/plat/brcm/board/common/bcm_elog.c: 251 in bcm_elog()
/plat/brcm/board/common/bcm_elog.c: 253 in bcm_elog()
/plat/brcm/board/common/bcm_elog.c: 241 in bcm_elog()
/plat/brcm/board/common/bcm_elog.c: 234 in bcm_elog()
/plat/brcm/board/common/bcm_elog.c: 223 in bcm_elog()
/plat/brcm/board/common/bcm_elog.c: 239 in bcm_elog()
________________________________________________________________________________________________________
*** CID 355488: (VARARGS)
/plat/brcm/board/common/bcm_elog.c: 251 in bcm_elog()
245 case 'l':
246 bit64 = 1;
247 fmt++;
248 goto loop;
249 case 'u':
250 if (bit64)
>>> CID 355488: (VARARGS)
>>> Calling va_arg on va_list "args", which has not been prepared with va_start().
251 unum = va_arg(args, uint64_t);
252 else
253 unum = va_arg(args, uint32_t);
254
255 elog_unsigned_num(elog, unum, 10);
256 break;
/plat/brcm/board/common/bcm_elog.c: 253 in bcm_elog()
247 fmt++;
248 goto loop;
249 case 'u':
250 if (bit64)
251 unum = va_arg(args, uint64_t);
252 else
>>> CID 355488: (VARARGS)
>>> Calling va_arg on va_list "args", which has not been prepared with va_start().
253 unum = va_arg(args, uint32_t);
254
255 elog_unsigned_num(elog, unum, 10);
256 break;
257 default:
258 /* Exit on any other format specifier */
/plat/brcm/board/common/bcm_elog.c: 241 in bcm_elog()
235 elog_string(elog, str);
236 break;
237 case 'x':
238 if (bit64)
239 unum = va_arg(args, uint64_t);
240 else
>>> CID 355488: (VARARGS)
>>> Calling va_arg on va_list "args", which has not been prepared with va_start().
241 unum = va_arg(args, uint32_t);
242
243 elog_unsigned_num(elog, unum, 16);
244 break;
245 case 'l':
246 bit64 = 1;
/plat/brcm/board/common/bcm_elog.c: 234 in bcm_elog()
228 } else
229 unum = (unsigned long)num;
230
231 elog_unsigned_num(elog, unum, 10);
232 break;
233 case 's':
>>> CID 355488: (VARARGS)
>>> Calling va_arg on va_list "args", which has not been prepared with va_start().
234 str = va_arg(args, char *);
235 elog_string(elog, str);
236 break;
237 case 'x':
238 if (bit64)
239 unum = va_arg(args, uint64_t);
/plat/brcm/board/common/bcm_elog.c: 223 in bcm_elog()
217 switch (*fmt) {
218 case 'i': /* Fall through to next one */
219 case 'd':
220 if (bit64)
221 num = va_arg(args, int64_t);
222 else
>>> CID 355488: (VARARGS)
>>> Calling va_arg on va_list "args", which has not been prepared with va_start().
223 num = va_arg(args, int32_t);
224
225 if (num < 0) {
226 elog_putchar(elog, '-');
227 unum = (unsigned long)-num;
228 } else
/plat/brcm/board/common/bcm_elog.c: 239 in bcm_elog()
233 case 's':
234 str = va_arg(args, char *);
235 elog_string(elog, str);
236 break;
237 case 'x':
238 if (bit64)
>>> CID 355488: (VARARGS)
>>> Calling va_arg on va_list "args", which has not been prepared with va_start().
239 unum = va_arg(args, uint64_t);
240 else
241 unum = va_arg(args, uint32_t);
242
243 elog_unsigned_num(elog, unum, 16);
244 break;
________________________________________________________________________________________________________
To view the defects in Coverity Scan visit, https://u2389337.ct.sendgrid.net/ls/click?upn=nJaKvJSIH-2FPAfmty-2BK5tYpPkl…
Hello,
Can someone please help me understand if
1. a 'value' of '1' for BRANCH_PROTECTION covers the PAuth protection provided by a value of '2' and/or '3'?
2. there is a way to enable BTI and "pac-ret" at the same time?
The docs provide this information.
<snip>
- ``BRANCH_PROTECTION``: Numeric value to enable ARMv8.3 Pointer Authentication
and ARMv8.5 Branch Target Identification support for TF-A BL images themselves.
If enabled, it is needed to use a compiler that supports the option
``-mbranch-protection``. Selects the branch protection features to use:
- 0: Default value turns off all types of branch protection
- 1: Enables all types of branch protection features
- 2: Return address signing to its standard level
- 3: Extend the signing to include leaf functions
The table below summarizes ``BRANCH_PROTECTION`` values, GCC compilation options
and resulting PAuth/BTI<https://tegra-sw-opengrok.nvidia.com/source/s?path=PAuth/BTI&project=stage-…> features.
+-------+--------------+-------+-----+
| Value | GCC option | PAuth | BTI |
+=======+==============+=======+=====+
| 0 | none | N | N |
+-------+--------------+-------+-----+
| 1 | standard | Y | Y |
+-------+--------------+-------+-----+
| 2 | pac-ret | Y | N |
+-------+--------------+-------+-----+
| 3 | pac-ret+leaf | Y | N |
+-------+--------------+-------+-----+
This option defaults to 0 and this is an experimental feature.
Note that Pointer Authentication is enabled for Non-secure world
irrespective of the value of this option if the CPU supports it.
<snip>
Thanks,
Varun
-----------------------------------------------------------------------------------
This email message is for the sole use of the intended recipient(s) and may contain
confidential information. Any unauthorized review, use, disclosure or distribution
is prohibited. If you are not the intended recipient, please contact the sender by
reply email and destroy all copies of the original message.
-----------------------------------------------------------------------------------
I realise non TF-A people may not have access to session conference details. These can be found here:
https://lists.trustedfirmware.org/pipermail/tf-a/2020-March/000330.html
From: TF-A <tf-a-bounces(a)lists.trustedfirmware.org> on behalf of Joanna Farley via TF-A <tf-a(a)lists.trustedfirmware.org>
Reply to: Joanna Farley <Joanna.Farley(a)arm.com>
Date: Tuesday, 7 April 2020 at 18:10
To: tf-a <tf-a(a)lists.trustedfirmware.org>
Cc: "tsc(a)lists.trustedfirmware.org" <tsc(a)lists.trustedfirmware.org>, "tf-m(a)lists.trustedfirmware.org" <tf-m(a)lists.trustedfirmware.org>, "op-tee(a)linaro.org" <op-tee(a)linaro.org>
Subject: [TF-A] TF-A Tech Forum @ Thu 9 Apr 2020 17:00 - 18:00 (GMT) - Special session on Project Maintenance Proposal for tf.org
Hi All,
The third TF-A Tech Forum is scheduled for Thu 9th Apr 2020 17:00 - 18:00 (GMT). A reoccurring meeting invite has been sent out to the subscribers of this TF-A mailing list. If you don’t have this please let me know.
For this special session I have also copied the TF-M, TSC and OPTEE mailing lists as the subject may interest the people subscribed to those lists as there is a cross mailing list discussion currently ongoing.
Agenda:
* Overview of the Project Maintenance Proposal for tf.org Projects by Sandrine Bailleux
* Optional TF-A Mailing List Topic Discussions
Thanks
Joanna
IMPORTANT NOTICE: The contents of this email and any attachments are confidential and may also be privileged. If you are not the intended recipient, please notify the sender immediately and do not disclose the contents to any other person, use it for any purpose, or store or copy the information in any medium. Thank you.
IMPORTANT NOTICE: The contents of this email and any attachments are confidential and may also be privileged. If you are not the intended recipient, please notify the sender immediately and do not disclose the contents to any other person, use it for any purpose, or store or copy the information in any medium. Thank you.
Hi All,
The third TF-A Tech Forum is scheduled for Thu 9th Apr 2020 17:00 - 18:00 (GMT). A reoccurring meeting invite has been sent out to the subscribers of this TF-A mailing list. If you don’t have this please let me know.
For this special session I have also copied the TF-M, TSC and OPTEE mailing lists as the subject may interest the people subscribed to those lists as there is a cross mailing list discussion currently ongoing.
Agenda:
* Overview of the Project Maintenance Proposal for tf.org Projects by Sandrine Bailleux
* Optional TF-A Mailing List Topic Discussions
Thanks
Joanna
IMPORTANT NOTICE: The contents of this email and any attachments are confidential and may also be privileged. If you are not the intended recipient, please notify the sender immediately and do not disclose the contents to any other person, use it for any purpose, or store or copy the information in any medium. Thank you.
Hi Joakim,
On 4/2/20 10:18 AM, Joakim Bech via TF-A wrote:
> Hi Sandrine,
>
> On Wed, Apr 01, 2020 at 11:46:20AM +0200, Sandrine Bailleux wrote:
>> Hi Joakim,
>>
>> On 4/1/20 10:08 AM, Joakim Bech via TSC wrote:
>>> How that works in practice is that all OP-TEE maintainers are adding
>>> their "Tested-by" (see example [2]) tag for the platform they maintain
>>> when we're doing a release. If there are platforms with no "Tested-by"
>>> tag, then they simply end up with the "last known version".
>>
>> I think that's a very good idea!
>>
> The "Tested-by" part for OP-TEE releases has been working pretty good,
> not sure how scalable it is the long run though. To give some more
> info regarding the "last known version", we even at one point had some
> stop-light for that. I.e. if a maintainer missed testing a release once,
> then it became "orange". If missed twice, then it became "red" and we
> showed last know supported version. But we dropped that idea a short
> while after introducing it.
May I know why you dropped the idea? Was it too much maintenance? If
that's the reason I guess again this could be addressed with some
automation work (generating the stop-light status from the commit
message info).
>>> However, to keep that up-to-date, it requires some discipline from the
>>> people maintaining such a table ... something that we in the OP-TEE
>>> project haven't been very good at :)
>>
>> Can't this be automated, such that it doesn't need to be manually kept
>> up-to-date? I imagine we could have some tools generating the platform
>> support table out of such a commit message.
>>
> Indeed it could, it's just a matter of doing some scripting if one
> doesn't want to do it manually. I already have Python scripts pulling
> all tags from GitHub pull requests. But there are of course several
> other ways how one could pull that kind of information.
Regards,
Sandrine
Hi Erik,
On 4/1/20 9:24 PM, Shreve, Erik via TF-A wrote:
> Sandrine,
>
> To clarify on functionality vs. support. I listed out a support life cycle consisting of the following states:
> Fully Supported
> Orphan
> Out of Date
> Deprecated
> These states are intended to have nothing to do with functionality, but only the support offered for the functionality that currently exists for a platform.
>
> I think I may have confused things when I listed "Functional Support" as the heading to represent "Functionality."
> I'm proposing that the supported "Functionality" should be documented in a standard way (within a project) for every platform.
>
> I do agree this could be burdensome to keep up with. But that is why I suggested that the project's feature list be versioned. The platform's supported feature list document would reference the version of the project feature list used. Platform maintainers then don' t have to continuously update the document. But it will be clear how long it has been since they did update and thus what information may be missing. Versioning the feature list document is also why I mentioned that the project version number may want to adopt a version number scheme where feature changes are represented by a certain part of the version number. For example Semantic Versioning 2.0.0: https://semver.org/. Hope that clarifies the intent? For implementation of this I'm imagining each project could create a supported_feature_list.rst file and each platform would copy that file into their platform doc folder and fill it in. I'm not saying that approach would be required at tf.org level, just sharing to further illustrate.
>
> That said, perhaps the implementation details for a project would not warrant such a document per platform? My primary concern around this is misuse/misconfiguration. If a platform doesn't support a feature or configuration it may not be obvious to a user unless an error is generated at build or run-time.
OK, I think I get the idea now, thanks for the explanations. This looks
reasonable to me. The idea of keeping a project's feature list being
mirrored and filled in per platform sounds like something we would want
to enforce at the tf.org level IMO.
At the same time, this could also be handled at the build system level
as you pointed out, or more precisely by the configuration manager. I am
thinking about the Linux kernel, where support for a particular feature
is handled (and documented) through the KConfig system. This might be a
more scalable approach. And it doesn't prevent us from also
auto-generating some feature list out of the Kconfig files for making
this information more accessible to users.
> My secondary concern is being able to consistently track tickets/bugs with features. Thus, I'm recommending that the features on that list be used with the ticket/issue system by feature name. This would allow users to find all bugs for Feature X on Platform Y in Project Z.
> Related to that, when I mentioned "tags." I wasn't thinking of Git tags, but "labels" in the ticket/issue tracking system(s). Different systems work differently for labeling/categorizing issues, but the goal is to provide a consistent way (per project) to find issues related to a feature on a platform.
>
> If requiring a feature list it is too much at the tf.org level then I'll be satisfied to push for that kind of documentation in the projects or platforms I'm involved in if/as appropriate.
Yes, good point, I think it would be desirable to be able to tie tickets
to some specific platform/feature/version. And I think it makes sense to
unify this across tf.org projects.
> Regarding the other conversational tidbits:
>
> Thanks for pointing out that the original proposal does say "builds all configurations supported by this platform" under "Fully Supported." I can see the intention here now. Substituting "features" for "configurations" would broaden the meaning a bit.
OK, I will change the wording, thanks for the suggestion.
> You said: "I am starting to think that we need a list of items to be defined per project."
> Yes, this sounds like a great idea.
>
> My original mention of wanting a "stronger standard put forth for platform documentation" was a response to seeing "Limited Support" in the original proposal allowing documentation to fall out of date.
>
>
> Hope that clarifies some of my thoughts. If not, I'm happy to continue the discussion. Thanks again for taking feedback!
>
> Erik Shreve, PSEM
> Software Security Engineer & Architect (CMCU Platform Development)
>
> -----Original Message-----
> From: Sandrine Bailleux [mailto:sandrine.bailleux@arm.com]
> Sent: Wednesday, April 01, 2020 4:18 AM
> To: Shreve, Erik; tf-a; tf-m(a)lists.trustedfirmware.org; tsc(a)lists.trustedfirmware.org; op-tee(a)linaro.org
> Cc: nd(a)arm.com
> Subject: Re: [EXTERNAL] [TF-M] Project Maintenance Proposal for tf.org Projects
>
> Hello Erik,
>
> Thanks for the feedback.
>
> On 3/26/20 3:37 PM, Shreve, Erik wrote:
>> Sandrine,
>>
>> Really glad to see this being pulled together. A couple of areas of feedback around the Platform Support Life Cycle.
>>
>> As previously mentioned there are two orthogonal concerns captured in the current life cycle: Support and Functionality.
>> I'd like to see these split out.
>
> Yes, you are the second person to mention that and I agree with you
> both. Unless someone disagrees, I intend to update the proposal and
> separate these 2 concepts in the next version of the document.
>
>> For functionality, chip vendors may not have a business case for supporting all features on a given platform but they may provide full support for the features they have chosen to include.
>> A simple example would be supporting PSA FF Isolation Level 1 only due to lack of HW isolation support needed to achieve Isolation Level 2 or greater.
>
> I completely agree. It would not make sense to support all features on
> all platforms just for the sake of completeness. Each platform ought to
> implement what is relevant in its case.
>
> That's what the current proposal tried to convey: a fully supported
> platform must "build all configurations *supported by this platform*"
> and "All *supported* configurations are tested in the CI". The key word
> here is supported and that would be defined by the platform itself. But
> I can see that maybe this wasn't clear enough. Your proposal below makes
> that a lot clearer.
>
>> Also, I'd like to see a stronger standard put forth for platform documentation. If a platform is "supported," I believe the documentation should be complete and accurate. A lack of complete and clear documentation leaves open a wide door for misuse/misconfiguration which could result in a vulnerable system.
>
> Fair point.
>
> But is it something we should include in this proposal or should we push
> it to a separate document setting expectations for the project's
> documentation, which the current general proposal could refer to (as in,
> "the platform should provide quality documentation up to the project's
> criteria defined in document XXX')?
>
> This is definitely an important topic but I am wary of keeping the
> tf.org proposal concise and focused at this point. I am worried that if
> we put too much stuff in it discussions will diverge too much and we
> might never reach an agreement.
>
> The same applies to testing standards for example, we could detail that
> in the proposal or simply leave it to projects to define it separately.
>
>> Here is a more concrete proposal:
>>
>> Functional Support:
>> Each project shall provide a standard feature or functionality list.
>> Each platform shall include in its documentation a copy of this list with the supported functionality marked as supported.
>> The platform documentation may reference a ticket if support is planned but not yet present.
>> The platform documentation shall explicitly state if a feature or function has no plans for support.
>
> Regarding the last item, this would require all platform maintainers to
> update their documentation every time a new feature is added to the
> project's global list of features. This seems too much of a constraint
> and unnecessary maintenance burden to me.
>
> I think a better, more lightweight alternative might be to let platform
> maintainers list what's supported and if some feature is not listed, it
> implies that it is not supported. This does not prevent platform
> maintainers from indicating their future plans of supporting a feature
> if they want to.
>
>> The feature/functionality list shall be versioned, with the version tied to the release version(s) of the project.
>> In this way, it will be clear if a platform was last officially updated for version X but the project is currently at version Y > X.
>
> I can see that Joakim Bech proposed something similar, with more details
> about how this was implemented for OP-TEE.
>
>> Note: projects will need to adopt (if they have not already) a version scheme that distinguishes between feature updates and bug fixes.
>
> Sorry I didn't get this, could you please elaborate?
>
>> Each project and platform shall use tags or similar functionality on tickets to associate tickets to features/functionality and platforms.
>> If the names of tags can't match the name of the feature or platform exactly then a mapping shall be provided in the appropriate document(s).
>
> If there's no appropriate tag in some cases, I guess we could always use
> a git SHA1 of a specific commit.
>
>> Life Cycle State
>>
>> Fully Supported
>> There is (at least) one active code owner for this platform.
>> All supported features build and either all tests pass or failures are associated with tracked known issues.
>> Other (not associated to a test) Known Issues are tracked
>> Documentation is up to date
>>
>> Note: Projects should document standards on how "active" code ownership is measured and
>> further document standards on how code owners are warned about impending life cycle state changes.
>
> Yes, good point, that is currently undefined in the proposal but I agree
> that it needs defining per project. I will add an item in the last
> section of the document.
>
> I am starting to think that we need a list of items to be defined per
> project. This list would complement the general tf.org proposal. Things
> like code owners/maintainers activity, code review timelines, and so on.
>
>>
>> Orphan
>> There is no active code owner
>> All supported features build and either all tests pass or failures are associated with tracked known issues.
>> Other (not associated to a test) Known Issues may not have been maintained (as there is no active code owner)
>> Documentation status is unclear since there is no active code owner.
>> There has been no change to the feature/functionality list in the project since the platform was last "Fully Supported"
>
> I am confused, you said earlier that you would like to see the concepts
> of support and functionality split out, but here you're listing 'orphan'
> as one of the possible states... Did I miss your point?
>
>> Out of date
>> Same as orphan, but either:
>> there have been changes to the feature/functionality list, or
>> there are failing tests without tracked tickets, or
>> there are known documentation issues.
>>
>> Deprecated
>> Same as Out of Date, but the build is broken. Platform may be removed from the project codebase in the future.
>>
>> Erik Shreve, PSEM
>> Software Security Engineer & Architect (CMCU Platform Development)
Hi Raghu,
I do agree with you: case 2 and 3 are similar (wrongly formed DTB) and should lead to the same behavior.
A mandatory property miss or a hit with a structurally incorrect node means that the DTB doesn't follow the provided binding document. Such a DTB shouldn't be considered as valid and should trigger a build failure and/or a code panic.
With the current implementation, case 2 and 3 are similar. The property_getter() functions expect a specific format of the node. If a node is not found or structurally incorrect, the function will return an error code, which will lead to a panic().
regards,
Louis
________________________________
From: TF-A <tf-a-bounces(a)lists.trustedfirmware.org> on behalf of Raghu Krishnamurthy via TF-A <tf-a(a)lists.trustedfirmware.org>
Sent: 06 April 2020 19:51
To: tf-a(a)lists.trustedfirmware.org <tf-a(a)lists.trustedfirmware.org>
Subject: Re: [TF-A] fconf: Validating config data
Thanks Louis. This new terminology helps. Let me spell this out again
just to make sure we're on the same page. The following are the cases:
1) Mandatory Property hit, nodes are structurally correct - Works normally.
2) Mandatory Property hit, nodes are structurally *incorrect* - Asserts
should catch structural issues during development because system
integrator expected to not make mistakes with number of nodes etc.
3) Mandatory Property miss - Panic(). Why is this case causing a panic,
but not case 2? You are allowing for some one to make the mistake of not
having a mandatory property, but assume that if a mandatory property is
present, it is always structurally sound. This is the part i have a
problem with. In my view case 2 and case 3 should panic and code should
not just assert on mandatory properties and must ALWAYS check for
structural soundness of an FDT property.
Similarly there are 3 cases for optional properties.
My question: Why does case 3 panic, but not case 2 ? It sounds like your
assumption is that case 2 cannot happen. If case 2 cannot happen, i
claim that case 3 cannot happen either.
Let me know what you think!
Thanks
Raghu
On 4/6/20 3:57 AM, Louis Mayencourt via TF-A wrote:
> Hi Raghu,
>
> Let me try to clarify/reword my idea:
>
> We complete the fconf documentation with a binding document, which
> defines the nodes that should be present in the config DTB to consider
> it as valid/well-formed. The document contains two kinds of node:
>
> * mandatory (critical): the firmware can't proceed without this
> information (example: load address, image UUID, ...). If this node
> is not present in the DTS, the build fails and/or the code panic.
>
> * optional (no-critical): the firmware can assume or assign a default
> value and proceed (example: uart config, enable authentication flag,
> ...). Such a property is used to influence the default behavior of
> the firmware.
>
> />> This is non-deterministic failure./
> The miss of a mandatory (critical) node will always lead to a build
> failure / code panic.
> The miss of an optional (no-critical) node should not influence the
> default behavior of the firmware.
>
> />> Is it not confusing to make the assumption that a DTB is "well
> formed",i.e expect the build process/integrator to not mess up the
>>> structure or number of nodes but allow the same integrator to miss a critical property in the DTB?/
> A DTB with a missing mandatory (critical) property should not be
> considered well formed.
>
> />> If there is a missing critical property, is that not a badly formed
> DTB?/
> With the above definition, it is.
>
> />> And if so, why not check for badly formed DTB's uniformly in code
> and why only check for missing "critical" properties?/
> With the rewording of "critical" / "no-critical" to "mandatory" /
> "optional", I think the situation is clearer: A DTB with a missing
> "optional" node/property is is still considered well-formed.
>
> I hope I answered your questions and clarify the idea behind the design.
>
> regards,
> Louis
>
> ------------------------------------------------------------------------
> *From:* Raghu Krishnamurthy <raghu.ncstate(a)icloud.com>
> *Sent:* 05 April 2020 00:06
> *To:* Louis Mayencourt <Louis.Mayencourt(a)arm.com>
> *Subject:* Re: [TF-A] fconf: Validating config data
> Thanks Louis.
> >>you can imagine a well-formed DTB which contain a
> >>critical set of properties and can contain some optional properties.
>
> This makes things even more confusing. The assumption we are asking code
> to make is that the DTB is always "well formed", ie don't check for
> structural issues such as extra nodes etc, but we are still making the
> distinctions between critical and non-critical properties, that may or
> may not exist in the DTB, which may or may not cause a panic. This is
> non-deterministic failure.
> Is it not confusing to make the assumption that a DTB is "well
> formed",i.e expect the build process/integrator to not mess up the
> structure or number of nodes but allow the same integrator to miss a
> critical property in the DTB? If there is a missing critical property,
> is that not a badly formed DTB ? And if so, why not check for badly
> formed DTB's uniformly in code and why only check for missing "critical"
> properties?
>
> -Raghu
>
> On 4/3/20 3:16 AM, Louis Mayencourt wrote:
>> Hi Raghu,
>>
>> I do agree that we need something similar to a binding document for
>> fconf properties. (similar to
>> https://review.trustedfirmware.org/c/TF-A/trusted-firmware-a/+/3694/2/docs/…).
>
>> At least for the common properties.
>>
>> The main idea behind the return code of the populator function was to
>> allow the code to handle no-critical property misses or to handle
>> critical failure by calling a platform hook.
>> With this in mind, you can imagine a well-formed DTB which contain a
>> critical set of properties and can contain some optional properties. The
>> return code and the populator "name" / "config" can be used to handle
>> this two cases.
>>
>> I tried to keep the design of fconf really simple, to leave room for
>> improvement according to feedbacks. Thanks for helping improving it!
>>
>> Regards,
>> Louis
>> ------------------------------------------------------------------------
>> *From:* TF-A <tf-a-bounces(a)lists.trustedfirmware.org> on behalf of Raghu
>> Krishnamurthy via TF-A <tf-a(a)lists.trustedfirmware.org>
>> *Sent:* 03 April 2020 10:15
>> *To:* tf-a(a)lists.trustedfirmware.org <tf-a(a)lists.trustedfirmware.org>
>> *Subject:* Re: [TF-A] fconf: Validating config data
>> A further point is that the fconf populators return an error code and
>> panics on error today. But if we are making the assumption that the
>> DTB's are well formed, do we really need to fail or even return an error
>> code?
>>
>> -Raghu
>>
>> On 4/3/20 1:51 AM, Raghu Krishnamurthy via TF-A wrote:
>>> Hi All, (Sorry for the long email)
>>>
>>> The review
>>> https://review.trustedfirmware.org/c/TF-A/trusted-firmware-a/+/3845
>>> attempts to fix bounds check in the fconf populator code for the
>>> topology and SP's. During review, Sandrine thoughtfully pointed out that
>>> there were discussions around bounds check along the same lines in the
>>> review
>>> https://review.trustedfirmware.org/c/TF-A/trusted-firmware-a/+/3492 and
>>> it was deemed sufficient to have assertions in code and it was safe to
>>> make the assumption that the DTB is always well formed and contains
>>> valid values. I think this email mostly echoes Sandrine's concern from
>>> review 3492.
>>>
>>> While i agree with the assumptions, I am generally of the opinion that
>>> we should validate/range-check any data, even if it is signed. Being
>>> signed does not necessarily mean the data is well formed/valid. If there
>>> is a mistake in the build process and it is validly signed, it is
>>> possible that we silently corrupt state/data that could later be used to
>>> exploit firmware and/or make debugging hard. This is probably far
>>> fetched, but the cost of adding the check is trivial to avoid this
>>> possibility.
>>>
>>> I imagine the case where you have secure partitions signed by different
>>> entity other than the silicon provider(dual root-of-trust). A silicon
>>> provider provides a dev system for the SP provider to test and validate
>>> the SP's on silicon. The silicon has production firmware(and hence no
>>> assertions), but loads signed data from the SP provider which has some
>>> invalid values. There could be silent corruption without any indication
>>> whatsoever about what went wrong and it may be hard to debug if/when
>>> there are issues.
>>> Also, testing does not necessarily catch all invalid values since you
>>> will likely not get 100% coverage, given the number of config options
>>> available. Moreover, the code today, is not consistent in asserting on
>>> every property for valid values and the failure mode is not
>>> consistent/deterministic. It seems like every config option should have
>>> a list of valid values or a range of acceptable values that must be at a
>>> minimum asserted on.
>>> I also wouldn't discount platforms such as RPI, where TRUSTED_BOARD_BOOT
>>> is likely to be turned off since it really does not provide any
>>> security, so assuming we always have signed data might not be valid.
>>>
>>> Anyway, is this decision worth revisiting? Too paranoid perhaps? :P
>>>
>>>
>>> Thanks
>>> Raghu
>> --
>> TF-A mailing list
>> TF-A(a)lists.trustedfirmware.org
>> https://lists.trustedfirmware.org/mailman/listinfo/tf-a
>> IMPORTANT NOTICE: The contents of this email and any attachments are
>> confidential and may also be privileged. If you are not the intended
>> recipient, please notify the sender immediately and do not disclose the
>> contents to any other person, use it for any purpose, or store or copy
>> the information in any medium. Thank you.
> IMPORTANT NOTICE: The contents of this email and any attachments are
> confidential and may also be privileged. If you are not the intended
> recipient, please notify the sender immediately and do not disclose the
> contents to any other person, use it for any purpose, or store or copy
> the information in any medium. Thank you.
>
--
TF-A mailing list
TF-A(a)lists.trustedfirmware.org
https://lists.trustedfirmware.org/mailman/listinfo/tf-a
IMPORTANT NOTICE: The contents of this email and any attachments are confidential and may also be privileged. If you are not the intended recipient, please notify the sender immediately and do not disclose the contents to any other person, use it for any purpose, or store or copy the information in any medium. Thank you.
Thanks Louis. This new terminology helps. Let me spell this out again
just to make sure we're on the same page. The following are the cases:
1) Mandatory Property hit, nodes are structurally correct - Works normally.
2) Mandatory Property hit, nodes are structurally *incorrect* - Asserts
should catch structural issues during development because system
integrator expected to not make mistakes with number of nodes etc.
3) Mandatory Property miss - Panic(). Why is this case causing a panic,
but not case 2? You are allowing for some one to make the mistake of not
having a mandatory property, but assume that if a mandatory property is
present, it is always structurally sound. This is the part i have a
problem with. In my view case 2 and case 3 should panic and code should
not just assert on mandatory properties and must ALWAYS check for
structural soundness of an FDT property.
Similarly there are 3 cases for optional properties.
My question: Why does case 3 panic, but not case 2 ? It sounds like your
assumption is that case 2 cannot happen. If case 2 cannot happen, i
claim that case 3 cannot happen either.
Let me know what you think!
Thanks
Raghu
On 4/6/20 3:57 AM, Louis Mayencourt via TF-A wrote:
> Hi Raghu,
>
> Let me try to clarify/reword my idea:
>
> We complete the fconf documentation with a binding document, which
> defines the nodes that should be present in the config DTB to consider
> it as valid/well-formed. The document contains two kinds of node:
>
> * mandatory (critical): the firmware can't proceed without this
> information (example: load address, image UUID, ...). If this node
> is not present in the DTS, the build fails and/or the code panic.
>
> * optional (no-critical): the firmware can assume or assign a default
> value and proceed (example: uart config, enable authentication flag,
> ...). Such a property is used to influence the default behavior of
> the firmware.
>
> />> This is non-deterministic failure./
> The miss of a mandatory (critical) node will always lead to a build
> failure / code panic.
> The miss of an optional (no-critical) node should not influence the
> default behavior of the firmware.
>
> />> Is it not confusing to make the assumption that a DTB is "well
> formed",i.e expect the build process/integrator to not mess up the
>>> structure or number of nodes but allow the same integrator to miss a critical property in the DTB?/
> A DTB with a missing mandatory (critical) property should not be
> considered well formed.
>
> />> If there is a missing critical property, is that not a badly formed
> DTB?/
> With the above definition, it is.
>
> />> And if so, why not check for badly formed DTB's uniformly in code
> and why only check for missing "critical" properties?/
> With the rewording of "critical" / "no-critical" to "mandatory" /
> "optional", I think the situation is clearer: A DTB with a missing
> "optional" node/property is is still considered well-formed.
>
> I hope I answered your questions and clarify the idea behind the design.
>
> regards,
> Louis
>
> ------------------------------------------------------------------------
> *From:* Raghu Krishnamurthy <raghu.ncstate(a)icloud.com>
> *Sent:* 05 April 2020 00:06
> *To:* Louis Mayencourt <Louis.Mayencourt(a)arm.com>
> *Subject:* Re: [TF-A] fconf: Validating config data
> Thanks Louis.
> >>you can imagine a well-formed DTB which contain a
> >>critical set of properties and can contain some optional properties.
>
> This makes things even more confusing. The assumption we are asking code
> to make is that the DTB is always "well formed", ie don't check for
> structural issues such as extra nodes etc, but we are still making the
> distinctions between critical and non-critical properties, that may or
> may not exist in the DTB, which may or may not cause a panic. This is
> non-deterministic failure.
> Is it not confusing to make the assumption that a DTB is "well
> formed",i.e expect the build process/integrator to not mess up the
> structure or number of nodes but allow the same integrator to miss a
> critical property in the DTB? If there is a missing critical property,
> is that not a badly formed DTB ? And if so, why not check for badly
> formed DTB's uniformly in code and why only check for missing "critical"
> properties?
>
> -Raghu
>
> On 4/3/20 3:16 AM, Louis Mayencourt wrote:
>> Hi Raghu,
>>
>> I do agree that we need something similar to a binding document for
>> fconf properties. (similar to
>> https://review.trustedfirmware.org/c/TF-A/trusted-firmware-a/+/3694/2/docs/…).
>
>> At least for the common properties.
>>
>> The main idea behind the return code of the populator function was to
>> allow the code to handle no-critical property misses or to handle
>> critical failure by calling a platform hook.
>> With this in mind, you can imagine a well-formed DTB which contain a
>> critical set of properties and can contain some optional properties. The
>> return code and the populator "name" / "config" can be used to handle
>> this two cases.
>>
>> I tried to keep the design of fconf really simple, to leave room for
>> improvement according to feedbacks. Thanks for helping improving it!
>>
>> Regards,
>> Louis
>> ------------------------------------------------------------------------
>> *From:* TF-A <tf-a-bounces(a)lists.trustedfirmware.org> on behalf of Raghu
>> Krishnamurthy via TF-A <tf-a(a)lists.trustedfirmware.org>
>> *Sent:* 03 April 2020 10:15
>> *To:* tf-a(a)lists.trustedfirmware.org <tf-a(a)lists.trustedfirmware.org>
>> *Subject:* Re: [TF-A] fconf: Validating config data
>> A further point is that the fconf populators return an error code and
>> panics on error today. But if we are making the assumption that the
>> DTB's are well formed, do we really need to fail or even return an error
>> code?
>>
>> -Raghu
>>
>> On 4/3/20 1:51 AM, Raghu Krishnamurthy via TF-A wrote:
>>> Hi All, (Sorry for the long email)
>>>
>>> The review
>>> https://review.trustedfirmware.org/c/TF-A/trusted-firmware-a/+/3845
>>> attempts to fix bounds check in the fconf populator code for the
>>> topology and SP's. During review, Sandrine thoughtfully pointed out that
>>> there were discussions around bounds check along the same lines in the
>>> review
>>> https://review.trustedfirmware.org/c/TF-A/trusted-firmware-a/+/3492 and
>>> it was deemed sufficient to have assertions in code and it was safe to
>>> make the assumption that the DTB is always well formed and contains
>>> valid values. I think this email mostly echoes Sandrine's concern from
>>> review 3492.
>>>
>>> While i agree with the assumptions, I am generally of the opinion that
>>> we should validate/range-check any data, even if it is signed. Being
>>> signed does not necessarily mean the data is well formed/valid. If there
>>> is a mistake in the build process and it is validly signed, it is
>>> possible that we silently corrupt state/data that could later be used to
>>> exploit firmware and/or make debugging hard. This is probably far
>>> fetched, but the cost of adding the check is trivial to avoid this
>>> possibility.
>>>
>>> I imagine the case where you have secure partitions signed by different
>>> entity other than the silicon provider(dual root-of-trust). A silicon
>>> provider provides a dev system for the SP provider to test and validate
>>> the SP's on silicon. The silicon has production firmware(and hence no
>>> assertions), but loads signed data from the SP provider which has some
>>> invalid values. There could be silent corruption without any indication
>>> whatsoever about what went wrong and it may be hard to debug if/when
>>> there are issues.
>>> Also, testing does not necessarily catch all invalid values since you
>>> will likely not get 100% coverage, given the number of config options
>>> available. Moreover, the code today, is not consistent in asserting on
>>> every property for valid values and the failure mode is not
>>> consistent/deterministic. It seems like every config option should have
>>> a list of valid values or a range of acceptable values that must be at a
>>> minimum asserted on.
>>> I also wouldn't discount platforms such as RPI, where TRUSTED_BOARD_BOOT
>>> is likely to be turned off since it really does not provide any
>>> security, so assuming we always have signed data might not be valid.
>>>
>>> Anyway, is this decision worth revisiting? Too paranoid perhaps? :P
>>>
>>>
>>> Thanks
>>> Raghu
>> --
>> TF-A mailing list
>> TF-A(a)lists.trustedfirmware.org
>> https://lists.trustedfirmware.org/mailman/listinfo/tf-a
>> IMPORTANT NOTICE: The contents of this email and any attachments are
>> confidential and may also be privileged. If you are not the intended
>> recipient, please notify the sender immediately and do not disclose the
>> contents to any other person, use it for any purpose, or store or copy
>> the information in any medium. Thank you.
> IMPORTANT NOTICE: The contents of this email and any attachments are
> confidential and may also be privileged. If you are not the intended
> recipient, please notify the sender immediately and do not disclose the
> contents to any other person, use it for any purpose, or store or copy
> the information in any medium. Thank you.
>
Hi Raghu,
Let me try to clarify/reword my idea:
We complete the fconf documentation with a binding document, which defines the nodes that should be present in the config DTB to consider it as valid/well-formed. The document contains two kinds of node:
* mandatory (critical): the firmware can't proceed without this information (example: load address, image UUID, ...). If this node is not present in the DTS, the build fails and/or the code panic.
* optional (no-critical): the firmware can assume or assign a default value and proceed (example: uart config, enable authentication flag, ...). Such a property is used to influence the default behavior of the firmware.
>> This is non-deterministic failure.
The miss of a mandatory (critical) node will always lead to a build failure / code panic.
The miss of an optional (no-critical) node should not influence the default behavior of the firmware.
>> Is it not confusing to make the assumption that a DTB is "well formed",i.e expect the build process/integrator to not mess up the
>> structure or number of nodes but allow the same integrator to miss a critical property in the DTB?
A DTB with a missing mandatory (critical) property should not be considered well formed.
>> If there is a missing critical property, is that not a badly formed DTB?
With the above definition, it is.
>> And if so, why not check for badly formed DTB's uniformly in code and why only check for missing "critical" properties?
With the rewording of "critical" / "no-critical" to "mandatory" / "optional", I think the situation is clearer: A DTB with a missing "optional" node/property is is still considered well-formed.
I hope I answered your questions and clarify the idea behind the design.
regards,
Louis
________________________________
From: Raghu Krishnamurthy <raghu.ncstate(a)icloud.com>
Sent: 05 April 2020 00:06
To: Louis Mayencourt <Louis.Mayencourt(a)arm.com>
Subject: Re: [TF-A] fconf: Validating config data
Thanks Louis.
>>you can imagine a well-formed DTB which contain a
>>critical set of properties and can contain some optional properties.
This makes things even more confusing. The assumption we are asking code
to make is that the DTB is always "well formed", ie don't check for
structural issues such as extra nodes etc, but we are still making the
distinctions between critical and non-critical properties, that may or
may not exist in the DTB, which may or may not cause a panic. This is
non-deterministic failure.
Is it not confusing to make the assumption that a DTB is "well
formed",i.e expect the build process/integrator to not mess up the
structure or number of nodes but allow the same integrator to miss a
critical property in the DTB? If there is a missing critical property,
is that not a badly formed DTB ? And if so, why not check for badly
formed DTB's uniformly in code and why only check for missing "critical"
properties?
-Raghu
On 4/3/20 3:16 AM, Louis Mayencourt wrote:
> Hi Raghu,
>
> I do agree that we need something similar to a binding document for
> fconf properties. (similar to
> https://review.trustedfirmware.org/c/TF-A/trusted-firmware-a/+/3694/2/docs/…).
> At least for the common properties.
>
> The main idea behind the return code of the populator function was to
> allow the code to handle no-critical property misses or to handle
> critical failure by calling a platform hook.
> With this in mind, you can imagine a well-formed DTB which contain a
> critical set of properties and can contain some optional properties. The
> return code and the populator "name" / "config" can be used to handle
> this two cases.
>
> I tried to keep the design of fconf really simple, to leave room for
> improvement according to feedbacks. Thanks for helping improving it!
>
> Regards,
> Louis
> ------------------------------------------------------------------------
> *From:* TF-A <tf-a-bounces(a)lists.trustedfirmware.org> on behalf of Raghu
> Krishnamurthy via TF-A <tf-a(a)lists.trustedfirmware.org>
> *Sent:* 03 April 2020 10:15
> *To:* tf-a(a)lists.trustedfirmware.org <tf-a(a)lists.trustedfirmware.org>
> *Subject:* Re: [TF-A] fconf: Validating config data
> A further point is that the fconf populators return an error code and
> panics on error today. But if we are making the assumption that the
> DTB's are well formed, do we really need to fail or even return an error
> code?
>
> -Raghu
>
> On 4/3/20 1:51 AM, Raghu Krishnamurthy via TF-A wrote:
>> Hi All, (Sorry for the long email)
>>
>> The review
>> https://review.trustedfirmware.org/c/TF-A/trusted-firmware-a/+/3845
>> attempts to fix bounds check in the fconf populator code for the
>> topology and SP's. During review, Sandrine thoughtfully pointed out that
>> there were discussions around bounds check along the same lines in the
>> review
>> https://review.trustedfirmware.org/c/TF-A/trusted-firmware-a/+/3492 and
>> it was deemed sufficient to have assertions in code and it was safe to
>> make the assumption that the DTB is always well formed and contains
>> valid values. I think this email mostly echoes Sandrine's concern from
>> review 3492.
>>
>> While i agree with the assumptions, I am generally of the opinion that
>> we should validate/range-check any data, even if it is signed. Being
>> signed does not necessarily mean the data is well formed/valid. If there
>> is a mistake in the build process and it is validly signed, it is
>> possible that we silently corrupt state/data that could later be used to
>> exploit firmware and/or make debugging hard. This is probably far
>> fetched, but the cost of adding the check is trivial to avoid this
>> possibility.
>>
>> I imagine the case where you have secure partitions signed by different
>> entity other than the silicon provider(dual root-of-trust). A silicon
>> provider provides a dev system for the SP provider to test and validate
>> the SP's on silicon. The silicon has production firmware(and hence no
>> assertions), but loads signed data from the SP provider which has some
>> invalid values. There could be silent corruption without any indication
>> whatsoever about what went wrong and it may be hard to debug if/when
>> there are issues.
>> Also, testing does not necessarily catch all invalid values since you
>> will likely not get 100% coverage, given the number of config options
>> available. Moreover, the code today, is not consistent in asserting on
>> every property for valid values and the failure mode is not
>> consistent/deterministic. It seems like every config option should have
>> a list of valid values or a range of acceptable values that must be at a
>> minimum asserted on.
>> I also wouldn't discount platforms such as RPI, where TRUSTED_BOARD_BOOT
>> is likely to be turned off since it really does not provide any
>> security, so assuming we always have signed data might not be valid.
>>
>> Anyway, is this decision worth revisiting? Too paranoid perhaps? :P
>>
>>
>> Thanks
>> Raghu
> --
> TF-A mailing list
> TF-A(a)lists.trustedfirmware.org
> https://lists.trustedfirmware.org/mailman/listinfo/tf-a
> IMPORTANT NOTICE: The contents of this email and any attachments are
> confidential and may also be privileged. If you are not the intended
> recipient, please notify the sender immediately and do not disclose the
> contents to any other person, use it for any purpose, or store or copy
> the information in any medium. Thank you.
IMPORTANT NOTICE: The contents of this email and any attachments are confidential and may also be privileged. If you are not the intended recipient, please notify the sender immediately and do not disclose the contents to any other person, use it for any purpose, or store or copy the information in any medium. Thank you.
+tf-a list
On 4/4/20 4:06 PM, Raghu Krishnamurthy wrote:
> Thanks Louis.
> >>you can imagine a well-formed DTB which contain a
> >>critical set of properties and can contain some optional properties.
>
> This makes things even more confusing. The assumption we are asking code
> to make is that the DTB is always "well formed", ie don't check for
> structural issues such as extra nodes etc, but we are still making the
> distinctions between critical and non-critical properties, that may or
> may not exist in the DTB, which may or may not cause a panic. This is
> non-deterministic failure.
> Is it not confusing to make the assumption that a DTB is "well
> formed",i.e expect the build process/integrator to not mess up the
> structure or number of nodes but allow the same integrator to miss a
> critical property in the DTB? If there is a missing critical property,
> is that not a badly formed DTB ? And if so, why not check for badly
> formed DTB's uniformly in code and why only check for missing "critical"
> properties?
>
> -Raghu
>
> On 4/3/20 3:16 AM, Louis Mayencourt wrote:
>> Hi Raghu,
>>
>> I do agree that we need something similar to a binding document for
>> fconf properties. (similar to
>> https://review.trustedfirmware.org/c/TF-A/trusted-firmware-a/+/3694/2/docs/…).
>> At least for the common properties.
>>
>> The main idea behind the return code of the populator function was to
>> allow the code to handle no-critical property misses or to handle
>> critical failure by calling a platform hook.
>> With this in mind, you can imagine a well-formed DTB which contain a
>> critical set of properties and can contain some optional properties.
>> The return code and the populator "name" / "config" can be used to
>> handle this two cases.
>>
>> I tried to keep the design of fconf really simple, to leave room for
>> improvement according to feedbacks. Thanks for helping improving it!
>>
>> Regards,
>> Louis
>> ------------------------------------------------------------------------
>> *From:* TF-A <tf-a-bounces(a)lists.trustedfirmware.org> on behalf of
>> Raghu Krishnamurthy via TF-A <tf-a(a)lists.trustedfirmware.org>
>> *Sent:* 03 April 2020 10:15
>> *To:* tf-a(a)lists.trustedfirmware.org <tf-a(a)lists.trustedfirmware.org>
>> *Subject:* Re: [TF-A] fconf: Validating config data
>> A further point is that the fconf populators return an error code and
>> panics on error today. But if we are making the assumption that the
>> DTB's are well formed, do we really need to fail or even return an error
>> code?
>>
>> -Raghu
>>
>> On 4/3/20 1:51 AM, Raghu Krishnamurthy via TF-A wrote:
>>> Hi All, (Sorry for the long email)
>>>
>>> The review
>>> https://review.trustedfirmware.org/c/TF-A/trusted-firmware-a/+/3845
>>> attempts to fix bounds check in the fconf populator code for the
>>> topology and SP's. During review, Sandrine thoughtfully pointed out
>>> that there were discussions around bounds check along the same lines
>>> in the review
>>> https://review.trustedfirmware.org/c/TF-A/trusted-firmware-a/+/3492 and
>>> it was deemed sufficient to have assertions in code and it was safe
>>> to make the assumption that the DTB is always well formed and
>>> contains valid values. I think this email mostly echoes Sandrine's
>>> concern from review 3492.
>>>
>>> While i agree with the assumptions, I am generally of the opinion
>>> that we should validate/range-check any data, even if it is signed.
>>> Being signed does not necessarily mean the data is well formed/valid.
>>> If there is a mistake in the build process and it is validly signed,
>>> it is possible that we silently corrupt state/data that could later
>>> be used to exploit firmware and/or make debugging hard. This is
>>> probably far fetched, but the cost of adding the check is trivial to
>>> avoid this possibility.
>>>
>>> I imagine the case where you have secure partitions signed by
>>> different entity other than the silicon provider(dual root-of-trust).
>>> A silicon provider provides a dev system for the SP provider to test
>>> and validate the SP's on silicon. The silicon has production
>>> firmware(and hence no assertions), but loads signed data from the SP
>>> provider which has some invalid values. There could be silent
>>> corruption without any indication whatsoever about what went wrong
>>> and it may be hard to debug if/when there are issues.
>>> Also, testing does not necessarily catch all invalid values since you
>>> will likely not get 100% coverage, given the number of config options
>>> available. Moreover, the code today, is not consistent in asserting
>>> on every property for valid values and the failure mode is not
>>> consistent/deterministic. It seems like every config option should
>>> have a list of valid values or a range of acceptable values that must
>>> be at a minimum asserted on.
>>> I also wouldn't discount platforms such as RPI, where
>>> TRUSTED_BOARD_BOOT is likely to be turned off since it really does
>>> not provide any security, so assuming we always have signed data
>>> might not be valid.
>>>
>>> Anyway, is this decision worth revisiting? Too paranoid perhaps? :P
>>>
>>>
>>> Thanks
>>> Raghu
>> --
>> TF-A mailing list
>> TF-A(a)lists.trustedfirmware.org
>> https://lists.trustedfirmware.org/mailman/listinfo/tf-a
>> IMPORTANT NOTICE: The contents of this email and any attachments are
>> confidential and may also be privileged. If you are not the intended
>> recipient, please notify the sender immediately and do not disclose
>> the contents to any other person, use it for any purpose, or store or
>> copy the information in any medium. Thank you.
>
Hi Raghu,
On 4/3/20 1:38 AM, Raghu Krishnamurthy via TF-A wrote:
> Thanks Sandrine. Patches look good.
Thanks for the review!
> I realized after looking at things a little closer that i had
> misunderstood how fconf works for io policies. I thought the image id's
> themselves came from the config files and not just the UUID's, which is
> why i was worried about bounds check, since the id was coming from an
> external source(trusted or untrusted, depending on if it is signed data
> or not).
> This also made me realize that we are using another table built into
> code, to convert from image id to UUID for io policies. Is there a
> reason image id's also can't be discovered from the config file?
I remember some internal discussions around this topic a few weeks ago.
If I recall correctly, the current thinking is that down the line, we
would like to move image IDs to DTBs but this looks complicated to
achieve today because image IDs are used by several components in TF-A
to tie things together. More work would be needed to abstract this
properly everywhere.
I think other folks in the team (Olivier? Manish? Louis?) might be able
to comment further on this.
Regards,
Sandrine
IMPORTANT NOTICE: The contents of this email and any attachments are confidential and may also be privileged. If you are not the intended recipient, please notify the sender immediately and do not disclose the contents to any other person, use it for any purpose, or store or copy the information in any medium. Thank you.
A further point is that the fconf populators return an error code and
panics on error today. But if we are making the assumption that the
DTB's are well formed, do we really need to fail or even return an error
code?
-Raghu
On 4/3/20 1:51 AM, Raghu Krishnamurthy via TF-A wrote:
> Hi All, (Sorry for the long email)
>
> The review
> https://review.trustedfirmware.org/c/TF-A/trusted-firmware-a/+/3845
> attempts to fix bounds check in the fconf populator code for the
> topology and SP's. During review, Sandrine thoughtfully pointed out that
> there were discussions around bounds check along the same lines in the
> review
> https://review.trustedfirmware.org/c/TF-A/trusted-firmware-a/+/3492 and
> it was deemed sufficient to have assertions in code and it was safe to
> make the assumption that the DTB is always well formed and contains
> valid values. I think this email mostly echoes Sandrine's concern from
> review 3492.
>
> While i agree with the assumptions, I am generally of the opinion that
> we should validate/range-check any data, even if it is signed. Being
> signed does not necessarily mean the data is well formed/valid. If there
> is a mistake in the build process and it is validly signed, it is
> possible that we silently corrupt state/data that could later be used to
> exploit firmware and/or make debugging hard. This is probably far
> fetched, but the cost of adding the check is trivial to avoid this
> possibility.
>
> I imagine the case where you have secure partitions signed by different
> entity other than the silicon provider(dual root-of-trust). A silicon
> provider provides a dev system for the SP provider to test and validate
> the SP's on silicon. The silicon has production firmware(and hence no
> assertions), but loads signed data from the SP provider which has some
> invalid values. There could be silent corruption without any indication
> whatsoever about what went wrong and it may be hard to debug if/when
> there are issues.
> Also, testing does not necessarily catch all invalid values since you
> will likely not get 100% coverage, given the number of config options
> available. Moreover, the code today, is not consistent in asserting on
> every property for valid values and the failure mode is not
> consistent/deterministic. It seems like every config option should have
> a list of valid values or a range of acceptable values that must be at a
> minimum asserted on.
> I also wouldn't discount platforms such as RPI, where TRUSTED_BOARD_BOOT
> is likely to be turned off since it really does not provide any
> security, so assuming we always have signed data might not be valid.
>
> Anyway, is this decision worth revisiting? Too paranoid perhaps? :P
>
>
> Thanks
> Raghu