Hi All,
The next TF-A Tech Forum is scheduled for Thu 19th November 2020 16:00 – 17:00 (GMT).
Please note UK entered Daylight Saving on 25th October when clocks went back one hour to go to GMT from BST.
A reoccurring meeting invite has been sent out to the subscribers of this TF-A mailing list. If you don’t have this please let me know.
Agenda:
* Trace-based Code Coverage Tooling for Firmware Projects
* Presented by Basil Eljuse & Saul Romero
* Summary
* TF-A has adopted a code coverage system to measure the effectiveness of the various runtime testing performed and this is achieved without doing code instrumentation. This presentation is an overview of that approach which has applicability beyond the TF-A project.
* Optional TF-A Mailing List Topic Discussions
If TF-A contributors have anything they wish to present at any future TF-A tech forum please contact me to have that scheduled.
Previous sessions, both recording and presentation material can be found on the trustedfirmware.org TF-A Technical meeting webpage: https://www.trustedfirmware.org/meetings/tf-a-technical-forum/
A scheduling tracking page is also available to help track sessions suggested and being prepared: https://developer.trustedfirmware.org/w/tf_a/tf-a-tech-forum-scheduling/ Final decisions on what will be presented will be shared a few days before the next meeting and shared on the TF-A mailing list.
Thanks
Joanna
Hi All,
i've been trying for a couple of days to load current optee together with atf from the github, with no luck. For that i have been using manuals (https://trustedfirmware-a.readthedocs.io/en/latest/plat/rockchip.html , http://opensource.rock-chips.com/wiki_ATF ) for rockchip and what i have succeded is to run very old atf tag 1.3 with optee blob from rockchip which i suspect is also very old. And if i try to use any newer atf >= 1.4 than i suspect that it does not load, because i do not see any LOGS from it, no matter that LOG_LEVEL was set to 50...
Can anybody help me solving this issue or point me to some solution? Thanks in advance.
BR
Piotr Łobacz
[https://softgent.com/wp-content/uploads/2020/01/Zasob-14.png]<https://www.softgent.com>
Softgent Sp. z o.o., Budowlanych 31d, 80-298 Gdansk, POLAND
KRS: 0000674406, NIP: 9581679801, REGON: 367090912
www.softgent.com
Sąd Rejonowy Gdańsk-Północ w Gdańsku, VII Wydział Gospodarczy Krajowego Rejestru Sądowego
KRS 0000674406, Kapitał zakładowy: 25 000,00 zł wpłacony w całości.
Hi Julius,
The TF-A build sets the `-mstrict-align` compiler option, that means that the compiler will generate only aligned accesses as default. So disabling the check at runtime, seems `unaligned` with the compiler option IMO. The programmer can induce unaligned access though, although I am struggling to understand why the programmer would want to do that. The case you have pointed out seems like an error which needs to be corrected in platform code IMO.
> Well, my assumption is that performing an unaligned access cannot be a
> vulnerability.
TF-A has linker symbol references and non-trivial linker section manipulations at runtime and these have previously triggered alignment errors when binary layout changed due to code issues. Disabling the alignment check at runtime means these errors would have remained undetected and would have resulted in more obscure crash at runtime or even a vulnerablility. In some cases, only the RELEASE builds (DEBUG=0) triggered the alignment fault.
> There are scenarios where unaligned accesses can be intentional, but I am not too worried about those -- there are ways to work around that, and if we make it policy that those accesses always need to be written that way I'm okay with that. What I am worried about is mistakes and oversights that slip through testing and then cause random or even attacker-controllable crashes in production. If the main reason you're enabling this flag is to help with early detection of coding mistakes, I wonder if the best approach would be to enable it for DEBUG=1 and disable it for DEBUG=0 (of course, if there are really security issues associated with it like you mentioned above, that wouldn't make sense).
Agree. But I am failing to understand why the unaligned accesses came to be there in the first place. My worry is, if this was not intended by the programmer, this might as well be a bug that will remain undetected if the alignment check is disabled.
Best Regards
Soby Mathew
> -----Original Message-----
> From: TF-A <tf-a-bounces(a)lists.trustedfirmware.org> On Behalf Of Julius
> Werner via TF-A
> Sent: 10 November 2020 02:31
> To: raghu.ncstate(a)icloud.com
> Cc: tf-a <tf-a(a)lists.trustedfirmware.org>
> Subject: Re: [TF-A] Alignment fault checking in EL3
>
> > >>. What I am worried about is mistakes and oversights that slip
> > >>through testing and then cause random or even attacker-controllable
> > >>crashes in production
> >
> > Why would this not be the case even with SCTLR.A bit set? I'm not seeing
> the relation between the crash and SCTLR.A bit. If there are oversights that
> slip through testing and an attacker can cause a crash, there is
> vulnerability/bug.
> > If there is something that slipped through, that causes a data
> abort(perhaps address not mapped in the translation tables), would we turn
> off the MMU ?
>
> Well, my assumption is that performing an unaligned access cannot be a
> vulnerability. If you have examples to the contrary please let me know, but
> as far as I am aware letting an unaligned access through should always work
> and will always result in the behavior the programmer intended (other than
> maybe contrived cases where device address space is mapped with a non-
> Device memory type, which is already very wrong for plenty of other reasons
> and should never happen). The only negative consequence is that the access
> may take slightly longer than if it were aligned. I do understand the desire to
> be able to shake out unaligned accesses during development and testing
>
> I don't think the comparison to a data abort works because obviously with a
> data abort the program usually can't just continue and still assume its internal
> state is valid.
Cross-posting on the TF-A mailing list too for interested people to attend the kickoff meeting listed below hosted by Linaro.
Thanks
Matteo
-----Original Message-----
From: Linaro-open-discussions <linaro-open-discussions-bounces(a)op-lists.linaro.org> On Behalf Of Ulf Hansson via Linaro-open-discussions
Sent: 10 November 2020 10:21
To: linaro-open-discussions(a)op-lists.linaro.org
Subject: [Linaro-open-discussions] Kickoff: Extend PSCI with OS-initiated mode in TF-A
Hi all,
As previously announced we are hosting a kickoff meeting for the above topic/project.
The meeting has been added to the linaro-open-discussions calendar, according to the below. Feel free to join us on Thursday this week!
Kind regards
Ulf Hansson, Linaro KWG
When:
Thursday Nov 12, 16:00-17:00 CET.
Agenda:
1. Introduction.
2. Update of the current support in Linux.
3. Collaborations.
4. Technical things.
5. Other matters.
To join the Zoom Meeting:
https://linaro-org.zoom.us/j/97261221687
Meeting ID: 972 6122 1687
One tap mobile
+13462487799,,97261221687# US (Houston)
+16465588656,,97261221687# US (New York)
Dial by your location
+1 346 248 7799 US (Houston)
+1 646 558 8656 US (New York)
+1 669 900 9128 US (San Jose)
+1 253 215 8782 US (Tacoma)
+1 301 715 8592 US (Germantown)
+1 312 626 6799 US (Chicago)
Meeting ID: 972 6122 1687
Find your local number: https://linaro-org.zoom.us/u/acg3UrpmdE
--
Linaro-open-discussions mailing list
https://collaborate.linaro.org/display/LOD/Linaro+Open+Discussions+Homehttps://op-lists.linaro.org/mailman/listinfo/linaro-open-discussions
> >>. What I am worried about is mistakes and
> >>oversights that slip through testing and then cause random or even
> >>attacker-controllable crashes in production
>
> Why would this not be the case even with SCTLR.A bit set? I'm not seeing the relation between the crash and SCTLR.A bit. If there are oversights that slip through testing and an attacker can cause a crash, there is vulnerability/bug.
> If there is something that slipped through, that causes a data abort(perhaps address not mapped in the translation tables), would we turn off the MMU ?
Well, my assumption is that performing an unaligned access cannot be a
vulnerability. If you have examples to the contrary please let me
know, but as far as I am aware letting an unaligned access through
should always work and will always result in the behavior the
programmer intended (other than maybe contrived cases where device
address space is mapped with a non-Device memory type, which is
already very wrong for plenty of other reasons and should never
happen). The only negative consequence is that the access may take
slightly longer than if it were aligned. I do understand the desire to
be able to shake out unaligned accesses during development and
testing, but at least in a production build I think taking that
performance hit would always be preferable to a crash.
I don't think the comparison to a data abort works because obviously
with a data abort the program usually can't just continue and still
assume its internal state is valid.
Hi Julius,
A lot of the rationale for the default initialization is not listed in the documentation and I do think perhaps we can do better as at the end of the day it’s decisions around platform ports and how they are going to be deployed in products that may influence decisions here for changing the defaults.
Your thoughts around attacker control able crashes are a valid one. I don’t think any one setting is all pros with no cons and platform providers need to make decisions but I think the default code with the settings is more likely to flush out alignment bugs and although I don’t have specific examples any bug could play a part in future vulnerability so best to flush them out rather than lie hidden.
It could be said that in the reference implementation a known crash may be better than a more nebulous unknown behaviour that could come with an undetected bug that could later be taken advantage of.
The use a DEBUG flag is an idea but we have all seen debug builds not operating like production builds so the value may be lost here.
It’s why on reviewing our documentation on seeing your post I think we could do better to explain the rationale for the settings in TF-A so platform owners can be better informed and override default settings if they really feel they have the need.
Cheers
Joanna
________________________________
From: Julius Werner
Sent: Monday, November 9, 2020 10:42 PM
To: Joanna Farley
Cc: raghu.ncstate(a)icloud.com; tf-a(a)lists.trustedfirmware.org
Subject: Re: [TF-A] Alignment fault checking in EL3
Hi Joanna,
Thank you for your detailed response. I was mostly wondering if this
was a deliberate choice or just something someone wrote this way once
and nobody ever thought about again. Since it sounds like it is
intentional, I'll try to understand your rationale better.
> Maintaining these settings in production is also advised as best practice as it is known any such defects can possibly play a part in allowing an actor to take advantage of the defect as part of a vulnerability and memory related defects in particular can be taken advantage of. So it seems prudent to guard against them.
I'm curious, can you elaborate on that? I can't really think of a
scenario where lack of alignment checking can really make code behave
in an unintentional way. (I don't think Raghu's scenario of accessing
MMIO registers works because those registers should be mapped with a
Device memory type anyway, and that memory type enforces aligned
accesses regardless of the SCTLR.A flag.) If there truly are security
concerns with this I can understand your approach much better.
> Saying all this for those few cases where unaligned accesses are correct and intentional we could look to define better ways to handle these and provide that in reference code as an option to try and get the best for robustness and debuggability which seemed a big part of the concern in your post. Team members in Arm have ideas on how that could be provided but it needs broader discussion on the implications for security and performance before taking forward.
There are scenarios where unaligned accesses can be intentional, but I
am not too worried about those -- there are ways to work around that,
and if we make it policy that those accesses always need to be written
that way I'm okay with that. What I am worried about is mistakes and
oversights that slip through testing and then cause random or even
attacker-controllable crashes in production. If the main reason you're
enabling this flag is to help with early detection of coding mistakes,
I wonder if the best approach would be to enable it for DEBUG=1 and
disable it for DEBUG=0 (of course, if there are really security issues
associated with it like you mentioned above, that wouldn't make
sense).
Hi Julius,
Talking to the team here its has always been felt it is best practise to forbid unaligned data accesses in Arm's embedded projects. That's the reason TF-A also enforces the build options: -mno-unaligned-access (AArch32) and -mstrict-align (AArch64). In the TF-A documentation [1] SCTLR_EL3.A and other settings are listed in the Architectural Initialisation section.
There is thought to be only a few cases where unaligned access are correct and intentional and most are defects that should be caught early and not in production and as such it is better to detect unaligned data access conditions early in a platform porting, rather than in the field especially if there is no firmware update capability. Alignment faults should be treated as fatal as they should never happen in production when components have been designed as such from the beginning. Maintaining these settings in production is also advised as best practice as it is known any such defects can possibly play a part in allowing an actor to take advantage of the defect as part of a vulnerability and memory related defects in particular can be taken advantage of. So it seems prudent to guard against them.
We can of course provide better rationalisation based on the above in our documentation to provide platforms porting efforts better guidance. It is after all up to partners in their platform ports to make appropriate decisions for their ports where settings could be changed however the upstreamed reference code should follow the settings we have which are felt to be best practices for TF-A.
It is known other projects follow other practices and that is fine if that works for them. Once a project enables unaligned accesses, it's very hard to go back again since it's likely that code that does unaligned accesses will slowly get added to the project. However for TF-A maintaining the current settings is felt to be the best approach when weighing up the costs and benefits.
Saying all this for those few cases where unaligned accesses are correct and intentional we could look to define better ways to handle these and provide that in reference code as an option to try and get the best for robustness and debuggability which seemed a big part of the concern in your post. Team members in Arm have ideas on how that could be provided but it needs broader discussion on the implications for security and performance before taking forward.
Thanks
Joanna
[1] https://trustedfirmware-a.readthedocs.io/en/latest/design/firmware-design.h…
On 09/11/2020, 17:47, "TF-A on behalf of Raghu Krishnamurthy via TF-A" <tf-a-bounces(a)lists.trustedfirmware.org on behalf of tf-a(a)lists.trustedfirmware.org> wrote:
Hi Julius,
I tend to agree with your argument about not using SCTLR.A bit but I think the unexpected crashes or instability is due buggy code and insufficient validation of invariants such as aligned pointers, irrespective of whether SCTRLR.A is set or unset.
Even if we allow unaligned accesses, we could have buggy code that access registers that typically have to be size aligned and we wouldn’t catch those bugs with SCTLR.A. Worse, some hardware implementations have undefined/impdef behavior when there are unaligned access to MMIO registers for ex, in which case, I would rather take an alignment fault at the core than allow triggering of undefined behavior.
So I don’t think stability/reliability and the use of SCTLR.A bit are related. If TF-A's position is that we want to allow only aligned accesses in EL3(for whatever reason, I can only think of efficiency), it is the code's responsibility to enforce this invariants using asserts or explicit checks.
>> I am still wondering why we choose to set the SCTLR_EL3.A
I think this is the relevant question. If there are good security reasons(which I don’t know about), I would say we should keep it. If it is for efficiency, given the way recent ARM64 cores are performing, I wouldn't have a problem with SCTLR.A=0.
Thanks
Raghu
-----Original Message-----
From: TF-A <tf-a-bounces(a)lists.trustedfirmware.org> On Behalf Of Julius Werner via TF-A
Sent: Friday, November 6, 2020 6:38 PM
To: tf-a <tf-a(a)lists.trustedfirmware.org>
Subject: [TF-A] Alignment fault checking in EL3
Hi,
I just debugged a TF-A boot crash that turned out to be caused by an alignment fault in platform code. Someone had defined some static storage space as a uint8_t array, and then accessed it by dereferencing uint16_t pointers.
Of course this is ultimately a bug in the platform code that should be fixed, but I am still wondering why we choose to set the SCTLR_EL3.A (Alignment fault checking) flag in TF-A? In an ideal world, maybe we could say that code which can generate alignment faults should not exist -- but, unfortunately, people make mistakes, and this kind of mistake may linger unnoticed for a long time in the codebase before randomly getting triggered due to subtle shifts in the binary's memory layout. (Worse, in some situations this could get affected by SMC parameters passed in from lower exception levels, so it would only be noticeable and could possibly be intentionally triggered if the lower exception level passes in just the right values.)
For that reason, most other environments I know (e.g. Linux) always keep that flag cleared. There's no harm in that -- as far as I'm aware all aarch64 cores are required to support unaligned accesses to cached memory types, and the worst that would happen is a slight performance penalty for the access. I think that flag is mostly meant as a debugging feature to be able to shake out accidental unaligned accesses from your code? If our goal is to be stable and reliable firmware, shouldn't we disable it to reduce the chance of unexpected crashes?
--
TF-A mailing list
TF-A(a)lists.trustedfirmware.org
https://lists.trustedfirmware.org/mailman/listinfo/tf-a
Hi Julius,
I tend to agree with your argument about not using SCTLR.A bit but I think the unexpected crashes or instability is due buggy code and insufficient validation of invariants such as aligned pointers, irrespective of whether SCTRLR.A is set or unset.
Even if we allow unaligned accesses, we could have buggy code that access registers that typically have to be size aligned and we wouldn’t catch those bugs with SCTLR.A. Worse, some hardware implementations have undefined/impdef behavior when there are unaligned access to MMIO registers for ex, in which case, I would rather take an alignment fault at the core than allow triggering of undefined behavior.
So I don’t think stability/reliability and the use of SCTLR.A bit are related. If TF-A's position is that we want to allow only aligned accesses in EL3(for whatever reason, I can only think of efficiency), it is the code's responsibility to enforce this invariants using asserts or explicit checks.
>> I am still wondering why we choose to set the SCTLR_EL3.A
I think this is the relevant question. If there are good security reasons(which I don’t know about), I would say we should keep it. If it is for efficiency, given the way recent ARM64 cores are performing, I wouldn't have a problem with SCTLR.A=0.
Thanks
Raghu
-----Original Message-----
From: TF-A <tf-a-bounces(a)lists.trustedfirmware.org> On Behalf Of Julius Werner via TF-A
Sent: Friday, November 6, 2020 6:38 PM
To: tf-a <tf-a(a)lists.trustedfirmware.org>
Subject: [TF-A] Alignment fault checking in EL3
Hi,
I just debugged a TF-A boot crash that turned out to be caused by an alignment fault in platform code. Someone had defined some static storage space as a uint8_t array, and then accessed it by dereferencing uint16_t pointers.
Of course this is ultimately a bug in the platform code that should be fixed, but I am still wondering why we choose to set the SCTLR_EL3.A (Alignment fault checking) flag in TF-A? In an ideal world, maybe we could say that code which can generate alignment faults should not exist -- but, unfortunately, people make mistakes, and this kind of mistake may linger unnoticed for a long time in the codebase before randomly getting triggered due to subtle shifts in the binary's memory layout. (Worse, in some situations this could get affected by SMC parameters passed in from lower exception levels, so it would only be noticeable and could possibly be intentionally triggered if the lower exception level passes in just the right values.)
For that reason, most other environments I know (e.g. Linux) always keep that flag cleared. There's no harm in that -- as far as I'm aware all aarch64 cores are required to support unaligned accesses to cached memory types, and the worst that would happen is a slight performance penalty for the access. I think that flag is mostly meant as a debugging feature to be able to shake out accidental unaligned accesses from your code? If our goal is to be stable and reliable firmware, shouldn't we disable it to reduce the chance of unexpected crashes?