New subject: Alignment fault checking in EL3

9 Nov 2020


      Hi Julius,
Talking to the team here its has always been felt it is best practise to forbid unaligned data accesses in Arm's embedded projects. That's the reason TF-A also enforces the build options: -mno-unaligned-access  (AArch32) and -mstrict-align (AArch64). In the TF-A documentation [1] SCTLR_EL3.A and other settings are listed in the Architectural Initialisation section.
There is thought to be only a few cases where unaligned access are correct and intentional and most are defects that should be caught early and not in production and as such it is better to detect unaligned data access conditions early in a platform porting, rather than in the field especially if there is no firmware update capability. Alignment faults should be treated as fatal as they should never happen in production when components have been designed as such from the beginning. Maintaining these settings in production is also advised as best practice as it is known any such defects can possibly play a part in allowing an actor to take advantage of the defect as part of a vulnerability and memory related defects in particular can be taken advantage of. So it seems prudent to guard against them.
We can of course provide better rationalisation based on the above in our documentation to provide platforms porting efforts better guidance. It is after all up to partners in their platform ports to make appropriate decisions for their ports where settings could be changed however the upstreamed reference code should follow the settings we have which are felt to be best practices for TF-A.
It is known other projects follow other practices and that is fine if that works for them. Once a project enables unaligned accesses, it's very hard to go back again since it's likely that code that does unaligned accesses will slowly get added to the project. However for TF-A maintaining the current settings is felt to be the best approach when weighing up the costs and benefits.
Saying all this for those few cases where unaligned accesses are correct and intentional we could look to define better ways to handle these and provide that in reference code as an option to try and get the best for robustness and debuggability which seemed a big part of the concern in your post. Team members in Arm have ideas on how that could be provided but it needs broader discussion on the implications for security and performance before taking forward.
Thanks
Joanna
[1] https://trustedfirmware-a.readthedocs.io/en/latest/design/firmware-design.ht...
On 09/11/2020, 17:47, "TF-A on behalf of Raghu Krishnamurthy via TF-A" <tf-a-bounces@lists.trustedfirmware.org on behalf of tf-a@lists.trustedfirmware.org> wrote:
Hi Julius,
I tend to agree with your argument about not using SCTLR.A bit but I think the unexpected crashes or instability is due buggy code and insufficient validation of invariants such as aligned pointers, irrespective of whether SCTRLR.A is set or unset.
    Even if we allow unaligned accesses, we could have buggy code that access registers that typically have to be size aligned and we wouldn’t catch those bugs with SCTLR.A. Worse, some hardware implementations have undefined/impdef behavior when there are unaligned access to MMIO registers for ex, in which case, I would rather take an alignment fault at the core than allow triggering of undefined behavior.
So I don’t think stability/reliability and the use of SCTLR.A bit are related. If TF-A's position is that we want to allow only aligned accesses in EL3(for whatever reason, I can only think of efficiency), it is the code's responsibility to enforce this invariants using asserts or explicit checks.
>> I am still wondering why we choose to set the SCTLR_EL3.A
    I think this is the relevant question. If there are good security reasons(which I don’t know about), I would say we should keep it. If it is for efficiency, given the way recent ARM64 cores are performing, I wouldn't have a problem with SCTLR.A=0.
Thanks
    Raghu
-----Original Message-----
    From: TF-A tf-a-bounces@lists.trustedfirmware.org On Behalf Of Julius Werner via TF-A
    Sent: Friday, November 6, 2020 6:38 PM
    To: tf-a tf-a@lists.trustedfirmware.org
    Subject: [TF-A] Alignment fault checking in EL3
Hi,
I just debugged a TF-A boot crash that turned out to be caused by an alignment fault in platform code. Someone had defined some static storage space as a uint8_t array, and then accessed it by dereferencing uint16_t pointers.
Of course this is ultimately a bug in the platform code that should be fixed, but I am still wondering why we choose to set the SCTLR_EL3.A (Alignment fault checking) flag in TF-A? In an ideal world, maybe we could say that code which can generate alignment faults should not exist -- but, unfortunately, people make mistakes, and this kind of mistake may linger unnoticed for a long time in the codebase before randomly getting triggered due to subtle shifts in the binary's memory layout. (Worse, in some situations this could get affected by SMC parameters passed in from lower exception levels, so it would only be noticeable and could possibly be intentionally triggered if the lower exception level passes in just the right values.)
For that reason, most other environments I know (e.g. Linux) always keep that flag cleared. There's no harm in that -- as far as I'm aware all aarch64 cores are required to support unaligned accesses to cached memory types, and the worst that would happen is a slight performance penalty for the access. I think that flag is mostly meant as a debugging feature to be able to shake out accidental unaligned accesses from your code? If our goal is to be stable and reliable firmware, shouldn't we disable it to reduce the chance of unexpected crashes?
-- 
    TF-A mailing list
    TF-A@lists.trustedfirmware.org
    https://lists.trustedfirmware.org/mailman/listinfo/tf-a

Re: [TF-A] Alignment fault checking in EL3