> >>. What I am worried about is mistakes and
> >>oversights that slip through testing and then cause random or even
> >>attacker-controllable crashes in production
>
> Why would this not be the case even with SCTLR.A bit set? I'm not seeing the relation between the crash and SCTLR.A bit. If there are oversights that slip through testing and an attacker can cause a crash, there is vulnerability/bug.
> If there is something that slipped through, that causes a data abort(perhaps address not mapped in the translation tables), would we turn off the MMU ?
Well, my assumption is that performing an unaligned access cannot be a
vulnerability. If you have examples to the contrary please let me
know, but as far as I am aware letting an unaligned access through
should always work and will always result in the behavior the
programmer intended (other than maybe contrived cases where device
address space is mapped with a non-Device memory type, which is
already very wrong for plenty of other reasons and should never
happen). The only negative consequence is that the access may take
slightly longer than if it were aligned. I do understand the desire to
be able to shake out unaligned accesses during development and
testing, but at least in a production build I think taking that
performance hit would always be preferable to a crash.
I don't think the comparison to a data abort works because obviously
with a data abort the program usually can't just continue and still
assume its internal state is valid.
Hi Julius,
A lot of the rationale for the default initialization is not listed in the documentation and I do think perhaps we can do better as at the end of the day it’s decisions around platform ports and how they are going to be deployed in products that may influence decisions here for changing the defaults.
Your thoughts around attacker control able crashes are a valid one. I don’t think any one setting is all pros with no cons and platform providers need to make decisions but I think the default code with the settings is more likely to flush out alignment bugs and although I don’t have specific examples any bug could play a part in future vulnerability so best to flush them out rather than lie hidden.
It could be said that in the reference implementation a known crash may be better than a more nebulous unknown behaviour that could come with an undetected bug that could later be taken advantage of.
The use a DEBUG flag is an idea but we have all seen debug builds not operating like production builds so the value may be lost here.
It’s why on reviewing our documentation on seeing your post I think we could do better to explain the rationale for the settings in TF-A so platform owners can be better informed and override default settings if they really feel they have the need.
Cheers
Joanna
________________________________
From: Julius Werner
Sent: Monday, November 9, 2020 10:42 PM
To: Joanna Farley
Cc: raghu.ncstate(a)icloud.com; tf-a(a)lists.trustedfirmware.org
Subject: Re: [TF-A] Alignment fault checking in EL3
Hi Joanna,
Thank you for your detailed response. I was mostly wondering if this
was a deliberate choice or just something someone wrote this way once
and nobody ever thought about again. Since it sounds like it is
intentional, I'll try to understand your rationale better.
> Maintaining these settings in production is also advised as best practice as it is known any such defects can possibly play a part in allowing an actor to take advantage of the defect as part of a vulnerability and memory related defects in particular can be taken advantage of. So it seems prudent to guard against them.
I'm curious, can you elaborate on that? I can't really think of a
scenario where lack of alignment checking can really make code behave
in an unintentional way. (I don't think Raghu's scenario of accessing
MMIO registers works because those registers should be mapped with a
Device memory type anyway, and that memory type enforces aligned
accesses regardless of the SCTLR.A flag.) If there truly are security
concerns with this I can understand your approach much better.
> Saying all this for those few cases where unaligned accesses are correct and intentional we could look to define better ways to handle these and provide that in reference code as an option to try and get the best for robustness and debuggability which seemed a big part of the concern in your post. Team members in Arm have ideas on how that could be provided but it needs broader discussion on the implications for security and performance before taking forward.
There are scenarios where unaligned accesses can be intentional, but I
am not too worried about those -- there are ways to work around that,
and if we make it policy that those accesses always need to be written
that way I'm okay with that. What I am worried about is mistakes and
oversights that slip through testing and then cause random or even
attacker-controllable crashes in production. If the main reason you're
enabling this flag is to help with early detection of coding mistakes,
I wonder if the best approach would be to enable it for DEBUG=1 and
disable it for DEBUG=0 (of course, if there are really security issues
associated with it like you mentioned above, that wouldn't make
sense).
Hi Julius,
Talking to the team here its has always been felt it is best practise to forbid unaligned data accesses in Arm's embedded projects. That's the reason TF-A also enforces the build options: -mno-unaligned-access (AArch32) and -mstrict-align (AArch64). In the TF-A documentation [1] SCTLR_EL3.A and other settings are listed in the Architectural Initialisation section.
There is thought to be only a few cases where unaligned access are correct and intentional and most are defects that should be caught early and not in production and as such it is better to detect unaligned data access conditions early in a platform porting, rather than in the field especially if there is no firmware update capability. Alignment faults should be treated as fatal as they should never happen in production when components have been designed as such from the beginning. Maintaining these settings in production is also advised as best practice as it is known any such defects can possibly play a part in allowing an actor to take advantage of the defect as part of a vulnerability and memory related defects in particular can be taken advantage of. So it seems prudent to guard against them.
We can of course provide better rationalisation based on the above in our documentation to provide platforms porting efforts better guidance. It is after all up to partners in their platform ports to make appropriate decisions for their ports where settings could be changed however the upstreamed reference code should follow the settings we have which are felt to be best practices for TF-A.
It is known other projects follow other practices and that is fine if that works for them. Once a project enables unaligned accesses, it's very hard to go back again since it's likely that code that does unaligned accesses will slowly get added to the project. However for TF-A maintaining the current settings is felt to be the best approach when weighing up the costs and benefits.
Saying all this for those few cases where unaligned accesses are correct and intentional we could look to define better ways to handle these and provide that in reference code as an option to try and get the best for robustness and debuggability which seemed a big part of the concern in your post. Team members in Arm have ideas on how that could be provided but it needs broader discussion on the implications for security and performance before taking forward.
Thanks
Joanna
[1] https://trustedfirmware-a.readthedocs.io/en/latest/design/firmware-design.h…
On 09/11/2020, 17:47, "TF-A on behalf of Raghu Krishnamurthy via TF-A" <tf-a-bounces(a)lists.trustedfirmware.org on behalf of tf-a(a)lists.trustedfirmware.org> wrote:
Hi Julius,
I tend to agree with your argument about not using SCTLR.A bit but I think the unexpected crashes or instability is due buggy code and insufficient validation of invariants such as aligned pointers, irrespective of whether SCTRLR.A is set or unset.
Even if we allow unaligned accesses, we could have buggy code that access registers that typically have to be size aligned and we wouldn’t catch those bugs with SCTLR.A. Worse, some hardware implementations have undefined/impdef behavior when there are unaligned access to MMIO registers for ex, in which case, I would rather take an alignment fault at the core than allow triggering of undefined behavior.
So I don’t think stability/reliability and the use of SCTLR.A bit are related. If TF-A's position is that we want to allow only aligned accesses in EL3(for whatever reason, I can only think of efficiency), it is the code's responsibility to enforce this invariants using asserts or explicit checks.
>> I am still wondering why we choose to set the SCTLR_EL3.A
I think this is the relevant question. If there are good security reasons(which I don’t know about), I would say we should keep it. If it is for efficiency, given the way recent ARM64 cores are performing, I wouldn't have a problem with SCTLR.A=0.
Thanks
Raghu
-----Original Message-----
From: TF-A <tf-a-bounces(a)lists.trustedfirmware.org> On Behalf Of Julius Werner via TF-A
Sent: Friday, November 6, 2020 6:38 PM
To: tf-a <tf-a(a)lists.trustedfirmware.org>
Subject: [TF-A] Alignment fault checking in EL3
Hi,
I just debugged a TF-A boot crash that turned out to be caused by an alignment fault in platform code. Someone had defined some static storage space as a uint8_t array, and then accessed it by dereferencing uint16_t pointers.
Of course this is ultimately a bug in the platform code that should be fixed, but I am still wondering why we choose to set the SCTLR_EL3.A (Alignment fault checking) flag in TF-A? In an ideal world, maybe we could say that code which can generate alignment faults should not exist -- but, unfortunately, people make mistakes, and this kind of mistake may linger unnoticed for a long time in the codebase before randomly getting triggered due to subtle shifts in the binary's memory layout. (Worse, in some situations this could get affected by SMC parameters passed in from lower exception levels, so it would only be noticeable and could possibly be intentionally triggered if the lower exception level passes in just the right values.)
For that reason, most other environments I know (e.g. Linux) always keep that flag cleared. There's no harm in that -- as far as I'm aware all aarch64 cores are required to support unaligned accesses to cached memory types, and the worst that would happen is a slight performance penalty for the access. I think that flag is mostly meant as a debugging feature to be able to shake out accidental unaligned accesses from your code? If our goal is to be stable and reliable firmware, shouldn't we disable it to reduce the chance of unexpected crashes?
--
TF-A mailing list
TF-A(a)lists.trustedfirmware.org
https://lists.trustedfirmware.org/mailman/listinfo/tf-a
Hi Julius,
I tend to agree with your argument about not using SCTLR.A bit but I think the unexpected crashes or instability is due buggy code and insufficient validation of invariants such as aligned pointers, irrespective of whether SCTRLR.A is set or unset.
Even if we allow unaligned accesses, we could have buggy code that access registers that typically have to be size aligned and we wouldn’t catch those bugs with SCTLR.A. Worse, some hardware implementations have undefined/impdef behavior when there are unaligned access to MMIO registers for ex, in which case, I would rather take an alignment fault at the core than allow triggering of undefined behavior.
So I don’t think stability/reliability and the use of SCTLR.A bit are related. If TF-A's position is that we want to allow only aligned accesses in EL3(for whatever reason, I can only think of efficiency), it is the code's responsibility to enforce this invariants using asserts or explicit checks.
>> I am still wondering why we choose to set the SCTLR_EL3.A
I think this is the relevant question. If there are good security reasons(which I don’t know about), I would say we should keep it. If it is for efficiency, given the way recent ARM64 cores are performing, I wouldn't have a problem with SCTLR.A=0.
Thanks
Raghu
-----Original Message-----
From: TF-A <tf-a-bounces(a)lists.trustedfirmware.org> On Behalf Of Julius Werner via TF-A
Sent: Friday, November 6, 2020 6:38 PM
To: tf-a <tf-a(a)lists.trustedfirmware.org>
Subject: [TF-A] Alignment fault checking in EL3
Hi,
I just debugged a TF-A boot crash that turned out to be caused by an alignment fault in platform code. Someone had defined some static storage space as a uint8_t array, and then accessed it by dereferencing uint16_t pointers.
Of course this is ultimately a bug in the platform code that should be fixed, but I am still wondering why we choose to set the SCTLR_EL3.A (Alignment fault checking) flag in TF-A? In an ideal world, maybe we could say that code which can generate alignment faults should not exist -- but, unfortunately, people make mistakes, and this kind of mistake may linger unnoticed for a long time in the codebase before randomly getting triggered due to subtle shifts in the binary's memory layout. (Worse, in some situations this could get affected by SMC parameters passed in from lower exception levels, so it would only be noticeable and could possibly be intentionally triggered if the lower exception level passes in just the right values.)
For that reason, most other environments I know (e.g. Linux) always keep that flag cleared. There's no harm in that -- as far as I'm aware all aarch64 cores are required to support unaligned accesses to cached memory types, and the worst that would happen is a slight performance penalty for the access. I think that flag is mostly meant as a debugging feature to be able to shake out accidental unaligned accesses from your code? If our goal is to be stable and reliable firmware, shouldn't we disable it to reduce the chance of unexpected crashes?
Hi,
I just debugged a TF-A boot crash that turned out to be caused by an
alignment fault in platform code. Someone had defined some static
storage space as a uint8_t array, and then accessed it by
dereferencing uint16_t pointers.
Of course this is ultimately a bug in the platform code that should be
fixed, but I am still wondering why we choose to set the SCTLR_EL3.A
(Alignment fault checking) flag in TF-A? In an ideal world, maybe we
could say that code which can generate alignment faults should not
exist -- but, unfortunately, people make mistakes, and this kind of
mistake may linger unnoticed for a long time in the codebase before
randomly getting triggered due to subtle shifts in the binary's memory
layout. (Worse, in some situations this could get affected by SMC
parameters passed in from lower exception levels, so it would only be
noticeable and could possibly be intentionally triggered if the lower
exception level passes in just the right values.)
For that reason, most other environments I know (e.g. Linux) always
keep that flag cleared. There's no harm in that -- as far as I'm aware
all aarch64 cores are required to support unaligned accesses to cached
memory types, and the worst that would happen is a slight performance
penalty for the access. I think that flag is mostly meant as a
debugging feature to be able to shake out accidental unaligned
accesses from your code? If our goal is to be stable and reliable
firmware, shouldn't we disable it to reduce the chance of unexpected
crashes?
If we're emulating EL3 then the EL3 guest firmware is responsible for
providing the PSCI ABI, including reboot, core power down, etc.
sbsa-ref machine has an embedded controller to do reboot, poweroff. Machine
virt,secure=on can reuse this code to do reboot inside ATF.
Signed-off-by: Maxim Uvarov <maxim.uvarov(a)linaro.org>
---
Hello,
This patch implements reboot for the secure machine inside ATF firmware. I.e. current qemu
patch should be used with [1] ATF patch. It looks like that Embedded Controller qemu
driver (sbsa-ec) can be common and widely used for other emulated machines. While if
there are plans to extend sbsa-ec then we might find some other solution.
So for the long term it looks like machine virt was used as an initial playground for secure
firmware. While the original intent was a runner for kvm guests. Relation between kvm guest
and firmware is not very clear now. If everyone agree it might be good solution to move secure
firmware things from virt machine to bsa-ref and make this machine reference for secure boot,
firmware updates etc.
[1] https://github.com/muvarov/arm-trusted-firmware/commit/6d3339a0081f6f2b45d9…
Best regards,
Maxim.
hw/arm/virt.c | 9 +++++++++
include/hw/arm/virt.h | 2 ++
2 files changed, 11 insertions(+)
diff --git a/hw/arm/virt.c b/hw/arm/virt.c
index e465a988d6..6b77912f02 100644
--- a/hw/arm/virt.c
+++ b/hw/arm/virt.c
@@ -152,6 +152,7 @@ static const MemMapEntry base_memmap[] = {
[VIRT_ACPI_GED] = { 0x09080000, ACPI_GED_EVT_SEL_LEN },
[VIRT_NVDIMM_ACPI] = { 0x09090000, NVDIMM_ACPI_IO_LEN},
[VIRT_PVTIME] = { 0x090a0000, 0x00010000 },
+ [VIRT_EC] = { 0x090c0000, 0x00001000 },
[VIRT_MMIO] = { 0x0a000000, 0x00000200 },
/* ...repeating for a total of NUM_VIRTIO_TRANSPORTS, each of that size */
[VIRT_PLATFORM_BUS] = { 0x0c000000, 0x02000000 },
@@ -1729,6 +1730,13 @@ static void virt_cpu_post_init(VirtMachineState *vms, int max_cpus,
}
}
+static void init_ec_controller(VirtMachineState *vms)
+{
+ vms->ec = qdev_new("sbsa-ec");
+
+ sysbus_mmio_map(SYS_BUS_DEVICE(vms->ec), 0, vms->memmap[VIRT_EC].base);
+}
+
static void machvirt_init(MachineState *machine)
{
VirtMachineState *vms = VIRT_MACHINE(machine);
@@ -1797,6 +1805,7 @@ static void machvirt_init(MachineState *machine)
*/
if (vms->secure && firmware_loaded) {
vms->psci_conduit = QEMU_PSCI_CONDUIT_DISABLED;
+ init_ec_controller(vms);
} else if (vms->virt) {
vms->psci_conduit = QEMU_PSCI_CONDUIT_SMC;
} else {
diff --git a/include/hw/arm/virt.h b/include/hw/arm/virt.h
index aad6d69841..6f2ce4e4ff 100644
--- a/include/hw/arm/virt.h
+++ b/include/hw/arm/virt.h
@@ -85,6 +85,7 @@ enum {
VIRT_ACPI_GED,
VIRT_NVDIMM_ACPI,
VIRT_PVTIME,
+ VIRT_EC,
VIRT_LOWMEMMAP_LAST,
};
@@ -163,6 +164,7 @@ struct VirtMachineState {
DeviceState *gic;
DeviceState *acpi_dev;
Notifier powerdown_notifier;
+ DeviceState *ec;
};
#define VIRT_ECAM_ID(high) (high ? VIRT_HIGH_PCIE_ECAM : VIRT_PCIE_ECAM)
--
2.17.1
Hi All,
The next TF-A Tech Forum is scheduled for Thu 5th November 2020 16:00 – 17:00 (GMT).
Please note UK entered Daylight Saving on 25th October when clocks went back one hour to go to GMT from BST.
A reoccurring meeting invite has been sent out to the subscribers of this TF-A mailing list. If you don’t have this please let me know.
Agenda:
* TF-A Tests Framework Overview
* Presented by Varun Wadekar
* Summary
* Trusted Firmware-A Tests (TF-A Tests) is a suite of baremetal tests to exercise the Trusted Firmware-A (TF-A) features from the Normal World.
* Optional TF-A Mailing List Topic Discussions
If TF-A contributors have anything they wish to present at any future TF-A tech forum please contact me to have that scheduled.
Previous sessions, both recording and presentation material can be found on the trustedfirmware.org TF-A Technical meeting webpage: https://www.trustedfirmware.org/meetings/tf-a-technical-forum/
A scheduling tracking page is also available to help track sessions suggested and being prepared: https://developer.trustedfirmware.org/w/tf_a/tf-a-tech-forum-scheduling/ Final decisions on what will be presented will be shared a few days before the next meeting and shared on the TF-A mailing list.
Thanks
Joanna
Hi All,
Gentle reminder about the Mbed TLS workshop tomorrow (Tuesday, November 3rd) from 2 to 6pm GMT.
See agenda and zoom link here - https://www.trustedfirmware.org/meetings/mbed-tls-workshop/
Thanks,
Shebu
-----Original Appointment-----
From: Trusted Firmware Public Meetings <linaro.org_havjv2figrh5egaiurb229pd8c(a)group.calendar.google.com>
Sent: Friday, October 23, 2020 12:32 AM
To: Trusted Firmware Public Meetings; Shebu Varghese Kuriakose; mbed-tls(a)lists.trustedfirmware.org; Don Harbin; psa-crypto(a)lists.trustedfirmware.org; Dave Rodgman
Subject: Mbed TLS Virtual Workshop
When: Tuesday, November 3, 2020 2:00 PM-6:00 PM (UTC+00:00) Dublin, Edinburgh, Lisbon, London.
Where: Zoom: https://linaro-org.zoom.us/j/95315200315?pwd=ZDJGc1BZMHZLV29DTmpGUllmMjB1UT…
You have been invited to the following event.
Mbed TLS Virtual Workshop
When
Tue Nov 3, 2020 7am – 11am Mountain Standard Time - Phoenix
Where
Zoom: https://linaro-org.zoom.us/j/95315200315?pwd=ZDJGc1BZMHZLV29DTmpGUllmMjB1UT… (map<https://www.google.com/maps/search/Zoom:+https:%2F%2Flinaro-org.zoom.us%2Fj…>)
Calendar
shebu.varghesekuriakose(a)arm.com<mailto:shebu.varghesekuriakose@arm.com>
Who
•
Don Harbin - creator
•
shebu.varghesekuriakose(a)arm.com<mailto:shebu.varghesekuriakose@arm.com>
•
mbed-tls(a)lists.trustedfirmware.org<mailto:mbed-tls@lists.trustedfirmware.org>
•
psa-crypto(a)lists.trustedfirmware.org<mailto:psa-crypto@lists.trustedfirmware.org>
•
dave.rodgman(a)arm.com<mailto:dave.rodgman@arm.com>
more details »<https://www.google.com/calendar/event?action=VIEW&eid=NHVvY2FxY2o4Njk3MWZkd…>
Hi,
Trustedfirmware.org community project would like to invite you to the Mbed TLS Virtual Workshop.
The purpose of the workshop is to bring together the Mbed TLS community including maintainers, contributors and users to discuss
* The future direction of the project and
* Ways to improve community collaboration
Here is the agenda for the workshop.
Topic Time (in GMT)
Welcome 2.00 - 2.10pm
Constant-time code 2.10 – 2.30pm
Processes - how does work get scheduled? 2.30 – 2.50pm
PSA Crypto APIs 2.50 – 3.20pm
PSA Crypto for Silicon Labs Wireless
MCUs - Why, What, Where and When 3.20 – 3.50pm
Break
Roadmap, TLS1.3 Update 4.10 – 4.30pm
Mbed TLS 3.0 Plans, Scope 4.30 – 5.00pm
How do I contribute my first review
and be an effective Mbed TLS reviewer 5.00 – 5.30pm
Regards,
Don Harbin
Trusted Firmware Community Manager
==============Zoom details below:====================
Trusted Firmware is inviting you to a scheduled Zoom meeting.
Topic: Mbed TLS Virtual Workshop
Time: Nov 3, 2020 02:00 PM Greenwich Mean Time
Join Zoom Meeting
https://linaro-org.zoom.us/j/95315200315?pwd=ZDJGc1BZMHZLV29DTmpGUllmMjB1UT…<https://www.google.com/url?q=https%3A%2F%2Flinaro-org.zoom.us%2Fj%2F9531520…>
Meeting ID: 953 1520 0315
Passcode: 143755
One tap mobile
+16699009128,,95315200315# US (San Jose)
+12532158782,,95315200315# US (Tacoma)
Dial by your location
+1 669 900 9128 US (San Jose)
+1 253 215 8782 US (Tacoma)
+1 346 248 7799 US (Houston)
+1 646 558 8656 US (New York)
+1 301 715 8592 US (Germantown)
+1 312 626 6799 US (Chicago)
888 788 0099 US Toll-free
877 853 5247 US Toll-free
Meeting ID: 953 1520 0315
Find your local number: https://linaro-org.zoom.us/u/apL3hgti4<https://www.google.com/url?q=https%3A%2F%2Flinaro-org.zoom.us%2Fu%2FapL3hgt…>
Going (shebu.varghesekuriakose(a)arm.com<mailto:shebu.varghesekuriakose@arm.com>)? Yes<https://www.google.com/calendar/event?action=RESPOND&eid=NHVvY2FxY2o4Njk3MW…> - Maybe<https://www.google.com/calendar/event?action=RESPOND&eid=NHVvY2FxY2o4Njk3MW…> - No<https://www.google.com/calendar/event?action=RESPOND&eid=NHVvY2FxY2o4Njk3MW…> more options »<https://www.google.com/calendar/event?action=VIEW&eid=NHVvY2FxY2o4Njk3MWZkd…>
Invitation from Google Calendar<https://www.google.com/calendar/>
You are receiving this courtesy email at the account shebu.varghesekuriakose(a)arm.com<mailto:shebu.varghesekuriakose@arm.com> because you are an attendee of this event.
To stop receiving future updates for this event, decline this event. Alternatively you can sign up for a Google account at https://www.google.com/calendar/ and control your notification settings for your entire calendar.
Forwarding this invitation could allow any recipient to send a response to the organizer and be added to the guest list, or invite others regardless of their own invitation status, or to modify your RSVP. Learn More<https://support.google.com/calendar/answer/37135#forwarding>.