Hi,
Arm worked to draft a firmware handoff [1] specification, evolving it based on community feedback.
This activity followed the request of some members of the Arm ecosystem [2].
The spec (still at ALP – feedback/comments welcome!) standardizes how information is propagated between different firmware components during boot.
The spec hopes to remove the reliance on bespoke/platform-specific information handoff mechanisms, thus reducing the code maintenance burden.
The concept of entry types is present in the spec – these are data structure layouts that carry a specific type of data.
New types are meant to be added, following the needs and use-cases of the different communities.
Thus, these communities should be empowered to request new types!
To enable community contributions, the specification must be hosted in a location that is friendly to change requests.
We propose to host the spec in trustedfirmware.org (tf.org).
Tf.org hosts several open-source projects and already has an open governance model.
TF-A, and the associated community, rely on tf.org, and thus are already well equipped to maintain this specification and keep it up to date.
Tf.org is agnostic of any downstream projects that would adopt this specification (e.g. U-boot, EDK2, etc.).
We welcome the views of the communities and want to understand if there are any strong objections to what’s being proposed!
If anyone has objections, we are happy to consider alternatives and associated trade-offs.
Regards
[1] https://developer.arm.com/documentation/den0135/latest
[2] Re: [TF-A] Proposal: TF-A to adopt hand-off blocks (HOBs) for information passing between boot stages - TF-A - lists.trustedfirmware.org<https://lists.trustedfirmware.org/archives/list/tf-a@lists.trustedfirmware.…>
Hi James & TF-A guys,
When hest acpi table configure Hardware Error Notification type as
Software Delegated Exception(0x0B) for RAS event, kernel RAS interacts with
TF-A by SDEI mechanism. On the firmware first system, kernel was notified by
TF-A sdei call.
The calling flow like as below when fatal RAS error happens:
TF-A notify kernel flow:
sdei_dispatch_event()
ehf_activate_priority()
call sdei callback // callback registered by kerenl
ehf_deactivate_priority()
Kernel sdei callback:
sdei_asm_handler()
__sdei_handler()
_sdei_handler()
sdei_event_handler()
ghes_sdei_critical_callback()
ghes_in_nmi_queue_one_entry()
/* if RAS error is fatal */
__ghes_panic()
panic()
If fatal RAS error occured, panic was called in sdei_asm_handle()
without ehf_deactivate_priority executed, which lead interrupt masked.
If interrupt masked, system would be halted in kdump flow like this:
arm-smmu-v3 arm-smmu-v3.3.auto: allocated 65536 entries for cmdq
arm-smmu-v3 arm-smmu-v3.3.auto: allocated 32768 entries for evtq
arm-smmu-v3 arm-smmu-v3.3.auto: allocated 65536 entries for priq
arm-smmu-v3 arm-smmu-v3.3.auto: SMMU currently enabled! Resetting...
So interrupt should be restored before panic otherwise kdump will hang.
In the process of sdei, a SDEI_EVENT_COMPLETE(or SDEI_EVENT_COMPLETE_AND_RESUME)
call should be called before panic for a completed run of ehf_deactivate_priority().
The ehf_deactivate_priority() function restore pmr_el1 to original value(>0x80).
The SDEI dispatch flow was broken if SDEI_EVENT_COMPLETE was not be called.
This will bring about two issue:
1 Kdump will hang for firmware reporting fatal RAS event by SDEI;
(as explain above)
2 For NMI scene,TF-A enable a secure timer, the PPI 29 will trigger periodically.
Kernel register a callback for hard lockup. The below code will not be
called when panic in kernel callback:
TF-A, services/std_svc/sdei/sdei_intr_mgmt.c sdei_intr_handler():
/*
* We reach here when client completes the event.
*
* If the cause of dispatch originally interrupted the Secure world,
* resume Secure.
*
* No need to save the Non-secure context ahead of a world switch: the
* Non-secure context was fully saved before dispatch, and has been
* returned to its pre-dispatch state.
*/
if (sec_state == SECURE)
restore_and_resume_secure_context();
/*
* The event was dispatched after receiving SDEI interrupt. With
* the event handling completed, EOI the corresponding
* interrupt.
*/
if ((map->ev_num != SDEI_EVENT_0) && !is_map_bound(map)) {
ERROR("Invalid SDEI mapping: ev=%u\n", map->ev_num);
panic();
}
plat_ic_end_of_interrupt(intr_raw);
How to fix above issues?
I think the root cause is that kernel broken the SDEI dispatch flow, so kernel
should modify to fix these issues.
Thanks,
Ming
Hello,
I'm working on a project for ChromeOS where we would like to be able to
load the BL32 payload (OpTee) for SEL-1 after the linux kernel has booted
rather than during the usual BL32 stage. We would do this via an SMC we
would add which would take the OpTee image from linux and then have EL3
load it and perform the init for SEL-1 at that time.
The reasoning behind this is that it's much easier to update the rootfs
than the FW on our devices, and we can still ensure the integrity of the
OpTee image if we load it early enough after the kernel boots.
The main questions I have are if there are any issues people would be aware
of by loading it after linux boots rather than during the usual BL32 stage?
And I would definitely want to upstream this work if it's something we can
do.
Thanks,
Jeffrey Kardatzke
Google, Inc.
Hi all,
As documented in the Release Cadence section of the TF-A documentation (https://trustedfirmware-a.readthedocs.io/en/latest/about/release-informatio…) the v2.8 release has an expected code freeze date of 3rd week of November 2022.
That equates to the start of that week Monday 14th November which is one calendar month away from tomorrow when the rc0 tag will be applied. Closing out the release takes around 6-10 working days normally over the last few releases.
We want to ensure that planned feature patches for the release are submitted in good time for the review process to conclude.
Preparations for v2.8 release is already underway.
Thanks
Joanna
Hi,
When a core is in debug recovery mode its caches are not invalidated
upon reset, so the L1 and L2 cache contents from before reset are
observable after reset. Similarly, debug recovery mode of DynamIQ
cluster ensures that contents of the shared L3 cache are also not
invalidated upon transition to On mode.
A common use case of booting cores in debug recovery mode is to boot
with caches disabled and preserve the caches until a point where
software can dump the caches and retrieve their contents. TF-A however
unconditionally cleans and invalidates caches at multiple points
during boot, e.g. in bl31_entrypoint when cleaning bss and .data
sections. This will not only lose the cache content needed for
debugging but will potentially corrupt memory as well, leading to bugs
when booting in recovery mode.
Can we make CMOs in lib/aarch64/cache_helpers.S conditional upon some
platform hook to address above scenario? Happy to work on a patch if
the idea of conditional CMOs makes sense.
Thanks,
Okash
Hi Everyone,
We have a TF-A Tech forum scheduled for this Thursday at 4pm GMT. Note the UK clocks moved to GMT from BST last weekend so the meeting may appear an hour different than in previous weeks in your calendar.
At this time I do not have any topics to present. Please do reach out to me if you have any topics you would like to present to the TF-A community.
If I do not find a topic or hear from the community that they have a topic I will cancel end of day this coming Wednesday 2nd November.
Thanks
Joanna
Hello,
Just a quick follow-up on this question of using an HSM (or in general, some form of Key Management Infrastructure) to sign TF-A images.
U-Boot has support for this with its mkimage utility (see https://github.com/u-boot/u-boot/blob/master/doc/uImage.FIT/signature.txt#L…). This appears to a custom engine in OpenSSL (and in this case, the pkcs11 engine). My questions are:
1. Does TF-A’s cert_create tool support using custom OpenSSL engines?
2. If so, is there a procedure for using this?
3. If not, is there a plan to add support for this in the roadmap somewhere?
* Or, in general, is there a plan to add HSM support for TF-A image signing?
Thanks,
Brian
Hello,
After learning the current implementation of plat_get_stack_protector_canary in TF-A, i am curious about why we not make the first byte of canary an NULL byte for better security?
Hello,
I am about to debug RME feature with ARM DS on
FVP_Base_RevC-2xAEMvA_11.19_14 platform. I am using the Trusted Firmware
with RME extension based on this description
https://trustedfirmware-a.readthedocs.io/en/latest/components/realm-managem…
My observation is running plainly the model, everything looks ok, the VFP
can run the SW without any problem. (I can see the consol windows with
normal booting procedure)
In case I would like to set up the ARM debugger in ARM DS, the simulation
immediately stops after start with a popup window: "Unable to connect to
device ARMAEM-a_MP_0 Error opening connection to device 16 Socket is
closed(E_io_error) Socket is closed"
Apparently my Debug settings are good: With the same settings I can
run/debug the complete ARM Reference Solution (Linux, u-boot,
TrustedFirmware) based on this description:
https://gitlab.arm.com/arm-reference-solutions/arm-reference-solutions-docs…
but if I add additional flags for FVP: (-C cluster0.rme_support_level=2 -C
cluster1.rme_support_level=2) I still can run the model, but cannot debug
so when I activate the RME feature the ARM debugger do the previously
mentioned behavior (stops right after start)
How can I workaround this? What is the most efficient way to debug FVP with
RME support. Any help is welcome here.
Bye,
Adam