Hi James & TF-A guys,
When hest acpi table configure Hardware Error Notification type as
Software Delegated Exception(0x0B) for RAS event, kernel RAS interacts with
TF-A by SDEI mechanism. On the firmware first system, kernel was notified by
TF-A sdei call.
The calling flow like as below when fatal RAS error happens:
TF-A notify kernel flow:
sdei_dispatch_event()
ehf_activate_priority()
call sdei callback // callback registered by kerenl
ehf_deactivate_priority()
Kernel sdei callback:
sdei_asm_handler()
__sdei_handler()
_sdei_handler()
sdei_event_handler()
ghes_sdei_critical_callback()
ghes_in_nmi_queue_one_entry()
/* if RAS error is fatal */
__ghes_panic()
panic()
If fatal RAS error occured, panic was called in sdei_asm_handle()
without ehf_deactivate_priority executed, which lead interrupt masked.
If interrupt masked, system would be halted in kdump flow like this:
arm-smmu-v3 arm-smmu-v3.3.auto: allocated 65536 entries for cmdq
arm-smmu-v3 arm-smmu-v3.3.auto: allocated 32768 entries for evtq
arm-smmu-v3 arm-smmu-v3.3.auto: allocated 65536 entries for priq
arm-smmu-v3 arm-smmu-v3.3.auto: SMMU currently enabled! Resetting...
So interrupt should be restored before panic otherwise kdump will hang.
In the process of sdei, a SDEI_EVENT_COMPLETE(or SDEI_EVENT_COMPLETE_AND_RESUME)
call should be called before panic for a completed run of ehf_deactivate_priority().
The ehf_deactivate_priority() function restore pmr_el1 to original value(>0x80).
The SDEI dispatch flow was broken if SDEI_EVENT_COMPLETE was not be called.
This will bring about two issue:
1 Kdump will hang for firmware reporting fatal RAS event by SDEI;
(as explain above)
2 For NMI scene,TF-A enable a secure timer, the PPI 29 will trigger periodically.
Kernel register a callback for hard lockup. The below code will not be
called when panic in kernel callback:
TF-A, services/std_svc/sdei/sdei_intr_mgmt.c sdei_intr_handler():
/*
* We reach here when client completes the event.
*
* If the cause of dispatch originally interrupted the Secure world,
* resume Secure.
*
* No need to save the Non-secure context ahead of a world switch: the
* Non-secure context was fully saved before dispatch, and has been
* returned to its pre-dispatch state.
*/
if (sec_state == SECURE)
restore_and_resume_secure_context();
/*
* The event was dispatched after receiving SDEI interrupt. With
* the event handling completed, EOI the corresponding
* interrupt.
*/
if ((map->ev_num != SDEI_EVENT_0) && !is_map_bound(map)) {
ERROR("Invalid SDEI mapping: ev=%u\n", map->ev_num);
panic();
}
plat_ic_end_of_interrupt(intr_raw);
How to fix above issues?
I think the root cause is that kernel broken the SDEI dispatch flow, so kernel
should modify to fix these issues.
Thanks,
Ming
Hi all,
I am running FVP with 2CPUs, Cactus SP (SEL1), Hafnium (SEL2) and KVM VHE.
Sometimes I send the "FFA_MSG_SEND_DIRECT_REQ" smc call from KVM (I fill
0x8400006f in x0, then VMID and SP ID in x1, let x2 as 0). It says
assert failed, like this:
ASSERT: lib/el3_runtime/aarch64/context_mgmt.c:651
BACKTRACE: START: assert
0: EL3: 0x4005cac
1: EL3: 0x400323c
2: EL3: 0x400620c
3: EL3: 0x400e180
4: EL3: 0x4005a94
BACKTRACE: END: assert
After I check the bl31.dump, I notice that:
when services/std_svc/spmd/spmd_main.c sends the FFA
call (from NS to S) via "spmd_smc_forward(smc_fid, secure_origin,x1,
x2, x3, x4, handle)", it will go to
cm_el1_sysregs_context_restore(secure_state_out) and
cm_el2_sysregs_context_restore(secure_state_out), then it will assert
the cm_get_context(). it gets the NULL context, so assert failed.
Before the problem appeared, I have modified many codes on a dirty
TF-A v2.4 (commit hash is 0aa70f4c4c023ca58dea2d093d3c08c69b652113),
Hafnium and TF-A-TESTS. I also mail with Hafnium MailList, they
consider it can be a problem in EL3.
Such assert is NOT ALWAYS failed. I mean, maybe when I run FVP and
send "smc" now, it is failed. But when I shut down, run FVP, and send
the same instruction with the same parameter again, it is OK.
I want to know, what is the possible reasons for suddenly losing the
secure context. Can you give me some advice on debugging? e.g., where
should I check? Need I provide more info?
Sincerely,
Wang
Hi all,
I want to add some big data structures in BL31 (e.g., create a large
uint32_t array). Also, I reserve a secure memory space (assume it is
0xa000_0000 - 0xb000_0000) by configuring TZASC.
Now, the BL31 says
build/fvp/debug/bl31/bl31.elf section `.bss' will not fit in region `RAM'
aarch64-none-elf-ld: BL31 image has exceeded its limit.
aarch64-none-elf-ld: region `RAM' overflowed by 458752 bytes
It looks like that the previous RAM cannot hold my big data
structures. If I want to add my reserved region into RAM (so I may
allocate these data into the reserved region), what should I do?
Sincerely,
Wang
Hi,
Please allow me to add some more details. Also posting to the TF-A list, as all tf.org repositories are affected.
The root cause of this issue is OpenSSH dropping support for SHA-1 RSA signatures with the 8.8 release, and thus any OS (or git client) coming with a recent version is affected. I.e. the newest git for windows is affected too, and so are “top notch” Linux distributions like Arch.
For details see the “Potentially-incompatible changes” chapter here: https://www.openssh.com/releasenotes.html
As the above page states “Incompatibility is more likely when connecting to older SSH implementations…”, and thus a server-side update would eliminate the problem. Till that happens the page above list multiple client-side workarounds. (It is possible to amend the ssh config in a way that fixes all repositories.)
/George
From: TF-M <tf-m-bounces(a)lists.trustedfirmware.org> On Behalf Of Kevin Townsend via TF-M
Sent: December 15, 2021 13:06
To: Thomas Törnblom via TF-M <tf-m(a)lists.trustedfirmware.org>
Subject: [TF-M] Tip on cloning TF-M on OS X Monterey
I recently switched to a new MBP that ships with OS X Monterey, and on both 12.0 and 12.1 (released this week) git clone seems to be broken when you're using HTTP rather than SSH:
digital envelope routines:CRYPTO_internal:bad key length
In order to clone TF-M, I had to make the following changes.
1. Add these details to $HOME/.ssh/config (microbuilder being my github username, associated with my TF-M account):
Host trustedfirmware.org<http://trustedfirmware.org>
User microbuilder
Hostname review.trustedfirmware.org<http://review.trustedfirmware.org>
Port 29418
IdentityFile ~/.ssh/id_rsa
IdentitiesOnly yes
2. Then try to clone with:
$ git clone trustedfirmware.org:/TF-M/trusted-firmware-m.git
This fails, however, since it tries to clone tf-m-tests.git, so:
3. Edit lib/ext/tf-m-tests/fetch_repo.cmake, changing:
FetchContent_Declare(tfm_test_repo
GIT_REPOSITORY trustedfirmware.org:TF-M/tf-m-tests.git
# GIT_REPOSITORY https://git.trustedfirmware.org/TF-M/tf-m-tests.git
GIT_TAG ${TFM_TEST_REPO_VERSION}
GIT_PROGRESS TRUE
)
This let me at least clone TF-M until the issues with HTTP-based cloning are fixed.
Hope this is useful to someone else working on OS X natively.
Kevin
Hi,
On STM32MP1, we'd like BL2 to be agnostic of what BL32 is in the FIP.
It can be either OP-TEE or TF-A SP_min.
But on STM32MP1, SP_min needs a device tree file (TOS_FW_CONFIG_ID),
whereas OP-TEE doesn't use this separate DT image.
As TOS_FW_CONFIG_ID is in list of images to be loaded by BL2, we then
have a warning message in case OP-TEE is used:
WARNING: FCONF: Invalid config id 26
I'd like to silence this warning with this kind of patch:
diff --git a/lib/fconf/fconf_dyn_cfg_getter.c
b/lib/fconf/fconf_dyn_cfg_getter.c
index 25dd7f9eda..f7e9834c3b 100644
--- a/lib/fconf/fconf_dyn_cfg_getter.c
+++ b/lib/fconf/fconf_dyn_cfg_getter.c
@@ -51,7 +51,11 @@ struct dyn_cfg_dtb_info_t
*dyn_cfg_dtb_info_getter(unsigned int config_id)
}
}
- WARN("FCONF: Invalid config id %u\n", config_id);
+ if (config_id == TOS_FW_CONFIG_ID) {
+ VERBOSE("FCONF: No TOS_FW_CONFIG image\n");
+ } else {
+ WARN("FCONF: Invalid config id %u\n", config_id);
+ }
return NULL;
}
I can change the VERBOSE message to INFO.
Do you think it is OK if I push the patch?
Thanks,
Yann
I’m not sure if the cancellations have been sent from the trustedfirmware.org calendar system so confirming that they are cancelled to the list.
Next scheduled Tech Forum is 13th January 2022.
Joanna
Hi all,
I want to load a specific image in BL31. But when I call
load_auth_image(). It says
"in function `load_image':
trusted-firmware-a/common/bl_common.c:87: undefined reference to
`plat_get_image_source'"
Also, the io_read, io_size and etc. are undefined reference.
I find other BL files (bl1, bl2) will call the load_auth_image() in
their main functions or sub-functions. If I want to implement it on
BL31, what should I do? Should I modify the Makefile?
Sincerely,
Wang