Hi Ken,
I tested with 2.0.0 release. There are my conclusions.
tfm_arch_set_context_ret_code set result (for psa_wait, …) by using SP stored in sp member of context_ctrl_t structure. It means that before calling tfm_arch_set_context_ret_code TF-M should update SP for thread (partition).
thrd_next update result by calling tfm_arch_set_context_ret_code, if query_state_cb (pointing to query_state implemented by backend_ipc.c) returns THRD_STATE_RET_VAL_AVAIL. For all partition except active (interrupted by PendSV) context (values in context_ctrl_t) is correct. But context is invalid for active partition, because it is updated by PendSV after exit from ipc_schedule and thrd_next.
As result call for psa_wait would fail if some secure IRQ will update signals_asserted after tfm_spm_partition_psa_wait and before PendSV will call thrd_next. SLIH interrupt can assert such signal (call to backend_assert_signal) in spm_handle_interrupt.
I made a simple fix:
diff --git a/secure_fw/spm/core/backend_ipc.c b/secure_fw/spm/core/backend_ipc.c
index 3a968a3..c8b0742 100644
--- a/secure_fw/spm/core/backend_ipc.c
+++ b/secure_fw/spm/core/backend_ipc.c
@@ -454,9 +454,14 @@
/* Protect concurrent access to current thread/component and thread status */
CRITICAL_SECTION_ENTER(cs);
- pth_next = thrd_next();
p_curr_ctx = (struct context_ctrl_t *)(CURRENT_THREAD->p_context_ctrl);
+ /* Update SP for current thread, so tfm_arch_set_context_ret_code can use up to date
+ * value of stack context to return value via R0.*/
+ p_curr_ctx->sp = __get_PSP() - sizeof(struct tfm_additional_context_t);
+
+ pth_next = thrd_next();
+
AAPCS_DUAL_U32_SET(ctx_ctrls, (uint32_t)p_curr_ctx, (uint32_t)p_curr_ctx);
p_part_curr = GET_CURRENT_COMPONENT();
Regards,
Roman.
From: Bohdan.Hunko--- via TF-M <tf-m@lists.trustedfirmware.org>
Sent: Wednesday, November 8, 2023 22:22
To: Ken.Liu@arm.com;
tf-m@lists.trustedfirmware.org
Cc: nd@arm.com
Subject: [TF-M] Re: Scheduling bug
Caution: This e-mail originated outside Infineon Technologies. Do not click on links or open attachments unless you validate it is safe. |
Hi Ken,
I was using 1.8.0 Release
Looks like this issue was fixed in new design although we would like to test it and confirm this when we migrate to newest release.
We will get back on this to you.
Regards,
Bohdan Hunko
Cypress Semiconductor Ukraine
Engineer
CSUKR CSS ICW SW FW
Mobile: +38099 50 19 714
Bohdan.Hunko@infineon.com
From: Ken Liu via TF-M <tf-m@lists.trustedfirmware.org>
Sent: Tuesday, November 7, 2023 03:58
To: tf-m@lists.trustedfirmware.org
Cc: nd <nd@arm.com>
Subject: [TF-M] Re: Scheduling bug
Caution: This e-mail originated outside Infineon Technologies. Do not click on links or open attachments unless you validate it is safe. |
Hi Bohdan,
Can you share us the version you are working on?
Looks like the patch is based on a previous version of TFM, and this issue is resolved in the latest code base.
The reason to do such abstraction is just because of such similar issues – the logic was coupled tightly because of synching the partition status, thread status and context update which caused maintenance confusion and effort. And we refactored this part to decouple the logic:
SPM: Add STATUS_NEED_SCHEDULE to manage scheduler (21054) · Gerrit Code Review (trustedfirmware.org)
Hope this helps, thanks.
/Ken
From: Bohdan.Hunko--- via TF-M <tf-m@lists.trustedfirmware.org>
Sent: Monday, November 6, 2023 10:01 PM
To: tf-m@lists.trustedfirmware.org
Subject: [TF-M] Scheduling bug
Hi
There seem to be scheduling bug we have found in SPM.
This bug is related to handling of interrupts that arrives during SVC call and assert signed for partition.
Steps to reproduce:
i. Mailbox IRQ and pendSV are both pending
i. Signal is asserted thus spm_handle_interrupt in thrd_next calls query_state_cb which returns THRD_STATE_RET_VAL_AVAIL and thus tfm_arch_set_context_ret_code is called
ii. tfm_arch_set_context_ret_code sets return code using OLD value of partition PSP (as it was never updated, as it is updated in PendSV)
i. Mailbox partition has 0 as signal because return value was written to wrong location is stack
Patch I have attached to the mail solves this problem for us BUT it seems more like a workaround than a proper fix(
Anyways it would be nice to have this problem review by SPM experts and have proper fix (maybe we have other places with same problem…)
Regards,
Bohdan Hunko
Cypress Semiconductor Ukraine
Engineer
CSUKR CSS ICW SW FW
Mobile: +38099 50 19 714
Bohdan.Hunko@infineon.com