Hi Nicola,
* Can you please share a bit more about your interrupt configurations, priorities etc? We don’t do anything special, the IRQ priority is Normal, nothing unusual.
* Am I missing something else? Looking into the code one thing that comes to mind is that tfm_arch_thread_fn_call can be called from unprivileged partition thus interrupt masking will not take effect. I believe this explains the behavior described in previous mail. If so then not only this code is effected, but other multithread issues may occur in different places of tfm_arch_thread_fn_call.
Bohdan Hunko
Cypress Semiconductor Ukraine LLC Senior Engineer CSS ICW SW INT BFS SFW Mobile: +380995019714 Bohdan.Hunko@infineon.commailto:Bohdan.Hunko@infineon.com
From: Nicola Mazzucato Nicola.Mazzucato@arm.com Sent: Friday, 19 December 2025 11:59 To: Hunko Bohdan (CSS ICW SW INT BFS SFW) Bohdan.Hunko@infineon.com Cc: Kozemchuk Ivan (CSS ICW SW INT BFS SFW) Ivan.Kozemchuk@infineon.com; Kytsun Hennadiy (CSS ICW SW INT BFS SFW) Hennadiy.Kytsun@infineon.com; Anton Komlev via TF-M tf-m@lists.trustedfirmware.org Subject: Re: Race condition in SPM scheduler lock logic
Caution: This e-mail originated outside Infineon Technologies. Please be cautious when sharing information or opening attachments especially from unknown senders. Refer to our intranet guidehttps://intranet-content.infineon.com/explore/aboutinfineon/rules/informationsecurity/ug/SocialEngineering/Pages/SocialEngineeringElements_en.aspx to help you identify Phishing email.
Hi Bohdan,
The sequence you provided seems reasonable, however "backend_abi_leaving_spm" and the subsequent "arch_release_sched_lock" execute with all interrupts disabled, so there are no interrupts that should change the scheduler_lock in between [1]. A pending interrupt would execute as soon as L:91, and then would correctly set the PendSV.
Can you please share a bit more about your interrupt configurations, priorities etc? Am I missing something else?
Thanks Best regards, Nick
[1] https://git.trustedfirmware.org/plugins/gitiles/TF-M/trusted-firmware-m.git/...
________________________________ From: Nicola Mazzucato via TF-M <tf-m@lists.trustedfirmware.orgmailto:tf-m@lists.trustedfirmware.org> Sent: 17 December 2025 08:37 To: tf-m@lists.trustedfirmware.orgmailto:tf-m@lists.trustedfirmware.org <tf-m@lists.trustedfirmware.orgmailto:tf-m@lists.trustedfirmware.org>; Bohdan.Hunko@infineon.commailto:Bohdan.Hunko@infineon.com <Bohdan.Hunko@infineon.commailto:Bohdan.Hunko@infineon.com> Cc: Ivan.Kozemchuk@infineon.commailto:Ivan.Kozemchuk@infineon.com <Ivan.Kozemchuk@infineon.commailto:Ivan.Kozemchuk@infineon.com>; Hennadiy.Kytsun@infineon.commailto:Hennadiy.Kytsun@infineon.com <Hennadiy.Kytsun@infineon.commailto:Hennadiy.Kytsun@infineon.com> Subject: [TF-M] Re: Race condition in SPM scheduler lock logic
Thanks Bohdan for reporting this.
Let me have a look and try to reproduce it.
Best regards, Nick
________________________________ From: Bohdan.Hunko--- via TF-M <tf-m@lists.trustedfirmware.orgmailto:tf-m@lists.trustedfirmware.org> Sent: 16 December 2025 20:54 To: tf-m@lists.trustedfirmware.orgmailto:tf-m@lists.trustedfirmware.org <tf-m@lists.trustedfirmware.orgmailto:tf-m@lists.trustedfirmware.org> Cc: Ivan.Kozemchuk@infineon.commailto:Ivan.Kozemchuk@infineon.com <Ivan.Kozemchuk@infineon.commailto:Ivan.Kozemchuk@infineon.com>; Hennadiy.Kytsun@infineon.commailto:Hennadiy.Kytsun@infineon.com <Hennadiy.Kytsun@infineon.commailto:Hennadiy.Kytsun@infineon.com> Subject: [TF-M] Race condition in SPM scheduler lock logic
Hi all,
I have found a bug in SPM scheduler lock logic – this bug is extremely hard to reproduce as it requires precise conditions and timings, but here is the description of the bug scenario:
1. Partition A calls psa_wait to wait for a signal (this signal is going to be asserted by FLIH IRQ later) 2. Currently signal is not asserted, no other partition is runnable, thus SPM marks this signal as being awaited and then schedules idle_thread 3. idle_thread calls psa_wait to poll SPM
* psa_wait calls tfm_arch_thread_fn_call * tfm_arch_thread_fn_call calls backend_abi_entering_spm * backend_abi_entering_spm calls arch_acquire_sched_lock * arch_acquire_sched_lock sets scheduler_lock = SCHEDULER_LOCKED * psa_wait (called by idle_partition) is being processed up to the point of backend_abi_leaving_spm * backend_abi_leaving_spm calls arch_release_sched_lock * here is where very sneaky the bug happens * arch_release_sched_lock executes following assembly instructions
i. "ldr r1, =scheduler_lock \n" "ldr r0, [r1, #0] \n"
ii. At this point r0 holds scheduler_lock is = SCHEDULER_LOCKED
iii. After these instructions are executed FLIH interrupt arrives
* FLIH handler asserts signal (which should unblock execution of the Partition A) * spm_handle_interrupt calls backend_assert_signal * backend_assert_signal does if (p_pt->signals_asserted & p_pt->signals_waiting) and returns STATUS_NEED_SCHEDULE * spm_handle_interrupt calls arch_attempt_schedule * arch_attempt_schedule checks value of scheduler_lock (which is SCHEDULER_LOCKED) and sets scheduler_lock= SCHEDULER_ATTEMPTED * Interrupt returns
iv. Execution continues, now scheduler_lock is = SCHEDULER_ATTEMPTED But the next line of code in arch_release_sched_lock is "movs r2, #"M2S(SCHEDULER_UNLOCKED)" \n"/* Unlock scheduler */
This effectively overwrites scheduler_lock from SCHEDULER_ATTEMPTED to SCHEDULER_UNLOCKED This means that following SRM scheduling logic will not trigger PendSV and just return to idle_partition – effectively resulting in a hang of a system.
Looks like the solution is to wrap lock logic in critical section. But may be there is other things that can be done to better fix this issue.
Let me know if there are other details that may be helpful to fix this bug.
Bohdan Hunko
Cypress Semiconductor Ukraine LLC
Senior Engineer
CSS ICW SW INT BFS SFW
Mobile: +380995019714 Bohdan.Hunko@infineon.commailto:Bohdan.Hunko@infineon.com