[AMD Official Use Only - AMD Internal Distribution Only]
Hello Maintainers,
We are observing a reproducible runtime regression on the ZynqMP (Cortex-A53) platform after enabling LTO (ENABLED_LTO=1) and merging the changes from the topic NUMA_AWARE_PER_CPU into our integration branch (https://github.com/ARM-software/arm-trusted-firmware/commit/7303319b3823e9e3...).
Summary of the issue
1. Baseline behavior * Platform: ZynqMP (Cortex-A53) * Configuration: ENABLED_LTO=1 * Without NUMA_AWARE_PER_CPU: Linux boots and runs stably * After merging NUMA_AWARE_PER_CPU, Linux boots but hangs during runtime * During the hang, CPUs are observed to unexpectedly re-enter EL3 * Re-entry into EL3 should not occur during normal Linux runtime execution and strongly suggests corruption or mismanagement of PSCI and/or per-CPU state(arm-trusted-firmware/lib/per_cpu/aarch64/per_cpu_asm.S at master * ARM-software/arm-trusted-firmwarehttps://github.com/ARM-software/arm-trusted-firmware/blob/master/lib/per_cpu/aarch64/per_cpu_asm.S#L28) * Reverting the NUMA_AWARE_PER_CPU changes restores stable Linux execution * The issue is reproducible only when NUMA_AWARE_PER_CPU is present * This clearly identifies NUMA_AWARE_PER_CPU as the regression source 2. Suspect with LTO * With NUMA_AWARE_PER_CPU enabled, LTO breaks the per-CPU base calculation * BL31 contains hand-written assembly that relies on linker-script symbols (e.g., per-CPU section boundaries) * Under LTO, symbol placement and retention are no longer guaranteed in the same way, leading to incorrect per-CPU base computation * This results in corrupted per-CPU data and subsequent erroneous PSCI suspend behavior (EL3 re-entry) 3. CPU idle dependency * The following kernel configuration options are enabled: * CONFIG_CPU_IDLE=y * CONFIG_CPU_IDLE_MULTIPLE_DRIVERS=y * CONFIG_CPU_IDLE_GOV_MENU=y * CONFIG_DT_IDLE_STATES=y * This further suggests the issue is triggered during CPU idle / suspend-resume paths, where correct per-CPU state handling is critical
Based on the above:
* This is specific to NUMA_AWARE_PER_CPU combined with LTO * The failure mode points to per-CPU base calculation and PSCI state corruption * Reverting NUMA_AWARE_PER_CPU fully restores stability on ZynqMP
We wanted to report this issue upstream and seek guidance on:
* Whether NUMA_AWARE_PER_CPU is expected to be LTO-safe on platforms relying on linker-defined per-CPU sections * Or if additional constraints / fixes are required for platforms like ZynqMP
We are happy to provide further logs, configuration details, or help to fixes.
Regards, Prasad Kummari
Hi Prasad,
Thanks for your mail. I’m trying to reproduce this at our end. Could I check if you see the boot progressing to shell without the CPU idle dependencies?
Regards, Rohit
From: Kummari, Prasad Prasad.Kummari@amd.com Date: Monday, 5 January 2026 at 18:04 To: Sammit Joshi Sammit.Joshi@arm.com, Rohit Mathew Rohit.Mathew@arm.com, scan-admin--- via TF-A tf-a@lists.trustedfirmware.org Cc: Belsare, Akshay akshay.belsare@amd.com, Bollapalli, Maheedhar Sai MaheedharSai.Bollapalli@amd.com, Simek, Michal michal.simek@amd.com Subject: ZynqMP regression with NUMA_AWARE_PER_CPU changes with ENABLED_LTO : Linux runtime hang due to EL3 re-entry
[AMD Official Use Only - AMD Internal Distribution Only]
Hello Maintainers,
We are observing a reproducible runtime regression on the ZynqMP (Cortex-A53) platform after enabling LTO (ENABLED_LTO=1) and merging the changes from the topic NUMA_AWARE_PER_CPU into our integration branch (https://github.com/ARM-software/arm-trusted-firmware/commit/7303319b3823e9e3...).
Summary of the issue
1. Baseline behavior * Platform: ZynqMP (Cortex-A53) * Configuration: ENABLED_LTO=1 * Without NUMA_AWARE_PER_CPU: Linux boots and runs stably * After merging NUMA_AWARE_PER_CPU, Linux boots but hangs during runtime * During the hang, CPUs are observed to unexpectedly re-enter EL3 * Re-entry into EL3 should not occur during normal Linux runtime execution and strongly suggests corruption or mismanagement of PSCI and/or per-CPU state(arm-trusted-firmware/lib/per_cpu/aarch64/per_cpu_asm.S at master · ARM-software/arm-trusted-firmwarehttps://github.com/ARM-software/arm-trusted-firmware/blob/master/lib/per_cpu/aarch64/per_cpu_asm.S#L28) * Reverting the NUMA_AWARE_PER_CPU changes restores stable Linux execution * The issue is reproducible only when NUMA_AWARE_PER_CPU is present * This clearly identifies NUMA_AWARE_PER_CPU as the regression source 2. Suspect with LTO * With NUMA_AWARE_PER_CPU enabled, LTO breaks the per-CPU base calculation * BL31 contains hand-written assembly that relies on linker-script symbols (e.g., per-CPU section boundaries) * Under LTO, symbol placement and retention are no longer guaranteed in the same way, leading to incorrect per-CPU base computation * This results in corrupted per-CPU data and subsequent erroneous PSCI suspend behavior (EL3 re-entry) 3. CPU idle dependency * The following kernel configuration options are enabled: * CONFIG_CPU_IDLE=y * CONFIG_CPU_IDLE_MULTIPLE_DRIVERS=y * CONFIG_CPU_IDLE_GOV_MENU=y * CONFIG_DT_IDLE_STATES=y * This further suggests the issue is triggered during CPU idle / suspend-resume paths, where correct per-CPU state handling is critical
Based on the above:
* This is specific to NUMA_AWARE_PER_CPU combined with LTO * The failure mode points to per-CPU base calculation and PSCI state corruption * Reverting NUMA_AWARE_PER_CPU fully restores stability on ZynqMP
We wanted to report this issue upstream and seek guidance on:
* Whether NUMA_AWARE_PER_CPU is expected to be LTO-safe on platforms relying on linker-defined per-CPU sections * Or if additional constraints / fixes are required for platforms like ZynqMP
We are happy to provide further logs, configuration details, or help to fixes.
Regards, Prasad Kummari
[AMD Official Use Only - AMD Internal Distribution Only]
Hi Mathew,
Yes, if the CPU_IDLE-related configurations listed below are removed while compiling the kernel image, the hang is no longer observed.
Configs: CONFIG_CPU_IDLE=y CONFIG_CPU_IDLE_MULTIPLE_DRIVERS=y CONFIG_CPU_IDLE_GOV_MENU=y CONFIG_DT_IDLE_STATES=y
Observed Behavior
* System later triggers an Unhandled Exception in EL3 CPUs enter psci_cpu_suspend(). Debug prints confirm execution reaches the suspend path. * The affected CPUs enter Reset Catch at EL3 and hang at the bl31_warm_entrypoint()->el3_entrypoint_common pc.
Regards, Prasad.
From: Rohit Mathew Rohit.Mathew@arm.com Sent: Tuesday, January 6, 2026 4:37 AM To: Kummari, Prasad Prasad.Kummari@amd.com; Sammit Joshi Sammit.Joshi@arm.com; scan-admin--- via TF-A tf-a@lists.trustedfirmware.org Cc: Belsare, Akshay akshay.belsare@amd.com; Bollapalli, Maheedhar Sai MaheedharSai.Bollapalli@amd.com; Simek, Michal michal.simek@amd.com Subject: Re: ZynqMP regression with NUMA_AWARE_PER_CPU changes with ENABLED_LTO : Linux runtime hang due to EL3 re-entry
Caution: This message originated from an External Source. Use proper caution when opening attachments, clicking links, or responding.
Hi Prasad,
Thanks for your mail. I'm trying to reproduce this at our end. Could I check if you see the boot progressing to shell without the CPU idle dependencies?
Regards, Rohit
From: Kummari, Prasad <Prasad.Kummari@amd.commailto:Prasad.Kummari@amd.com> Date: Monday, 5 January 2026 at 18:04 To: Sammit Joshi <Sammit.Joshi@arm.commailto:Sammit.Joshi@arm.com>, Rohit Mathew <Rohit.Mathew@arm.commailto:Rohit.Mathew@arm.com>, scan-admin--- via TF-A <tf-a@lists.trustedfirmware.orgmailto:tf-a@lists.trustedfirmware.org> Cc: Belsare, Akshay <akshay.belsare@amd.commailto:akshay.belsare@amd.com>, Bollapalli, Maheedhar Sai <MaheedharSai.Bollapalli@amd.commailto:MaheedharSai.Bollapalli@amd.com>, Simek, Michal <michal.simek@amd.commailto:michal.simek@amd.com> Subject: ZynqMP regression with NUMA_AWARE_PER_CPU changes with ENABLED_LTO : Linux runtime hang due to EL3 re-entry
[AMD Official Use Only - AMD Internal Distribution Only]
Hello Maintainers,
We are observing a reproducible runtime regression on the ZynqMP (Cortex-A53) platform after enabling LTO (ENABLED_LTO=1) and merging the changes from the topic NUMA_AWARE_PER_CPU into our integration branch (https://github.com/ARM-software/arm-trusted-firmware/commit/7303319b3823e9e3...).
Summary of the issue
1. Baseline behavior
* Platform: ZynqMP (Cortex-A53) * Configuration: ENABLED_LTO=1 * Without NUMA_AWARE_PER_CPU: Linux boots and runs stably * After merging NUMA_AWARE_PER_CPU, Linux boots but hangs during runtime * During the hang, CPUs are observed to unexpectedly re-enter EL3 * Re-entry into EL3 should not occur during normal Linux runtime execution and strongly suggests corruption or mismanagement of PSCI and/or per-CPU state(arm-trusted-firmware/lib/per_cpu/aarch64/per_cpu_asm.S at master * ARM-software/arm-trusted-firmwarehttps://github.com/ARM-software/arm-trusted-firmware/blob/master/lib/per_cpu/aarch64/per_cpu_asm.S#L28) * Reverting the NUMA_AWARE_PER_CPU changes restores stable Linux execution * The issue is reproducible only when NUMA_AWARE_PER_CPU is present * This clearly identifies NUMA_AWARE_PER_CPU as the regression source
1. Suspect with LTO
* With NUMA_AWARE_PER_CPU enabled, LTO breaks the per-CPU base calculation * BL31 contains hand-written assembly that relies on linker-script symbols (e.g., per-CPU section boundaries) * Under LTO, symbol placement and retention are no longer guaranteed in the same way, leading to incorrect per-CPU base computation * This results in corrupted per-CPU data and subsequent erroneous PSCI suspend behavior (EL3 re-entry)
1. CPU idle dependency
* The following kernel configuration options are enabled: * CONFIG_CPU_IDLE=y * CONFIG_CPU_IDLE_MULTIPLE_DRIVERS=y * CONFIG_CPU_IDLE_GOV_MENU=y * CONFIG_DT_IDLE_STATES=y * This further suggests the issue is triggered during CPU idle / suspend-resume paths, where correct per-CPU state handling is critical
Based on the above:
* This is specific to NUMA_AWARE_PER_CPU combined with LTO * The failure mode points to per-CPU base calculation and PSCI state corruption * Reverting NUMA_AWARE_PER_CPU fully restores stability on ZynqMP
We wanted to report this issue upstream and seek guidance on:
* Whether NUMA_AWARE_PER_CPU is expected to be LTO-safe on platforms relying on linker-defined per-CPU sections * Or if additional constraints / fixes are required for platforms like ZynqMP
We are happy to provide further logs, configuration details, or help to fixes.
Regards, Prasad Kummari
Hi Prasad,
Thanks for the info. We tried booting on few different platforms (ACPI-Linux/DT-Linux/TFTF) from our end with Suspend and LTO enabled and boot seems to go through without any problems - so we still can’t reproduce the issue on our end. Could I request the following info from your end to see if we can spot something?
* Do you see this happening only for the said platform (In case you have other platforms) * Since you had debug logs added, could we check if this happens on the first/specific CPU suspend or is this quite random? Were you able to trace it to a specific function/line within the suspend path where the crash occurs? * It was stated that CPUs are unexpectedly entering EL3. Can I check if the entry itself was not expected or if there was a problem further down the line ie in the suspend path following the entry? * Could you share the exact commit you are on for TF-A * Could you share the build configuration for the platform (all build-flags) * Would it be possible to share the BL31 logs (so we can see the crash register logs) and Linux logs (just to correlate) as well as bl31.dump/bl31.map file for the failure case (If you can share for the working case as well, that would be great)
Regards, Rohit
From: Kummari, Prasad Prasad.Kummari@amd.com Date: Tuesday, 6 January 2026 at 06:34 To: Rohit Mathew Rohit.Mathew@arm.com, Sammit Joshi Sammit.Joshi@arm.com, scan-admin--- via TF-A tf-a@lists.trustedfirmware.org Cc: Belsare, Akshay akshay.belsare@amd.com, Bollapalli, Maheedhar Sai MaheedharSai.Bollapalli@amd.com, Simek, Michal michal.simek@amd.com Subject: RE: ZynqMP regression with NUMA_AWARE_PER_CPU changes with ENABLED_LTO : Linux runtime hang due to EL3 re-entry
[AMD Official Use Only - AMD Internal Distribution Only]
Hi Mathew,
Yes, if the CPU_IDLE–related configurations listed below are removed while compiling the kernel image, the hang is no longer observed.
Configs: CONFIG_CPU_IDLE=y CONFIG_CPU_IDLE_MULTIPLE_DRIVERS=y CONFIG_CPU_IDLE_GOV_MENU=y CONFIG_DT_IDLE_STATES=y
Observed Behavior
* System later triggers an Unhandled Exception in EL3 CPUs enter psci_cpu_suspend(). Debug prints confirm execution reaches the suspend path. * The affected CPUs enter Reset Catch at EL3 and hang at the bl31_warm_entrypoint()->el3_entrypoint_common pc.
Regards, Prasad.
From: Rohit Mathew Rohit.Mathew@arm.com Sent: Tuesday, January 6, 2026 4:37 AM To: Kummari, Prasad Prasad.Kummari@amd.com; Sammit Joshi Sammit.Joshi@arm.com; scan-admin--- via TF-A tf-a@lists.trustedfirmware.org Cc: Belsare, Akshay akshay.belsare@amd.com; Bollapalli, Maheedhar Sai MaheedharSai.Bollapalli@amd.com; Simek, Michal michal.simek@amd.com Subject: Re: ZynqMP regression with NUMA_AWARE_PER_CPU changes with ENABLED_LTO : Linux runtime hang due to EL3 re-entry
Caution: This message originated from an External Source. Use proper caution when opening attachments, clicking links, or responding.
Hi Prasad,
Thanks for your mail. I’m trying to reproduce this at our end. Could I check if you see the boot progressing to shell without the CPU idle dependencies?
Regards, Rohit
From: Kummari, Prasad <Prasad.Kummari@amd.commailto:Prasad.Kummari@amd.com> Date: Monday, 5 January 2026 at 18:04 To: Sammit Joshi <Sammit.Joshi@arm.commailto:Sammit.Joshi@arm.com>, Rohit Mathew <Rohit.Mathew@arm.commailto:Rohit.Mathew@arm.com>, scan-admin--- via TF-A <tf-a@lists.trustedfirmware.orgmailto:tf-a@lists.trustedfirmware.org> Cc: Belsare, Akshay <akshay.belsare@amd.commailto:akshay.belsare@amd.com>, Bollapalli, Maheedhar Sai <MaheedharSai.Bollapalli@amd.commailto:MaheedharSai.Bollapalli@amd.com>, Simek, Michal <michal.simek@amd.commailto:michal.simek@amd.com> Subject: ZynqMP regression with NUMA_AWARE_PER_CPU changes with ENABLED_LTO : Linux runtime hang due to EL3 re-entry
[AMD Official Use Only - AMD Internal Distribution Only]
Hello Maintainers,
We are observing a reproducible runtime regression on the ZynqMP (Cortex-A53) platform after enabling LTO (ENABLED_LTO=1) and merging the changes from the topic NUMA_AWARE_PER_CPU into our integration branch (https://github.com/ARM-software/arm-trusted-firmware/commit/7303319b3823e9e3...).
Summary of the issue
1. Baseline behavior
* Platform: ZynqMP (Cortex-A53)
* Configuration: ENABLED_LTO=1
* Without NUMA_AWARE_PER_CPU: Linux boots and runs stably
* After merging NUMA_AWARE_PER_CPU, Linux boots but hangs during runtime
* During the hang, CPUs are observed to unexpectedly re-enter EL3
* Re-entry into EL3 should not occur during normal Linux runtime execution and strongly suggests corruption or mismanagement of PSCI and/or per-CPU state(arm-trusted-firmware/lib/per_cpu/aarch64/per_cpu_asm.S at master · ARM-software/arm-trusted-firmwarehttps://github.com/ARM-software/arm-trusted-firmware/blob/master/lib/per_cpu/aarch64/per_cpu_asm.S#L28)
* Reverting the NUMA_AWARE_PER_CPU changes restores stable Linux execution
* The issue is reproducible only when NUMA_AWARE_PER_CPU is present
* This clearly identifies NUMA_AWARE_PER_CPU as the regression source
1. Suspect with LTO
* With NUMA_AWARE_PER_CPU enabled, LTO breaks the per-CPU base calculation
* BL31 contains hand-written assembly that relies on linker-script symbols (e.g., per-CPU section boundaries)
* Under LTO, symbol placement and retention are no longer guaranteed in the same way, leading to incorrect per-CPU base computation
* This results in corrupted per-CPU data and subsequent erroneous PSCI suspend behavior (EL3 re-entry)
1. CPU idle dependency
* The following kernel configuration options are enabled:
* CONFIG_CPU_IDLE=y
* CONFIG_CPU_IDLE_MULTIPLE_DRIVERS=y
* CONFIG_CPU_IDLE_GOV_MENU=y
* CONFIG_DT_IDLE_STATES=y
* This further suggests the issue is triggered during CPU idle / suspend-resume paths, where correct per-CPU state handling is critical
Based on the above:
* This is specific to NUMA_AWARE_PER_CPU combined with LTO * The failure mode points to per-CPU base calculation and PSCI state corruption * Reverting NUMA_AWARE_PER_CPU fully restores stability on ZynqMP
We wanted to report this issue upstream and seek guidance on:
* Whether NUMA_AWARE_PER_CPU is expected to be LTO-safe on platforms relying on linker-defined per-CPU sections * Or if additional constraints / fixes are required for platforms like ZynqMP
We are happy to provide further logs, configuration details, or help to fixes.
Regards, Prasad Kummari
[AMD Official Use Only - AMD Internal Distribution Only]
Hi Mathew,
Thank you for quick response and shared required information.
Regards, Prasad.
From: Rohit Mathew Rohit.Mathew@arm.com Sent: Tuesday, January 6, 2026 9:21 PM To: Kummari, Prasad Prasad.Kummari@amd.com; Sammit Joshi Sammit.Joshi@arm.com; scan-admin--- via TF-A tf-a@lists.trustedfirmware.org Cc: Belsare, Akshay akshay.belsare@amd.com; Bollapalli, Maheedhar Sai MaheedharSai.Bollapalli@amd.com; Simek, Michal michal.simek@amd.com Subject: Re: ZynqMP regression with NUMA_AWARE_PER_CPU changes with ENABLED_LTO : Linux runtime hang due to EL3 re-entry
Caution: This message originated from an External Source. Use proper caution when opening attachments, clicking links, or responding.
Hi Prasad,
Thanks for the info. We tried booting on few different platforms (ACPI-Linux/DT-Linux/TFTF) from our end with Suspend and LTO enabled and boot seems to go through without any problems - so we still can't reproduce the issue on our end. Could I request the following info from your end to see if we can spot something?
* Do you see this happening only for the said platform (In case you have other platforms)
The issue is currently observed only on our ZynqMP (Cortex-A53) based platform. We do not see the issue on other internal platforms we have access to. The failure is reproducible on ZynqMP Platform, we have enabled LTO.
* Since you had debug logs added, could we check if this happens on the first/specific CPU suspend or is this quite random? Were you able to trace it to a specific function/line within the suspend path where the crash occurs? It appears to be quite random. Added debug logs in psci_cpu_suspend(). Although BL31 NOTICE logs and Linux printk messages are interleaved and somewhat noisy, they are still understandable. I've attached the full logs for reference with DEBUG=1 and DEBUG=0. LOGs DEBUG=1: [ 4.966988] zynqmp-dpsub fd4a0000.display: [drm] Cannot find any crtc or sizes NOTICE: CPU2: psci_cpu_suspend() base=0x12280 next=0x125c0 delta=0x340 NOTICE: CPU3: psci_cpu_suspend() base=0x125c0 next=0x11c00 delta=0xfffffffffffff640 NOTICE: CPU3: psci_cpu_suspend() base=0x125c0 next=0x11c00 delta=0xfffffffffffff640 NOTICE: CPU0: psci_cpu_suspend() base=0x11c00 next=0x11f40 delta=0x340 nUOTnhIaCnEd:l e dCP EUx3c: epptsciio_nc ipun_ sEuLs3p.e ---> unexpectedly entering EL3 =0(3)0 b a se = 0 x 1 2 5 c0= n0exx0t00=000x01010c00000 0de0l00t2a 0xf0f f ff f f f f f f ff 6 4=0 x0000000000000000 x1 = 0x00000000000111f8 NOTICE: CPU1: psci_cpu_suspend() base=0x11f40 next=0x12280 delta=0x340 NOTICE: CPU0: psci_cpu_suspend() base=0x11c00 next=0x11f40 delta=0x340 NOTICE: CPU3: psci_cpu_suspend() base=0x125c0 next=0x11c00 delta=0xfffffffffffff640 NOTICE: CPU1: psci_cpu_suspend() base=0x11f40 next=0x12280 delta=0x340 --> continues prints and hang.
* It was stated that CPUs are unexpectedly entering EL3. Can I check if the entry itself was not expected or if there was a problem further down the line ie in the suspend path following the entry? Exception Register Dump: x30 = 0x0000000000000002 x0 = 0x0000000000000000 x1 = 0x0000000000012538 x2 = 0x00000000000134c0 x3 = 0x000000000000003f x4 = 0x00000000000002c0 x5 = 0x0000000055540000 x6 = 0x000000000000b554 x7 = 0x00000000f9010000 x8 = 0x0000000000000008 x9 = 0x000000000000b1b0 x10 = 0x000000002000ff00 x11 = 0x000000000000b554 x12 = 0x00000000f9010400 x13 = 0x0000000000001ff0 x14 = 0x0000000000000001 x15 = 0x0000000000001100 x16 = 0xaa69440080c94020 x17 = 0x51453496800300f0 x18 = 0xa1c6bb25d660316b x19 = 0x00012801026a8386 x20 = 0x246f2823c0616a0c x21 = 0x2a0495a48c060008 x22 = 0x0020241c01920c03 x23 = 0x271802d906c4c238 x24 = 0xd56233b4061dd10c x25 = 0xa4b115480a80090f x26 = 0x8d098c0181808610 x[ 3.875646] mmc0: new high speed SDHC card at address aaaa 27 = 0x2241009a000000c0 x28 = 0x1c180f2194984856 x29 = 0x301c2a610c800908 scr_el3 = 0x0000000000000238 sctlr_el3 = 0x0000000030cd183f cptr_el3 = 0x0000000000000000 tcr_el3 = 0x0000000080803520 daif = 0x00000000000003c0 mair_el3 = 0x00000000004400ff spsr_el3 = 0x00000000200002cc elr_el3 = 0x0000000000000002 ttbr0_el3 = 0x0000000000012ac0 esr_el3 = 0x000000008a000000 far_el3 = 0x0000000000000002 mpidr_el1 = 0x0000000080000003 sp_el0 = 0x0000000000012540 isr_el1 = 0x0000000000000000 dacr32_el2 = 0x0000000000000000 ifsr32_el2 = 0x0000000000000000 cpuectlr_el1 = 0x0000000000000040 cpumerrsr_el1 = 0x000000001a000040 l2merrsr_el1 = 0x0000000010100008 cpuactlr_el1 = 0x00001000090ca000 gicc_hppir = 0x00000000000003fe gicc_ahppir = 0x0000000000000801 gicc_ctlr = 0x00000000000001e9 gicd_ispendr regs (Offsets 0x200-0x278) Offset Value 0x200: 0x0000000000000012 0x208: 0x0000000000000000 0x210: 0x0000000000000000 0x218: 0x0000000000000000 0x220: 0x0000000000000000 0x228: 0x0000000000000000 0x230: 0x0000000000000000 0x238: 0x0000000000000000 0x240: 0x0000000000000000 0x248: 0x0000000000000000 0x250: 0x0000000000000000 0x258: 0x0000000000000000 0x260: 0x0000000000000000 0x268: 0x0000000000000000 0x270: 0x0000000000000000 0x278: 0x0000000000000000 cci_snoop_ctrl_cluster0x100000000c0000003 cci_snoop_ctrl_cluster1x100000000c0000000
CPU state with debugger:
xsdb% ta 1 PS TAP 2 PMU 3 MicroBlaze PMU (Sleeping. No clock) 4 PL 5 PSU 6 RPU 7 Cortex-R5 #0 (Halted) 8 Cortex-R5 #1 (Lock Step Mode) 9 APU 10* Cortex-A53 #0 (External debug access is disabled) 11 Cortex-A53 #1 (No Power) 12 Cortex-A53 #2 (Running) 13 Cortex-A53 #3 (Running) xsdb% ta 1 PS TAP 2 PMU 3 MicroBlaze PMU (Sleeping. No clock) 4 PL 5 PSU 6 RPU 7 Cortex-R5 #0 (Halted) 8 Cortex-R5 #1 (Lock Step Mode) 9 APU 10* Cortex-A53 #0 (APB AP transaction error, DAP status 0x30000021) 11 Cortex-A53 #1 (Running) 12 Cortex-A53 #2 (Running) 13 Cortex-A53 #3 (No Power) xsdb%
* Could you share the exact commit you are on for TF-A Merge "docs(changelog): changelog for v2.14 release" into integration * ARM-software/arm-trusted-firmware@1d5aa93https://github.com/ARM-software/arm-trusted-firmware/commit/1d5aa939bc8d3d892e2ed9945fa50e36a1a924cc
* Could you share the build configuration for the platform (all build-flags) make -j20 RESET_TO_BL31=1 PLAT=zynqmp bl31 IPI_CRC_CHECK=1 DEBUG=1 In DEBUG mode ZynqMP will run DDR start address: 0x1000, without debug it will run OCM start address 0x00000000fffea000
* Would it be possible to share the BL31 logs (so we can see the crash register logs) and Linux logs (just to correlate) as well as bl31.dump/bl31.map file for the failure case (If you can share for the working case as well, that would be great) Attached are the working and non-working dumps and logs for NUMA_AWARE_PER_CPU with DDR and OCM. In the working case, LTO is disabled. CPU state with debugger with OCM BL31.elf: xsdb% ta 1 PS TAP 2 PMU 3 MicroBlaze PMU (Sleeping. No clock) 4 PL 5 PSU 6 RPU 7 Cortex-R5 #0 (Halted) 8 Cortex-R5 #1 (Lock Step Mode) 9 APU 10* Cortex-A53 #0 (Running) 11 Cortex-A53 #1 (No Power) 12 Cortex-A53 #2 (Power On Reset) 13 Cortex-A53 #3 (Running) xsdb% Info: Cortex-A53 #2 (target 12) Stopped at 0x0 (Cannot resume. APB AP transaction error, DAP status 0x30000021) xsdb% Info: Cortex-A53 #2 (target 12) Running (No Power) xsdb% Info: Cortex-A53 #2 (target 12) Running (No Power) xsdb% Info: Cortex-A53 #2 (target 12) Stopped at 0x0 (Cannot resume. APB AP transaction error, DAP status 0x30000021) xsdb% Info: Cortex-A53 #2 (target 12) Running (No Power) xsdb% Info: Cortex-A53 #0 (target 10) Stopped at 0xfffea194 (Reset Catch) xsdb% Info: Cortex-A53 #1 (target 11) Running (No Power) xsdb% Info: Cortex-A53 #1 (target 11) Running (No Power) xsdb% Info: Cortex-A53 #1 (target 11) Running (No Power) xsdb% Info: Cortex-A53 #1 (target 11) Running (No Power) xsdb% Info: Cortex-A53 #1 (target 11) Running (No Power) xsdb% Info: Cortex-A53 #2 (target 12) Running (No Power) xsdb% Info: Cortex-A53 #1 (target 11) Running (No Power) xsdb% ta 1 PS TAP 2 PMU 3 MicroBlaze PMU (Sleeping. No clock) 4 PL 5 PSU 6 RPU 7 Cortex-R5 #0 (Halted) 8 Cortex-R5 #1 (Lock Step Mode) 9 APU 10* Cortex-A53 #0 (Reset Catch, EL3(S)/A64) 11 Cortex-A53 #1 (No Power) 12 Cortex-A53 #2 (Running) 13 Cortex-A53 #3 (Running) xsdb% Info: Cortex-A53 #2 (target 12) Running (No Power)
Regards, Rohit
From: Kummari, Prasad <Prasad.Kummari@amd.commailto:Prasad.Kummari@amd.com> Date: Tuesday, 6 January 2026 at 06:34 To: Rohit Mathew <Rohit.Mathew@arm.commailto:Rohit.Mathew@arm.com>, Sammit Joshi <Sammit.Joshi@arm.commailto:Sammit.Joshi@arm.com>, scan-admin--- via TF-A <tf-a@lists.trustedfirmware.orgmailto:tf-a@lists.trustedfirmware.org> Cc: Belsare, Akshay <akshay.belsare@amd.commailto:akshay.belsare@amd.com>, Bollapalli, Maheedhar Sai <MaheedharSai.Bollapalli@amd.commailto:MaheedharSai.Bollapalli@amd.com>, Simek, Michal <michal.simek@amd.commailto:michal.simek@amd.com> Subject: RE: ZynqMP regression with NUMA_AWARE_PER_CPU changes with ENABLED_LTO : Linux runtime hang due to EL3 re-entry
[AMD Official Use Only - AMD Internal Distribution Only]
Hi Mathew,
Yes, if the CPU_IDLE-related configurations listed below are removed while compiling the kernel image, the hang is no longer observed.
Configs: CONFIG_CPU_IDLE=y CONFIG_CPU_IDLE_MULTIPLE_DRIVERS=y CONFIG_CPU_IDLE_GOV_MENU=y CONFIG_DT_IDLE_STATES=y
Observed Behavior
* System later triggers an Unhandled Exception in EL3 CPUs enter psci_cpu_suspend(). Debug prints confirm execution reaches the suspend path. * The affected CPUs enter Reset Catch at EL3 and hang at the bl31_warm_entrypoint()->el3_entrypoint_common pc.
Regards, Prasad.
From: Rohit Mathew <Rohit.Mathew@arm.commailto:Rohit.Mathew@arm.com> Sent: Tuesday, January 6, 2026 4:37 AM To: Kummari, Prasad <Prasad.Kummari@amd.commailto:Prasad.Kummari@amd.com>; Sammit Joshi <Sammit.Joshi@arm.commailto:Sammit.Joshi@arm.com>; scan-admin--- via TF-A <tf-a@lists.trustedfirmware.orgmailto:tf-a@lists.trustedfirmware.org> Cc: Belsare, Akshay <akshay.belsare@amd.commailto:akshay.belsare@amd.com>; Bollapalli, Maheedhar Sai <MaheedharSai.Bollapalli@amd.commailto:MaheedharSai.Bollapalli@amd.com>; Simek, Michal <michal.simek@amd.commailto:michal.simek@amd.com> Subject: Re: ZynqMP regression with NUMA_AWARE_PER_CPU changes with ENABLED_LTO : Linux runtime hang due to EL3 re-entry
Caution: This message originated from an External Source. Use proper caution when opening attachments, clicking links, or responding.
Hi Prasad,
Thanks for your mail. I'm trying to reproduce this at our end. Could I check if you see the boot progressing to shell without the CPU idle dependencies?
Regards, Rohit
From: Kummari, Prasad <Prasad.Kummari@amd.commailto:Prasad.Kummari@amd.com> Date: Monday, 5 January 2026 at 18:04 To: Sammit Joshi <Sammit.Joshi@arm.commailto:Sammit.Joshi@arm.com>, Rohit Mathew <Rohit.Mathew@arm.commailto:Rohit.Mathew@arm.com>, scan-admin--- via TF-A <tf-a@lists.trustedfirmware.orgmailto:tf-a@lists.trustedfirmware.org> Cc: Belsare, Akshay <akshay.belsare@amd.commailto:akshay.belsare@amd.com>, Bollapalli, Maheedhar Sai <MaheedharSai.Bollapalli@amd.commailto:MaheedharSai.Bollapalli@amd.com>, Simek, Michal <michal.simek@amd.commailto:michal.simek@amd.com> Subject: ZynqMP regression with NUMA_AWARE_PER_CPU changes with ENABLED_LTO : Linux runtime hang due to EL3 re-entry
[AMD Official Use Only - AMD Internal Distribution Only]
Hello Maintainers,
We are observing a reproducible runtime regression on the ZynqMP (Cortex-A53) platform after enabling LTO (ENABLED_LTO=1) and merging the changes from the topic NUMA_AWARE_PER_CPU into our integration branch (https://github.com/ARM-software/arm-trusted-firmware/commit/7303319b3823e9e3...).
Summary of the issue
1. Baseline behavior
* Platform: ZynqMP (Cortex-A53)
* Configuration: ENABLED_LTO=1
* Without NUMA_AWARE_PER_CPU: Linux boots and runs stably
* After merging NUMA_AWARE_PER_CPU, Linux boots but hangs during runtime
* During the hang, CPUs are observed to unexpectedly re-enter EL3
* Re-entry into EL3 should not occur during normal Linux runtime execution and strongly suggests corruption or mismanagement of PSCI and/or per-CPU state(arm-trusted-firmware/lib/per_cpu/aarch64/per_cpu_asm.S at master * ARM-software/arm-trusted-firmwarehttps://github.com/ARM-software/arm-trusted-firmware/blob/master/lib/per_cpu/aarch64/per_cpu_asm.S#L28)
* Reverting the NUMA_AWARE_PER_CPU changes restores stable Linux execution
* The issue is reproducible only when NUMA_AWARE_PER_CPU is present
* This clearly identifies NUMA_AWARE_PER_CPU as the regression source
1. Suspect with LTO
* With NUMA_AWARE_PER_CPU enabled, LTO breaks the per-CPU base calculation
* BL31 contains hand-written assembly that relies on linker-script symbols (e.g., per-CPU section boundaries)
* Under LTO, symbol placement and retention are no longer guaranteed in the same way, leading to incorrect per-CPU base computation
* This results in corrupted per-CPU data and subsequent erroneous PSCI suspend behavior (EL3 re-entry)
1. CPU idle dependency
* The following kernel configuration options are enabled:
* CONFIG_CPU_IDLE=y
* CONFIG_CPU_IDLE_MULTIPLE_DRIVERS=y
* CONFIG_CPU_IDLE_GOV_MENU=y
* CONFIG_DT_IDLE_STATES=y
* This further suggests the issue is triggered during CPU idle / suspend-resume paths, where correct per-CPU state handling is critical
Based on the above:
* This is specific to NUMA_AWARE_PER_CPU combined with LTO * The failure mode points to per-CPU base calculation and PSCI state corruption * Reverting NUMA_AWARE_PER_CPU fully restores stability on ZynqMP
We wanted to report this issue upstream and seek guidance on:
* Whether NUMA_AWARE_PER_CPU is expected to be LTO-safe on platforms relying on linker-defined per-CPU sections * Or if additional constraints / fixes are required for platforms like ZynqMP
We are happy to provide further logs, configuration details, or help to fixes.
Regards, Prasad Kummari
[AMD Official Use Only - AMD Internal Distribution Only]
++ @Kummari, Prasadmailto:Prasad.Kummari@amd.com
From: Kummari, Prasad Sent: Tuesday, January 6, 2026 11:31 PM To: 'Rohit Mathew' Rohit.Mathew@arm.com; Sammit Joshi Sammit.Joshi@arm.com; scan-admin--- via TF-A tf-a@lists.trustedfirmware.org Cc: Belsare, Akshay akshay.belsare@amd.com; Bollapalli, Maheedhar Sai MaheedharSai.Bollapalli@amd.com; Simek, Michal michal.simek@amd.com Subject: RE: ZynqMP regression with NUMA_AWARE_PER_CPU changes with ENABLED_LTO : Linux runtime hang due to EL3 re-entry
Hi Mathew,
Thank you for quick response and shared required information.
Regards, Prasad.
From: Rohit Mathew <Rohit.Mathew@arm.commailto:Rohit.Mathew@arm.com> Sent: Tuesday, January 6, 2026 9:21 PM To: Kummari, Prasad <Prasad.Kummari@amd.commailto:Prasad.Kummari@amd.com>; Sammit Joshi <Sammit.Joshi@arm.commailto:Sammit.Joshi@arm.com>; scan-admin--- via TF-A <tf-a@lists.trustedfirmware.orgmailto:tf-a@lists.trustedfirmware.org> Cc: Belsare, Akshay <akshay.belsare@amd.commailto:akshay.belsare@amd.com>; Bollapalli, Maheedhar Sai <MaheedharSai.Bollapalli@amd.commailto:MaheedharSai.Bollapalli@amd.com>; Simek, Michal <michal.simek@amd.commailto:michal.simek@amd.com> Subject: Re: ZynqMP regression with NUMA_AWARE_PER_CPU changes with ENABLED_LTO : Linux runtime hang due to EL3 re-entry
Caution: This message originated from an External Source. Use proper caution when opening attachments, clicking links, or responding.
Hi Prasad,
Thanks for the info. We tried booting on few different platforms (ACPI-Linux/DT-Linux/TFTF) from our end with Suspend and LTO enabled and boot seems to go through without any problems - so we still can't reproduce the issue on our end. Could I request the following info from your end to see if we can spot something?
* Do you see this happening only for the said platform (In case you have other platforms)
The issue is currently observed only on our ZynqMP (Cortex-A53) based platform. We do not see the issue on other internal platforms we have access to. The failure is reproducible on ZynqMP Platform, we have enabled LTO.
* Since you had debug logs added, could we check if this happens on the first/specific CPU suspend or is this quite random? Were you able to trace it to a specific function/line within the suspend path where the crash occurs? It appears to be quite random. Added debug logs in psci_cpu_suspend(). Although BL31 NOTICE logs and Linux printk messages are interleaved and somewhat noisy, they are still understandable. I've attached the full logs for reference with DEBUG=1 and DEBUG=0. LOGs DEBUG=1: [ 4.966988] zynqmp-dpsub fd4a0000.display: [drm] Cannot find any crtc or sizes NOTICE: CPU2: psci_cpu_suspend() base=0x12280 next=0x125c0 delta=0x340 NOTICE: CPU3: psci_cpu_suspend() base=0x125c0 next=0x11c00 delta=0xfffffffffffff640 NOTICE: CPU3: psci_cpu_suspend() base=0x125c0 next=0x11c00 delta=0xfffffffffffff640 NOTICE: CPU0: psci_cpu_suspend() base=0x11c00 next=0x11f40 delta=0x340 nUOTnhIaCnEd:l e dCP EUx3c: epptsciio_nc ipun_ sEuLs3p.e ---> unexpectedly entering EL3 =0(3)0 b a se = 0 x 1 2 5 c0= n0exx0t00=000x01010c00000 0de0l00t2a 0xf0f f ff f f f f f f ff 6 4=0 x0000000000000000 x1 = 0x00000000000111f8 NOTICE: CPU1: psci_cpu_suspend() base=0x11f40 next=0x12280 delta=0x340 NOTICE: CPU0: psci_cpu_suspend() base=0x11c00 next=0x11f40 delta=0x340 NOTICE: CPU3: psci_cpu_suspend() base=0x125c0 next=0x11c00 delta=0xfffffffffffff640 NOTICE: CPU1: psci_cpu_suspend() base=0x11f40 next=0x12280 delta=0x340 --> continues prints and hang.
* It was stated that CPUs are unexpectedly entering EL3. Can I check if the entry itself was not expected or if there was a problem further down the line ie in the suspend path following the entry? Exception Register Dump: x30 = 0x0000000000000002 x0 = 0x0000000000000000 x1 = 0x0000000000012538 x2 = 0x00000000000134c0 x3 = 0x000000000000003f x4 = 0x00000000000002c0 x5 = 0x0000000055540000 x6 = 0x000000000000b554 x7 = 0x00000000f9010000 x8 = 0x0000000000000008 x9 = 0x000000000000b1b0 x10 = 0x000000002000ff00 x11 = 0x000000000000b554 x12 = 0x00000000f9010400 x13 = 0x0000000000001ff0 x14 = 0x0000000000000001 x15 = 0x0000000000001100 x16 = 0xaa69440080c94020 x17 = 0x51453496800300f0 x18 = 0xa1c6bb25d660316b x19 = 0x00012801026a8386 x20 = 0x246f2823c0616a0c x21 = 0x2a0495a48c060008 x22 = 0x0020241c01920c03 x23 = 0x271802d906c4c238 x24 = 0xd56233b4061dd10c x25 = 0xa4b115480a80090f x26 = 0x8d098c0181808610 x[ 3.875646] mmc0: new high speed SDHC card at address aaaa 27 = 0x2241009a000000c0 x28 = 0x1c180f2194984856 x29 = 0x301c2a610c800908 scr_el3 = 0x0000000000000238 sctlr_el3 = 0x0000000030cd183f cptr_el3 = 0x0000000000000000 tcr_el3 = 0x0000000080803520 daif = 0x00000000000003c0 mair_el3 = 0x00000000004400ff spsr_el3 = 0x00000000200002cc elr_el3 = 0x0000000000000002 ttbr0_el3 = 0x0000000000012ac0 esr_el3 = 0x000000008a000000 far_el3 = 0x0000000000000002 mpidr_el1 = 0x0000000080000003 sp_el0 = 0x0000000000012540 isr_el1 = 0x0000000000000000 dacr32_el2 = 0x0000000000000000 ifsr32_el2 = 0x0000000000000000 cpuectlr_el1 = 0x0000000000000040 cpumerrsr_el1 = 0x000000001a000040 l2merrsr_el1 = 0x0000000010100008 cpuactlr_el1 = 0x00001000090ca000 gicc_hppir = 0x00000000000003fe gicc_ahppir = 0x0000000000000801 gicc_ctlr = 0x00000000000001e9 gicd_ispendr regs (Offsets 0x200-0x278) Offset Value 0x200: 0x0000000000000012 0x208: 0x0000000000000000 0x210: 0x0000000000000000 0x218: 0x0000000000000000 0x220: 0x0000000000000000 0x228: 0x0000000000000000 0x230: 0x0000000000000000 0x238: 0x0000000000000000 0x240: 0x0000000000000000 0x248: 0x0000000000000000 0x250: 0x0000000000000000 0x258: 0x0000000000000000 0x260: 0x0000000000000000 0x268: 0x0000000000000000 0x270: 0x0000000000000000 0x278: 0x0000000000000000 cci_snoop_ctrl_cluster0x100000000c0000003 cci_snoop_ctrl_cluster1x100000000c0000000
CPU state with debugger: xsdb% ta 1 PS TAP 2 PMU 3 MicroBlaze PMU (Sleeping. No clock) 4 PL 5 PSU 6 RPU 7 Cortex-R5 #0 (Halted) 8 Cortex-R5 #1 (Lock Step Mode) 9 APU 10* Cortex-A53 #0 (External debug access is disabled) 11 Cortex-A53 #1 (No Power) 12 Cortex-A53 #2 (Running) 13 Cortex-A53 #3 (Running) xsdb% ta 1 PS TAP 2 PMU 3 MicroBlaze PMU (Sleeping. No clock) 4 PL 5 PSU 6 RPU 7 Cortex-R5 #0 (Halted) 8 Cortex-R5 #1 (Lock Step Mode) 9 APU 10* Cortex-A53 #0 (APB AP transaction error, DAP status 0x30000021) 11 Cortex-A53 #1 (Running) 12 Cortex-A53 #2 (Running) 13 Cortex-A53 #3 (No Power) xsdb%
* Could you share the exact commit you are on for TF-A Merge "docs(changelog): changelog for v2.14 release" into integration * ARM-software/arm-trusted-firmware@1d5aa93https://github.com/ARM-software/arm-trusted-firmware/commit/1d5aa939bc8d3d892e2ed9945fa50e36a1a924cc
* Could you share the build configuration for the platform (all build-flags) make -j20 RESET_TO_BL31=1 PLAT=zynqmp bl31 IPI_CRC_CHECK=1 DEBUG=1 In DEBUG mode ZynqMP will run DDR start address: 0x1000, without debug it will run OCM start address 0x00000000fffea000
* Would it be possible to share the BL31 logs (so we can see the crash register logs) and Linux logs (just to correlate) as well as bl31.dump/bl31.map file for the failure case (If you can share for the working case as well, that would be great) Attached are the working and non-working dumps and logs for NUMA_AWARE_PER_CPU with DDR and OCM. In the working case, LTO is disabled. CPU state with debugger with OCM BL31.elf: xsdb% ta 1 PS TAP 2 PMU 3 MicroBlaze PMU (Sleeping. No clock) 4 PL 5 PSU 6 RPU 7 Cortex-R5 #0 (Halted) 8 Cortex-R5 #1 (Lock Step Mode) 9 APU 10* Cortex-A53 #0 (Running) 11 Cortex-A53 #1 (No Power) 12 Cortex-A53 #2 (Power On Reset) 13 Cortex-A53 #3 (Running) xsdb% Info: Cortex-A53 #2 (target 12) Stopped at 0x0 (Cannot resume. APB AP transaction error, DAP status 0x30000021) xsdb% Info: Cortex-A53 #2 (target 12) Running (No Power) xsdb% Info: Cortex-A53 #2 (target 12) Running (No Power) xsdb% Info: Cortex-A53 #2 (target 12) Stopped at 0x0 (Cannot resume. APB AP transaction error, DAP status 0x30000021) xsdb% Info: Cortex-A53 #2 (target 12) Running (No Power) xsdb% Info: Cortex-A53 #0 (target 10) Stopped at 0xfffea194 (Reset Catch) xsdb% Info: Cortex-A53 #1 (target 11) Running (No Power) xsdb% Info: Cortex-A53 #1 (target 11) Running (No Power) xsdb% Info: Cortex-A53 #1 (target 11) Running (No Power) xsdb% Info: Cortex-A53 #1 (target 11) Running (No Power) xsdb% Info: Cortex-A53 #1 (target 11) Running (No Power) xsdb% Info: Cortex-A53 #2 (target 12) Running (No Power) xsdb% Info: Cortex-A53 #1 (target 11) Running (No Power) xsdb% ta 1 PS TAP 2 PMU 3 MicroBlaze PMU (Sleeping. No clock) 4 PL 5 PSU 6 RPU 7 Cortex-R5 #0 (Halted) 8 Cortex-R5 #1 (Lock Step Mode) 9 APU 10* Cortex-A53 #0 (Reset Catch, EL3(S)/A64) 11 Cortex-A53 #1 (No Power) 12 Cortex-A53 #2 (Running) 13 Cortex-A53 #3 (Running) xsdb% Info: Cortex-A53 #2 (target 12) Running (No Power)
Regards, Rohit
From: Kummari, Prasad <Prasad.Kummari@amd.commailto:Prasad.Kummari@amd.com> Date: Tuesday, 6 January 2026 at 06:34 To: Rohit Mathew <Rohit.Mathew@arm.commailto:Rohit.Mathew@arm.com>, Sammit Joshi <Sammit.Joshi@arm.commailto:Sammit.Joshi@arm.com>, scan-admin--- via TF-A <tf-a@lists.trustedfirmware.orgmailto:tf-a@lists.trustedfirmware.org> Cc: Belsare, Akshay <akshay.belsare@amd.commailto:akshay.belsare@amd.com>, Bollapalli, Maheedhar Sai <MaheedharSai.Bollapalli@amd.commailto:MaheedharSai.Bollapalli@amd.com>, Simek, Michal <michal.simek@amd.commailto:michal.simek@amd.com> Subject: RE: ZynqMP regression with NUMA_AWARE_PER_CPU changes with ENABLED_LTO : Linux runtime hang due to EL3 re-entry
[AMD Official Use Only - AMD Internal Distribution Only]
Hi Mathew,
Yes, if the CPU_IDLE-related configurations listed below are removed while compiling the kernel image, the hang is no longer observed.
Configs: CONFIG_CPU_IDLE=y CONFIG_CPU_IDLE_MULTIPLE_DRIVERS=y CONFIG_CPU_IDLE_GOV_MENU=y CONFIG_DT_IDLE_STATES=y
Observed Behavior
* System later triggers an Unhandled Exception in EL3 CPUs enter psci_cpu_suspend(). Debug prints confirm execution reaches the suspend path. * The affected CPUs enter Reset Catch at EL3 and hang at the bl31_warm_entrypoint()->el3_entrypoint_common pc.
Regards, Prasad.
From: Rohit Mathew <Rohit.Mathew@arm.commailto:Rohit.Mathew@arm.com> Sent: Tuesday, January 6, 2026 4:37 AM To: Kummari, Prasad <Prasad.Kummari@amd.commailto:Prasad.Kummari@amd.com>; Sammit Joshi <Sammit.Joshi@arm.commailto:Sammit.Joshi@arm.com>; scan-admin--- via TF-A <tf-a@lists.trustedfirmware.orgmailto:tf-a@lists.trustedfirmware.org> Cc: Belsare, Akshay <akshay.belsare@amd.commailto:akshay.belsare@amd.com>; Bollapalli, Maheedhar Sai <MaheedharSai.Bollapalli@amd.commailto:MaheedharSai.Bollapalli@amd.com>; Simek, Michal <michal.simek@amd.commailto:michal.simek@amd.com> Subject: Re: ZynqMP regression with NUMA_AWARE_PER_CPU changes with ENABLED_LTO : Linux runtime hang due to EL3 re-entry
Caution: This message originated from an External Source. Use proper caution when opening attachments, clicking links, or responding.
Hi Prasad,
Thanks for your mail. I'm trying to reproduce this at our end. Could I check if you see the boot progressing to shell without the CPU idle dependencies?
Regards, Rohit
From: Kummari, Prasad <Prasad.Kummari@amd.commailto:Prasad.Kummari@amd.com> Date: Monday, 5 January 2026 at 18:04 To: Sammit Joshi <Sammit.Joshi@arm.commailto:Sammit.Joshi@arm.com>, Rohit Mathew <Rohit.Mathew@arm.commailto:Rohit.Mathew@arm.com>, scan-admin--- via TF-A <tf-a@lists.trustedfirmware.orgmailto:tf-a@lists.trustedfirmware.org> Cc: Belsare, Akshay <akshay.belsare@amd.commailto:akshay.belsare@amd.com>, Bollapalli, Maheedhar Sai <MaheedharSai.Bollapalli@amd.commailto:MaheedharSai.Bollapalli@amd.com>, Simek, Michal <michal.simek@amd.commailto:michal.simek@amd.com> Subject: ZynqMP regression with NUMA_AWARE_PER_CPU changes with ENABLED_LTO : Linux runtime hang due to EL3 re-entry
[AMD Official Use Only - AMD Internal Distribution Only]
Hello Maintainers,
We are observing a reproducible runtime regression on the ZynqMP (Cortex-A53) platform after enabling LTO (ENABLED_LTO=1) and merging the changes from the topic NUMA_AWARE_PER_CPU into our integration branch (https://github.com/ARM-software/arm-trusted-firmware/commit/7303319b3823e9e3...).
Summary of the issue
1. Baseline behavior
* Platform: ZynqMP (Cortex-A53)
* Configuration: ENABLED_LTO=1
* Without NUMA_AWARE_PER_CPU: Linux boots and runs stably
* After merging NUMA_AWARE_PER_CPU, Linux boots but hangs during runtime
* During the hang, CPUs are observed to unexpectedly re-enter EL3
* Re-entry into EL3 should not occur during normal Linux runtime execution and strongly suggests corruption or mismanagement of PSCI and/or per-CPU state(arm-trusted-firmware/lib/per_cpu/aarch64/per_cpu_asm.S at master * ARM-software/arm-trusted-firmwarehttps://github.com/ARM-software/arm-trusted-firmware/blob/master/lib/per_cpu/aarch64/per_cpu_asm.S#L28)
* Reverting the NUMA_AWARE_PER_CPU changes restores stable Linux execution
* The issue is reproducible only when NUMA_AWARE_PER_CPU is present
* This clearly identifies NUMA_AWARE_PER_CPU as the regression source
1. Suspect with LTO
* With NUMA_AWARE_PER_CPU enabled, LTO breaks the per-CPU base calculation
* BL31 contains hand-written assembly that relies on linker-script symbols (e.g., per-CPU section boundaries)
* Under LTO, symbol placement and retention are no longer guaranteed in the same way, leading to incorrect per-CPU base computation
* This results in corrupted per-CPU data and subsequent erroneous PSCI suspend behavior (EL3 re-entry)
1. CPU idle dependency
* The following kernel configuration options are enabled:
* CONFIG_CPU_IDLE=y
* CONFIG_CPU_IDLE_MULTIPLE_DRIVERS=y
* CONFIG_CPU_IDLE_GOV_MENU=y
* CONFIG_DT_IDLE_STATES=y
* This further suggests the issue is triggered during CPU idle / suspend-resume paths, where correct per-CPU state handling is critical
Based on the above:
* This is specific to NUMA_AWARE_PER_CPU combined with LTO * The failure mode points to per-CPU base calculation and PSCI state corruption * Reverting NUMA_AWARE_PER_CPU fully restores stability on ZynqMP
We wanted to report this issue upstream and seek guidance on:
* Whether NUMA_AWARE_PER_CPU is expected to be LTO-safe on platforms relying on linker-defined per-CPU sections * Or if additional constraints / fixes are required for platforms like ZynqMP
We are happy to provide further logs, configuration details, or help to fixes.
Regards, Prasad Kummari
Hi Prasad,
Thanks for sharing all the build artefacts and requested information. The register states (X2 and SP_EL0) + map/dump files for the DDR boot shows that CPU2 went into exception but the exact point of crash is unfortunately not recoverable from the shared artefacts - some registers are lost due to scrambled logs and the other ones don’t show any relatable offsets/addresses which can be traced back from the dump and map files. The OCM boot logs doesn’t show an exception so there isn’t much to decode from those logs.
The map and dump files for the failed LTO builds doesn’t build look different from the passing ones we have on our end (We have 3 passing builds/3 different platforms). The order of how the objects are placed in the section changes between an LTO and non-LTO build, but per-cpu accessors are object-order-agnostic within a CPUs space in general.
You did mention that the cores that went into crash then further woke up in bl31_warmboot_entrypoint. Could this mean that the CPU suspend sequence from the Zync’s PMU was issued before the core went into an exception and this caused the core to not land at the WFI but instead go into an exception? Is this something that could be checked relatively easily? Also, do you know the exact instruction at which such cores were hung in the warm boot entrypoint? If we don’t get much information, we might have to think about adding a debug patch to retrieve more about the crash. Could we also check:
* What version of compiler is being used? * DEBUG=1 is passed in the config (ie DEBUG=1 ENABLE_LTO=1), does this happen without DEBUG=1 (ie ENABLE_LTO=1 DEBUG=0)? *
It seems that not everyone is getting the mails from the mailing list at from Arm side. Would it be okay to move the rest of the discussion/debug to discord TF-A so that everyone can participate? Let us know.
Regards, Rohit
From: Kummari, Prasad Prasad.Kummari@amd.com Date: Tuesday, 6 January 2026 at 18:11 To: Rohit Mathew Rohit.Mathew@arm.com, Sammit Joshi Sammit.Joshi@arm.com, scan-admin--- via TF-A tf-a@lists.trustedfirmware.org Cc: Belsare, Akshay akshay.belsare@amd.com, Bollapalli, Maheedhar Sai MaheedharSai.Bollapalli@amd.com, Simek, Michal michal.simek@amd.com, Kummari, Prasad Prasad.Kummari@amd.com Subject: RE: ZynqMP regression with NUMA_AWARE_PER_CPU changes with ENABLED_LTO : Linux runtime hang due to EL3 re-entry
[AMD Official Use Only - AMD Internal Distribution Only]
++ @Kummari, Prasadmailto:Prasad.Kummari@amd.com
From: Kummari, Prasad Sent: Tuesday, January 6, 2026 11:31 PM To: 'Rohit Mathew' Rohit.Mathew@arm.com; Sammit Joshi Sammit.Joshi@arm.com; scan-admin--- via TF-A tf-a@lists.trustedfirmware.org Cc: Belsare, Akshay akshay.belsare@amd.com; Bollapalli, Maheedhar Sai MaheedharSai.Bollapalli@amd.com; Simek, Michal michal.simek@amd.com Subject: RE: ZynqMP regression with NUMA_AWARE_PER_CPU changes with ENABLED_LTO : Linux runtime hang due to EL3 re-entry
Hi Mathew,
Thank you for quick response and shared required information.
Regards, Prasad.
From: Rohit Mathew <Rohit.Mathew@arm.commailto:Rohit.Mathew@arm.com> Sent: Tuesday, January 6, 2026 9:21 PM To: Kummari, Prasad <Prasad.Kummari@amd.commailto:Prasad.Kummari@amd.com>; Sammit Joshi <Sammit.Joshi@arm.commailto:Sammit.Joshi@arm.com>; scan-admin--- via TF-A <tf-a@lists.trustedfirmware.orgmailto:tf-a@lists.trustedfirmware.org> Cc: Belsare, Akshay <akshay.belsare@amd.commailto:akshay.belsare@amd.com>; Bollapalli, Maheedhar Sai <MaheedharSai.Bollapalli@amd.commailto:MaheedharSai.Bollapalli@amd.com>; Simek, Michal <michal.simek@amd.commailto:michal.simek@amd.com> Subject: Re: ZynqMP regression with NUMA_AWARE_PER_CPU changes with ENABLED_LTO : Linux runtime hang due to EL3 re-entry
Caution: This message originated from an External Source. Use proper caution when opening attachments, clicking links, or responding.
Hi Prasad,
Thanks for the info. We tried booting on few different platforms (ACPI-Linux/DT-Linux/TFTF) from our end with Suspend and LTO enabled and boot seems to go through without any problems - so we still can’t reproduce the issue on our end. Could I request the following info from your end to see if we can spot something?
• Do you see this happening only for the said platform (In case you have other platforms)
The issue is currently observed only on our ZynqMP (Cortex-A53) based platform. We do not see the issue on other internal platforms we have access to. The failure is reproducible on ZynqMP Platform, we have enabled LTO.
* Since you had debug logs added, could we check if this happens on the first/specific CPU suspend or is this quite random? Were you able to trace it to a specific function/line within the suspend path where the crash occurs? It appears to be quite random. Added debug logs in psci_cpu_suspend(). Although BL31 NOTICE logs and Linux printk messages are interleaved and somewhat noisy, they are still understandable. I’ve attached the full logs for reference with DEBUG=1 and DEBUG=0. LOGs DEBUG=1: [ 4.966988] zynqmp-dpsub fd4a0000.display: [drm] Cannot find any crtc or sizes NOTICE: CPU2: psci_cpu_suspend() base=0x12280 next=0x125c0 delta=0x340 NOTICE: CPU3: psci_cpu_suspend() base=0x125c0 next=0x11c00 delta=0xfffffffffffff640 NOTICE: CPU3: psci_cpu_suspend() base=0x125c0 next=0x11c00 delta=0xfffffffffffff640 NOTICE: CPU0: psci_cpu_suspend() base=0x11c00 next=0x11f40 delta=0x340 nUOTnhIaCnEd:l e dCP EUx3c: epptsciio_nc ipun_ sEuLs3p.e ---> unexpectedly entering EL3 =0(3)0 b a se = 0 x 1 2 5 c0= n0exx0t00=000x01010c00000 0de0l00t2a 0xf0f f ff f f f f f f ff 6 4=0 x0000000000000000 x1 = 0x00000000000111f8 NOTICE: CPU1: psci_cpu_suspend() base=0x11f40 next=0x12280 delta=0x340 NOTICE: CPU0: psci_cpu_suspend() base=0x11c00 next=0x11f40 delta=0x340 NOTICE: CPU3: psci_cpu_suspend() base=0x125c0 next=0x11c00 delta=0xfffffffffffff640 NOTICE: CPU1: psci_cpu_suspend() base=0x11f40 next=0x12280 delta=0x340 --> continues prints and hang.
* It was stated that CPUs are unexpectedly entering EL3. Can I check if the entry itself was not expected or if there was a problem further down the line ie in the suspend path following the entry? Exception Register Dump: x30 = 0x0000000000000002 x0 = 0x0000000000000000 x1 = 0x0000000000012538 x2 = 0x00000000000134c0 x3 = 0x000000000000003f x4 = 0x00000000000002c0 x5 = 0x0000000055540000 x6 = 0x000000000000b554 x7 = 0x00000000f9010000 x8 = 0x0000000000000008 x9 = 0x000000000000b1b0 x10 = 0x000000002000ff00 x11 = 0x000000000000b554 x12 = 0x00000000f9010400 x13 = 0x0000000000001ff0 x14 = 0x0000000000000001 x15 = 0x0000000000001100 x16 = 0xaa69440080c94020 x17 = 0x51453496800300f0 x18 = 0xa1c6bb25d660316b x19 = 0x00012801026a8386 x20 = 0x246f2823c0616a0c x21 = 0x2a0495a48c060008 x22 = 0x0020241c01920c03 x23 = 0x271802d906c4c238 x24 = 0xd56233b4061dd10c x25 = 0xa4b115480a80090f x26 = 0x8d098c0181808610 x[ 3.875646] mmc0: new high speed SDHC card at address aaaa 27 = 0x2241009a000000c0 x28 = 0x1c180f2194984856 x29 = 0x301c2a610c800908 scr_el3 = 0x0000000000000238 sctlr_el3 = 0x0000000030cd183f cptr_el3 = 0x0000000000000000 tcr_el3 = 0x0000000080803520 daif = 0x00000000000003c0 mair_el3 = 0x00000000004400ff spsr_el3 = 0x00000000200002cc elr_el3 = 0x0000000000000002 ttbr0_el3 = 0x0000000000012ac0 esr_el3 = 0x000000008a000000 far_el3 = 0x0000000000000002 mpidr_el1 = 0x0000000080000003 sp_el0 = 0x0000000000012540 isr_el1 = 0x0000000000000000 dacr32_el2 = 0x0000000000000000 ifsr32_el2 = 0x0000000000000000 cpuectlr_el1 = 0x0000000000000040 cpumerrsr_el1 = 0x000000001a000040 l2merrsr_el1 = 0x0000000010100008 cpuactlr_el1 = 0x00001000090ca000 gicc_hppir = 0x00000000000003fe gicc_ahppir = 0x0000000000000801 gicc_ctlr = 0x00000000000001e9 gicd_ispendr regs (Offsets 0x200-0x278) Offset Value 0x200: 0x0000000000000012 0x208: 0x0000000000000000 0x210: 0x0000000000000000 0x218: 0x0000000000000000 0x220: 0x0000000000000000 0x228: 0x0000000000000000 0x230: 0x0000000000000000 0x238: 0x0000000000000000 0x240: 0x0000000000000000 0x248: 0x0000000000000000 0x250: 0x0000000000000000 0x258: 0x0000000000000000 0x260: 0x0000000000000000 0x268: 0x0000000000000000 0x270: 0x0000000000000000 0x278: 0x0000000000000000 cci_snoop_ctrl_cluster0x100000000c0000003 cci_snoop_ctrl_cluster1x100000000c0000000
CPU state with debugger: xsdb% ta 1 PS TAP 2 PMU 3 MicroBlaze PMU (Sleeping. No clock) 4 PL 5 PSU 6 RPU 7 Cortex-R5 #0 (Halted) 8 Cortex-R5 #1 (Lock Step Mode) 9 APU 10* Cortex-A53 #0 (External debug access is disabled) 11 Cortex-A53 #1 (No Power) 12 Cortex-A53 #2 (Running) 13 Cortex-A53 #3 (Running) xsdb% ta 1 PS TAP 2 PMU 3 MicroBlaze PMU (Sleeping. No clock) 4 PL 5 PSU 6 RPU 7 Cortex-R5 #0 (Halted) 8 Cortex-R5 #1 (Lock Step Mode) 9 APU 10* Cortex-A53 #0 (APB AP transaction error, DAP status 0x30000021) 11 Cortex-A53 #1 (Running) 12 Cortex-A53 #2 (Running) 13 Cortex-A53 #3 (No Power) xsdb%
* Could you share the exact commit you are on for TF-A Merge "docs(changelog): changelog for v2.14 release" into integration · ARM-software/arm-trusted-firmware@1d5aa93https://github.com/ARM-software/arm-trusted-firmware/commit/1d5aa939bc8d3d892e2ed9945fa50e36a1a924cc
* Could you share the build configuration for the platform (all build-flags) make -j20 RESET_TO_BL31=1 PLAT=zynqmp bl31 IPI_CRC_CHECK=1 DEBUG=1 In DEBUG mode ZynqMP will run DDR start address: 0x1000, without debug it will run OCM start address 0x00000000fffea000
* Would it be possible to share the BL31 logs (so we can see the crash register logs) and Linux logs (just to correlate) as well as bl31.dump/bl31.map file for the failure case (If you can share for the working case as well, that would be great) Attached are the working and non-working dumps and logs for NUMA_AWARE_PER_CPU with DDR and OCM. In the working case, LTO is disabled. CPU state with debugger with OCM BL31.elf: xsdb% ta 1 PS TAP 2 PMU 3 MicroBlaze PMU (Sleeping. No clock) 4 PL 5 PSU 6 RPU 7 Cortex-R5 #0 (Halted) 8 Cortex-R5 #1 (Lock Step Mode) 9 APU 10* Cortex-A53 #0 (Running) 11 Cortex-A53 #1 (No Power) 12 Cortex-A53 #2 (Power On Reset) 13 Cortex-A53 #3 (Running) xsdb% Info: Cortex-A53 #2 (target 12) Stopped at 0x0 (Cannot resume. APB AP transaction error, DAP status 0x30000021) xsdb% Info: Cortex-A53 #2 (target 12) Running (No Power) xsdb% Info: Cortex-A53 #2 (target 12) Running (No Power) xsdb% Info: Cortex-A53 #2 (target 12) Stopped at 0x0 (Cannot resume. APB AP transaction error, DAP status 0x30000021) xsdb% Info: Cortex-A53 #2 (target 12) Running (No Power) xsdb% Info: Cortex-A53 #0 (target 10) Stopped at 0xfffea194 (Reset Catch) xsdb% Info: Cortex-A53 #1 (target 11) Running (No Power) xsdb% Info: Cortex-A53 #1 (target 11) Running (No Power) xsdb% Info: Cortex-A53 #1 (target 11) Running (No Power) xsdb% Info: Cortex-A53 #1 (target 11) Running (No Power) xsdb% Info: Cortex-A53 #1 (target 11) Running (No Power) xsdb% Info: Cortex-A53 #2 (target 12) Running (No Power) xsdb% Info: Cortex-A53 #1 (target 11) Running (No Power) xsdb% ta 1 PS TAP 2 PMU 3 MicroBlaze PMU (Sleeping. No clock) 4 PL 5 PSU 6 RPU 7 Cortex-R5 #0 (Halted) 8 Cortex-R5 #1 (Lock Step Mode) 9 APU 10* Cortex-A53 #0 (Reset Catch, EL3(S)/A64) 11 Cortex-A53 #1 (No Power) 12 Cortex-A53 #2 (Running) 13 Cortex-A53 #3 (Running) xsdb% Info: Cortex-A53 #2 (target 12) Running (No Power)
Regards, Rohit
From: Kummari, Prasad <Prasad.Kummari@amd.commailto:Prasad.Kummari@amd.com> Date: Tuesday, 6 January 2026 at 06:34 To: Rohit Mathew <Rohit.Mathew@arm.commailto:Rohit.Mathew@arm.com>, Sammit Joshi <Sammit.Joshi@arm.commailto:Sammit.Joshi@arm.com>, scan-admin--- via TF-A <tf-a@lists.trustedfirmware.orgmailto:tf-a@lists.trustedfirmware.org> Cc: Belsare, Akshay <akshay.belsare@amd.commailto:akshay.belsare@amd.com>, Bollapalli, Maheedhar Sai <MaheedharSai.Bollapalli@amd.commailto:MaheedharSai.Bollapalli@amd.com>, Simek, Michal <michal.simek@amd.commailto:michal.simek@amd.com> Subject: RE: ZynqMP regression with NUMA_AWARE_PER_CPU changes with ENABLED_LTO : Linux runtime hang due to EL3 re-entry
[AMD Official Use Only - AMD Internal Distribution Only]
Hi Mathew,
Yes, if the CPU_IDLE–related configurations listed below are removed while compiling the kernel image, the hang is no longer observed.
Configs: CONFIG_CPU_IDLE=y CONFIG_CPU_IDLE_MULTIPLE_DRIVERS=y CONFIG_CPU_IDLE_GOV_MENU=y CONFIG_DT_IDLE_STATES=y
Observed Behavior
* System later triggers an Unhandled Exception in EL3 CPUs enter psci_cpu_suspend(). Debug prints confirm execution reaches the suspend path. * The affected CPUs enter Reset Catch at EL3 and hang at the bl31_warm_entrypoint()->el3_entrypoint_common pc.
Regards, Prasad.
From: Rohit Mathew <Rohit.Mathew@arm.commailto:Rohit.Mathew@arm.com> Sent: Tuesday, January 6, 2026 4:37 AM To: Kummari, Prasad <Prasad.Kummari@amd.commailto:Prasad.Kummari@amd.com>; Sammit Joshi <Sammit.Joshi@arm.commailto:Sammit.Joshi@arm.com>; scan-admin--- via TF-A <tf-a@lists.trustedfirmware.orgmailto:tf-a@lists.trustedfirmware.org> Cc: Belsare, Akshay <akshay.belsare@amd.commailto:akshay.belsare@amd.com>; Bollapalli, Maheedhar Sai <MaheedharSai.Bollapalli@amd.commailto:MaheedharSai.Bollapalli@amd.com>; Simek, Michal <michal.simek@amd.commailto:michal.simek@amd.com> Subject: Re: ZynqMP regression with NUMA_AWARE_PER_CPU changes with ENABLED_LTO : Linux runtime hang due to EL3 re-entry
Caution: This message originated from an External Source. Use proper caution when opening attachments, clicking links, or responding.
Hi Prasad,
Thanks for your mail. I’m trying to reproduce this at our end. Could I check if you see the boot progressing to shell without the CPU idle dependencies?
Regards, Rohit
From: Kummari, Prasad <Prasad.Kummari@amd.commailto:Prasad.Kummari@amd.com> Date: Monday, 5 January 2026 at 18:04 To: Sammit Joshi <Sammit.Joshi@arm.commailto:Sammit.Joshi@arm.com>, Rohit Mathew <Rohit.Mathew@arm.commailto:Rohit.Mathew@arm.com>, scan-admin--- via TF-A <tf-a@lists.trustedfirmware.orgmailto:tf-a@lists.trustedfirmware.org> Cc: Belsare, Akshay <akshay.belsare@amd.commailto:akshay.belsare@amd.com>, Bollapalli, Maheedhar Sai <MaheedharSai.Bollapalli@amd.commailto:MaheedharSai.Bollapalli@amd.com>, Simek, Michal <michal.simek@amd.commailto:michal.simek@amd.com> Subject: ZynqMP regression with NUMA_AWARE_PER_CPU changes with ENABLED_LTO : Linux runtime hang due to EL3 re-entry
[AMD Official Use Only - AMD Internal Distribution Only]
Hello Maintainers,
We are observing a reproducible runtime regression on the ZynqMP (Cortex-A53) platform after enabling LTO (ENABLED_LTO=1) and merging the changes from the topic NUMA_AWARE_PER_CPU into our integration branch (https://github.com/ARM-software/arm-trusted-firmware/commit/7303319b3823e9e3...).
Summary of the issue
1. Baseline behavior
* Platform: ZynqMP (Cortex-A53)
* Configuration: ENABLED_LTO=1
* Without NUMA_AWARE_PER_CPU: Linux boots and runs stably
* After merging NUMA_AWARE_PER_CPU, Linux boots but hangs during runtime
* During the hang, CPUs are observed to unexpectedly re-enter EL3
* Re-entry into EL3 should not occur during normal Linux runtime execution and strongly suggests corruption or mismanagement of PSCI and/or per-CPU state(arm-trusted-firmware/lib/per_cpu/aarch64/per_cpu_asm.S at master · ARM-software/arm-trusted-firmwarehttps://github.com/ARM-software/arm-trusted-firmware/blob/master/lib/per_cpu/aarch64/per_cpu_asm.S#L28)
* Reverting the NUMA_AWARE_PER_CPU changes restores stable Linux execution
* The issue is reproducible only when NUMA_AWARE_PER_CPU is present
* This clearly identifies NUMA_AWARE_PER_CPU as the regression source
1. Suspect with LTO
* With NUMA_AWARE_PER_CPU enabled, LTO breaks the per-CPU base calculation * BL31 contains hand-written assembly that relies on linker-script symbols (e.g., per-CPU section boundaries) * Under LTO, symbol placement and retention are no longer guaranteed in the same way, leading to incorrect per-CPU base computation * This results in corrupted per-CPU data and subsequent erroneous PSCI suspend behavior (EL3 re-entry)
1. CPU idle dependency
* The following kernel configuration options are enabled: * CONFIG_CPU_IDLE=y * CONFIG_CPU_IDLE_MULTIPLE_DRIVERS=y * CONFIG_CPU_IDLE_GOV_MENU=y * CONFIG_DT_IDLE_STATES=y * This further suggests the issue is triggered during CPU idle / suspend-resume paths, where correct per-CPU state handling is critical
Based on the above:
* This is specific to NUMA_AWARE_PER_CPU combined with LTO * The failure mode points to per-CPU base calculation and PSCI state corruption * Reverting NUMA_AWARE_PER_CPU fully restores stability on ZynqMP
We wanted to report this issue upstream and seek guidance on:
* Whether NUMA_AWARE_PER_CPU is expected to be LTO-safe on platforms relying on linker-defined per-CPU sections * Or if additional constraints / fixes are required for platforms like ZynqMP
We are happy to provide further logs, configuration details, or help to fixes.
Regards, Prasad Kummari
[AMD Official Use Only - AMD Internal Distribution Only]
Hi Mathew,
Thank you for your response.
From: Rohit Mathew Rohit.Mathew@arm.com Sent: Thursday, January 8, 2026 8:30 PM To: Kummari, Prasad Prasad.Kummari@amd.com; Sammit Joshi Sammit.Joshi@arm.com; scan-admin--- via TF-A tf-a@lists.trustedfirmware.org Cc: Belsare, Akshay akshay.belsare@amd.com; Bollapalli, Maheedhar Sai MaheedharSai.Bollapalli@amd.com; Simek, Michal michal.simek@amd.com; Kummari, Prasad Prasad.Kummari@amd.com; Manish Pandey2 Manish.Pandey2@arm.com; Boyan Karatotev Boyan.Karatotev@arm.com; Chris Kay Chris.Kay@arm.com Subject: Re: ZynqMP regression with NUMA_AWARE_PER_CPU changes with ENABLED_LTO : Linux runtime hang due to EL3 re-entry
Caution: This message originated from an External Source. Use proper caution when opening attachments, clicking links, or responding.
Hi Prasad,
Thanks for sharing all the build artefacts and requested information. The register states (X2 and SP_EL0) + map/dump files for the DDR boot shows that CPU2 went into exception but the exact point of crash is unfortunately not recoverable from the shared artefacts - some registers are lost due to scrambled logs and the other ones don't show any relatable offsets/addresses which can be traced back from the dump and map files. The OCM boot logs doesn't show an exception so there isn't much to decode from those logs.
The map and dump files for the failed LTO builds doesn't build look different from the passing ones we have on our end (We have 3 passing builds/3 different platforms). The order of how the objects are placed in the section changes between an LTO and non-LTO build, but per-cpu accessors are object-order-agnostic within a CPUs space in general.
You did mention that the cores that went into crash then further woke up in bl31_warmboot_entrypoint. Could this mean that the CPU suspend sequence from the Zync's PMU was issued before the core went into an exception and this caused the core to not land at the WFI but instead go into an exception? Is this something that could be checked relatively easily? Also, do you know the exact instruction at which such cores were hung in the warm boot entrypoint? If we don't get much information, we might have to think about adding a debug patch to retrieve more about the crash. Could we also check:
Observations:
The cores did not actually wake up. Instead, they were found in no power, running and reset catch state, with the CPU program counter pointing to bl31_warmboot_entrypoint. When an interrupt is manually injected using the XSDB debugger, the core transitions to the running state. It appears that the CPU suspend sequence was issued to the ZynqMP PMU, after which an exception was triggered as per logs. The exact instruction at which the core gets stuck could not be determined, as the behavior is quite random. In some cases, certain CPUs enter reset catch with the PC pointing to bl31_warmboot_entrypoint.
We performed three iterations in ddr. The system booted correctly once, while the issue occurred intermittently in the remaining two runs.
Refer to the logs with BL31 runtime logs disabled for a proper dump:
DMSG: [ 6.617446] sd 2:0:0:0: [sdb] Attached SCSI removable disk INIT: Entering runlevel: 5 Configuring network interfaces... [ 7.050920] macb ff0e0000.ethernet eth0: PHY [ff0e0000.ethernet-ffffffff:0c] driver [TI DP83867] (irq=POLL) [ 7.060769] macb ff0e0000.ethernet eth0: configuring for phy/rgmii-id link mode [ 7.068763] macb ff0e0000.ethernet: gem-ptp-timer ptp clock registered. udhcpc: started, v1.36.1 udhcpc: broadcasting discover Unhandled Exception in EL3. x30 = 0x0000000000000002 x0 = 0x0000000000000000 x1 = 0x0000000000011638 x2 = 0x00000000000125c0 x3 = 0x000000000000003f x4 = 0x00000000000002c0 x5 = 0x0000000055540000 x6 = 0x000000000000b554 x7 = 0x00000000f9010000 x8 = 0x0000000000000008 x9 = 0x000000000000b1b0 x10 = 0x000000002000ff00 x11 = 0x000000000000b554 x12 = 0x00000000f9010400 x13 = 0x0000000000001ff0 x14 = 0x0000000000000001 x15 = 0x0000000000001100 x16 = 0x6710328481440018 x17 = 0x5000735413ee4101 x18 = 0x201f0181a5141414 x19 = 0x0040420cc905600d x20 = 0x9320242204661cc5 x21 = 0x00da002043400012 x22 = 0x000042401021fc8f x23 = 0x8402230493c01e4c x24 = 0x102238309108d512 x25 = 0x908ce0c38144b4ab x26 = 0xe22408240944ae0f x27 = 0x0040008048920017 x28 = 0x194c481653804086 x29 = 0x0af2ec062181d02d scr_el3 = 0x0000000000000238 sctlr_el3 = 0x0000000030cd183f cptr_el3 = 0x0000000000000000 tcr_el3 = 0x0000000080803520 daif = 0x00000000000003c0 mair_el3 = 0x00000000004400ff spsr_el3 = 0x00000000200002cc elr_el3 = 0x0000000000000002 ttbr0_el3 = 0x0000000000011bc0 esr_el3 = 0x000000008a000000 far_el3 = 0x0000000000000002 mpidr_el1 = 0x0000000080000003 sp_el0 = 0x0000000000011640 isr_el1 = 0x0000000000000000 dacr32_el2 = 0x0000000000000000 ifsr32_el2 = 0x0000000000000000 cpuectlr_el1 = 0x0000000000000040 cpumerrsr_el1 = 0x000000000804020a l2merrsr_el1 = 0x0000000010008120 cpuactlr_el1 = 0x00001000090ca000 gicc_hppir = 0x00000000000003fe gicc_ahppir = 0x0000000000000801 gicc_ctlr = 0x00000000000001e9 gicd_ispendr regs (Offsets 0x200-0x278) Offset Value 0x200: 0x0000000000000012 0x208: 0x0000000000000000 0x210: 0x0000000000000000 0x218: 0x0000000000000000 0x220: 0x0000000000000000 0x228: 0x0000000000000000 0x230: 0x0000000000000000 0x238: 0x0000000000000000 0x240: 0x0000000000000000 0x248: 0x0000000000000000 0x250: 0x0000000000000000 0x258: 0x0000000000000000 0x260: 0x0000000000000000 0x268: 0x0000000000000000 0x270: 0x0000000000000000 0x278: 0x0000000000000000 cci_snoop_ctrl_cluster0x100000000c0000003 cci_snoop_ctrl_cluster1x100000000c0000000 cci_snoop_ctrl_cluster1x100000000c0000000 [ 10.151108] macb ff0e0000.ethernet eth0: Link is Up - 1Gbps/Full - flow control tx [ 14.824964] platform ina226-u76: deferred probe pending: iio_hwmon: Failed to get channels [ 14.833259] platform ina226-u77: deferred probe pending: iio_hwmon: Failed to get channels [ 14.841532] platform ina226-u78: deferred probe pending: iio_hwmon: Failed to get channels [ 14.849803] platform ina226-u87: deferred probe pending: iio_hwmon: Failed to get channels [ 14.858074] platform ina226-u85: deferred probe pending: iio_hwmon: Failed to get channels [ 14.866344] platform ina226-u86: deferred probe pending: iio_hwmon: Failed to get channels .....hang
debugger logs:
xsdb% ta 1 PS TAP 2 PMU 3 MicroBlaze PMU (Sleeping. No clock) 4 PL 5 PSU 6 RPU 7 Cortex-R5 #0 (Halted) 8 Cortex-R5 #1 (Lock Step Mode) 9 APU 10* Cortex-A53 #0 (No Power) 11 Cortex-A53 #1 (No Power) 12 Cortex-A53 #2 (Running) 13 Cortex-A53 #3 (Running) xsdb% Info: Cortex-A53 #0 (target 10) Running (No Power) xsdb% Info: Cortex-A53 #2 (target 12) Running (No Power) xsdb% Info: Cortex-A53 #2 (target 12) Running (No Power) xsdb% Info: Cortex-A53 #1 (target 11) Running (No Power) xsdb% Info: Cortex-A53 #1 (target 11) Running (No Power) xsdb% Info: Cortex-A53 #1 (target 11) Running (APB AP transaction error, DAP status 0x30000021) xsdb% Info: Cortex-A53 #2 (target 12) Running (No Power) xsdb% Info: Cortex-A53 #1 (target 11) Running (No Power) xsdb% Info: Cortex-A53 #0 (target 10) Stopped at 0x1194 (Reset Catch) xsdb% Info: Cortex-A53 #1 (target 11) Running (No Power) xsdb% Info: Cortex-A53 #1 (target 11) Running (No Power) xsdb% ta 10 9 APU 10 Cortex-A53 #0 (Reset Catch, EL3(S)/A64) 11 Cortex-A53 #1 (Running) 12 Cortex-A53 #2 (Running) 13 Cortex-A53 #3 (Running) xsdb% rrd pc pc: 0000000000001194
bl31.dump: 2622 0000000000001194 <bl31_warm_entrypoint>: 2623 1194: d2810600 mov x0, #0x830 // #2096 2624 1198: f2a618a0 movk x0, #0x30c5, lsl #16 2625 119c: d51e1000 msr sctlr_el3, x0 2626 11a0: d5033fdf isb 2627 11a4: 1004b2e0 adr x0, a800 <sync_exception_sp_el0> 2628 11a8: d51ec000 msr vbar_el3, x0 2629 11ac: d5033fdf isb
* What version of compiler is being used? aarch64-linux/bin/aarch64-linux-gnu-gcc --version ls (GNU coreutils) 8.32 Copyright (C) 2020 Free Software Foundation, Inc.
* DEBUG=1 is passed in the config (ie DEBUG=1 ENABLE_LTO=1), does this happen without DEBUG=1 (ie ENABLE_LTO=1 DEBUG=0)? Yes, DEBUG=1 and DEBUG=0 is ENABLE_LTO=1 is enabled by default in platform.mk.
It seems that not everyone is getting the mails from the mailing list at from Arm side. Would it be okay to move the rest of the discussion/debug to discord TF-A so that everyone can participate? Let us know.
Yes, please and could you please share the debug patch and will apply share required logs to you.
Regards, Rohit
From: Kummari, Prasad <Prasad.Kummari@amd.commailto:Prasad.Kummari@amd.com> Date: Tuesday, 6 January 2026 at 18:11 To: Rohit Mathew <Rohit.Mathew@arm.commailto:Rohit.Mathew@arm.com>, Sammit Joshi <Sammit.Joshi@arm.commailto:Sammit.Joshi@arm.com>, scan-admin--- via TF-A <tf-a@lists.trustedfirmware.orgmailto:tf-a@lists.trustedfirmware.org> Cc: Belsare, Akshay <akshay.belsare@amd.commailto:akshay.belsare@amd.com>, Bollapalli, Maheedhar Sai <MaheedharSai.Bollapalli@amd.commailto:MaheedharSai.Bollapalli@amd.com>, Simek, Michal <michal.simek@amd.commailto:michal.simek@amd.com>, Kummari, Prasad <Prasad.Kummari@amd.commailto:Prasad.Kummari@amd.com> Subject: RE: ZynqMP regression with NUMA_AWARE_PER_CPU changes with ENABLED_LTO : Linux runtime hang due to EL3 re-entry
[AMD Official Use Only - AMD Internal Distribution Only]
++ @Kummari, Prasadmailto:Prasad.Kummari@amd.com
From: Kummari, Prasad Sent: Tuesday, January 6, 2026 11:31 PM To: 'Rohit Mathew' <Rohit.Mathew@arm.commailto:Rohit.Mathew@arm.com>; Sammit Joshi <Sammit.Joshi@arm.commailto:Sammit.Joshi@arm.com>; scan-admin--- via TF-A <tf-a@lists.trustedfirmware.orgmailto:tf-a@lists.trustedfirmware.org> Cc: Belsare, Akshay <akshay.belsare@amd.commailto:akshay.belsare@amd.com>; Bollapalli, Maheedhar Sai <MaheedharSai.Bollapalli@amd.commailto:MaheedharSai.Bollapalli@amd.com>; Simek, Michal <michal.simek@amd.commailto:michal.simek@amd.com> Subject: RE: ZynqMP regression with NUMA_AWARE_PER_CPU changes with ENABLED_LTO : Linux runtime hang due to EL3 re-entry
Hi Mathew,
Thank you for quick response and shared required information.
Regards, Prasad.
From: Rohit Mathew <Rohit.Mathew@arm.commailto:Rohit.Mathew@arm.com> Sent: Tuesday, January 6, 2026 9:21 PM To: Kummari, Prasad <Prasad.Kummari@amd.commailto:Prasad.Kummari@amd.com>; Sammit Joshi <Sammit.Joshi@arm.commailto:Sammit.Joshi@arm.com>; scan-admin--- via TF-A <tf-a@lists.trustedfirmware.orgmailto:tf-a@lists.trustedfirmware.org> Cc: Belsare, Akshay <akshay.belsare@amd.commailto:akshay.belsare@amd.com>; Bollapalli, Maheedhar Sai <MaheedharSai.Bollapalli@amd.commailto:MaheedharSai.Bollapalli@amd.com>; Simek, Michal <michal.simek@amd.commailto:michal.simek@amd.com> Subject: Re: ZynqMP regression with NUMA_AWARE_PER_CPU changes with ENABLED_LTO : Linux runtime hang due to EL3 re-entry
Caution: This message originated from an External Source. Use proper caution when opening attachments, clicking links, or responding.
Hi Prasad,
Thanks for the info. We tried booting on few different platforms (ACPI-Linux/DT-Linux/TFTF) from our end with Suspend and LTO enabled and boot seems to go through without any problems - so we still can't reproduce the issue on our end. Could I request the following info from your end to see if we can spot something?
* Do you see this happening only for the said platform (In case you have other platforms)
The issue is currently observed only on our ZynqMP (Cortex-A53) based platform. We do not see the issue on other internal platforms we have access to. The failure is reproducible on ZynqMP Platform, we have enabled LTO.
* Since you had debug logs added, could we check if this happens on the first/specific CPU suspend or is this quite random? Were you able to trace it to a specific function/line within the suspend path where the crash occurs? It appears to be quite random. Added debug logs in psci_cpu_suspend(). Although BL31 NOTICE logs and Linux printk messages are interleaved and somewhat noisy, they are still understandable. I've attached the full logs for reference with DEBUG=1 and DEBUG=0. LOGs DEBUG=1: [ 4.966988] zynqmp-dpsub fd4a0000.display: [drm] Cannot find any crtc or sizes NOTICE: CPU2: psci_cpu_suspend() base=0x12280 next=0x125c0 delta=0x340 NOTICE: CPU3: psci_cpu_suspend() base=0x125c0 next=0x11c00 delta=0xfffffffffffff640 NOTICE: CPU3: psci_cpu_suspend() base=0x125c0 next=0x11c00 delta=0xfffffffffffff640 NOTICE: CPU0: psci_cpu_suspend() base=0x11c00 next=0x11f40 delta=0x340 nUOTnhIaCnEd:l e dCP EUx3c: epptsciio_nc ipun_ sEuLs3p.e ---> unexpectedly entering EL3 =0(3)0 b a se = 0 x 1 2 5 c0= n0exx0t00=000x01010c00000 0de0l00t2a 0xf0f f ff f f f f f f ff 6 4=0 x0000000000000000 x1 = 0x00000000000111f8 NOTICE: CPU1: psci_cpu_suspend() base=0x11f40 next=0x12280 delta=0x340 NOTICE: CPU0: psci_cpu_suspend() base=0x11c00 next=0x11f40 delta=0x340 NOTICE: CPU3: psci_cpu_suspend() base=0x125c0 next=0x11c00 delta=0xfffffffffffff640 NOTICE: CPU1: psci_cpu_suspend() base=0x11f40 next=0x12280 delta=0x340 --> continues prints and hang.
* It was stated that CPUs are unexpectedly entering EL3. Can I check if the entry itself was not expected or if there was a problem further down the line ie in the suspend path following the entry? Exception Register Dump: x30 = 0x0000000000000002 x0 = 0x0000000000000000 x1 = 0x0000000000012538 x2 = 0x00000000000134c0 x3 = 0x000000000000003f x4 = 0x00000000000002c0 x5 = 0x0000000055540000 x6 = 0x000000000000b554 x7 = 0x00000000f9010000 x8 = 0x0000000000000008 x9 = 0x000000000000b1b0 x10 = 0x000000002000ff00 x11 = 0x000000000000b554 x12 = 0x00000000f9010400 x13 = 0x0000000000001ff0 x14 = 0x0000000000000001 x15 = 0x0000000000001100 x16 = 0xaa69440080c94020 x17 = 0x51453496800300f0 x18 = 0xa1c6bb25d660316b x19 = 0x00012801026a8386 x20 = 0x246f2823c0616a0c x21 = 0x2a0495a48c060008 x22 = 0x0020241c01920c03 x23 = 0x271802d906c4c238 x24 = 0xd56233b4061dd10c x25 = 0xa4b115480a80090f x26 = 0x8d098c0181808610 x[ 3.875646] mmc0: new high speed SDHC card at address aaaa 27 = 0x2241009a000000c0 x28 = 0x1c180f2194984856 x29 = 0x301c2a610c800908 scr_el3 = 0x0000000000000238 sctlr_el3 = 0x0000000030cd183f cptr_el3 = 0x0000000000000000 tcr_el3 = 0x0000000080803520 daif = 0x00000000000003c0 mair_el3 = 0x00000000004400ff spsr_el3 = 0x00000000200002cc elr_el3 = 0x0000000000000002 ttbr0_el3 = 0x0000000000012ac0 esr_el3 = 0x000000008a000000 far_el3 = 0x0000000000000002 mpidr_el1 = 0x0000000080000003 sp_el0 = 0x0000000000012540 isr_el1 = 0x0000000000000000 dacr32_el2 = 0x0000000000000000 ifsr32_el2 = 0x0000000000000000 cpuectlr_el1 = 0x0000000000000040 cpumerrsr_el1 = 0x000000001a000040 l2merrsr_el1 = 0x0000000010100008 cpuactlr_el1 = 0x00001000090ca000 gicc_hppir = 0x00000000000003fe gicc_ahppir = 0x0000000000000801 gicc_ctlr = 0x00000000000001e9 gicd_ispendr regs (Offsets 0x200-0x278) Offset Value 0x200: 0x0000000000000012 0x208: 0x0000000000000000 0x210: 0x0000000000000000 0x218: 0x0000000000000000 0x220: 0x0000000000000000 0x228: 0x0000000000000000 0x230: 0x0000000000000000 0x238: 0x0000000000000000 0x240: 0x0000000000000000 0x248: 0x0000000000000000 0x250: 0x0000000000000000 0x258: 0x0000000000000000 0x260: 0x0000000000000000 0x268: 0x0000000000000000 0x270: 0x0000000000000000 0x278: 0x0000000000000000 cci_snoop_ctrl_cluster0x100000000c0000003 cci_snoop_ctrl_cluster1x100000000c0000000
CPU state with debugger: xsdb% ta 1 PS TAP 2 PMU 3 MicroBlaze PMU (Sleeping. No clock) 4 PL 5 PSU 6 RPU 7 Cortex-R5 #0 (Halted) 8 Cortex-R5 #1 (Lock Step Mode) 9 APU 10* Cortex-A53 #0 (External debug access is disabled) 11 Cortex-A53 #1 (No Power) 12 Cortex-A53 #2 (Running) 13 Cortex-A53 #3 (Running) xsdb% ta 1 PS TAP 2 PMU 3 MicroBlaze PMU (Sleeping. No clock) 4 PL 5 PSU 6 RPU 7 Cortex-R5 #0 (Halted) 8 Cortex-R5 #1 (Lock Step Mode) 9 APU 10* Cortex-A53 #0 (APB AP transaction error, DAP status 0x30000021) 11 Cortex-A53 #1 (Running) 12 Cortex-A53 #2 (Running) 13 Cortex-A53 #3 (No Power) xsdb%
* Could you share the exact commit you are on for TF-A Merge "docs(changelog): changelog for v2.14 release" into integration * ARM-software/arm-trusted-firmware@1d5aa93https://github.com/ARM-software/arm-trusted-firmware/commit/1d5aa939bc8d3d892e2ed9945fa50e36a1a924cc
* Could you share the build configuration for the platform (all build-flags) make -j20 RESET_TO_BL31=1 PLAT=zynqmp bl31 IPI_CRC_CHECK=1 DEBUG=1 In DEBUG mode ZynqMP will run DDR start address: 0x1000, without debug it will run OCM start address 0x00000000fffea000
* Would it be possible to share the BL31 logs (so we can see the crash register logs) and Linux logs (just to correlate) as well as bl31.dump/bl31.map file for the failure case (If you can share for the working case as well, that would be great) Attached are the working and non-working dumps and logs for NUMA_AWARE_PER_CPU with DDR and OCM. In the working case, LTO is disabled. CPU state with debugger with OCM BL31.elf: xsdb% ta 1 PS TAP 2 PMU 3 MicroBlaze PMU (Sleeping. No clock) 4 PL 5 PSU 6 RPU 7 Cortex-R5 #0 (Halted) 8 Cortex-R5 #1 (Lock Step Mode) 9 APU 10* Cortex-A53 #0 (Running) 11 Cortex-A53 #1 (No Power) 12 Cortex-A53 #2 (Power On Reset) 13 Cortex-A53 #3 (Running) xsdb% Info: Cortex-A53 #2 (target 12) Stopped at 0x0 (Cannot resume. APB AP transaction error, DAP status 0x30000021) xsdb% Info: Cortex-A53 #2 (target 12) Running (No Power) xsdb% Info: Cortex-A53 #2 (target 12) Running (No Power) xsdb% Info: Cortex-A53 #2 (target 12) Stopped at 0x0 (Cannot resume. APB AP transaction error, DAP status 0x30000021) xsdb% Info: Cortex-A53 #2 (target 12) Running (No Power) xsdb% Info: Cortex-A53 #0 (target 10) Stopped at 0xfffea194 (Reset Catch) xsdb% Info: Cortex-A53 #1 (target 11) Running (No Power) xsdb% Info: Cortex-A53 #1 (target 11) Running (No Power) xsdb% Info: Cortex-A53 #1 (target 11) Running (No Power) xsdb% Info: Cortex-A53 #1 (target 11) Running (No Power) xsdb% Info: Cortex-A53 #1 (target 11) Running (No Power) xsdb% Info: Cortex-A53 #2 (target 12) Running (No Power) xsdb% Info: Cortex-A53 #1 (target 11) Running (No Power) xsdb% ta 1 PS TAP 2 PMU 3 MicroBlaze PMU (Sleeping. No clock) 4 PL 5 PSU 6 RPU 7 Cortex-R5 #0 (Halted) 8 Cortex-R5 #1 (Lock Step Mode) 9 APU 10* Cortex-A53 #0 (Reset Catch, EL3(S)/A64) 11 Cortex-A53 #1 (No Power) 12 Cortex-A53 #2 (Running) 13 Cortex-A53 #3 (Running) xsdb% Info: Cortex-A53 #2 (target 12) Running (No Power)
Regards, Rohit
From: Kummari, Prasad <Prasad.Kummari@amd.commailto:Prasad.Kummari@amd.com> Date: Tuesday, 6 January 2026 at 06:34 To: Rohit Mathew <Rohit.Mathew@arm.commailto:Rohit.Mathew@arm.com>, Sammit Joshi <Sammit.Joshi@arm.commailto:Sammit.Joshi@arm.com>, scan-admin--- via TF-A <tf-a@lists.trustedfirmware.orgmailto:tf-a@lists.trustedfirmware.org> Cc: Belsare, Akshay <akshay.belsare@amd.commailto:akshay.belsare@amd.com>, Bollapalli, Maheedhar Sai <MaheedharSai.Bollapalli@amd.commailto:MaheedharSai.Bollapalli@amd.com>, Simek, Michal <michal.simek@amd.commailto:michal.simek@amd.com> Subject: RE: ZynqMP regression with NUMA_AWARE_PER_CPU changes with ENABLED_LTO : Linux runtime hang due to EL3 re-entry
[AMD Official Use Only - AMD Internal Distribution Only]
Hi Mathew,
Yes, if the CPU_IDLE-related configurations listed below are removed while compiling the kernel image, the hang is no longer observed.
Configs: CONFIG_CPU_IDLE=y CONFIG_CPU_IDLE_MULTIPLE_DRIVERS=y CONFIG_CPU_IDLE_GOV_MENU=y CONFIG_DT_IDLE_STATES=y
Observed Behavior
* System later triggers an Unhandled Exception in EL3 CPUs enter psci_cpu_suspend(). Debug prints confirm execution reaches the suspend path. * The affected CPUs enter Reset Catch at EL3 and hang at the bl31_warm_entrypoint()->el3_entrypoint_common pc.
Regards, Prasad.
From: Rohit Mathew <Rohit.Mathew@arm.commailto:Rohit.Mathew@arm.com> Sent: Tuesday, January 6, 2026 4:37 AM To: Kummari, Prasad <Prasad.Kummari@amd.commailto:Prasad.Kummari@amd.com>; Sammit Joshi <Sammit.Joshi@arm.commailto:Sammit.Joshi@arm.com>; scan-admin--- via TF-A <tf-a@lists.trustedfirmware.orgmailto:tf-a@lists.trustedfirmware.org> Cc: Belsare, Akshay <akshay.belsare@amd.commailto:akshay.belsare@amd.com>; Bollapalli, Maheedhar Sai <MaheedharSai.Bollapalli@amd.commailto:MaheedharSai.Bollapalli@amd.com>; Simek, Michal <michal.simek@amd.commailto:michal.simek@amd.com> Subject: Re: ZynqMP regression with NUMA_AWARE_PER_CPU changes with ENABLED_LTO : Linux runtime hang due to EL3 re-entry
Caution: This message originated from an External Source. Use proper caution when opening attachments, clicking links, or responding.
Hi Prasad,
Thanks for your mail. I'm trying to reproduce this at our end. Could I check if you see the boot progressing to shell without the CPU idle dependencies?
Regards, Rohit
From: Kummari, Prasad <Prasad.Kummari@amd.commailto:Prasad.Kummari@amd.com> Date: Monday, 5 January 2026 at 18:04 To: Sammit Joshi <Sammit.Joshi@arm.commailto:Sammit.Joshi@arm.com>, Rohit Mathew <Rohit.Mathew@arm.commailto:Rohit.Mathew@arm.com>, scan-admin--- via TF-A <tf-a@lists.trustedfirmware.orgmailto:tf-a@lists.trustedfirmware.org> Cc: Belsare, Akshay <akshay.belsare@amd.commailto:akshay.belsare@amd.com>, Bollapalli, Maheedhar Sai <MaheedharSai.Bollapalli@amd.commailto:MaheedharSai.Bollapalli@amd.com>, Simek, Michal <michal.simek@amd.commailto:michal.simek@amd.com> Subject: ZynqMP regression with NUMA_AWARE_PER_CPU changes with ENABLED_LTO : Linux runtime hang due to EL3 re-entry
[AMD Official Use Only - AMD Internal Distribution Only]
Hello Maintainers,
We are observing a reproducible runtime regression on the ZynqMP (Cortex-A53) platform after enabling LTO (ENABLED_LTO=1) and merging the changes from the topic NUMA_AWARE_PER_CPU into our integration branch (https://github.com/ARM-software/arm-trusted-firmware/commit/7303319b3823e9e3...).
Summary of the issue
1. Baseline behavior
* Platform: ZynqMP (Cortex-A53)
* Configuration: ENABLED_LTO=1
* Without NUMA_AWARE_PER_CPU: Linux boots and runs stably
* After merging NUMA_AWARE_PER_CPU, Linux boots but hangs during runtime
* During the hang, CPUs are observed to unexpectedly re-enter EL3
* Re-entry into EL3 should not occur during normal Linux runtime execution and strongly suggests corruption or mismanagement of PSCI and/or per-CPU state(arm-trusted-firmware/lib/per_cpu/aarch64/per_cpu_asm.S at master * ARM-software/arm-trusted-firmwarehttps://github.com/ARM-software/arm-trusted-firmware/blob/master/lib/per_cpu/aarch64/per_cpu_asm.S#L28)
* Reverting the NUMA_AWARE_PER_CPU changes restores stable Linux execution
* The issue is reproducible only when NUMA_AWARE_PER_CPU is present
* This clearly identifies NUMA_AWARE_PER_CPU as the regression source
1. Suspect with LTO
* With NUMA_AWARE_PER_CPU enabled, LTO breaks the per-CPU base calculation * BL31 contains hand-written assembly that relies on linker-script symbols (e.g., per-CPU section boundaries) * Under LTO, symbol placement and retention are no longer guaranteed in the same way, leading to incorrect per-CPU base computation * This results in corrupted per-CPU data and subsequent erroneous PSCI suspend behavior (EL3 re-entry)
1. CPU idle dependency
* The following kernel configuration options are enabled: * CONFIG_CPU_IDLE=y * CONFIG_CPU_IDLE_MULTIPLE_DRIVERS=y * CONFIG_CPU_IDLE_GOV_MENU=y * CONFIG_DT_IDLE_STATES=y * This further suggests the issue is triggered during CPU idle / suspend-resume paths, where correct per-CPU state handling is critical
Based on the above:
* This is specific to NUMA_AWARE_PER_CPU combined with LTO * The failure mode points to per-CPU base calculation and PSCI state corruption * Reverting NUMA_AWARE_PER_CPU fully restores stability on ZynqMP
We wanted to report this issue upstream and seek guidance on:
* Whether NUMA_AWARE_PER_CPU is expected to be LTO-safe on platforms relying on linker-defined per-CPU sections * Or if additional constraints / fixes are required for platforms like ZynqMP
We are happy to provide further logs, configuration details, or help to fixes.
Regards, Prasad Kummari
tf-a@lists.trustedfirmware.org