[TF-A] ZynqMP regression with NUMA_AWARE_PER_CPU changes with ENABLED_LTO : Linux runtime hang due to EL3 re-entry

5 Jan 2026


      [AMD Official Use Only - AMD Internal Distribution Only]
Hello Maintainers,
We are observing a reproducible runtime regression on the ZynqMP (Cortex-A53) platform after enabling LTO (ENABLED_LTO=1) and merging the changes from the topic NUMA_AWARE_PER_CPU into our integration branch (https://github.com/ARM-software/arm-trusted-firmware/commit/7303319b3823e9e3...).
Summary of the issue
1.  Baseline behavior
     *   Platform: ZynqMP (Cortex-A53)
     *   Configuration: ENABLED_LTO=1
     *   Without NUMA_AWARE_PER_CPU: Linux boots and runs stably
     *   After merging NUMA_AWARE_PER_CPU, Linux boots but hangs during runtime
     *   During the hang, CPUs are observed to unexpectedly re-enter EL3
     *   Re-entry into EL3 should not occur during normal Linux runtime execution and strongly suggests corruption or mismanagement of PSCI and/or per-CPU state(arm-trusted-firmware/lib/per_cpu/aarch64/per_cpu_asm.S at master * ARM-software/arm-trusted-firmwarehttps://github.com/ARM-software/arm-trusted-firmware/blob/master/lib/per_cpu/aarch64/per_cpu_asm.S#L28)
     *   Reverting the NUMA_AWARE_PER_CPU changes restores stable Linux execution
     *   The issue is reproducible only when NUMA_AWARE_PER_CPU is present
     *   This clearly identifies NUMA_AWARE_PER_CPU as the regression source
  2.  Suspect with LTO
     *   With NUMA_AWARE_PER_CPU enabled, LTO breaks the per-CPU base calculation
     *   BL31 contains hand-written assembly that relies on linker-script symbols (e.g., per-CPU section boundaries)
     *   Under LTO, symbol placement and retention are no longer guaranteed in the same way, leading to incorrect per-CPU base computation
     *   This results in corrupted per-CPU data and subsequent erroneous PSCI suspend behavior (EL3 re-entry)
  3.  CPU idle dependency
     *   The following kernel configuration options are enabled:
     *   CONFIG_CPU_IDLE=y
     *   CONFIG_CPU_IDLE_MULTIPLE_DRIVERS=y
     *   CONFIG_CPU_IDLE_GOV_MENU=y
     *   CONFIG_DT_IDLE_STATES=y
     *   This further suggests the issue is triggered during CPU idle / suspend-resume paths, where correct per-CPU state handling is critical
Based on the above:
*   This is specific to NUMA_AWARE_PER_CPU combined with LTO
  *   The failure mode points to per-CPU base calculation and PSCI state corruption
  *   Reverting NUMA_AWARE_PER_CPU fully restores stability on ZynqMP
We wanted to report this issue upstream and seek guidance on:
*   Whether NUMA_AWARE_PER_CPU is expected to be LTO-safe on platforms relying on linker-defined per-CPU sections
  *   Or if additional constraints / fixes are required for platforms like ZynqMP
We are happy to provide further logs, configuration details, or help to fixes.
Regards,
Prasad Kummari

2026

2025

2024

2023

2022

2021

2020

2019

2018

[TF-A] ZynqMP regression with NUMA_AWARE_PER_CPU changes with ENABLED_LTO : Linux runtime hang due to EL3 re-entry