[AMD Official Use Only - AMD Internal Distribution Only]


Hello Maintainers,

 

We are observing a reproducible runtime regression on the ZynqMP (Cortex-A53) platform after enabling LTO (ENABLED_LTO=1) and merging the changes from the topic NUMA_AWARE_PER_CPU into our integration branch (https://github.com/ARM-software/arm-trusted-firmware/commit/7303319b3823e9e33748d963e9173f3678aba4da).

 

Summary of the issue

  1. Baseline behavior
    • Platform: ZynqMP (Cortex-A53)
    • Configuration: ENABLED_LTO=1
    • Without NUMA_AWARE_PER_CPU: Linux boots and runs stably
    • After merging NUMA_AWARE_PER_CPU, Linux boots but hangs during runtime
    • During the hang, CPUs are observed to unexpectedly re-enter EL3
    • Re-entry into EL3 should not occur during normal Linux runtime execution and strongly suggests corruption or mismanagement of PSCI and/or per-CPU state(arm-trusted-firmware/lib/per_cpu/aarch64/per_cpu_asm.S at master · ARM-software/arm-trusted-firmware)
    • Reverting the NUMA_AWARE_PER_CPU changes restores stable Linux execution
    • The issue is reproducible only when NUMA_AWARE_PER_CPU is present
    • This clearly identifies NUMA_AWARE_PER_CPU as the regression source
  2. Suspect with LTO
    • With NUMA_AWARE_PER_CPU enabled, LTO breaks the per-CPU base calculation
    • BL31 contains hand-written assembly that relies on linker-script symbols (e.g., per-CPU section boundaries)
    • Under LTO, symbol placement and retention are no longer guaranteed in the same way, leading to incorrect per-CPU base computation
    • This results in corrupted per-CPU data and subsequent erroneous PSCI suspend behavior (EL3 re-entry)
  3. CPU idle dependency
    • The following kernel configuration options are enabled:
    • CONFIG_CPU_IDLE=y
    • CONFIG_CPU_IDLE_MULTIPLE_DRIVERS=y
    • CONFIG_CPU_IDLE_GOV_MENU=y
    • CONFIG_DT_IDLE_STATES=y
    • This further suggests the issue is triggered during CPU idle / suspend-resume paths, where correct per-CPU state handling is critical

 

Based on the above:

We wanted to report this issue upstream and seek guidance on:

 

We are happy to provide further logs, configuration details, or help to fixes.

 

Regards,
Prasad Kummari