Hi James & TF-A guys,
When hest acpi table configure Hardware Error Notification type as
Software Delegated Exception(0x0B) for RAS event, kernel RAS interacts with
TF-A by SDEI mechanism. On the firmware first system, kernel was notified by
TF-A sdei call.
The calling flow like as below when fatal RAS error happens:
TF-A notify kernel flow:
sdei_dispatch_event()
ehf_activate_priority()
call sdei callback // callback registered by kerenl
ehf_deactivate_priority()
Kernel sdei callback:
sdei_asm_handler()
__sdei_handler()
_sdei_handler()
sdei_event_handler()
ghes_sdei_critical_callback()
ghes_in_nmi_queue_one_entry()
/* if RAS error is fatal */
__ghes_panic()
panic()
If fatal RAS error occured, panic was called in sdei_asm_handle()
without ehf_deactivate_priority executed, which lead interrupt masked.
If interrupt masked, system would be halted in kdump flow like this:
arm-smmu-v3 arm-smmu-v3.3.auto: allocated 65536 entries for cmdq
arm-smmu-v3 arm-smmu-v3.3.auto: allocated 32768 entries for evtq
arm-smmu-v3 arm-smmu-v3.3.auto: allocated 65536 entries for priq
arm-smmu-v3 arm-smmu-v3.3.auto: SMMU currently enabled! Resetting...
So interrupt should be restored before panic otherwise kdump will hang.
In the process of sdei, a SDEI_EVENT_COMPLETE(or SDEI_EVENT_COMPLETE_AND_RESUME)
call should be called before panic for a completed run of ehf_deactivate_priority().
The ehf_deactivate_priority() function restore pmr_el1 to original value(>0x80).
The SDEI dispatch flow was broken if SDEI_EVENT_COMPLETE was not be called.
This will bring about two issue:
1 Kdump will hang for firmware reporting fatal RAS event by SDEI;
(as explain above)
2 For NMI scene,TF-A enable a secure timer, the PPI 29 will trigger periodically.
Kernel register a callback for hard lockup. The below code will not be
called when panic in kernel callback:
TF-A, services/std_svc/sdei/sdei_intr_mgmt.c sdei_intr_handler():
/*
* We reach here when client completes the event.
*
* If the cause of dispatch originally interrupted the Secure world,
* resume Secure.
*
* No need to save the Non-secure context ahead of a world switch: the
* Non-secure context was fully saved before dispatch, and has been
* returned to its pre-dispatch state.
*/
if (sec_state == SECURE)
restore_and_resume_secure_context();
/*
* The event was dispatched after receiving SDEI interrupt. With
* the event handling completed, EOI the corresponding
* interrupt.
*/
if ((map->ev_num != SDEI_EVENT_0) && !is_map_bound(map)) {
ERROR("Invalid SDEI mapping: ev=%u\n", map->ev_num);
panic();
}
plat_ic_end_of_interrupt(intr_raw);
How to fix above issues?
I think the root cause is that kernel broken the SDEI dispatch flow, so kernel
should modify to fix these issues.
Thanks,
Ming
This event has been canceled with this note:
"No topic to be presented this week.
Cancelling meeting."
Title: TF-A Tech Forum
We run an open technical forum call for anyone to participate and it is not
restricted to Trusted Firmware project members. It will operate under the
guidance of the TF TSC. Feel free to forward this invite to
colleagues. Invites are via the TF-A mailing list and also published on the
Trusted Firmware website. Details are
here: https://www.trustedfirmware.org/meetings/tf-a-technical-forum/Tr…
Firmware is inviting you to a scheduled Zoom meeting.Join Zoom
Meetinghttps://zoom.us/j/9159704974Meeting ID: 915 970 4974One tap
mobile+16465588656,,9159704974# US (New York)+16699009128,,9159704974# US
(San Jose)Dial by your location +1 646 558
8656 US (New York) +1 669 900
9128 US (San Jose) 877 853 5247 US
Toll-free 888 788 0099 US Toll-freeMeeting ID:
915 970 4974Find your local
number: https://zoom.us/u/ad27hc6t7h
When: Thu Feb 24, 2022 4pm – 5pm United Kingdom Time
Calendar: tf-a(a)lists.trustedfirmware.org
Who:
* Bill Fletcher - creator
* marek.bykowski(a)gmail.com
* okash.khawaja(a)gmail.com
* tf-a(a)lists.trustedfirmware.org
Invitation from Google Calendar: https://calendar.google.com/calendar/
You are receiving this courtesy email at the account
tf-a(a)lists.trustedfirmware.org because you are an attendee of this event.
To stop receiving future updates for this event, decline this event.
Alternatively you can sign up for a Google account at
https://calendar.google.com/calendar/ and control your notification
settings for your entire calendar.
Forwarding this invitation could allow any recipient to send a response to
the organizer and be added to the guest list, or invite others regardless
of their own invitation status, or to modify your RSVP. Learn more at
https://support.google.com/calendar/answer/37135#forwarding
Hello.
This threads originates in a trivial fix for the 'clean' Makefile target.
All contributions, even cosmetic and/or from the outside world, must
follow the same formal process. Apparently, the process fails for
external contributors.
The discussion is visible there:
https://lists.trustedfirmware.org/archives/list/tf-a@lists.trustedfirmware.…
At this moment, there were more people involved in the discussion than
letters affected by the patch, so I was invited to switch to a private
mail exchange. I was asked to describe the error messages, did so, and
was forgotten ever since.
Almost a year has passed, the patch is neither refused nor applied.
The connection error still prevents some/all external contributions.
The only effect of my request so far is that I have been obliged to
create accounts on github and your gerrit instance.
Is there hope for a more satisfying conclusion?
This event has been canceled with this note:
"No topics prepared for this week."
Title: TF-A Tech Forum
We run an open technical forum call for anyone to participate and it is not
restricted to Trusted Firmware project members. It will operate under the
guidance of the TF TSC. Feel free to forward this invite to
colleagues. Invites are via the TF-A mailing list and also published on the
Trusted Firmware website. Details are
here: https://www.trustedfirmware.org/meetings/tf-a-technical-forum/Tr…
Firmware is inviting you to a scheduled Zoom meeting.Join Zoom
Meetinghttps://zoom.us/j/9159704974Meeting ID: 915 970 4974One tap
mobile+16465588656,,9159704974# US (New York)+16699009128,,9159704974# US
(San Jose)Dial by your location +1 646 558
8656 US (New York) +1 669 900
9128 US (San Jose) 877 853 5247 US
Toll-free 888 788 0099 US Toll-freeMeeting ID:
915 970 4974Find your local
number: https://zoom.us/u/ad27hc6t7h
When: Thu Feb 10, 2022 4pm – 5pm United Kingdom Time
Calendar: tf-a(a)lists.trustedfirmware.org
Who:
* Bill Fletcher - creator
* marek.bykowski(a)gmail.com
* okash.khawaja(a)gmail.com
* tf-a(a)lists.trustedfirmware.org
Invitation from Google Calendar: https://calendar.google.com/calendar/
You are receiving this courtesy email at the account
tf-a(a)lists.trustedfirmware.org because you are an attendee of this event.
To stop receiving future updates for this event, decline this event.
Alternatively you can sign up for a Google account at
https://calendar.google.com/calendar/ and control your notification
settings for your entire calendar.
Forwarding this invitation could allow any recipient to send a response to
the organizer and be added to the guest list, or invite others regardless
of their own invitation status, or to modify your RSVP. Learn more at
https://support.google.com/calendar/answer/37135#forwarding
Hi,
With RME enabled FVP_Base_RevC_2xAEMvA, we are trying to have TF-A's BL31
successfully exit EL3 and jump to a normal world boot firmware (edk2 UEFI
boot loader) instead of tftf.bin. Yet, it fails with the following log
messages (showing the last 3):
INFO: BL31: Preparing for EL3 exit to normal world
INFO: Entry point address = 0x88000000
INFO: SPSR = 0x3c9
We could boot and run RMM and tftf.bin successfully following the
instructions on the TF-A documentation page (
https://trustedfirmware-a.readthedocs.io/en/latest/components/realm-managem…).
While keeping build and run commands same, we tried to just replace the
tftf.bin with FVP_AARCH64_EFI.fd, the build artifact of the edk2-platform
for ARM (
https://github.com/tianocore/edk2-platforms/tree/master/Platform/ARM)
We checked both tftf.bin and FVP_AARCH64_EFI.fd cases result in the same
entry point address 0x88000000. Yet, the latter stops after exiting EL3 as
aforementioned, unlike tftf.bin which successfully proceeds to run some
tests afterwards. Also, we found that FVP_AARCH64_EFI.fd can boot
successfully with the same fast model but without RME enabled.
What is the possible reason for this symptom and the necessary tweaks we
should do to address this issue? What should we look for to get some clue?
Cheers,
Hi Yusuf,
I hope you have gone through https://trustedfirmware-a.readthedocs.io/en/latest/getting_started/porting-…
1. Distinguishing between a cold boot and a warm boot.
2. In the case of a cold boot and the CPU being a secondary CPU, ensuring that the CPU is placed in a platform-specific state until the primary CPU performs the necessary steps to remove it from this state.
3. In the case of a warm boot, ensuring that the CPU jumps to a platform- specific address in the BL31 image in the same processor mode as it was when released from reset.
Secondary cores are kept in TF-A holding pen until primary core makes a request to start secondary core(from OS through PSCI CPU_ON). On receiving this call primary breaks the condition which held secondary. For example, investigate a5ds platform's plat_secondary_cold_boot_setup() & a5ds_pwr_domain_on(). Platform also provides warm_boot_entrypoint (most platform uses bl31_warm_entrypoint) from where secondary starts execution.
Primary core is responsible for platform initialization using platform helper functions mentioned https://trustedfirmware-a.readthedocs.io/en/latest/getting_started/porting-… (make sure your platform has implemented all the mandatory hooks).
Hope this helps
thanks
Manish
________________________________
From: Mohd Yusuf Abdul Hamid via TF-A <tf-a(a)lists.trustedfirmware.org>
Sent: 07 February 2022 02:32
To: tf-a(a)lists.trustedfirmware.org <tf-a(a)lists.trustedfirmware.org>
Subject: [TF-A] How secondary core(s) move from TF-A into Kernel space using PSCI - 4 x A55 ?
Hi,
I have been stuck at this problem for more than a week. Hopefully good folks here can help clarify a few things.
Platform 4x Cortex A55 single cluster.
What I got working:
1. I can boot single core kernel to shell using TFA bl31
Baremetal (bare minimum startup + platform specific SOC enablement, EL3) -> TFA bl31 -> Kernel
2. I added PSCI in DT and can see the hook trigger service and hotplug secondary core in.
Secondary core woke up:
1. Bare minimum startup (skip SOC specific enablement) -> TFA bl31 -> go thru 'plat_secondary_cold_boot_setup' path, using 'RESET_TO_BL31:=1'
Now, I am not sure how from there, the secondary core would jump to:
a. If jump to kernel's 'secondary_holding_pen' it looks like it would drop from EL3 -> EL1 and wait (however at this point Core0 is already in cpu_idle) and won't continue
a.1 For this case, I am also not sure why I hit "instruction abort" in core1 - from what I read MMU hasnt been set up, which is true. I also wonder at what point MMU is set up for this path in the secondary core?
b. If jump to 'secondary_entry' I believe the core is still in EL3 at this point and I will get an exception at 'set_cpu_boot_mode_flag'
c. If someone can summarize what are the minimum requirements for the secondary core to get set up before jumping to 'secondary_holding_pen'/'secondary_entry' whichever is applicable.
Any pointers would be much appreciated.
ps: I have access to Trace32.
Mohd Yusuf Abdul Hamid
Hello, Everyone,
If I want to add a new platform support in TF-A for RK3566 as an example,
what Documentation do I need to read.
Using RK3399 as a contrast ( because most of RK3399 doc is opened in
internet ), we already know this SoC is supported in OPTEE and TF-A. And I
can get RK3399 Docs:* TRM V1.3 Part 1*, T*RM V1.3 Part2*, *TRM V1.4 Part 1*,
*Datasheet V2.1*. I can see in *TRM chapter 16 System Security, *there are
some descriptions about system security, and references to other system
registers, like *SGRF, *etc, but it still seems to me insufficiently ( No
SGRF description ) to finish a full support platform implementation in
TF-A. Some people said I need to sign an NDA with Rockchip to get Security
related part docs. But when I reach to Rockchip, they said all docs are
opened already, No NDA options. When I talked to one partner/distributor
of Rockchip, only security related doc is also some doc I can find on
internet.
So I am curious and confused, can I, as a third party developer, develop a
new platform implementation for TF-A / OPTEE ( specially for Rockchip
Platform )?
Thanks
Hi,
I have been stuck at this problem for more than a week. Hopefully good
folks here can help clarify a few things.
Platform 4x Cortex A55 single cluster.
What I got working:
1. I can boot single core kernel to shell using TFA bl31
Baremetal (bare minimum startup + platform specific SOC enablement,
EL3) -> TFA bl31 -> Kernel
2. I added PSCI in DT and can see the hook trigger service and hotplug
secondary core in.
Secondary core woke up:
1. Bare minimum startup (skip SOC specific enablement) -> TFA bl31 -> go
thru 'plat_secondary_cold_boot_setup' path, using 'RESET_TO_BL31:=1'
Now, I am not sure how from there, the secondary core would jump to:
a. If jump to kernel's 'secondary_holding_pen' it looks like it would drop
from EL3 -> EL1 and wait (however at this point Core0 is already in
cpu_idle) and won't continue
a.1 For this case, I am also not sure why I hit "instruction abort" in
core1 - from what I read MMU hasnt been set up, which is true. I also
wonder at what point MMU is set up for this path in the secondary core?
b. If jump to 'secondary_entry' I believe the core is still in EL3 at this
point and I will get an exception at 'set_cpu_boot_mode_flag'
c. If someone can summarize what are the minimum requirements for the
secondary core to get set up before jumping to
'secondary_holding_pen'/'secondary_entry' whichever is applicable.
Any pointers would be much appreciated.
ps: I have access to Trace32.
Mohd Yusuf Abdul Hamid