Hi Manish,
In case of RAS errors originating while EL3 is running, would it make sense to create a bert record and then do a system restart as a best effort case, so that the user gets a chance to figure out the reason for the system restart. Probably, adding a platform level hook to the Current EL exception handlers would be appropriate for such error record creation ?
Thanks, Linu Cherian.
From: Manish Pandey2 Manish.Pandey2@arm.com Sent: Monday, August 4, 2025 5:52 PM To: Jaiprakash Singh jaiprakashs@marvell.com; tfa-lts@lists.trustedfirmware.org Cc: Linu Cherian lcherian@marvell.com; George Cherian gcherian@marvell.com Subject: [EXTERNAL] Re: RAS Handling In Current EL
The reason for not having RAS error handlers for errors originating from EL3 is that we do not expect any errors while executing in EL3. 1. EL3 is Expected to be Error-Free Execution in EL3 is considered highly trusted and minimal. The code ZjQcmQRYFpfptBannerStart
ZjQcmQRYFpfptBannerEnd The reason for not having RAS error handlers for errors originating from EL3 is that we do not expect any errors while executing in EL3. 1. EL3 is Expected to be Error-Free Execution in EL3 is considered highly trusted and minimal. The code running in EL3 is tightly controlled, and it is assumed to be free of faults under normal operating conditions. Consequently, the likelihood of encountering RAS errors within EL3 itself is negligible. 2. Handling Lower EL Errors During EL3 Execution To manage RAS errors originating from lower Exception Levels (EL2/EL1) that might surface while EL3 is executing, patches in the EL3 firmware ensure that these errors are properly isolated. This is typically done by synchronizing the error reporting to an exception boundary, thereby avoiding contamination of EL3 state with lower EL errors. 3. Critical Failure Implication of Genuine EL3 Errors If an error genuinely originates and manifests within EL3 itself-without involvement from lower ELs-it indicates a catastrophic failure. Such a situation implies severe hardware or firmware corruption, leaving the platform in an unrecoverable state. In these cases, the only viable recovery path is to initiate a full system restart. If in case we decide to go ahead with having handler for EL3 errors, I am not sure if LTS branches are a good place (will let LTS maintainers to reply)
________________________________ From: Jaiprakash Singh <jaiprakashs@marvell.commailto:jaiprakashs@marvell.com> Sent: 04 August 2025 11:39 To: tfa-lts@lists.trustedfirmware.orgmailto:tfa-lts@lists.trustedfirmware.org <tfa-lts@lists.trustedfirmware.orgmailto:tfa-lts@lists.trustedfirmware.org> Cc: Linu Cherian <lcherian@marvell.commailto:lcherian@marvell.com>; George Cherian <gcherian@marvell.commailto:gcherian@marvell.com> Subject: RAS Handling In Current EL
Hi,
When current_el = EL3,
1. TF-A does not implement RAS error handling for current EL. 2. For current EL, all the exception either enters report_unhandled_exception or report_unhandled_interrupt.Any reason for this?
There are platform hooks to handle RAS errors from lower EL (plat_ea_handler).
Can plat_ea_handler be extended to handled RAS errors in current EL?
Thanks
Regards,
Jaiprakash
IMPORTANT NOTICE: The contents of this email and any attachments are confidential and may also be privileged. If you are not the intended recipient, please notify the sender immediately and do not disclose the contents to any other person, use it for any purpose, or store or copy the information in any medium. Thank you.