[resending as the previous email was sent to the wrong address]
Hi,
Looking into the mail chain below, this is probably being tested on RD-N1-Edge platform. There was regression noticed in the dmc620 ras error handling in the code pushed to Linaro for RD-N1-Edge platform. This will be fixed later today and patches will be merged into Linaro repos. It should then be accessible using the usual repo init/sync commands.
Thanks,
Thomas.
> -----Original Message-----
> From: TF-A <tf-a-bounces(a)lists.trustedfirmware.org> On Behalf Of Olivier
> Deprez via TF-A
> Sent: Tuesday, April 21, 2020 4:45 PM
> To: TF-A <tf-a-bounces(a)lists.trustedfirmware.org>; Raghu K via TF-A <tf-
> a(a)lists.trustedfirmware.org>; 吴斌(郅隆) <zhilong.wb(a)alibaba-inc.com>
> Subject: Re: [TF-A] 回复:Re: 回复:Re: [RAS] BL32 UnRecognized Event -
> 0xC4000061 and BL31 Crashed
>
> Hi Raghu,
>
> Yes you're right, we probably need few return code checks here and here. I
> may submit a patch and verify it doesn't break anything else.
>
> Hi Bin Wu,
>
> I had noticed the following sequence originating from linux sdei driver init
> down to TF-A:
>
> INFO: SDEI: Private events initialized on 81000100
> INFO: SDEI: Private events initialized on 81000200
> INFO: SDEI: Private events initialized on 81000300
> INFO: SDEI: Private events initialized on 81010000
> INFO: SDEI: Private events initialized on 81010100
> INFO: SDEI: Private events initialized on 81010200
> INFO: SDEI: Private events initialized on 81010300
> INFO: SDEI: > VER
> INFO: SDEI: < VER:1000000000000
> INFO: SDEI: > P_RESET():81000000
> INFO: SDEI: < P_RESET:0
> INFO: SDEI: > P_RESET():81000200
> INFO: SDEI: < P_RESET:0
> INFO: SDEI: > P_RESET():81000300
> INFO: SDEI: < P_RESET:0
> INFO: SDEI: > P_RESET():81010000
> INFO: SDEI: < P_RESET:0
> INFO: SDEI: > P_RESET():81010100
> INFO: SDEI: < P_RESET:0
> INFO: SDEI: > P_RESET():81010200
> INFO: SDEI: < P_RESET:0
> INFO: SDEI: > P_RESET():81010300
> INFO: SDEI: < P_RESET:0
> INFO: SDEI: > P_RESET():81000100
> INFO: SDEI: < P_RESET:0
> INFO: SDEI: > S_RESET():81000100
> INFO: SDEI: < S_RESET:0
> INFO: SDEI: > UNMASK:81000000
> INFO: SDEI: < UNMASK:0
> INFO: SDEI: > UNMASK:81000100
> INFO: SDEI: < UNMASK:0
> INFO: SDEI: > UNMASK:81000200
> INFO: SDEI: < UNMASK:0
> INFO: SDEI: > UNMASK:81000300
> INFO: SDEI: < UNMASK:0
> INFO: SDEI: > UNMASK:81010000
> INFO: SDEI: < UNMASK:0
> INFO: SDEI: > UNMASK:81010100
> INFO: SDEI: < UNMASK:0
> INFO: SDEI: > UNMASK:81010200
> INFO: SDEI: < UNMASK:0
> INFO: SDEI: > UNMASK:81010300
> INFO: SDEI: < UNMASK:0
> INFO: SDEI: > INFO(n:804, 0)
> INFO: SDEI: < INFO:0
> INFO: SDEI: > INFO(n:805, 0)
> INFO: SDEI: < INFO:0
>
> There is an Sdei Info request about events 804 and 805.
> Although I don't see any register or enable event service call, so I wonder if
> this demo code is missing something or expects that the platform
> implements such event definition natively.
>
> This does not look like flows described in https://trustedfirmware-
> a.readthedocs.io/en/latest/components/sdei.html
> for regular SDEI usage or explicit dispatch of events.
>
> Maybe we should involve Linaro ppl on the expected init sequence and
> dependency to TF-A (platform files).
>
> Regards,
> Olivier.
>
>
> ________________________________________
> From: TF-A <tf-a-bounces(a)lists.trustedfirmware.org> on behalf of 吴斌(郅
> 隆) via TF-A <tf-a(a)lists.trustedfirmware.org>
> Sent: 21 April 2020 08:45
> To: TF-A; Raghu K via TF-A
> Subject: [TF-A] 回复:Re: 回复:Re: [RAS] BL32 UnRecognized Event -
> 0xC4000061 and BL31 Crashed
>
> Hi Olivier and All,
>
> Thank you so much for your help. It makes me understand the internals.
> The next step, I need to check this event_num(804) register flow in kernel
> side, am I right?
>
>
> BRs,
> Bin Wu
> ------------------原始邮件 ------------------
> 发件人:TF-A <tf-a-bounces(a)lists.trustedfirmware.org>
> 发送时间:Tue Apr 21 09:51:49 2020
> 收件人:Raghu K via TF-A <tf-a(a)lists.trustedfirmware.org>
> 主题:Re: [TF-A] 回复:Re: [RAS] BL32 UnRecognized Event - 0xC4000061 and
> BL31 Crashed
> Nice debug! Apart from the issue you pointed out, there is also the
> issue with not checking the return code. The ras handler should really
> be checking or panic'ing if there is an unexpected error code from
> spm_sp_call and sdei_dispatch_event.
>
> -Raghu
>
> On 4/20/20 2:37 PM, Olivier Deprez via TF-A wrote:
> > Hi Bin Wu,
> >
> > Here's an early observation. On receiving the RAS fiq interrupt the
> following occurs:
> >
> > ehf_el3_interrupt_handler => sgi_ras_intr_handler => spm_sp_call
> (enters/exit the SP to handle the injected RAS error) => sdei_dispatch_event
> >
> > se = get_event_entry(map);
> > if (!can_sdei_state_trans(se, DO_DISPATCH))
> > return -1;
> >
> > p *map
> > $6 = {ev_num = 804, intr = 0, map_flags = 112, reg_count = 0, lock = {lock =
> 0}}
> > p *se
> > $4 = {ep = 0, arg = 0, affinity = 0, reg_flags = 0, state = 0 '\0'}
> >
> > sdei_dispatch_event exits in error at this stage, this does not seem a
> correct behavior.
> > The SDEI handler is not called in NS world and context remains unchanged.
> > The interrupt handler blindly returns to S-EL1 SP context at same location
> where it last exited.
> > sgi_ras_intr_handler => ehf_el3_interrupt_handler => vector_entry
> fiq_aarch64 => el3_exit => re-enters the SP with X0=0xC4000061
> > SP then exits but the EL3 context has not been setup for SP entry leading
> to crash.
> >
> > IMO there is an issue around mapping SDEI event number to RAS interrupt
> number leading to sdei_dispatch_event exiting early.
> >
> > Regards,
> > Olivier.
> >
> >
> > ________________________________________
> > From: TF-A on behalf of Matteo Carlini via TF-A
> > Sent: 14 April 2020 10:41
> > To: 吴斌(郅隆); tf-a(a)lists.trustedfirmware.org; Thomas Abraham; Deepak
> Pandey
> > Cc: nd
> > Subject: Re: [TF-A] 回复:Re: [RAS] BL32 UnRecognized Event - 0xC4000061
> and BL31 Crashed
> >
> > Looping-in Thomas & Deepak, responsible for the RD-N1 landing team
> platforms releases. They might be able to help.
> >
> > Thanks
> > Matteo
> >
> > From: TF-A On Behalf Of ??(??) via TF-A
> > Sent: 14 April 2020 06:47
> > To: TF-A ; Raghu Krishnamurthy via TF-A
> > Subject: [TF-A] 回复:Re: [RAS] BL32 UnRecognized Event - 0xC4000061
> and BL31 Crashed
> >
> > Hi RagHu,
> >
> > Really appreciate your help.
> >
> > I was downloaded this software stack from git.linaro.org. This software
> stack include ATF, kernel, edk2 and so on.
> > The user guide i used from linaro is:https://git.linaro.org/landing-
> teams/working/arm/arm-reference-
> platforms.git/about/docs/rdn1edge/user-guide.rst#obtaining-the-rd-n1-
> edge-and-rd-n1-edge-dual-fast-model
> >
> > 1) What platform you are running on? Can this issue be reproduced
> > outside your testing environment, perhaps on FVP or QEMU?
> > A: I am running on ARM N1-Edge FVP platform. It can reproduced on this
> FVP platform.
> >
> > 2) What version of TF-A and StandaloneMM is being used? Preferably the
> > commit-id, so that we can be sure we are looking at the same code.
> > A: TF-A: https://git.linaro.org/landing-teams/working/arm/arm-tf.git
> tag:RD-INFRA-20191024-RC0
> > StandloneMM seems build from edk2 & edk2-platform. so i just put edk2
> and edk2-platform version information. if anything i missed, please let me
> know.
> > edk2: https://git.linaro.org/landing-teams/working/arm/edk2.git tag:RD-
> INFRA-20191024-RC0
> > edk2-platform: https://git.linaro.org/landing-teams/working/arm/edk2-
> platforms.git tag:RD-INFRA-20191024-RC0
> >
> > 3) What version of the kernel and sdei driver is being used?
> > A: kernel-release: https://git.linaro.org/landing-
> teams/working/arm/kernel-release.git tag:RD-INFRA-20191024-RC0
> > The sdei driver was included in kernel, do i need to provide sdei driver
> version? If need please let me know.
> > 4) I can't tell from looking at the log but do you know if writing 0x123
> > to sde_ras_poison causes a DMC620 interrupt or an SError or external
> > abort through memory access ?
> > A: Sorry, linaro only refered it will inject the DMC-620 single-bit RAS error.
> So I am also not sure which exception type it will trigger.
> >
> > BRs,
> > Bin Wu
> >
> > ------------------原始邮件 ------------------
> > 发件人:TF-A >
> > 发送时间:Tue Apr 14 01:25:47 2020
> > 收件人:Raghu Krishnamurthy via TF-A >
> > 主题:Re: [TF-A] [RAS] BL32 UnRecognized Event - 0xC4000061 and BL31
> Crashed
> > Hello,
> >
> > >>Does BL31 need to send 0xC4000061 event to BL32 again?
> >
> > I don't think it will. It is really odd that
> > 0xC4000061(SP_EVENT_COMPLETE_AARCH64) ever reaches the BL32/MM
> handler.
> > This is from looking at the upstream code quickly but it definitely
> > depends on the platform you are running, what version of TF-A you are
> > using, build options used. Is it possible that the unhandled exception
> > is occurring after successful handling of the DMC620 error but there is
> > a following issue that occurs right after, causing the crash?
> > From the register dump it looks like there was an Instruction abort
> > exception at address 0 while running in EL3. Something seems to have
> > gone seriously wrong to have 0xC4000061 ever go back to BL32 and to get
> > an instruction abort at address 0.
> >
> > >>Does current TF-A support to run RAS test? It seems BL31 will crash.
> > See above. The answer really depends on the factors mentioned above.
> >
> > The following would be helpful to know:
> > 1) What platform you are running on? Can this issue be reproduced
> > outside your testing environment, perhaps on FVP or QEMU?
> > 2) What version of TF-A and StandaloneMM is being used? Preferably the
> > commit-id, so that we can be sure we are looking at the same code.
> > 3) What version of the kernel and sdei driver is being used?
> > 4) I can't tell from looking at the log but do you know if writing 0x123
> > to sde_ras_poison causes a DMC620 interrupt or an SError or external
> > abort through memory access ?
> >
> > Thanks
> > Raghu
> >
> >
> > On 4/13/20 12:16 AM, 吴斌(郅隆) via TF-A wrote:
> >> Dear Friends,
> >>
> >> I am using TF-A to test RAS feature.
> >> When I triggered DMC620 RAS error in Linux(echo 0x123 >
> >> /sys/kernel/debug/sdei_ras_poison).
> >> BL32 will recieve
> >> UnRecognized Event - 0xC4000061(SP_EVENT_COMPLETE_AARCH64) and
> finally
> >> BL31 crashed.
> >>
> >> In my understanding, this 0xC4000061 should consumed by BL31, not
> send
> >> it to BL32 again.
> >>
> >> A piece of error log as below:
> >>
> >> *************************************
> >>
> >> CperWrite - CperAddress@0xFF610064
> >> CperWrite - 1 Section@FFBE91A8, Length 80, SectionType@FFBE9138
> >> CperWrite - Got Error Section: Platform Memory.
> >> MmEntryPoint Done
> >> Received delegated event
> >> X0 : 0xC4000061
> >> X1 : 0x0
> >> X2 : 0x0
> >> X3 : 0x0
> >> Received event - 0xC4000061 on cpu 0
> >> UnRecognized Event - 0xC4000061
> >> Failed delegated event 0xC4000061, Status 0x2
> >> Unhandled Exception in EL3.
> >> x30 = 0x0000000000000000
> >> x0 = 0x00000000ff007e00
> >> x1 = 0xfffffffffffffffe
> >> x2 = 0x00000000600003c0
> >> x3 = 0x0000000000000000
> >> x4 = 0x0000000000000000
> >> x5 = 0x0000000000000000
> >> x6 = 0x00000000ff015080
> >> x7 = 0x0000000000000000
> >> x8 = 0x00000000c4000061
> >> x9 = 0x0000000000000021
> >> x10 = 0x0000000000000040
> >> x11 = 0x00000000ff00f2b0
> >> x12 = 0x00000000ff0118c0
> >> x13 = 0x0000000000000002
> >> x14 = 0x00000000ff016b70
> >> x15 = 0x00000000ff003f20
> >> x16 = 0x0000000000000044
> >> x17 = 0x00000000ff010430
> >> x18 = 0x0000000000000e3c
> >> x19 = 0x0000000000000000
> >> More error log please refer to attachment.
> >>
> >> My question is,
> >> 1. Does BL31 need to send 0xC4000061 event to BL32 again?
> >> 2. Does current TF-A support to run RAS test? It seems BL31 will crash.
> >>
> >> Appreciate your help.
> >>
> >> BRs,
> >> Bin Wu
> >>
> > --
> > TF-A mailing list
> > TF-A(a)lists.trustedfirmware.org
> > https://lists.trustedfirmware.org/mailman/listinfo/tf-a
>
> --
> TF-A mailing list
> TF-A(a)lists.trustedfirmware.org
> https://lists.trustedfirmware.org/mailman/listinfo/tf-a
> IMPORTANT NOTICE: The contents of this email and any attachments are
> confidential and may also be privileged. If you are not the intended
> recipient, please notify the sender immediately and do not disclose the
> contents to any other person, use it for any purpose, or store or copy the
> information in any medium. Thank you.
> --
> TF-A mailing list
> TF-A(a)lists.trustedfirmware.org
> https://lists.trustedfirmware.org/mailman/listinfo/tf-a
IMPORTANT NOTICE: The contents of this email and any attachments are confidential and may also be privileged. If you are not the intended recipient, please notify the sender immediately and do not disclose the contents to any other person, use it for any purpose, or store or copy the information in any medium. Thank you.
Hi Raghu,
Yes you're right, we probably need few return code checks here and here. I may submit a patch and verify it doesn't break anything else.
Hi Bin Wu,
I had noticed the following sequence originating from linux sdei driver init down to TF-A:
INFO: SDEI: Private events initialized on 81000100
INFO: SDEI: Private events initialized on 81000200
INFO: SDEI: Private events initialized on 81000300
INFO: SDEI: Private events initialized on 81010000
INFO: SDEI: Private events initialized on 81010100
INFO: SDEI: Private events initialized on 81010200
INFO: SDEI: Private events initialized on 81010300
INFO: SDEI: > VER
INFO: SDEI: < VER:1000000000000
INFO: SDEI: > P_RESET():81000000
INFO: SDEI: < P_RESET:0
INFO: SDEI: > P_RESET():81000200
INFO: SDEI: < P_RESET:0
INFO: SDEI: > P_RESET():81000300
INFO: SDEI: < P_RESET:0
INFO: SDEI: > P_RESET():81010000
INFO: SDEI: < P_RESET:0
INFO: SDEI: > P_RESET():81010100
INFO: SDEI: < P_RESET:0
INFO: SDEI: > P_RESET():81010200
INFO: SDEI: < P_RESET:0
INFO: SDEI: > P_RESET():81010300
INFO: SDEI: < P_RESET:0
INFO: SDEI: > P_RESET():81000100
INFO: SDEI: < P_RESET:0
INFO: SDEI: > S_RESET():81000100
INFO: SDEI: < S_RESET:0
INFO: SDEI: > UNMASK:81000000
INFO: SDEI: < UNMASK:0
INFO: SDEI: > UNMASK:81000100
INFO: SDEI: < UNMASK:0
INFO: SDEI: > UNMASK:81000200
INFO: SDEI: < UNMASK:0
INFO: SDEI: > UNMASK:81000300
INFO: SDEI: < UNMASK:0
INFO: SDEI: > UNMASK:81010000
INFO: SDEI: < UNMASK:0
INFO: SDEI: > UNMASK:81010100
INFO: SDEI: < UNMASK:0
INFO: SDEI: > UNMASK:81010200
INFO: SDEI: < UNMASK:0
INFO: SDEI: > UNMASK:81010300
INFO: SDEI: < UNMASK:0
INFO: SDEI: > INFO(n:804, 0)
INFO: SDEI: < INFO:0
INFO: SDEI: > INFO(n:805, 0)
INFO: SDEI: < INFO:0
There is an Sdei Info request about events 804 and 805.
Although I don't see any register or enable event service call, so I wonder if this demo code is missing something or expects that the platform implements such event definition natively.
This does not look like flows described in https://trustedfirmware-a.readthedocs.io/en/latest/components/sdei.html
for regular SDEI usage or explicit dispatch of events.
Maybe we should involve Linaro ppl on the expected init sequence and dependency to TF-A (platform files).
Regards,
Olivier.
________________________________________
From: TF-A <tf-a-bounces(a)lists.trustedfirmware.org> on behalf of 吴斌(郅隆) via TF-A <tf-a(a)lists.trustedfirmware.org>
Sent: 21 April 2020 08:45
To: TF-A; Raghu K via TF-A
Subject: [TF-A] 回复:Re: 回复:Re: [RAS] BL32 UnRecognized Event - 0xC4000061 and BL31 Crashed
Hi Olivier and All,
Thank you so much for your help. It makes me understand the internals.
The next step, I need to check this event_num(804) register flow in kernel side, am I right?
BRs,
Bin Wu
------------------原始邮件 ------------------
发件人:TF-A <tf-a-bounces(a)lists.trustedfirmware.org>
发送时间:Tue Apr 21 09:51:49 2020
收件人:Raghu K via TF-A <tf-a(a)lists.trustedfirmware.org>
主题:Re: [TF-A] 回复:Re: [RAS] BL32 UnRecognized Event - 0xC4000061 and BL31 Crashed
Nice debug! Apart from the issue you pointed out, there is also the
issue with not checking the return code. The ras handler should really
be checking or panic'ing if there is an unexpected error code from
spm_sp_call and sdei_dispatch_event.
-Raghu
On 4/20/20 2:37 PM, Olivier Deprez via TF-A wrote:
> Hi Bin Wu,
>
> Here's an early observation. On receiving the RAS fiq interrupt the following occurs:
>
> ehf_el3_interrupt_handler => sgi_ras_intr_handler => spm_sp_call (enters/exit the SP to handle the injected RAS error) => sdei_dispatch_event
>
> se = get_event_entry(map);
> if (!can_sdei_state_trans(se, DO_DISPATCH))
> return -1;
>
> p *map
> $6 = {ev_num = 804, intr = 0, map_flags = 112, reg_count = 0, lock = {lock = 0}}
> p *se
> $4 = {ep = 0, arg = 0, affinity = 0, reg_flags = 0, state = 0 '\0'}
>
> sdei_dispatch_event exits in error at this stage, this does not seem a correct behavior.
> The SDEI handler is not called in NS world and context remains unchanged.
> The interrupt handler blindly returns to S-EL1 SP context at same location where it last exited.
> sgi_ras_intr_handler => ehf_el3_interrupt_handler => vector_entry fiq_aarch64 => el3_exit => re-enters the SP with X0=0xC4000061
> SP then exits but the EL3 context has not been setup for SP entry leading to crash.
>
> IMO there is an issue around mapping SDEI event number to RAS interrupt number leading to sdei_dispatch_event exiting early.
>
> Regards,
> Olivier.
>
>
> ________________________________________
> From: TF-A on behalf of Matteo Carlini via TF-A
> Sent: 14 April 2020 10:41
> To: 吴斌(郅隆); tf-a(a)lists.trustedfirmware.org; Thomas Abraham; Deepak Pandey
> Cc: nd
> Subject: Re: [TF-A] 回复:Re: [RAS] BL32 UnRecognized Event - 0xC4000061 and BL31 Crashed
>
> Looping-in Thomas & Deepak, responsible for the RD-N1 landing team platforms releases. They might be able to help.
>
> Thanks
> Matteo
>
> From: TF-A On Behalf Of ??(??) via TF-A
> Sent: 14 April 2020 06:47
> To: TF-A ; Raghu Krishnamurthy via TF-A
> Subject: [TF-A] 回复:Re: [RAS] BL32 UnRecognized Event - 0xC4000061 and BL31 Crashed
>
> Hi RagHu,
>
> Really appreciate your help.
>
> I was downloaded this software stack from git.linaro.org. This software stack include ATF, kernel, edk2 and so on.
> The user guide i used from linaro is:https://git.linaro.org/landing-teams/working/arm/arm-reference-platforms…
>
> 1) What platform you are running on? Can this issue be reproduced
> outside your testing environment, perhaps on FVP or QEMU?
> A: I am running on ARM N1-Edge FVP platform. It can reproduced on this FVP platform.
>
> 2) What version of TF-A and StandaloneMM is being used? Preferably the
> commit-id, so that we can be sure we are looking at the same code.
> A: TF-A: https://git.linaro.org/landing-teams/working/arm/arm-tf.git tag:RD-INFRA-20191024-RC0
> StandloneMM seems build from edk2 & edk2-platform. so i just put edk2 and edk2-platform version information. if anything i missed, please let me know.
> edk2: https://git.linaro.org/landing-teams/working/arm/edk2.git tag:RD-INFRA-20191024-RC0
> edk2-platform: https://git.linaro.org/landing-teams/working/arm/edk2-platforms.git tag:RD-INFRA-20191024-RC0
>
> 3) What version of the kernel and sdei driver is being used?
> A: kernel-release: https://git.linaro.org/landing-teams/working/arm/kernel-release.git tag:RD-INFRA-20191024-RC0
> The sdei driver was included in kernel, do i need to provide sdei driver version? If need please let me know.
> 4) I can't tell from looking at the log but do you know if writing 0x123
> to sde_ras_poison causes a DMC620 interrupt or an SError or external
> abort through memory access ?
> A: Sorry, linaro only refered it will inject the DMC-620 single-bit RAS error. So I am also not sure which exception type it will trigger.
>
> BRs,
> Bin Wu
>
> ------------------原始邮件 ------------------
> 发件人:TF-A >
> 发送时间:Tue Apr 14 01:25:47 2020
> 收件人:Raghu Krishnamurthy via TF-A >
> 主题:Re: [TF-A] [RAS] BL32 UnRecognized Event - 0xC4000061 and BL31 Crashed
> Hello,
>
> >>Does BL31 need to send 0xC4000061 event to BL32 again?
>
> I don't think it will. It is really odd that
> 0xC4000061(SP_EVENT_COMPLETE_AARCH64) ever reaches the BL32/MM handler.
> This is from looking at the upstream code quickly but it definitely
> depends on the platform you are running, what version of TF-A you are
> using, build options used. Is it possible that the unhandled exception
> is occurring after successful handling of the DMC620 error but there is
> a following issue that occurs right after, causing the crash?
> From the register dump it looks like there was an Instruction abort
> exception at address 0 while running in EL3. Something seems to have
> gone seriously wrong to have 0xC4000061 ever go back to BL32 and to get
> an instruction abort at address 0.
>
> >>Does current TF-A support to run RAS test? It seems BL31 will crash.
> See above. The answer really depends on the factors mentioned above.
>
> The following would be helpful to know:
> 1) What platform you are running on? Can this issue be reproduced
> outside your testing environment, perhaps on FVP or QEMU?
> 2) What version of TF-A and StandaloneMM is being used? Preferably the
> commit-id, so that we can be sure we are looking at the same code.
> 3) What version of the kernel and sdei driver is being used?
> 4) I can't tell from looking at the log but do you know if writing 0x123
> to sde_ras_poison causes a DMC620 interrupt or an SError or external
> abort through memory access ?
>
> Thanks
> Raghu
>
>
> On 4/13/20 12:16 AM, 吴斌(郅隆) via TF-A wrote:
>> Dear Friends,
>>
>> I am using TF-A to test RAS feature.
>> When I triggered DMC620 RAS error in Linux(echo 0x123 >
>> /sys/kernel/debug/sdei_ras_poison).
>> BL32 will recieve
>> UnRecognized Event - 0xC4000061(SP_EVENT_COMPLETE_AARCH64) and finally
>> BL31 crashed.
>>
>> In my understanding, this 0xC4000061 should consumed by BL31, not send
>> it to BL32 again.
>>
>> A piece of error log as below:
>>
>> *************************************
>>
>> CperWrite - CperAddress@0xFF610064
>> CperWrite - 1 Section@FFBE91A8, Length 80, SectionType@FFBE9138
>> CperWrite - Got Error Section: Platform Memory.
>> MmEntryPoint Done
>> Received delegated event
>> X0 : 0xC4000061
>> X1 : 0x0
>> X2 : 0x0
>> X3 : 0x0
>> Received event - 0xC4000061 on cpu 0
>> UnRecognized Event - 0xC4000061
>> Failed delegated event 0xC4000061, Status 0x2
>> Unhandled Exception in EL3.
>> x30 = 0x0000000000000000
>> x0 = 0x00000000ff007e00
>> x1 = 0xfffffffffffffffe
>> x2 = 0x00000000600003c0
>> x3 = 0x0000000000000000
>> x4 = 0x0000000000000000
>> x5 = 0x0000000000000000
>> x6 = 0x00000000ff015080
>> x7 = 0x0000000000000000
>> x8 = 0x00000000c4000061
>> x9 = 0x0000000000000021
>> x10 = 0x0000000000000040
>> x11 = 0x00000000ff00f2b0
>> x12 = 0x00000000ff0118c0
>> x13 = 0x0000000000000002
>> x14 = 0x00000000ff016b70
>> x15 = 0x00000000ff003f20
>> x16 = 0x0000000000000044
>> x17 = 0x00000000ff010430
>> x18 = 0x0000000000000e3c
>> x19 = 0x0000000000000000
>> More error log please refer to attachment.
>>
>> My question is,
>> 1. Does BL31 need to send 0xC4000061 event to BL32 again?
>> 2. Does current TF-A support to run RAS test? It seems BL31 will crash.
>>
>> Appreciate your help.
>>
>> BRs,
>> Bin Wu
>>
> --
> TF-A mailing list
> TF-A(a)lists.trustedfirmware.org
> https://lists.trustedfirmware.org/mailman/listinfo/tf-a
--
TF-A mailing list
TF-A(a)lists.trustedfirmware.org
https://lists.trustedfirmware.org/mailman/listinfo/tf-a
IMPORTANT NOTICE: The contents of this email and any attachments are confidential and may also be privileged. If you are not the intended recipient, please notify the sender immediately and do not disclose the contents to any other person, use it for any purpose, or store or copy the information in any medium. Thank you.
Hi All,
The next TF-A Tech Forum is scheduled for Thu 23rd Apr 2020 17:00 - 18:00 (BST). A reoccurring meeting invite has been sent out to the subscribers of this TF-A mailing list. If you don’t have this please let me know.
Agenda:
* Overview of the TF-A v2.3 Release by Bipin Ravi and Mark Dykes
* Project Maintenance Proposal for tf.org Projects discussion
* Optional TF-A Mailing List Topic Discussions
Thanks
Joanna
IMPORTANT NOTICE: The contents of this email and any attachments are confidential and may also be privileged. If you are not the intended recipient, please notify the sender immediately and do not disclose the contents to any other person, use it for any purpose, or store or copy the information in any medium. Thank you.
> -----Original Message-----
> From: TF-A <tf-a-bounces(a)lists.trustedfirmware.org> On Behalf Of François
> Ozog via TF-A
> Sent: 20 April 2020 16:25
> To: Achin Gupta <Achin.Gupta(a)arm.com>
> Cc: tf-a(a)lists.trustedfirmware.org
> Subject: Re: [TF-A] [RFC] isolation between runtime BL33 services and OS
>
> On Mon, 20 Apr 2020 at 15:50, Achin Gupta <achin.gupta(a)arm.com> wrote:
> >
> > On Mon, Apr 20, 2020 at 03:37:23PM +0200, François Ozog wrote:
> > > On Mon, 20 Apr 2020 at 15:27, Achin Gupta <[1]achin.gupta(a)arm.com>
> > > wrote:
> > >
> > > Hi Francois,
> > > On Mon, Apr 20, 2020 at 11:45:02AM +0000, Fran ois Ozog via TF-A
> > > wrote:
> > > > Hi,
> > > >
> > > > I am trying to identify a mechanism to enforce a form of two-way
> > > > isolation between BL33 runtime services in OS, for instance:
> > > > - a pair of 2MB areas that could be RO by one entity and RW by the
> > > other
> > > > - an execute only BL33 2MB area?
> > > Stupid Q! Are you referring to isolation between EFI runtime
> > > services and the
> > > OS?
> > > It is not clear what you mean by BL33 runtime services?
> > >
> > > Not a stupid Q. I concentrate effectively on EFI runtime but more
> > > generally this is the non-trusted firmware component that delivers
> > > runtime services to OS.
> > > (My flow is somewhat convoluted: TFA loads minimal Linux as BL33, Linux
> > > kexecs a UEFI reduced U-Boot (without drivers) which bootefi the
> > > distro).
> >
> > Thanks! I see and IIUC, this is about two separately provisioned SW
> > components that share an EL (EL1 in this case) at the same time in the
> > same image. We want component A to have permission X on a memory
> > region and component B to have permission Y on the same memory region.
> > If so, then this would require a cooperation between the two components?
> >
> Yes. Well cooperation is what happens today: Component A (UEFI compliant
> FW) tells component B not to use memory it occupies.
> I wish an EL(+n) component to make that a guarantee. Yet I don't want to
> have "virtualization".
>
> > I might be still missing the obvious but I am wondering how a SW
> > entity at a higher EL (Hypervisor in EL2 or TF-A in EL3) could create
> > and enforce the separation between the two components. It would not
> > have visibility of what is happening inside the EL at the very least.
>
> I hoped that by installing a page mapping "power play", we could enforce
> some policy.
> Performance here is not important because those data and context changes
> seldomly happen.
> I assume components A and B have a different mapping for the same "physical
> page":
> - EL1_A(VA)-> IA1; EL1_B(VA)->IA2
> - EL2(IA1) -> PA (RW), EL2(IA2)->PA(RO) or "not present"
> A collaboration between UEFI FW and EL2/3 would allow that to happen.
> A call to UEFI runtime service from SystemTable would result in a swap of
> TTBR1 (from EL1_B to EL1_A) so that execution can continue in UEFI.
> (I have no solution, just trying to check if we can find one).
>
HI Francois,
What you suggest is possible AFAICS, as you suggest, if you create 2 IPAs with corresponding VAs. Communication between the 2 would involve some shared memory and invoking EL2 to trigger the switch between the VAs. This is suited more for an EL2 design I think rather than EL3.
Best Regards
Soby Mathew
> >
> > cheers,
> > Achin
> >
> > >
> > > cheers,
> > > Achin
> > > >
> > > > This is similar to hypervisor except it only deals with memory, no
> > > > vCPU, no GIC virtualization...
> > > >
> > > > Could EL3 or EL2 install protective mappings ? BL33 could ask
> > > either
> > > > EL2 hypervisor or SecureMonitor to actually install them.
> > > >
> > > > Cordially,
> > > >
> > > > FF
> > > > --
> > > > TF-A mailing list
> > > > [2]TF-A(a)lists.trustedfirmware.org
> > > > [3]https://lists.trustedfirmware.org/mailman/listinfo/tf-a
> > > IMPORTANT NOTICE: The contents of this email and any attachments
> are
> > > confidential and may also be privileged. If you are not the intended
> > > recipient, please notify the sender immediately and do not disclose
> > > the contents to any other person, use it for any purpose, or store
> > > or copy the information in any medium. Thank you.
> > >
> > > --
> > > [uc?id=0BxTAygkus3RgQVhuNHMwUi1mYWc&export=download]
> > > Fran ois-Fr d ric Ozog | Director Linaro Edge & Fog Computing Group
> > > T: +33.67221.6485
> > > [4]francois.ozog(a)linaro.org | Skype: ffozog
> > >
> > > References
> > >
> > > 1. mailto:achin.gupta@arm.com
> > > 2. mailto:TF-A@lists.trustedfirmware.org
> > > 3. https://lists.trustedfirmware.org/mailman/listinfo/tf-a
> > > 4. mailto:francois.ozog@linaro.org
> > IMPORTANT NOTICE: The contents of this email and any attachments are
> confidential and may also be privileged. If you are not the intended recipient,
> please notify the sender immediately and do not disclose the contents to any
> other person, use it for any purpose, or store or copy the information in any
> medium. Thank you.
>
>
>
> --
> François-Frédéric Ozog | Director Linaro Edge & Fog Computing Group
> T: +33.67221.6485
> francois.ozog(a)linaro.org | Skype: ffozog
> --
> TF-A mailing list
> TF-A(a)lists.trustedfirmware.org
> https://lists.trustedfirmware.org/mailman/listinfo/tf-a
IMPORTANT NOTICE: The contents of this email and any attachments are confidential and may also be privileged. If you are not the intended recipient, please notify the sender immediately and do not disclose the contents to any other person, use it for any purpose, or store or copy the information in any medium. Thank you.
Hi Olivier and All,
Thank you so much for your help. It makes me understand the internals.
The next step, I need to check this event_num(804) register flow in kernel side, am I right?
BRs,
Bin Wu
------------------原始邮件 ------------------
发件人:TF-A <tf-a-bounces(a)lists.trustedfirmware.org>
发送时间:Tue Apr 21 09:51:49 2020
收件人:Raghu K via TF-A <tf-a(a)lists.trustedfirmware.org>
主题:Re: [TF-A] 回复:Re: [RAS] BL32 UnRecognized Event - 0xC4000061 and BL31 Crashed
Nice debug! Apart from the issue you pointed out, there is also the
issue with not checking the return code. The ras handler should really
be checking or panic'ing if there is an unexpected error code from
spm_sp_call and sdei_dispatch_event.
-Raghu
On 4/20/20 2:37 PM, Olivier Deprez via TF-A wrote:
> Hi Bin Wu,
>
> Here's an early observation. On receiving the RAS fiq interrupt the following occurs:
>
> ehf_el3_interrupt_handler => sgi_ras_intr_handler => spm_sp_call (enters/exit the SP to handle the injected RAS error) => sdei_dispatch_event
>
> se = get_event_entry(map);
> if (!can_sdei_state_trans(se, DO_DISPATCH))
> return -1;
>
> p *map
> $6 = {ev_num = 804, intr = 0, map_flags = 112, reg_count = 0, lock = {lock = 0}}
> p *se
> $4 = {ep = 0, arg = 0, affinity = 0, reg_flags = 0, state = 0 '\0'}
>
> sdei_dispatch_event exits in error at this stage, this does not seem a correct behavior.
> The SDEI handler is not called in NS world and context remains unchanged.
> The interrupt handler blindly returns to S-EL1 SP context at same location where it last exited.
> sgi_ras_intr_handler => ehf_el3_interrupt_handler => vector_entry fiq_aarch64 => el3_exit => re-enters the SP with X0=0xC4000061
> SP then exits but the EL3 context has not been setup for SP entry leading to crash.
>
> IMO there is an issue around mapping SDEI event number to RAS interrupt number leading to sdei_dispatch_event exiting early.
>
> Regards,
> Olivier.
>
>
> ________________________________________
> From: TF-A on behalf of Matteo Carlini via TF-A
> Sent: 14 April 2020 10:41
> To: 吴斌(郅隆); tf-a(a)lists.trustedfirmware.org; Thomas Abraham; Deepak Pandey
> Cc: nd
> Subject: Re: [TF-A] 回复:Re: [RAS] BL32 UnRecognized Event - 0xC4000061 and BL31 Crashed
>
> Looping-in Thomas & Deepak, responsible for the RD-N1 landing team platforms releases. They might be able to help.
>
> Thanks
> Matteo
>
> From: TF-A On Behalf Of ??(??) via TF-A
> Sent: 14 April 2020 06:47
> To: TF-A ; Raghu Krishnamurthy via TF-A
> Subject: [TF-A] 回复:Re: [RAS] BL32 UnRecognized Event - 0xC4000061 and BL31 Crashed
>
> Hi RagHu,
>
> Really appreciate your help.
>
> I was downloaded this software stack from git.linaro.org. This software stack include ATF, kernel, edk2 and so on.
> The user guide i used from linaro is:https://git.linaro.org/landing-teams/working/arm/arm-reference-platforms…
>
> 1) What platform you are running on? Can this issue be reproduced
> outside your testing environment, perhaps on FVP or QEMU?
> A: I am running on ARM N1-Edge FVP platform. It can reproduced on this FVP platform.
>
> 2) What version of TF-A and StandaloneMM is being used? Preferably the
> commit-id, so that we can be sure we are looking at the same code.
> A: TF-A: https://git.linaro.org/landing-teams/working/arm/arm-tf.git tag:RD-INFRA-20191024-RC0
> StandloneMM seems build from edk2 & edk2-platform. so i just put edk2 and edk2-platform version information. if anything i missed, please let me know.
> edk2: https://git.linaro.org/landing-teams/working/arm/edk2.git tag:RD-INFRA-20191024-RC0
> edk2-platform: https://git.linaro.org/landing-teams/working/arm/edk2-platforms.git tag:RD-INFRA-20191024-RC0
>
> 3) What version of the kernel and sdei driver is being used?
> A: kernel-release: https://git.linaro.org/landing-teams/working/arm/kernel-release.git tag:RD-INFRA-20191024-RC0
> The sdei driver was included in kernel, do i need to provide sdei driver version? If need please let me know.
> 4) I can't tell from looking at the log but do you know if writing 0x123
> to sde_ras_poison causes a DMC620 interrupt or an SError or external
> abort through memory access ?
> A: Sorry, linaro only refered it will inject the DMC-620 single-bit RAS error. So I am also not sure which exception type it will trigger.
>
> BRs,
> Bin Wu
>
> ------------------原始邮件 ------------------
> 发件人:TF-A >
> 发送时间:Tue Apr 14 01:25:47 2020
> 收件人:Raghu Krishnamurthy via TF-A >
> 主题:Re: [TF-A] [RAS] BL32 UnRecognized Event - 0xC4000061 and BL31 Crashed
> Hello,
>
> >>Does BL31 need to send 0xC4000061 event to BL32 again?
>
> I don't think it will. It is really odd that
> 0xC4000061(SP_EVENT_COMPLETE_AARCH64) ever reaches the BL32/MM handler.
> This is from looking at the upstream code quickly but it definitely
> depends on the platform you are running, what version of TF-A you are
> using, build options used. Is it possible that the unhandled exception
> is occurring after successful handling of the DMC620 error but there is
> a following issue that occurs right after, causing the crash?
> From the register dump it looks like there was an Instruction abort
> exception at address 0 while running in EL3. Something seems to have
> gone seriously wrong to have 0xC4000061 ever go back to BL32 and to get
> an instruction abort at address 0.
>
> >>Does current TF-A support to run RAS test? It seems BL31 will crash.
> See above. The answer really depends on the factors mentioned above.
>
> The following would be helpful to know:
> 1) What platform you are running on? Can this issue be reproduced
> outside your testing environment, perhaps on FVP or QEMU?
> 2) What version of TF-A and StandaloneMM is being used? Preferably the
> commit-id, so that we can be sure we are looking at the same code.
> 3) What version of the kernel and sdei driver is being used?
> 4) I can't tell from looking at the log but do you know if writing 0x123
> to sde_ras_poison causes a DMC620 interrupt or an SError or external
> abort through memory access ?
>
> Thanks
> Raghu
>
>
> On 4/13/20 12:16 AM, 吴斌(郅隆) via TF-A wrote:
>> Dear Friends,
>>
>> I am using TF-A to test RAS feature.
>> When I triggered DMC620 RAS error in Linux(echo 0x123 >
>> /sys/kernel/debug/sdei_ras_poison).
>> BL32 will recieve
>> UnRecognized Event - 0xC4000061(SP_EVENT_COMPLETE_AARCH64) and finally
>> BL31 crashed.
>>
>> In my understanding, this 0xC4000061 should consumed by BL31, not send
>> it to BL32 again.
>>
>> A piece of error log as below:
>>
>> *************************************
>>
>> CperWrite - CperAddress@0xFF610064
>> CperWrite - 1 Section@FFBE91A8, Length 80, SectionType@FFBE9138
>> CperWrite - Got Error Section: Platform Memory.
>> MmEntryPoint Done
>> Received delegated event
>> X0 : 0xC4000061
>> X1 : 0x0
>> X2 : 0x0
>> X3 : 0x0
>> Received event - 0xC4000061 on cpu 0
>> UnRecognized Event - 0xC4000061
>> Failed delegated event 0xC4000061, Status 0x2
>> Unhandled Exception in EL3.
>> x30 = 0x0000000000000000
>> x0 = 0x00000000ff007e00
>> x1 = 0xfffffffffffffffe
>> x2 = 0x00000000600003c0
>> x3 = 0x0000000000000000
>> x4 = 0x0000000000000000
>> x5 = 0x0000000000000000
>> x6 = 0x00000000ff015080
>> x7 = 0x0000000000000000
>> x8 = 0x00000000c4000061
>> x9 = 0x0000000000000021
>> x10 = 0x0000000000000040
>> x11 = 0x00000000ff00f2b0
>> x12 = 0x00000000ff0118c0
>> x13 = 0x0000000000000002
>> x14 = 0x00000000ff016b70
>> x15 = 0x00000000ff003f20
>> x16 = 0x0000000000000044
>> x17 = 0x00000000ff010430
>> x18 = 0x0000000000000e3c
>> x19 = 0x0000000000000000
>> More error log please refer to attachment.
>>
>> My question is,
>> 1. Does BL31 need to send 0xC4000061 event to BL32 again?
>> 2. Does current TF-A support to run RAS test? It seems BL31 will crash.
>>
>> Appreciate your help.
>>
>> BRs,
>> Bin Wu
>>
> --
> TF-A mailing list
> TF-A(a)lists.trustedfirmware.org
> https://lists.trustedfirmware.org/mailman/listinfo/tf-a
--
TF-A mailing list
TF-A(a)lists.trustedfirmware.org
https://lists.trustedfirmware.org/mailman/listinfo/tf-a
My view is that smaller patches are easier to review and we should try to break up the patches to logical chucks where possible. I haven't taken a look at the patches myself but I am sure there will be ways to break it up for ease of review.
Best Regards
Soby Mathew
> -----Original Message-----
> From: TF-A <tf-a-bounces(a)lists.trustedfirmware.org> On Behalf Of Raghu
> Krishnamurthy via TF-A
> Sent: 20 April 2020 18:09
> To: Alexei Fedorov <Alexei.Fedorov(a)arm.com>; tf-a(a)lists.trustedfirmware.org
> Subject: Re: [TF-A] Event Log for Measured Boot
>
> Fair enough. I have no doubt it was tested. It is just extremely difficult to review
> such patches and I disagree with your statement.
> There is almost always a way to split patches up by using feature flags for
> example, that will help with not breaking the build. You can test them all
> together once you have all the patches. I also think it is perfectly reasonable to
> say measured boot cannot be turned on until a certain commit id present.
> However, if you think this is the right approach, i have no issues.
>
> Thanks
> Raghu
>
> On 4/20/20 8:44 AM, Alexei Fedorov wrote:
> > Hi Raghu and Varun,
> >
> > This patch is a complete, tested and verified reference implementation
> > for FVP platform.
> > Splitting it will create a set of separate non-buildable patches
> > causing more complexity in following and understanding the code
> > changes and dependencies.
> > The whole patch with all the code present in it should be reviewed
> > anyway, and the time spent will be less than the time used for
> > reviewing separate patches (mass defect).
> >
> > Alexei
> >
> > ----------------------------------------------------------------------
> > --
> > *From:* TF-A <tf-a-bounces(a)lists.trustedfirmware.org> on behalf of
> > Raghu Krishnamurthy via TF-A <tf-a(a)lists.trustedfirmware.org>
> > *Sent:* 02 April 2020 05:11
> > *To:* tf-a(a)lists.trustedfirmware.org <tf-a(a)lists.trustedfirmware.org>
> > *Subject:* Re: [TF-A] Event Log for Measured Boot Hi Alexei,
> >
> > I second Varun on this. The patch is huge. I recommend breaking it up
> > into multiple commits. I've reviewed it but since it is a large patch,
> > it might require a few more sittings to grasp all the changes(which
> > also means there may be some stupid review comments :)).
> >
> > -Raghu
> >
> > On 3/31/20 10:28 AM, Varun Wadekar via TF-A wrote:
> >> Hello Alexei,
> >>
> >> Just curious, the patch is huge and will take some time to review. Do
> >> you expect this change to be merged before the v2.3 release?
> >>
> >> -Varun
> >>
> >> *From:* TF-A <tf-a-bounces(a)lists.trustedfirmware.org> *On Behalf Of
> >> *Alexei Fedorov via TF-A
> >> *Sent:* Tuesday, March 31, 2020 7:19 AM
> >> *To:* tf-a(a)lists.trustedfirmware.org
> >> *Subject:* [TF-A] Event Log for Measured Boot
> >>
> >> *External email: Use caution opening links or attachments*
> >>
> >> Hi,
> >>
> >> Please review and provide your comments for the patch which adds
> >>
> >> Event Log generation for the Measured Boot.
> >>
> >> https://review.trustedfirmware.org/c/TF-A/trusted-firmware-a/+/3806
> >>
> >> Thanks.
> >>
> >> Alexei
> >>
> >> IMPORTANT NOTICE: The contents of this email and any attachments are
> >> confidential and may also be privileged. If you are not the intended
> >> recipient, please notify the sender immediately and do not disclose
> >> the contents to any other person, use it for any purpose, or store or
> >> copy the information in any medium. Thank you.
> >>
> >> ---------------------------------------------------------------------
> >> --- This email message is for the sole use of the intended
> >> recipient(s) and may contain confidential information. Any
> >> unauthorized review, use, disclosure or distribution is prohibited.
> >> If you are not the intended recipient, please contact the sender by
> >> reply email and destroy all copies of the original message.
> >> ---------------------------------------------------------------------
> >> ---
> >>
> > --
> > TF-A mailing list
> > TF-A(a)lists.trustedfirmware.org
> > https://lists.trustedfirmware.org/mailman/listinfo/tf-a
> > IMPORTANT NOTICE: The contents of this email and any attachments are
> > confidential and may also be privileged. If you are not the intended
> > recipient, please notify the sender immediately and do not disclose
> > the contents to any other person, use it for any purpose, or store or
> > copy the information in any medium. Thank you.
> --
> TF-A mailing list
> TF-A(a)lists.trustedfirmware.org
> https://lists.trustedfirmware.org/mailman/listinfo/tf-a
IMPORTANT NOTICE: The contents of this email and any attachments are confidential and may also be privileged. If you are not the intended recipient, please notify the sender immediately and do not disclose the contents to any other person, use it for any purpose, or store or copy the information in any medium. Thank you.
Nice debug! Apart from the issue you pointed out, there is also the
issue with not checking the return code. The ras handler should really
be checking or panic'ing if there is an unexpected error code from
spm_sp_call and sdei_dispatch_event.
-Raghu
On 4/20/20 2:37 PM, Olivier Deprez via TF-A wrote:
> Hi Bin Wu,
>
> Here's an early observation. On receiving the RAS fiq interrupt the following occurs:
>
> ehf_el3_interrupt_handler => sgi_ras_intr_handler => spm_sp_call (enters/exit the SP to handle the injected RAS error) => sdei_dispatch_event
>
> se = get_event_entry(map);
> if (!can_sdei_state_trans(se, DO_DISPATCH))
> return -1;
>
> p *map
> $6 = {ev_num = 804, intr = 0, map_flags = 112, reg_count = 0, lock = {lock = 0}}
> p *se
> $4 = {ep = 0, arg = 0, affinity = 0, reg_flags = 0, state = 0 '\0'}
>
> sdei_dispatch_event exits in error at this stage, this does not seem a correct behavior.
> The SDEI handler is not called in NS world and context remains unchanged.
> The interrupt handler blindly returns to S-EL1 SP context at same location where it last exited.
> sgi_ras_intr_handler => ehf_el3_interrupt_handler => vector_entry fiq_aarch64 => el3_exit => re-enters the SP with X0=0xC4000061
> SP then exits but the EL3 context has not been setup for SP entry leading to crash.
>
> IMO there is an issue around mapping SDEI event number to RAS interrupt number leading to sdei_dispatch_event exiting early.
>
> Regards,
> Olivier.
>
>
> ________________________________________
> From: TF-A <tf-a-bounces(a)lists.trustedfirmware.org> on behalf of Matteo Carlini via TF-A <tf-a(a)lists.trustedfirmware.org>
> Sent: 14 April 2020 10:41
> To: 吴斌(郅隆); tf-a(a)lists.trustedfirmware.org; Thomas Abraham; Deepak Pandey
> Cc: nd
> Subject: Re: [TF-A] 回复:Re: [RAS] BL32 UnRecognized Event - 0xC4000061 and BL31 Crashed
>
> Looping-in Thomas & Deepak, responsible for the RD-N1 landing team platforms releases. They might be able to help.
>
> Thanks
> Matteo
>
> From: TF-A <tf-a-bounces(a)lists.trustedfirmware.org> On Behalf Of ??(??) via TF-A
> Sent: 14 April 2020 06:47
> To: TF-A <tf-a-bounces(a)lists.trustedfirmware.org>; Raghu Krishnamurthy via TF-A <tf-a(a)lists.trustedfirmware.org>
> Subject: [TF-A] 回复:Re: [RAS] BL32 UnRecognized Event - 0xC4000061 and BL31 Crashed
>
> Hi RagHu,
>
> Really appreciate your help.
>
> I was downloaded this software stack from git.linaro.org. This software stack include ATF, kernel, edk2 and so on.
> The user guide i used from linaro is:https://git.linaro.org/landing-teams/working/arm/arm-reference-platforms…
>
> 1) What platform you are running on? Can this issue be reproduced
> outside your testing environment, perhaps on FVP or QEMU?
> A: I am running on ARM N1-Edge FVP platform. It can reproduced on this FVP platform.
>
> 2) What version of TF-A and StandaloneMM is being used? Preferably the
> commit-id, so that we can be sure we are looking at the same code.
> A: TF-A: https://git.linaro.org/landing-teams/working/arm/arm-tf.git tag:RD-INFRA-20191024-RC0
> StandloneMM seems build from edk2 & edk2-platform. so i just put edk2 and edk2-platform version information. if anything i missed, please let me know.
> edk2: https://git.linaro.org/landing-teams/working/arm/edk2.git<https://git.linaro.org/landing-teams/working/arm/edk2.git/> tag:RD-INFRA-20191024-RC0
> edk2-platform: https://git.linaro.org/landing-teams/working/arm/edk2-platforms.git<https://git.linaro.org/landing-teams/working/arm/edk2-platforms.git/> tag:RD-INFRA-20191024-RC0
>
> 3) What version of the kernel and sdei driver is being used?
> A: kernel-release: https://git.linaro.org/landing-teams/working/arm/kernel-release.git tag:RD-INFRA-20191024-RC0
> The sdei driver was included in kernel, do i need to provide sdei driver version? If need please let me know.
> 4) I can't tell from looking at the log but do you know if writing 0x123
> to sde_ras_poison causes a DMC620 interrupt or an SError or external
> abort through memory access ?
> A: Sorry, linaro only refered it will inject the DMC-620 single-bit RAS error. So I am also not sure which exception type it will trigger.
>
> BRs,
> Bin Wu
>
> ------------------原始邮件 ------------------
> 发件人:TF-A <tf-a-bounces(a)lists.trustedfirmware.org<mailto:tf-a-bounces@lists.trustedfirmware.org>>
> 发送时间:Tue Apr 14 01:25:47 2020
> 收件人:Raghu Krishnamurthy via TF-A <tf-a(a)lists.trustedfirmware.org<mailto:tf-a@lists.trustedfirmware.org>>
> 主题:Re: [TF-A] [RAS] BL32 UnRecognized Event - 0xC4000061 and BL31 Crashed
> Hello,
>
> >>Does BL31 need to send 0xC4000061 event to BL32 again?
>
> I don't think it will. It is really odd that
> 0xC4000061(SP_EVENT_COMPLETE_AARCH64) ever reaches the BL32/MM handler.
> This is from looking at the upstream code quickly but it definitely
> depends on the platform you are running, what version of TF-A you are
> using, build options used. Is it possible that the unhandled exception
> is occurring after successful handling of the DMC620 error but there is
> a following issue that occurs right after, causing the crash?
> From the register dump it looks like there was an Instruction abort
> exception at address 0 while running in EL3. Something seems to have
> gone seriously wrong to have 0xC4000061 ever go back to BL32 and to get
> an instruction abort at address 0.
>
> >>Does current TF-A support to run RAS test? It seems BL31 will crash.
> See above. The answer really depends on the factors mentioned above.
>
> The following would be helpful to know:
> 1) What platform you are running on? Can this issue be reproduced
> outside your testing environment, perhaps on FVP or QEMU?
> 2) What version of TF-A and StandaloneMM is being used? Preferably the
> commit-id, so that we can be sure we are looking at the same code.
> 3) What version of the kernel and sdei driver is being used?
> 4) I can't tell from looking at the log but do you know if writing 0x123
> to sde_ras_poison causes a DMC620 interrupt or an SError or external
> abort through memory access ?
>
> Thanks
> Raghu
>
>
> On 4/13/20 12:16 AM, 吴斌(郅隆) via TF-A wrote:
>> Dear Friends,
>>
>> I am using TF-A to test RAS feature.
>> When I triggered DMC620 RAS error in Linux(echo 0x123 >
>> /sys/kernel/debug/sdei_ras_poison).
>> BL32 will recieve
>> UnRecognized Event - 0xC4000061(SP_EVENT_COMPLETE_AARCH64) and finally
>> BL31 crashed.
>>
>> In my understanding, this 0xC4000061 should consumed by BL31, not send
>> it to BL32 again.
>>
>> A piece of error log as below:
>>
>> *************************************
>>
>> CperWrite - CperAddress@0xFF610064
>> CperWrite - 1 Section@FFBE91A8, Length 80, SectionType@FFBE9138
>> CperWrite - Got Error Section: Platform Memory.
>> MmEntryPoint Done
>> Received delegated event
>> X0 : 0xC4000061
>> X1 : 0x0
>> X2 : 0x0
>> X3 : 0x0
>> Received event - 0xC4000061 on cpu 0
>> UnRecognized Event - 0xC4000061
>> Failed delegated event 0xC4000061, Status 0x2
>> Unhandled Exception in EL3.
>> x30 = 0x0000000000000000
>> x0 = 0x00000000ff007e00
>> x1 = 0xfffffffffffffffe
>> x2 = 0x00000000600003c0
>> x3 = 0x0000000000000000
>> x4 = 0x0000000000000000
>> x5 = 0x0000000000000000
>> x6 = 0x00000000ff015080
>> x7 = 0x0000000000000000
>> x8 = 0x00000000c4000061
>> x9 = 0x0000000000000021
>> x10 = 0x0000000000000040
>> x11 = 0x00000000ff00f2b0
>> x12 = 0x00000000ff0118c0
>> x13 = 0x0000000000000002
>> x14 = 0x00000000ff016b70
>> x15 = 0x00000000ff003f20
>> x16 = 0x0000000000000044
>> x17 = 0x00000000ff010430
>> x18 = 0x0000000000000e3c
>> x19 = 0x0000000000000000
>> More error log please refer to attachment.
>>
>> My question is,
>> 1. Does BL31 need to send 0xC4000061 event to BL32 again?
>> 2. Does current TF-A support to run RAS test? It seems BL31 will crash.
>>
>> Appreciate your help.
>>
>> BRs,
>> Bin Wu
>>
> --
> TF-A mailing list
> TF-A(a)lists.trustedfirmware.org<mailto:TF-A@lists.trustedfirmware.org>
> https://lists.trustedfirmware.org/mailman/listinfo/tf-a
Hi Bin Wu,
Here's an early observation. On receiving the RAS fiq interrupt the following occurs:
ehf_el3_interrupt_handler => sgi_ras_intr_handler => spm_sp_call (enters/exit the SP to handle the injected RAS error) => sdei_dispatch_event
se = get_event_entry(map);
if (!can_sdei_state_trans(se, DO_DISPATCH))
return -1;
p *map
$6 = {ev_num = 804, intr = 0, map_flags = 112, reg_count = 0, lock = {lock = 0}}
p *se
$4 = {ep = 0, arg = 0, affinity = 0, reg_flags = 0, state = 0 '\0'}
sdei_dispatch_event exits in error at this stage, this does not seem a correct behavior.
The SDEI handler is not called in NS world and context remains unchanged.
The interrupt handler blindly returns to S-EL1 SP context at same location where it last exited.
sgi_ras_intr_handler => ehf_el3_interrupt_handler => vector_entry fiq_aarch64 => el3_exit => re-enters the SP with X0=0xC4000061
SP then exits but the EL3 context has not been setup for SP entry leading to crash.
IMO there is an issue around mapping SDEI event number to RAS interrupt number leading to sdei_dispatch_event exiting early.
Regards,
Olivier.
________________________________________
From: TF-A <tf-a-bounces(a)lists.trustedfirmware.org> on behalf of Matteo Carlini via TF-A <tf-a(a)lists.trustedfirmware.org>
Sent: 14 April 2020 10:41
To: 吴斌(郅隆); tf-a(a)lists.trustedfirmware.org; Thomas Abraham; Deepak Pandey
Cc: nd
Subject: Re: [TF-A] 回复:Re: [RAS] BL32 UnRecognized Event - 0xC4000061 and BL31 Crashed
Looping-in Thomas & Deepak, responsible for the RD-N1 landing team platforms releases. They might be able to help.
Thanks
Matteo
From: TF-A <tf-a-bounces(a)lists.trustedfirmware.org> On Behalf Of ??(??) via TF-A
Sent: 14 April 2020 06:47
To: TF-A <tf-a-bounces(a)lists.trustedfirmware.org>; Raghu Krishnamurthy via TF-A <tf-a(a)lists.trustedfirmware.org>
Subject: [TF-A] 回复:Re: [RAS] BL32 UnRecognized Event - 0xC4000061 and BL31 Crashed
Hi RagHu,
Really appreciate your help.
I was downloaded this software stack from git.linaro.org. This software stack include ATF, kernel, edk2 and so on.
The user guide i used from linaro is:https://git.linaro.org/landing-teams/working/arm/arm-reference-platforms…
1) What platform you are running on? Can this issue be reproduced
outside your testing environment, perhaps on FVP or QEMU?
A: I am running on ARM N1-Edge FVP platform. It can reproduced on this FVP platform.
2) What version of TF-A and StandaloneMM is being used? Preferably the
commit-id, so that we can be sure we are looking at the same code.
A: TF-A: https://git.linaro.org/landing-teams/working/arm/arm-tf.git tag:RD-INFRA-20191024-RC0
StandloneMM seems build from edk2 & edk2-platform. so i just put edk2 and edk2-platform version information. if anything i missed, please let me know.
edk2: https://git.linaro.org/landing-teams/working/arm/edk2.git<https://git.linaro.org/landing-teams/working/arm/edk2.git/> tag:RD-INFRA-20191024-RC0
edk2-platform: https://git.linaro.org/landing-teams/working/arm/edk2-platforms.git<https://git.linaro.org/landing-teams/working/arm/edk2-platforms.git/> tag:RD-INFRA-20191024-RC0
3) What version of the kernel and sdei driver is being used?
A: kernel-release: https://git.linaro.org/landing-teams/working/arm/kernel-release.git tag:RD-INFRA-20191024-RC0
The sdei driver was included in kernel, do i need to provide sdei driver version? If need please let me know.
4) I can't tell from looking at the log but do you know if writing 0x123
to sde_ras_poison causes a DMC620 interrupt or an SError or external
abort through memory access ?
A: Sorry, linaro only refered it will inject the DMC-620 single-bit RAS error. So I am also not sure which exception type it will trigger.
BRs,
Bin Wu
------------------原始邮件 ------------------
发件人:TF-A <tf-a-bounces(a)lists.trustedfirmware.org<mailto:tf-a-bounces@lists.trustedfirmware.org>>
发送时间:Tue Apr 14 01:25:47 2020
收件人:Raghu Krishnamurthy via TF-A <tf-a(a)lists.trustedfirmware.org<mailto:tf-a@lists.trustedfirmware.org>>
主题:Re: [TF-A] [RAS] BL32 UnRecognized Event - 0xC4000061 and BL31 Crashed
Hello,
>>Does BL31 need to send 0xC4000061 event to BL32 again?
I don't think it will. It is really odd that
0xC4000061(SP_EVENT_COMPLETE_AARCH64) ever reaches the BL32/MM handler.
This is from looking at the upstream code quickly but it definitely
depends on the platform you are running, what version of TF-A you are
using, build options used. Is it possible that the unhandled exception
is occurring after successful handling of the DMC620 error but there is
a following issue that occurs right after, causing the crash?
From the register dump it looks like there was an Instruction abort
exception at address 0 while running in EL3. Something seems to have
gone seriously wrong to have 0xC4000061 ever go back to BL32 and to get
an instruction abort at address 0.
>>Does current TF-A support to run RAS test? It seems BL31 will crash.
See above. The answer really depends on the factors mentioned above.
The following would be helpful to know:
1) What platform you are running on? Can this issue be reproduced
outside your testing environment, perhaps on FVP or QEMU?
2) What version of TF-A and StandaloneMM is being used? Preferably the
commit-id, so that we can be sure we are looking at the same code.
3) What version of the kernel and sdei driver is being used?
4) I can't tell from looking at the log but do you know if writing 0x123
to sde_ras_poison causes a DMC620 interrupt or an SError or external
abort through memory access ?
Thanks
Raghu
On 4/13/20 12:16 AM, 吴斌(郅隆) via TF-A wrote:
> Dear Friends,
>
> I am using TF-A to test RAS feature.
> When I triggered DMC620 RAS error in Linux(echo 0x123 >
> /sys/kernel/debug/sdei_ras_poison).
> BL32 will recieve
> UnRecognized Event - 0xC4000061(SP_EVENT_COMPLETE_AARCH64) and finally
> BL31 crashed.
>
> In my understanding, this 0xC4000061 should consumed by BL31, not send
> it to BL32 again.
>
> A piece of error log as below:
>
> *************************************
>
> CperWrite - CperAddress@0xFF610064
> CperWrite - 1 Section@FFBE91A8, Length 80, SectionType@FFBE9138
> CperWrite - Got Error Section: Platform Memory.
> MmEntryPoint Done
> Received delegated event
> X0 : 0xC4000061
> X1 : 0x0
> X2 : 0x0
> X3 : 0x0
> Received event - 0xC4000061 on cpu 0
> UnRecognized Event - 0xC4000061
> Failed delegated event 0xC4000061, Status 0x2
> Unhandled Exception in EL3.
> x30 = 0x0000000000000000
> x0 = 0x00000000ff007e00
> x1 = 0xfffffffffffffffe
> x2 = 0x00000000600003c0
> x3 = 0x0000000000000000
> x4 = 0x0000000000000000
> x5 = 0x0000000000000000
> x6 = 0x00000000ff015080
> x7 = 0x0000000000000000
> x8 = 0x00000000c4000061
> x9 = 0x0000000000000021
> x10 = 0x0000000000000040
> x11 = 0x00000000ff00f2b0
> x12 = 0x00000000ff0118c0
> x13 = 0x0000000000000002
> x14 = 0x00000000ff016b70
> x15 = 0x00000000ff003f20
> x16 = 0x0000000000000044
> x17 = 0x00000000ff010430
> x18 = 0x0000000000000e3c
> x19 = 0x0000000000000000
> More error log please refer to attachment.
>
> My question is,
> 1. Does BL31 need to send 0xC4000061 event to BL32 again?
> 2. Does current TF-A support to run RAS test? It seems BL31 will crash.
>
> Appreciate your help.
>
> BRs,
> Bin Wu
>
--
TF-A mailing list
TF-A(a)lists.trustedfirmware.org<mailto:TF-A@lists.trustedfirmware.org>
https://lists.trustedfirmware.org/mailman/listinfo/tf-a
Hi all,
Trusted Firmware-A and TF-A tests version 2.3 is now available and can be found here:
TF-A:
https://git.trustedfirmware.org/TF-A/trusted-firmware-a.git/tag/?h=v2.3
TF-A-tests:
https://git.trustedfirmware.org/TF-A/tf-a-tests.git/tag/?h=v2.3
Please refer to the readme and change log for further information.
Thanks & best regards,
[cid:image001.jpg@01D6170F.64702DF0]
Bipin Ravi | Principal Design Engineer
Bipin.Ravi(a)arm.com<mailto:Bipin.Ravi@arm.com> | Mobile: +1-214-212-0794
5707 Southwest Parkway, Suite 100, Austin, TX 78735
Hi Raghu and Varun,
This patch is a complete, tested and verified reference implementation for FVP platform.
Splitting it will create a set of separate non-buildable patches causing more complexity in following
and understanding the code changes and dependencies.
The whole patch with all the code present in it should be reviewed anyway, and the time spent will be less
than the time used for reviewing separate patches (mass defect).
Alexei
________________________________
From: TF-A <tf-a-bounces(a)lists.trustedfirmware.org> on behalf of Raghu Krishnamurthy via TF-A <tf-a(a)lists.trustedfirmware.org>
Sent: 02 April 2020 05:11
To: tf-a(a)lists.trustedfirmware.org <tf-a(a)lists.trustedfirmware.org>
Subject: Re: [TF-A] Event Log for Measured Boot
Hi Alexei,
I second Varun on this. The patch is huge. I recommend breaking it up
into multiple commits. I've reviewed it but since it is a large patch,
it might require a few more sittings to grasp all the changes(which also
means there may be some stupid review comments :)).
-Raghu
On 3/31/20 10:28 AM, Varun Wadekar via TF-A wrote:
> Hello Alexei,
>
> Just curious, the patch is huge and will take some time to review. Do
> you expect this change to be merged before the v2.3 release?
>
> -Varun
>
> *From:* TF-A <tf-a-bounces(a)lists.trustedfirmware.org> *On Behalf Of
> *Alexei Fedorov via TF-A
> *Sent:* Tuesday, March 31, 2020 7:19 AM
> *To:* tf-a(a)lists.trustedfirmware.org
> *Subject:* [TF-A] Event Log for Measured Boot
>
> *External email: Use caution opening links or attachments*
>
> Hi,
>
> Please review and provide your comments for the patch which adds
>
> Event Log generation for the Measured Boot.
>
> https://review.trustedfirmware.org/c/TF-A/trusted-firmware-a/+/3806
>
> Thanks.
>
> Alexei
>
> IMPORTANT NOTICE: The contents of this email and any attachments are
> confidential and may also be privileged. If you are not the intended
> recipient, please notify the sender immediately and do not disclose the
> contents to any other person, use it for any purpose, or store or copy
> the information in any medium. Thank you.
>
> ------------------------------------------------------------------------
> This email message is for the sole use of the intended recipient(s) and
> may contain confidential information. Any unauthorized review, use,
> disclosure or distribution is prohibited. If you are not the intended
> recipient, please contact the sender by reply email and destroy all
> copies of the original message.
> ------------------------------------------------------------------------
>
--
TF-A mailing list
TF-A(a)lists.trustedfirmware.org
https://lists.trustedfirmware.org/mailman/listinfo/tf-a
IMPORTANT NOTICE: The contents of this email and any attachments are confidential and may also be privileged. If you are not the intended recipient, please notify the sender immediately and do not disclose the contents to any other person, use it for any purpose, or store or copy the information in any medium. Thank you.
Hi,
If buffers are to be consumed by NWd only, it's probably better to contain the isolation at NS-EL2 (through Stage-2 MMU mapping).
Though from your statements I'm not clear if you wish to use an hypervisor, or not?
If yes, which implementation or kind do you want to use?
If not, I wonder if you could re-use OS facilities like shared mem with different permissions for two child processes (might be tricky)?
I don't think execute-only regions exist in Cortex-A (unless implementing additional hardware). If executable it's probably also readable. Maybe you meant execute-never?
Regards,
Olivier.
________________________________________
From: TF-A <tf-a-bounces(a)lists.trustedfirmware.org> on behalf of François Ozog via TF-A <tf-a(a)lists.trustedfirmware.org>
Sent: 20 April 2020 13:45
To: tf-a(a)lists.trustedfirmware.org
Subject: [TF-A] [RFC] isolation between runtime BL33 services and OS
Hi,
I am trying to identify a mechanism to enforce a form of two-way
isolation between BL33 runtime services in OS, for instance:
- a pair of 2MB areas that could be RO by one entity and RW by the other
- an execute only BL33 2MB area?
This is similar to hypervisor except it only deals with memory, no
vCPU, no GIC virtualization...
Could EL3 or EL2 install protective mappings ? BL33 could ask either
EL2 hypervisor or SecureMonitor to actually install them.
Cordially,
FF
--
TF-A mailing list
TF-A(a)lists.trustedfirmware.org
https://lists.trustedfirmware.org/mailman/listinfo/tf-a
Hi,
I am trying to identify a mechanism to enforce a form of two-way
isolation between BL33 runtime services in OS, for instance:
- a pair of 2MB areas that could be RO by one entity and RW by the other
- an execute only BL33 2MB area?
This is similar to hypervisor except it only deals with memory, no
vCPU, no GIC virtualization...
Could EL3 or EL2 install protective mappings ? BL33 could ask either
EL2 hypervisor or SecureMonitor to actually install them.
Cordially,
FF
Hi,
As part of integrating Hafnium within Trusted Firmware projects, a new mailing list has been created:
https://lists.trustedfirmware.org/mailman/listinfo/hafnium
You can register this list to participate to Hafnium in general, and coming S-EL2 firmware design discussions.
Regards,
Olivier.
Hi @François<mailto:francois.ozog@linaro.org>,
<quote>
Now, I changed U-Boot to Image, added code to ensure arg0 is 0 (DTB) but Linux does not start (zero printk visible).
I don't find kernel text base anymore, may be it disappeared: any suggestion to get the kernel start when loaded at 64KB offset?
the kernel image with embedded initrd is 10MB, SRAM is at 64MB so there should be enough space to decompress initrd...
</quote>
I recommend reading through the arm64 Linux booting requirements if you haven’t already; you’ll need to ensure all of those conditions are met before entering the kernel [1].
For example have you correctly set text_offset in the image header?
<quote>
u32 code0; /* Executable code */
u32 code1; /* Executable code */
u64 text_offset; /* Image load offset, little endian */
u64 image_size; /* Effective Image size, little endian */
u64 flags; /* kernel flags, little endian */
u64 res2 = 0; /* reserved */
u64 res3 = 0; /* reserved */
u64 res4 = 0; /* reserved */
u32 magic = 0x644d5241; /* Magic number, little endian, "ARM\x64" */
u32 res5; /* reserved (used for PE COFF offset) */
</quote>
<quote>
The Image must be placed text_offset bytes from a 2MB aligned base address anywhere in usable system RAM and called there.
</quote>
Kind regards,
Ash.
[1] https://elixir.bootlin.com/linux/latest/source/Documentation/arm64/booting.…
IMPORTANT NOTICE: The contents of this email and any attachments are confidential and may also be privileged. If you are not the intended recipient, please notify the sender immediately and do not disclose the contents to any other person, use it for any purpose, or store or copy the information in any medium. Thank you.
Hi Feng,
This is standard practice for operating systems, hypervisors, and firmware running on Armv8-A systems. A key distinction between SPSel,#0 and SPSel,#1 is that you can tell which stack pointer you were using when you took an exception as they correspond to different offsets in the vector table. Often times, taking an exception from the same EL when using SPSel,#0 is non-terminal whereas taking an exception from the same EL when already using SPSel,#1 is considered terminal.
Take for example the scenario where your operating system, hypervisor, or firmware is running some task/thread code at EL1/EL2/EL3 and runs out of stack space, triggering a translation fault and attempting to stack some data (we’ll probably be using unmapped guard pages at the stack boundaries). The first thing the exception handler will try to do is stack the GPRs. If the reason you took the exception is because the stack pointed to by SP_EL1/2/3 has itself overflowed, this attempt to stack the GPRs will itself cause a translation fault and you’ll get stuck in a recursive exception.
In contrast, if the reason you took the exception is because the stack pointed to by SP_EL0 has overflowed, the exception handler will successfully stack the GPRs to the SP_EL1/2/3 stack and be able to diagnose + log what went wrong before rebooting gracefully.
Hope that helps,
Ash.
From: TF-A <tf-a-bounces(a)lists.trustedfirmware.org> on behalf of Chen Feng via TF-A <tf-a(a)lists.trustedfirmware.org>
Reply to: Chen Feng <puck.chen(a)hisilicon.com>
Date: Friday, 17 April 2020 at 09:02
To: "tf-a(a)lists.trustedfirmware.org" <tf-a(a)lists.trustedfirmware.org>, Alexei Fedorov <Alexei.Fedorov(a)arm.com>, Yatharth Kochar <Yatharth.Kochar(a)arm.com>, Sandrine Bailleux <Sandrine.Bailleux(a)arm.com>
Cc: "puck.chen(a)hisilicon.com" <puck.chen(a)hisilicon.com>, "lizhong11(a)hisilicon.com" <lizhong11(a)hisilicon.com>
Subject: [TF-A] sp select in atf
Hello all,
I see the atf use different sp, and special for smc64 it use the sp_el0.
So for the unhandled-exception handler, it must switch to sp_el0 to do
the back-trace-dump. Since it default used sp_el3 when exception happen.
My question here is why using different sp in atf code? Just use the
sp_el3 for all scenes seems more simpler.
Cheers,
- feng
--
TF-A mailing list
TF-A(a)lists.trustedfirmware.org<mailto:TF-A@lists.trustedfirmware.org>
https://lists.trustedfirmware.org/mailman/listinfo/tf-a
IMPORTANT NOTICE: The contents of this email and any attachments are confidential and may also be privileged. If you are not the intended recipient, please notify the sender immediately and do not disclose the contents to any other person, use it for any purpose, or store or copy the information in any medium. Thank you.
Hello all,
I see the atf use different sp, and special for smc64 it use the sp_el0.
So for the unhandled-exception handler, it must switch to sp_el0 to do
the back-trace-dump. Since it default used sp_el3 when exception happen.
My question here is why using different sp in atf code? Just use the
sp_el3 for all scenes seems more simpler.
Cheers,
- feng
As of now ARM_LINUX_KERNEL_AS_BL33 is only supported when RESET_TO_BL31=1, along with it you need to pass PRELOADED_BL33_BASE as well as ARM_PRELOADED_DTB_BASE.
AFAIK This feature is not tested for platforms which uses all the BL(1/2/31) stages from TF-A . The most likely reason for this is loading and authentication of Linux Image.
BL2 which is responsible for loading of various images, does not have support to load linux image.
With platforms having RESET_TO_BL31, TF-A relies on prior loader which loads kernel and device tree blobs at respective address.
In short, if your platform has RESET_TO_BL31=1, it will be quite easy else you need to understand the BL2 loading mechanism and see if you can extend it for loading Linux and DTB.
Kernel Output format zImage/Image should work.
Hope this helps!
________________________________
From: TF-A <tf-a-bounces(a)lists.trustedfirmware.org> on behalf of François Ozog via TF-A <tf-a(a)lists.trustedfirmware.org>
Sent: 15 April 2020 13:31
To: tf-a(a)lists.trustedfirmware.org <tf-a(a)lists.trustedfirmware.org>
Subject: [TF-A] Linux as BL33
I want to use Linux as BL33 on a Marvell Macchiatobin.
Currently I have the successful boot flow:
TFA (mainline v2.2) -> U-Boot (Mainline 2020.04rc5) -> Kernel (5.6.3)
with U-root initrd (6.0.0, https://github.com/u-root/u-root ) ->
Ubuntu 19.10
The 5.6.3 "intermediary" kernel is 5.5MB uncompressed , u-root initrd
is 3.5MB compressed (some form of golang based busybox).
I was pointed to the ARM_LINUX_KERNEL_AS_BL33 option which is not
supported on the Macchiatobin.
It does not look too difficult to add, but I'd like to have some
feedback/guidance on how to do it:
- how to add the option to the TFA platform
- how to generate a usable kernel (compile options? non relocatable
kernel? output format, i.e. Image, zImage, uImage...)
Thanks for your help
-FF
--
TF-A mailing list
TF-A(a)lists.trustedfirmware.org
https://lists.trustedfirmware.org/mailman/listinfo/tf-a
IMPORTANT NOTICE: The contents of this email and any attachments are confidential and may also be privileged. If you are not the intended recipient, please notify the sender immediately and do not disclose the contents to any other person, use it for any purpose, or store or copy the information in any medium. Thank you.
+Harb
________________________________
From: TF-A <tf-a-bounces(a)lists.trustedfirmware.org> on behalf of Vivek Prasad via TF-A <tf-a(a)lists.trustedfirmware.org>
Sent: Thursday, April 16, 2020 10:19 AM
To: Stuart Yoder <stuart.yoder(a)arm.com>; Alexei Fedorov <Alexei.Fedorov(a)arm.com>; tf-a <tf-a(a)lists.trustedfirmware.org>; Raghu Krishnamurthy <raghu.ncstate(a)icloud.com>
Cc: Loc Ho <loc.ho(a)amperecomputing.com>; Vivek Kumar <vivek(a)amperecomputing.com>; Benjamin Chaffin <bchaffin(a)amperecomputing.com>; Ard Biesheuvel <Ard.Biesheuvel(a)arm.com>; Mohamad Ammar <moe(a)amperecomputing.com>; Charles Garcia-Tobin <Charles.Garcia-Tobin(a)arm.com>
Subject: Re: [TF-A] Proposal for Measured Boot Implementation
Hello Stuart, Alexei,
Chiming-in here on Ampere's behalf...
We analysed this proposal internally. And we see a number issues with this, some of which was already raised by Raghu in the previous threads.
Here is a summary of the main issues that we see.
* Only supporting mbedtls, and this is fixed config at compile time.
* We propose that there should be a variable for the algorithm to be used, which can be setup at initialization time.
* This solution relies on taking the hash directly from the digest as the measurement, instead of the computed hash. This is not safe, especially considering measured boot may use a different hash bank, so digest hash may not be correct/valid.
* Only measuring the BL2 image, per the ARM SBSG we must be measuring and logging *all* images/boot phases
* BL31
* BL32 (all secure partitions)
* BL33 (UEFI or any other non-secure boot loader)
* Once we ERET into BL33, the measure boot flow continues and is owned by that boot loader
* Only see support for PCR0, any/all unsigned config data must be logged to PCR1.
* Passing PCRs to non-secure software before logging is not compliant with TCG Static-Root-of-Trust Measurement (SRTM) requirements
* It was discussed before in separate conversations… especially in systems where you are talked about two different signing domains where BL33 is a different trust/signing domain.
* BL33 should only do hash-log-extend… there is no need for BL33 to be aware of the current PCR value (beyond what is provided in the boot event log).
* Based on comments on the mail thread, there seem to be bad assumptions/expectations around TPM accessibility from non-secure world.
* Expecting SPI/I2C TPMs to be directly accessed from non-secure world instead of abstracting hardware details via the TCG CRB interface (which has been already standardized as the defacto mechanism for ARM on past mobile, client, and server solutions).
* CRB will "just work" for Aptio/EDK2/Linux/Windows/Hyper-V/VMWare
* NOTE: This goes back to what is a “productizable” TPM solution. We want it to be turn-key solution for customers without having to support/develop proprietary drivers.
-Vivek/Harb
Hi all,
Thanks to all who have commented on this proposal so far. I've edited
the original document to try and incorporate all feedback gathered so
far (through the TSC meeting, this email thread and the TF-A tech call).
Please have another look and flag anything I might have missed:
https://developer.trustedfirmware.org/w/collaboration/project-maintenance-p…
The major changes are:
== Removed concept of self-review ==
This is proving too controversial, several people do not want to allow
self-review.
Roles of maintainer and code owner are still cumulative but cannot be
both exercised for the same patch.
The exact method of dealing with review bottleneck is still to be
decided. In addition to the current proposal of increasing the
maintainers pool, the most popular alternatives mentioned so far are:
- Set a minimum wait time for feedback before a patch can be merged
without any further delay.
- Mandate distinct reviewers for a patch.
== Enhanced the section "Patch contribution Guidelines" ==
Mentioned that patches should be small, on-topic, with comprehensive
commit messages.
== Added a note about how to deal with disagreement ==
If reviewers cannot find a common ground, the proposal is to call out a
3rd-party maintainer.
== Removed "out-of-date" platform state ==
Squashed it into "limited support" to reduce the number of states.
== Removed "orphan" state from platform support life cycle ==
This concept is orthogonal to the level of functionality.
Added a note in the "Code Owner" section instead.
== Per-project guidelines as a complementary document ==
Added a list of things that it would typically cover.
== Added requirement on fully supported platforms to document the
features they support ==
== Added todo mentioning that the proposal might cover branching
strategies in the future ==
The full diff may be seen here:
https://developer.trustedfirmware.org/phriction/diff/73/?l=4&r=5
This proposal is still open for discussion at this stage and further
feedback is most welcome!
Regards,
Sandrine
I want to use Linux as BL33 on a Marvell Macchiatobin.
Currently I have the successful boot flow:
TFA (mainline v2.2) -> U-Boot (Mainline 2020.04rc5) -> Kernel (5.6.3)
with U-root initrd (6.0.0, https://github.com/u-root/u-root ) ->
Ubuntu 19.10
The 5.6.3 "intermediary" kernel is 5.5MB uncompressed , u-root initrd
is 3.5MB compressed (some form of golang based busybox).
I was pointed to the ARM_LINUX_KERNEL_AS_BL33 option which is not
supported on the Macchiatobin.
It does not look too difficult to add, but I'd like to have some
feedback/guidance on how to do it:
- how to add the option to the TFA platform
- how to generate a usable kernel (compile options? non relocatable
kernel? output format, i.e. Image, zImage, uImage...)
Thanks for your help
-FF
Looping-in Thomas & Deepak, responsible for the RD-N1 landing team platforms releases. They might be able to help.
Thanks
Matteo
From: TF-A <tf-a-bounces(a)lists.trustedfirmware.org> On Behalf Of ??(??) via TF-A
Sent: 14 April 2020 06:47
To: TF-A <tf-a-bounces(a)lists.trustedfirmware.org>; Raghu Krishnamurthy via TF-A <tf-a(a)lists.trustedfirmware.org>
Subject: [TF-A] 回复:Re: [RAS] BL32 UnRecognized Event - 0xC4000061 and BL31 Crashed
Hi RagHu,
Really appreciate your help.
I was downloaded this software stack from git.linaro.org. This software stack include ATF, kernel, edk2 and so on.
The user guide i used from linaro is:https://git.linaro.org/landing-teams/working/arm/arm-reference-platforms…
1) What platform you are running on? Can this issue be reproduced
outside your testing environment, perhaps on FVP or QEMU?
A: I am running on ARM N1-Edge FVP platform. It can reproduced on this FVP platform.
2) What version of TF-A and StandaloneMM is being used? Preferably the
commit-id, so that we can be sure we are looking at the same code.
A: TF-A: https://git.linaro.org/landing-teams/working/arm/arm-tf.git tag:RD-INFRA-20191024-RC0
StandloneMM seems build from edk2 & edk2-platform. so i just put edk2 and edk2-platform version information. if anything i missed, please let me know.
edk2: https://git.linaro.org/landing-teams/working/arm/edk2.git<https://git.linaro.org/landing-teams/working/arm/edk2.git/> tag:RD-INFRA-20191024-RC0
edk2-platform: https://git.linaro.org/landing-teams/working/arm/edk2-platforms.git<https://git.linaro.org/landing-teams/working/arm/edk2-platforms.git/> tag:RD-INFRA-20191024-RC0
3) What version of the kernel and sdei driver is being used?
A: kernel-release: https://git.linaro.org/landing-teams/working/arm/kernel-release.git tag:RD-INFRA-20191024-RC0
The sdei driver was included in kernel, do i need to provide sdei driver version? If need please let me know.
4) I can't tell from looking at the log but do you know if writing 0x123
to sde_ras_poison causes a DMC620 interrupt or an SError or external
abort through memory access ?
A: Sorry, linaro only refered it will inject the DMC-620 single-bit RAS error. So I am also not sure which exception type it will trigger.
BRs,
Bin Wu
------------------原始邮件 ------------------
发件人:TF-A <tf-a-bounces(a)lists.trustedfirmware.org<mailto:tf-a-bounces@lists.trustedfirmware.org>>
发送时间:Tue Apr 14 01:25:47 2020
收件人:Raghu Krishnamurthy via TF-A <tf-a(a)lists.trustedfirmware.org<mailto:tf-a@lists.trustedfirmware.org>>
主题:Re: [TF-A] [RAS] BL32 UnRecognized Event - 0xC4000061 and BL31 Crashed
Hello,
>>Does BL31 need to send 0xC4000061 event to BL32 again?
I don't think it will. It is really odd that
0xC4000061(SP_EVENT_COMPLETE_AARCH64) ever reaches the BL32/MM handler.
This is from looking at the upstream code quickly but it definitely
depends on the platform you are running, what version of TF-A you are
using, build options used. Is it possible that the unhandled exception
is occurring after successful handling of the DMC620 error but there is
a following issue that occurs right after, causing the crash?
From the register dump it looks like there was an Instruction abort
exception at address 0 while running in EL3. Something seems to have
gone seriously wrong to have 0xC4000061 ever go back to BL32 and to get
an instruction abort at address 0.
>>Does current TF-A support to run RAS test? It seems BL31 will crash.
See above. The answer really depends on the factors mentioned above.
The following would be helpful to know:
1) What platform you are running on? Can this issue be reproduced
outside your testing environment, perhaps on FVP or QEMU?
2) What version of TF-A and StandaloneMM is being used? Preferably the
commit-id, so that we can be sure we are looking at the same code.
3) What version of the kernel and sdei driver is being used?
4) I can't tell from looking at the log but do you know if writing 0x123
to sde_ras_poison causes a DMC620 interrupt or an SError or external
abort through memory access ?
Thanks
Raghu
On 4/13/20 12:16 AM, 吴斌(郅隆) via TF-A wrote:
> Dear Friends,
>
> I am using TF-A to test RAS feature.
> When I triggered DMC620 RAS error in Linux(echo 0x123 >
> /sys/kernel/debug/sdei_ras_poison).
> BL32 will recieve
> UnRecognized Event - 0xC4000061(SP_EVENT_COMPLETE_AARCH64) and finally
> BL31 crashed.
>
> In my understanding, this 0xC4000061 should consumed by BL31, not send
> it to BL32 again.
>
> A piece of error log as below:
>
> *************************************
>
> CperWrite - CperAddress@0xFF610064
> CperWrite - 1 Section@FFBE91A8, Length 80, SectionType@FFBE9138
> CperWrite - Got Error Section: Platform Memory.
> MmEntryPoint Done
> Received delegated event
> X0 : 0xC4000061
> X1 : 0x0
> X2 : 0x0
> X3 : 0x0
> Received event - 0xC4000061 on cpu 0
> UnRecognized Event - 0xC4000061
> Failed delegated event 0xC4000061, Status 0x2
> Unhandled Exception in EL3.
> x30 = 0x0000000000000000
> x0 = 0x00000000ff007e00
> x1 = 0xfffffffffffffffe
> x2 = 0x00000000600003c0
> x3 = 0x0000000000000000
> x4 = 0x0000000000000000
> x5 = 0x0000000000000000
> x6 = 0x00000000ff015080
> x7 = 0x0000000000000000
> x8 = 0x00000000c4000061
> x9 = 0x0000000000000021
> x10 = 0x0000000000000040
> x11 = 0x00000000ff00f2b0
> x12 = 0x00000000ff0118c0
> x13 = 0x0000000000000002
> x14 = 0x00000000ff016b70
> x15 = 0x00000000ff003f20
> x16 = 0x0000000000000044
> x17 = 0x00000000ff010430
> x18 = 0x0000000000000e3c
> x19 = 0x0000000000000000
> More error log please refer to attachment.
>
> My question is,
> 1. Does BL31 need to send 0xC4000061 event to BL32 again?
> 2. Does current TF-A support to run RAS test? It seems BL31 will crash.
>
> Appreciate your help.
>
> BRs,
> Bin Wu
>
--
TF-A mailing list
TF-A(a)lists.trustedfirmware.org<mailto:TF-A@lists.trustedfirmware.org>
https://lists.trustedfirmware.org/mailman/listinfo/tf-a
Hi RagHu,
Really appreciate your help.
I was downloaded this software stack from git.linaro.org. This software stack include ATF, kernel, edk2 and so on.
The user guide i used from linaro is:https://git.linaro.org/landing-teams/working/arm/arm-reference-platforms…
1) What platform you are running on? Can this issue be reproduced
outside your testing environment, perhaps on FVP or QEMU?
A: I am running on ARM N1-Edge FVP platform. It can reproduced on this FVP platform.
2) What version of TF-A and StandaloneMM is being used? Preferably the
commit-id, so that we can be sure we are looking at the same code.
A: TF-A: https://git.linaro.org/landing-teams/working/arm/arm-tf.git tag:RD-INFRA-20191024-RC0
StandloneMM seems build from edk2 & edk2-platform. so i just put edk2 and edk2-platform version information. if anything i missed, please let me know.
edk2: https://git.linaro.org/landing-teams/working/arm/edk2.git tag:RD-INFRA-20191024-RC0
edk2-platform: https://git.linaro.org/landing-teams/working/arm/edk2-platforms.git tag:RD-INFRA-20191024-RC0
3) What version of the kernel and sdei driver is being used?
A: kernel-release: https://git.linaro.org/landing-teams/working/arm/kernel-release.git tag:RD-INFRA-20191024-RC0
The sdei driver was included in kernel, do i need to provide sdei driver version? If need please let me know.
4) I can't tell from looking at the log but do you know if writing 0x123
to sde_ras_poison causes a DMC620 interrupt or an SError or external
abort through memory access ?
A: Sorry, linaro only refered it will inject the DMC-620 single-bit RAS error. So I am also not sure which exception type it will trigger.
BRs,
Bin Wu
------------------原始邮件 ------------------
发件人:TF-A <tf-a-bounces(a)lists.trustedfirmware.org>
发送时间:Tue Apr 14 01:25:47 2020
收件人:Raghu Krishnamurthy via TF-A <tf-a(a)lists.trustedfirmware.org>
主题:Re: [TF-A] [RAS] BL32 UnRecognized Event - 0xC4000061 and BL31 Crashed
Hello,
>>Does BL31 need to send 0xC4000061 event to BL32 again?
I don't think it will. It is really odd that
0xC4000061(SP_EVENT_COMPLETE_AARCH64) ever reaches the BL32/MM handler.
This is from looking at the upstream code quickly but it definitely
depends on the platform you are running, what version of TF-A you are
using, build options used. Is it possible that the unhandled exception
is occurring after successful handling of the DMC620 error but there is
a following issue that occurs right after, causing the crash?
From the register dump it looks like there was an Instruction abort
exception at address 0 while running in EL3. Something seems to have
gone seriously wrong to have 0xC4000061 ever go back to BL32 and to get
an instruction abort at address 0.
>>Does current TF-A support to run RAS test? It seems BL31 will crash.
See above. The answer really depends on the factors mentioned above.
The following would be helpful to know:
1) What platform you are running on? Can this issue be reproduced
outside your testing environment, perhaps on FVP or QEMU?
2) What version of TF-A and StandaloneMM is being used? Preferably the
commit-id, so that we can be sure we are looking at the same code.
3) What version of the kernel and sdei driver is being used?
4) I can't tell from looking at the log but do you know if writing 0x123
to sde_ras_poison causes a DMC620 interrupt or an SError or external
abort through memory access ?
Thanks
Raghu
On 4/13/20 12:16 AM, 吴斌(郅隆) via TF-A wrote:
> Dear Friends,
>
> I am using TF-A to test RAS feature.
> When I triggered DMC620 RAS error in Linux(echo 0x123 >
> /sys/kernel/debug/sdei_ras_poison).
> BL32 will recieve
> UnRecognized Event - 0xC4000061(SP_EVENT_COMPLETE_AARCH64) and finally
> BL31 crashed.
>
> In my understanding, this 0xC4000061 should consumed by BL31, not send
> it to BL32 again.
>
> A piece of error log as below:
>
> *************************************
>
> CperWrite - CperAddress@0xFF610064
> CperWrite - 1 Section@FFBE91A8, Length 80, SectionType@FFBE9138
> CperWrite - Got Error Section: Platform Memory.
> MmEntryPoint Done
> Received delegated event
> X0 : 0xC4000061
> X1 : 0x0
> X2 : 0x0
> X3 : 0x0
> Received event - 0xC4000061 on cpu 0
> UnRecognized Event - 0xC4000061
> Failed delegated event 0xC4000061, Status 0x2
> Unhandled Exception in EL3.
> x30 = 0x0000000000000000
> x0 = 0x00000000ff007e00
> x1 = 0xfffffffffffffffe
> x2 = 0x00000000600003c0
> x3 = 0x0000000000000000
> x4 = 0x0000000000000000
> x5 = 0x0000000000000000
> x6 = 0x00000000ff015080
> x7 = 0x0000000000000000
> x8 = 0x00000000c4000061
> x9 = 0x0000000000000021
> x10 = 0x0000000000000040
> x11 = 0x00000000ff00f2b0
> x12 = 0x00000000ff0118c0
> x13 = 0x0000000000000002
> x14 = 0x00000000ff016b70
> x15 = 0x00000000ff003f20
> x16 = 0x0000000000000044
> x17 = 0x00000000ff010430
> x18 = 0x0000000000000e3c
> x19 = 0x0000000000000000
> More error log please refer to attachment.
>
> My question is,
> 1. Does BL31 need to send 0xC4000061 event to BL32 again?
> 2. Does current TF-A support to run RAS test? It seems BL31 will crash.
>
> Appreciate your help.
>
> BRs,
> Bin Wu
>
--
TF-A mailing list
TF-A(a)lists.trustedfirmware.org
https://lists.trustedfirmware.org/mailman/listinfo/tf-a
Hi Varun,
1. The value of '1' sets ‘standard’ type of BP which according to GCC documentation:
"turns on all types of branch protection features. If a feature has additional tuning options, then ‘standard’ sets it to its standard level. "
It equals to "bti+pac-ret".
2. Yes. See above and use option value of '1'.
Regards.
Alexei
________________________________
From: TF-A <tf-a-bounces(a)lists.trustedfirmware.org> on behalf of Varun Wadekar via TF-A <tf-a(a)lists.trustedfirmware.org>
Sent: 10 April 2020 19:28
To: tf-a(a)lists.trustedfirmware.org <tf-a(a)lists.trustedfirmware.org>
Cc: Kalyani Chidambaram Vaidyanathan <kalyanic(a)nvidia.com>; Anthony Zhou <anzhou(a)nvidia.com>
Subject: Re: [TF-A] BRANCH_PROTECTION
Hello,
Can someone please help clarify?
-Varun
From: TF-A <tf-a-bounces(a)lists.trustedfirmware.org> On Behalf Of Varun Wadekar via TF-A
Sent: Tuesday, April 7, 2020 9:58 PM
To: tf-a(a)lists.trustedfirmware.org
Cc: Kalyani Chidambaram Vaidyanathan <kalyanic(a)nvidia.com>; Anthony Zhou <anzhou(a)nvidia.com>
Subject: [TF-A] BRANCH_PROTECTION
External email: Use caution opening links or attachments
Hello,
Can someone please help me understand if
1. a ‘value’ of ‘1’ for BRANCH_PROTECTION covers the PAuth protection provided by a value of ‘2’ and/or ‘3’?
2. there is a way to enable BTI and “pac-ret” at the same time?
The docs provide this information.
<snip>
- ``BRANCH_PROTECTION``: Numeric value to enable ARMv8.3 Pointer Authentication
and ARMv8.5 Branch Target Identification support for TF-A BL images themselves.
If enabled, it is needed to use a compiler that supports the option
``-mbranch-protection``. Selects the branch protection features to use:
- 0: Default value turns off all types of branch protection
- 1: Enables all types of branch protection features
- 2: Return address signing to its standard level
- 3: Extend the signing to include leaf functions
The table below summarizes ``BRANCH_PROTECTION`` values, GCC compilation options
and resulting PAuth/BTI<https://tegra-sw-opengrok.nvidia.com/source/s?path=PAuth/BTI&project=stage-…> features.
+-------+--------------+-------+-----+
| Value | GCC option | PAuth | BTI |
+=======+==============+=======+=====+
| 0 | none | N | N |
+-------+--------------+-------+-----+
| 1 | standard | Y | Y |
+-------+--------------+-------+-----+
| 2 | pac-ret | Y | N |
+-------+--------------+-------+-----+
| 3 | pac-ret+leaf | Y | N |
+-------+--------------+-------+-----+
This option defaults to 0 and this is an experimental feature.
Note that Pointer Authentication is enabled for Non-secure world
irrespective of the value of this option if the CPU supports it.
<snip>
Thanks,
Varun
________________________________
This email message is for the sole use of the intended recipient(s) and may contain confidential information. Any unauthorized review, use, disclosure or distribution is prohibited. If you are not the intended recipient, please contact the sender by reply email and destroy all copies of the original message.
________________________________
IMPORTANT NOTICE: The contents of this email and any attachments are confidential and may also be privileged. If you are not the intended recipient, please notify the sender immediately and do not disclose the contents to any other person, use it for any purpose, or store or copy the information in any medium. Thank you.
Hello,
>>Does BL31 need to send 0xC4000061 event to BL32 again?
I don't think it will. It is really odd that
0xC4000061(SP_EVENT_COMPLETE_AARCH64) ever reaches the BL32/MM handler.
This is from looking at the upstream code quickly but it definitely
depends on the platform you are running, what version of TF-A you are
using, build options used. Is it possible that the unhandled exception
is occurring after successful handling of the DMC620 error but there is
a following issue that occurs right after, causing the crash?
From the register dump it looks like there was an Instruction abort
exception at address 0 while running in EL3. Something seems to have
gone seriously wrong to have 0xC4000061 ever go back to BL32 and to get
an instruction abort at address 0.
>>Does current TF-A support to run RAS test? It seems BL31 will crash.
See above. The answer really depends on the factors mentioned above.
The following would be helpful to know:
1) What platform you are running on? Can this issue be reproduced
outside your testing environment, perhaps on FVP or QEMU?
2) What version of TF-A and StandaloneMM is being used? Preferably the
commit-id, so that we can be sure we are looking at the same code.
3) What version of the kernel and sdei driver is being used?
4) I can't tell from looking at the log but do you know if writing 0x123
to sde_ras_poison causes a DMC620 interrupt or an SError or external
abort through memory access ?
Thanks
Raghu
On 4/13/20 12:16 AM, 吴斌(郅隆) via TF-A wrote:
> Dear Friends,
>
> I am using TF-A to test RAS feature.
> When I triggered DMC620 RAS error in Linux(echo 0x123 >
> /sys/kernel/debug/sdei_ras_poison).
> BL32 will recieve
> UnRecognized Event - 0xC4000061(SP_EVENT_COMPLETE_AARCH64) and finally
> BL31 crashed.
>
> In my understanding, this 0xC4000061 should consumed by BL31, not send
> it to BL32 again.
>
> A piece of error log as below:
>
> *************************************
>
> CperWrite - CperAddress@0xFF610064
> CperWrite - 1 Section@FFBE91A8, Length 80, SectionType@FFBE9138
> CperWrite - Got Error Section: Platform Memory.
> MmEntryPoint Done
> Received delegated event
> X0 : 0xC4000061
> X1 : 0x0
> X2 : 0x0
> X3 : 0x0
> Received event - 0xC4000061 on cpu 0
> UnRecognized Event - 0xC4000061
> Failed delegated event 0xC4000061, Status 0x2
> Unhandled Exception in EL3.
> x30 = 0x0000000000000000
> x0 = 0x00000000ff007e00
> x1 = 0xfffffffffffffffe
> x2 = 0x00000000600003c0
> x3 = 0x0000000000000000
> x4 = 0x0000000000000000
> x5 = 0x0000000000000000
> x6 = 0x00000000ff015080
> x7 = 0x0000000000000000
> x8 = 0x00000000c4000061
> x9 = 0x0000000000000021
> x10 = 0x0000000000000040
> x11 = 0x00000000ff00f2b0
> x12 = 0x00000000ff0118c0
> x13 = 0x0000000000000002
> x14 = 0x00000000ff016b70
> x15 = 0x00000000ff003f20
> x16 = 0x0000000000000044
> x17 = 0x00000000ff010430
> x18 = 0x0000000000000e3c
> x19 = 0x0000000000000000
> More error log please refer to attachment.
>
> My question is,
> 1. Does BL31 need to send 0xC4000061 event to BL32 again?
> 2. Does current TF-A support to run RAS test? It seems BL31 will crash.
>
> Appreciate your help.
>
> BRs,
> Bin Wu
>