Hi Andre,
Looks like the issue is related to unhandled Exception in EL3. I have enabled serial prints in BL31, that is pointing to this. Please see the below logs. Any ideas??
INFO: ARM GICv2 driver initialized NOTICE: BL2: RZ/G Initial Program Loader(CA57) Rev.2.0.6 NOTICE: BL2: PRR is RZG G2M Ver.1.3 NOTICE: BL2: Board is HiHope RZ/G2M Rev.4.0 NOTICE: BL2: Boot device is QSPI Flash(40MHz) NOTICE: BL2: LCM state is unknown NOTICE: BL2: DDR3200(rev.0.40) NOTICE: BL2: [COLD_BOOT] NOTICE: BL2: DRAM Split is 2ch NOTICE: BL2: QoS is default setting(rev.0.19) NOTICE: BL2: DRAM refresh interval 1.95 usec NOTICE: BL2: Periodic Write DQ Training NOTICE: BL2: CH0: 400000000 - 47fffffff, 2 GiB NOTICE: BL2: CH2: 600000000 - 67fffffff, 2 GiB NOTICE: BL2: FDT at 0xe63236c8 NOTICE: BL2: v2.3(release):v2.3-786-ge62d7c7e9-dirty NOTICE: BL2: Built : 14:06:08, Oct 11 2020 NOTICE: BL2: Normal boot INFO: BL2: Doing platform setup INFO: BL2: Loading image id 3 NOTICE: BL2: dst=0xe6323300 src=0x8180000 len=512(0x200) NOTICE: BL2: dst=0x43f00000 src=0x8180400 len=6144(0x1800) WARNING: r-car ignoring the BL31 size from certificate,using RCAR_TRUSTED_SRAM_SIZE instead INFO: Loading image id=3 at address 0x44000000 NOTICE: rcar_file_len: len: 0x0003e000 NOTICE: BL2: dst=0x44000000 src=0x81c0000 len=253952(0x3e000) INFO: Image id=3 loaded: 0x44000000 - 0x4403e000 INFO: BL2: Loading image id 5 INFO: Loading image id=5 at address 0x50000000 NOTICE: rcar_file_len: len: 0x00100000 NOTICE: BL2: dst=0x50000000 src=0x8300000 len=1048576(0x100000) INFO: Image id=5 loaded: 0x50000000 - 0x50100000 NOTICE: BL2: Booting BL31 INFO: Entry point address = 0x44000000 INFO: SPSR = 0x3cd Unhandled Exception in EL3. x30 = 0x0000000000000000 x0 = 0x0000000000000001 x1 = 0x0000000000000000 x2 = 0x0000000000000000 x3 = 0x0000000000000004 x4 = 0x0000000000000047 x5 = 0x0000000044053fc0 x6 = 0x0000000000000000 x7 = 0x0000000000000000 x8 = 0x00000000e6322218 x9 = 0x00000000440000e8 x10 = 0x000000004400546c x11 = 0x00000000e63220c8 x12 = 0x00000000000000b8 x13 = 0x00000000000000a4 x14 = 0x00000000e63234d8 x15 = 0x00000000e63236c8 x16 = 0x00000000e630e730 x17 = 0x0000000044005010 x18 = 0x0000000000000013 x19 = 0x000000004400003c x20 = 0x0000000000000000 x21 = 0x0000000000000000 x22 = 0x0000000000000000 x23 = 0x0000000000000000 x24 = 0x0000000000000000 x25 = 0x0000000000000000 x26 = 0x0000000000000000 x27 = 0x0000000000000000 x28 = 0x0000000000000000 x29 = 0x00000000440084d0 scr_el3 = 0x0000000000000238 sctlr_el3 = 0x0000000030c5183a cptr_el3 = 0x0000000000000000 tcr_el3 = 0x0000000000000000 daif = 0x00000000000003c0 mair_el3 = 0x44e048e000098aa4 spsr_el3 = 0x00000000800002cc elr_el3 = 0x0000000000000000 ttbr0_el3 = 0x0000000000000000 esr_el3 = 0x0000000002000000 far_el3 = 0x0000000000000000 spsr_el1 = 0x0000000000000000 elr_el1 = 0x0000000000000000 spsr_abt = 0x0000000000000010 spsr_und = 0x0000000000000010 spsr_irq = 0x0000000000000010 spsr_fiq = 0x0000000000000010 sctlr_el1 = 0x0000000000c50838 actlr_el1 = 0x0000000000000000 cpacr_el1 = 0x0000000000300000 csselr_el1 = 0x0000000000000000 sp_el1 = 0x00000000e6360000 esr_el1 = 0x0000000000000000 ttbr0_el1 = 0x0000000000000000 ttbr1_el1 = 0x0000000000000000 mair_el1 = 0x44e048e000098aa4 amair_el1 = 0x0000000000000000 tcr_el1 = 0x0000000000000000 tpidr_el1 = 0x0000000000000000 tpidr_el0 = 0x0000000000000000 tpidrro_el0 = 0x0000000000000000 par_el1 = 0x0000000000000000 mpidr_el1 = 0x0000000080000000 afsr0_el1 = 0x0000000000000000 afsr1_el1 = 0x0000000000000000 contextidr_el1 = 0x0000000000000000 vbar_el1 = 0x0000000000000000 cntp_ctl_el0 = 0x0000000000000000 cntp_cval_el0 = 0x0000000000000000 cntv_ctl_el0 = 0x0000000000000000 cntv_cval_el0 = 0x0000000000000000 cntkctl_el1 = 0x0000000000000000 sp_el0 = 0x00000000440084d0 isr_el1 = 0x0000000000000000 dacr32_el2 = 0x0000000055555555 ifsr32_el2 = 0x0000000000000000 cpuectlr_el1 = 0x0000001b00000040 cpumerrsr_el1 = 0x0000000000000000 l2merrsr_el1 = 0x0000000000000000 gicc_hppir = 0x00000000000003ff gicc_ahppir = 0x00000000000003ff gicc_ctlr = 0x0000000000000000 gicd_ispendr regs (Offsets 0x200 - 0x278) Offset: value 0000000000000200: 0x0000000000000000 0000000000000208: 0x0000000000000000 0000000000000210: 0x0000000000000000 0000000000000218: 0x0000000000000000 0000000000000220: 0x0000000000000000 0000000000000228: 0x0000000000000000 0000000000000230: 0x0000000000000000 0000000000000238: 0x0000000000000000 0000000000000240: 0x0000000000000000 0000000000000248: 0x0000000000000000 0000000000000250: 0x0000000000000000 0000000000000258: 0x0000000000000000 0000000000000260: 0x0000000000000000 0000000000000268: 0x0000000000000000 0000000000000270: 0x0000000000000000 0000000000000278: 0x0000000000000000 cci_snoop_ctrl_cluster0x10000000000000000 cci_snoop_ctrl_cluster1x10000000000000000
Working log by commenting memset in rcar_log_init ------------------------------------------------------------------------
INFO: ARM GICv2 driver initialized NOTICE: BL2: RZ/G Initial Program Loader(CA57) Rev.2.0.6 NOTICE: BL2: PRR is RZG G2M Ver.1.3 NOTICE: BLDEBUG:dst=0xe63234b8, val = 0, count=80 NOTICE: BL2: Board is HiHope RZ/G2M Rev.4.0 NOTICE: BL2: Boot device is QSPI Flash(40MHz) NOTICE: BL2: LCM state is unknown NOTICE: BL2: DDR3200(rev.0.40) NOTICE: BL2: [COLD_BOOT] NOTICE: BL2: DRAM Split is 2ch NOTICE: BL2: QoS is default setting(rev.0.19) NOTICE: BL2: DRAM refresh interval 1.95 usec NOTICE: BL2: Periodic Write DQ Training NOTICE: BLDEBUG:dst=0xe63236c8, val = 0, count=4096 NOTICE: BLDEBUG:dst=0xe63234b8, val = 0, count=80 NOTICE: BL2: CH0: 400000000 - 47fffffff, 2 GiB NOTICE: BL2: CH2: 600000000 - 67fffffff, 2 GiB NOTICE: BLDEBUG:dst=0xe6323760, val = 0, count=20 NOTICE: BLDEBUG:dst=0xe6323760, val = 0, count=16 NOTICE: BL2: FDT at 0xe63236c8 NOTICE: BL2: v2.3(release):v2.3-787-g662de50a8-dirty NOTICE: BL2: Built : 13:08:51, Oct 11 2020 NOTICE: BL2: Normal boot INFO: BL2: Doing platform setup INFO: BL2: Loading image id 3 NOTICE: BLDEBUG:dst=0xe6323300, val = 0, count=512 NOTICE: BLDEBUG:dst=0xe6324728, val = 0, count=24 NOTICE: BL2: dst=0xe6323300 src=0x8180000 len=512(0x200) NOTICE: BL2: dst=0x43f00000 src=0x8180400 len=6144(0x1800) NOTICE: BLDEBUG:dst=0xe6324728, val = 0, count=24 WARNING: r-car ignoring the BL31 size from certificate,using RCAR_TRUSTED_SRAM_SIZE instead INFO: Loading image id=3 at address 0x44000000 NOTICE: rcar_file_len: len: 0x0003e000 NOTICE: BL2: dst=0x44000000 src=0x81c0000 len=253952(0x3e000) NOTICE: BLDEBUG:dst=0xe6324728, val = 0, count=24 INFO: Image id=3 loaded: 0x44000000 - 0x4403e000 NOTICE: BLDEBUG:dst=0xe6324740, val = 0, count=40 NOTICE: BLDEBUG:dst=0x4403e200, val = 0, count=224 INFO: BL2: Loading image id 5 NOTICE: BLDEBUG:dst=0xe6323300, val = 0, count=512 NOTICE: BLDEBUG:dst=0xe6324728, val = 0, count=24 INFO: Loading image id=5 at address 0x50000000 NOTICE: rcar_file_len: len: 0x00100000 NOTICE: BL2: dst=0x50000000 src=0x8300000 len=1048576(0x100000) NOTICE: BLDEBUG:dst=0xe6324728, val = 0, count=24 INFO: Image id=5 loaded: 0x50000000 - 0x50100000 NOTICE: BLDEBUG:dst=0xe6324740, val = 0, count=40 NOTICE: BL2: Booting BL31 INFO: Entry point address = 0x44000000 INFO: SPSR = 0x3cd NOTICE: BL3-1 : Rev.2.0.6 NOTICE: BL31: v2.3(release):v2.3-787-g662de50a8-dirty NOTICE: BL31: Built : 13:08:55, Oct 11 2020 INFO: ARM GICv2 driver initialized INFO: BL31: Initializing runtime services NOTICE: BLDEBUG:dst=0x4400b9b0, val = -1, count=128 INFO: BL31: Preparing for EL3 exit to normal world INFO: Entry point address = 0x50000000 INFO: SPSR = 0x3c5
U-Boot 2020.10-00535-gd0e9210a2f (Oct 08 2020 - 12:34:09 +0100)
CPU: Renesas Electronics R8A774A1 rev 1.3 Model: HopeRun HiHope RZ/G2M with sub board DRAM: 3.9 GiB MMC: mmc@ee100000: 0, mmc@ee160000: 1 Loading Environment from MMC... OK In: serial@e6e88000 Out: serial@e6e88000 Err: serial@e6e88000 Net: eth0: ethernet@e6800000 Hit any key to stop autoboot: 0
Regards, Biju
-----Original Message----- From: Biju Das Sent: 11 October 2020 11:01 To: André Przywara andre.przywara@arm.com; TF- A@lists.trustedfirmware.org; Madhukar Pappireddy madhukar.pappireddy@arm.com; Joanna.Farley@arm.com Cc: Chris Paterson Chris.Paterson2@renesas.com; Prabhakar Mahadev Lad prabhakar.mahadev-lad.rj@bp.renesas.com; Marek Vasut marek.vasut@gmail.com; Marek Vasut marek.vasut+renesas@gmail.com Subject: RE: commit 75fab6496e5fce9a11 ("libc: memset: improve performance by avoiding single byte writes") causing BL31 boot failure on Renesas RZ/G2 platforms.
Hi Andre,
Subject: Re: commit 75fab6496e5fce9a11 ("libc: memset: improve performance by avoiding single byte writes") causing BL31 boot failure on Renesas RZ/G2 platforms.
>> >>>>>> Subject: Re: commit 75fab6496e5fce9a11 ("libc: memset: >>>>>> improve performance by avoiding single byte writes") >>>>>> causing >>>>>> BL31 boot failure on Renesas RZ/G2 platforms. >>>>>> >>>>>> On 08/10/2020 20:05, Biju Das wrote: >>>>>>> Then on target, found that BL31 is failed to boot[3], >>>>>>> which is fixed by reverting the commit 75fab6496e5fce9a11 >>>>>>> ("libc: memset: improve performance by avoiding single >>>>>>> byte writes"), see >>>>>> the logs [3]. >>>>>> >>>>>> Mmmh, interesting. Can you build with "DEBUG=1" to get
more
>>>>>> output from BL31? >>>>> >>>>> Sure Will do and provide feedback. >>> >>>>>> I see some calls to memset() from code in
drivers/renesas/rcar.
>>>>>> Can you add some debug prints at the top of memset() to
dump
>>>>>> the parameters on each call? To see what breaks it? >>>>> >>>>> OK. >>> >>> I believe commenting [1] fixed the issue which corresponds to >>> memset >> parameters[2]. >>> Looks like calling rcar_log_init-->memset from [3] is causing >>> the
issue.
>> >> Many thanks for debugging this, that's helpful! >> >> But without your patch it is working fine. Only thing is, you >> have added extra code to >>> improve the speed. I am not sure how it is impacting here. >>> Currently I don't >> have access to Lauterbach to debug further. Any thoughts ??? >> >> So this RCAR_BL31_LOG_BASE address is mapped as device
memory,
what >> kind of device is this? SRAM? Or something non-volatile, even? >> Even if it's SRAM, sometimes the controller managing this does >> not support all AMBA accesses (hence it's a device, not memory). > > It is DRAM only, with Security protected attribute.
Mmh, interesting. Why is it then mapped as device?
#define MAP_ATFW_LOG
MAP_REGION_FLAT(RCAR_BL31_LOG_BASE,
\ RCAR_BL31_LOG_SIZE, \ MT_DEVICE | MT_RW | MT_SECURE)
You could try using MT_NORMAL instead. Or ...
>> In general letting the compiler access device memory directly >> is asking for trouble, every access should go through MMIO
accessors.
>> Linux has >> memset_io() for those kind of problems. >> >> Can you try the attached patch? That should downgrade the >> stores to 32-bit ones. Not a solution, but would help to isolate the
problem.
> > Patch didn't fix the issue.
I see, bummer. Does it work if you apply this on top?
diff --git a/lib/libc/memset.c b/lib/libc/memset.c index 5ad6ede51..7478871e3 100644 --- a/lib/libc/memset.c +++ b/lib/libc/memset.c @@ -7,6 +7,7 @@ #include <stddef.h> #include <string.h> #include <stdint.h> +#include <lib/mmio.h>
void *memset(void *dst, int val, size_t count) { @@ -34,7 +35,8 @@ void *memset(void *dst, int val, size_t count) /* Use 32-bit writes for as long as possible. */ ptr32 = (void *)ptr; for (; count >= 4; count -= 4) { -*ptr32++ = fill; +mmio_write_32((uintptr_t)ptr32, fill); +ptr32++; }
/* Handle the remaining part byte-per-byte. */
> However I copied the same code emmc_memset to rcar_memset()
and
with that it works fine. > Looks like byte access works OK. Any thoughts??
That's weird, I just see "str w1, [x3, x5]" in the disassembly, and since everything is nicely aligned, it should actually work. That is still with the note that letting the compiler access device memory is wrong, but the generated code is actually very similar in this special case
here.
Issue is with your original commit[1], it is entering infinite loop. I
commented that while loop and it is fine.
[1]https://git.trustedfirmware.org/TF-A/trusted-firmware-a.git/tre e/ li b/libc/memset.c#n43
That's very weird. I stared at the code for a while, and can't see the issue. I also put it through my unit tests again, and tried to mimic your special case (clearing 81840 bytes at 0xyyyy0010). That worked
fine for me every time.
Do you see the actual problem?
Can you send me the resulting binary:
build/rcar/release/libc/memset.o?
Just to check what your compiler made of it?
Here it is [1] https://gitlab.com/bijudas/debug
Thanks, that is actually identical to what my compiler produced, so that means toolchain-wise we are on the same page.
Just a suggestion, Why can't we use finite for loops with pointer arithmetic
instead of the while loops?
- first for loop with ptr64 and ptr64 +(remainingcount >> 3) and
- second for loop with ptr and ptr + (remainingcount & 7)
But what would that change?
Hmm. The result same, it didn't fix the issue.
Please see the code generated by compiler for working case and non
working case.
Just to clarify: with "working" you mean "commenting the last while loop"?
Yes. I just want to highlight that some kind of issue is happening just before the while loop.
Which breaks the algorithm, as we would lose the tail bytes to be filled, in case the end of the to-be-filled region is not 64-bit aligned? So it's not an option to just skip that. It just happens to work in your case because the end *is* 64-bit aligned, so the final loop is not needed.
Agreed.
And besides, I still don't see the problem. I stepped through the disassembly, with (<aligned ptr>, 0, 81840) for the parameters and don't see why there would be a problem. Once it's done with the 64-bit write section, both x2 and x6 are 0, so the cmp at 0x48 sets the Z bit, and "b.eq 68" branches to the
ret.
So the whole final loop is skipped, as intended.
Please see the line inserted by compiler for non working case. 3c:928000e4 movx4, #0xfffffffffffffff8 // #-8
Yes, that has a particular purpose (which is not very obvious because this is - Os optimised): It is used to *subtract* the already filled bytes from the original length, to then serve as the limit for the final while loop - which is actually turned in a for loop in assembly! There is no msub in AArch64, so the compiler uses madd and makes one of the factors negative.
So, can you please double check that: memset(0x44040010, 0, 81840); is really the call that hangs your system?
Yes, This is called from console.S-->rcar_log_int-->memset
And that it really does because so it's being stuck in the last while loop?
After further investigation, not sure it is related to while loop or some kind of corruption?, since I don't have access to JTAG further debugging. Mostly some kind of corruption leading to crash.
The new code introduced, some increased code size, which is causing bl31 boot failure, when we call memset from console.S
Case 1) console.S -->rcar_log_int -->memset --> bl31 not booting.
Case 2) If I comment, rcar_log_init from console.S --> bl31 boots fine.
Case 3) If I comment, rcar_log_init from console.S and call rcar_log_init from bl31_early_platform_setup2 --> bl31 boots fine.
Any thoughts?
And can you make this RCAR_BL31_LOG_BASE region MT_NORMAL
memory and
see if the problem still persists?
Yes , still the problem persists. The current mapping "MT_DEVICE" is as per "Use of coherent memory in TF- A" section as mentioned in [1]. By changing the region as MT_NORMAL also didn't help , see the patch [2]. Please share your opinion on this?
[1] https://github.com/ARM-software/arm-trusted- firmware/blob/master/docs/design/firmware-design.rst [2] diff --git a/include/lib/xlat_tables/xlat_tables_v2.h b/include/lib/xlat_tables/xlat_tables_v2.h index 359b9839a..c50727a42 100644 --- a/include/lib/xlat_tables/xlat_tables_v2.h +++ b/include/lib/xlat_tables/xlat_tables_v2.h @@ -86,6 +86,7 @@ #define MT_DEVICE U(0) #define MT_NON_CACHEABLE U(1) #define MT_MEMORY U(2) +#define MT_NORMAL U(4) /* Values up to 7 are reserved to add new memory types in the future */
#define MT_RO (U(0) << MT_PERM_SHIFT) diff --git a/plat/renesas/rcar/aarch64/platform_common.c b/plat/renesas/rcar/aarch64/platform_common.c index b0a88cb6b..744c79c98 100644 --- a/plat/renesas/rcar/aarch64/platform_common.c +++ b/plat/renesas/rcar/aarch64/platform_common.c @@ -64,7 +64,7 @@ const uint8_t version_of_renesas[VERSION_OF_RENESAS_MAXLEN]
#define MAP_ATFW_LOG MAP_REGION_FLAT(RCAR_BL31_LOG_BASE, \ RCAR_BL31_LOG_SIZE, \
MT_DEVICE | MT_RW | MT_SECURE)
MT_NORMAL | MT_RW | MT_SECURE)
Sure, having no tail loop simplifies the algorithm, but is wrong. So what is the point here?
Agreed, I just want to give some pointers related to this issue, so that we can narrow down the issue. that's all.
Renesas Electronics Europe GmbH, Geschaeftsfuehrer/President: Carsten Jauch, Sitz der Gesellschaft/Registered office: Duesseldorf, Arcadiastrasse 10, 40472 Duesseldorf, Germany, Handelsregister/Commercial Register: Duesseldorf, HRB 3708 USt-IDNr./Tax identification no.: DE 119353406 WEEE-Reg.-Nr./WEEE reg. no.: DE 14978647