Hi,
I'm testing with Hafnium as SPMC at S-EL2 and OP-TEE as an SP at S-EL1 on
QEMU v7.0.0. I've run into a few problems and fixed most of them.
I believe the setup is similar to what Shiju is using in this mail thread
https://lists.trustedfirmware.org/archives/list/hafnium@lists.trustedfirmwa…
My setup can be duplicated with:
repo init -u https://github.com/jenswi-linaro/manifest.git -m qemu_v8.xml \
-b qemu_sel2
repo sync -j8
(cd hafnium && git submodule init && git submodule update)
cd build
make -j8 toolchains
make -j8 all
make run-only
With this xtest -x 1034 passes, xtest 1034 often causes
ERROR: Data abort: pc=0xe1198b8, esr=0x96000006, ec=0x25, far=0x9c
Panic: EL2 exception
Xtest runs dreadfully slow, I haven't investigated why yet, but at
least it works.
This is based on patches provided by Olivier at:
[1] https://review.trustedfirmware.org/c/TF-A/trusted-firmware-a/+/16412/2
[2] https://review.trustedfirmware.org/c/hafnium/hafnium/+/16323/7
I've also encountered the problem cache maintenance problem Shiju
described in the mail thread above:
NOTICE: Trapped access to system register write: op0=1, op1=0, crn=7,
crm=14, op2=2, rt=11.
It can be worked around by compiling OP-TEE with
CFG_CORE_WORKAROUND_NSITR_CACHE_PRIME=n. I'm pretty sure we do dcache
clean+inv elsewhere so I'm surprised it fails here. Is Hafnium expected
to block dcache clean+inv?
For Hafnium I've added two patches on top of [2], available at
https://github.com/jenswi-linaro/hafnium/tree/qemu_sel2:
- 79b4d2cbe06e SPMC: add missing ME initialization for secondary cores
- 659c79d5eacf feat(mm): fix FEAT_LPA workaround
For TF-A I've added a few patches on top of [1], available at
https://github.com/jenswi-linaro/arm-trusted-firmware/tree/qemu_sel2:
- a040396cae9e feat(qemu): add tos-fw-config for sel2 spmc
- 4f7d91723485 fix(qemu): change TOS_FW_CONFIG_NAME value
- fbfc9a222c7f spmd_smc_handler() add s/ns state to SMC traces
- ca65081b9cdc feat(sptool): add dependency to SP image
- b1e1b46a0680 fix(qemu): restore code to added needed psci nodes
For OP-TEE I've also added a few patches, available at
https://github.com/jenswi-linaro/optee_os/tree/qemu_sel2:
- 1057def23777 plat-vexpress: sel2 spmc: update for hafnium
- f18a54ed3524 core: ffa: use hvc instead of smc with S-EL2
- d18bbc92f7c1 core: mobj_ffa_add_pages_at() trust addresses from SPMC
There's also one patch for QEMU on top of v7.0.0, available at:
https://github.com/jenswi-linaro/qemu/tree/qemu_sel2
- 0c1e39672dcb Read PS bits from VTCR_EL2
The QEMU problem is fixed in v.7.1.0, but I can't get that version of
QEMU to work with TF-A. I guess it's because of yet another new CPU
feature since I'm running with "-cpu max".
I'll try to upstream the Hafnium and TF-A patches that make sense on
their own.
What's the plan with the interrupt controller?
How will OP-TEE be able to handle secure interrupts?
The hafnium git pulls in a few git submodules and even the source code
for a Linux kernel.
I guess this is useful in your internal CI setup, but when used
isolated as in my setup it makes no sense at all.
It would also be nice to be able to build with an external toolchain.
I hope this is a temporary situation, I don't see why Hafnium should
be pickier about toolchain than for instance TF-A.
Speaking of building, I haven't been able to figure out how to build
only for the QEMU variant I need so right now I'm building for
everything and that takes a bit longer than necessary.
I'm going to maintain the setup above as long as it's relevant to me. I may
add more patches on the branches or even rebase as needed. So if anyone is
using this, keep in mind that my branches may change without warning.
Thanks,
Jens
Hi Jens,
I have a couple of Hafnium changes implementing the use of VSTTBR_EL2/VTTBR_EL2 to split an SP IPA into secure and non-secure IPA spaces.
They're very much in experimental stage so difficult to share just now (I will do some time later in February).
However I'd like to report some possible issue observed with qemu.
Essentially, when normal world driver inits, it performs a first share operation for a single NS page:
INFO: 1> 1 0 FFA_MEM_SHARE_32(84000073) 50 50 0 0 0 0 0
[...]
VERBOSE: Marked sending complete.
Current share states:
SHARE 0x0 (from VM 0x0, attributes 0x6f (NS), flags 0x8, tag 0, to 1 recipients [VM 0x8001: 0x6 (offset 48)]): fully sent with 1 fragments, 0 retrieved, sender's original mode: 0x87
INFO: 1< 1 0 FFA_SUCCESS_32(84000061) 0 0 0 0 0 0 0
INFO: 1> 1 0 FFA_MSG_SEND_DIRECT_REQ_32(8400006f) 8001 0 80000000 0 0 0 0
E/TC:1 0 mobj_ffa_get_by_cookie:387 Populating mobj from rx buffer, cookie 0
Retrieve operation happens:
E/TC:1 0 spmc_retrieve_req:1415 spmc_retrieve_req enter.
INFO: 1> 1 8001 FFA_MEM_RETRIEVE_REQ_32(84000074) 30 30 0 0 0 0 0
Current share states:
SHARE 0x0 (from VM 0x0, attributes 0x6f (NS), flags 0x8, tag 0, to 1 recipients [VM 0x8001: 0x6 (offset 48)]): fully sent with 1 fragments, 0 retrieved, sender's original mode: 0x87
Current share states:
SHARE 0x0 (from VM 0x0, attributes 0x6f (NS), flags 0x8, tag 0, to 1 recipients [VM 0x8001: 0x6 (offset 48)]): fully sent with 1 fragments, 1 retrieved, sender's original mode: 0x87
INFO: 1< 1 8001 Unknown(84000075) 50 50 0 0 0 0 0
Hafnium maps the NS page into OP-TEE's S2 page tables rooted to by VTTBR_EL2
0: e178003 S
1: e179003 S
f: e17a003 S
186: 240000041f867ff NS
(similar dump from VSTTBR_EL2 show OP-TEE secure pages properly mapped)
OP-TEE then maps the page in its S1 PTs as NS:
E/TC:1 0 spmc_retrieve_req:1428 spmc_retrieve_req exit.
E/TC:1 0 thread_spmc_populate_mobj_from_rx:1506 thread_spmc_populate_mobj_from_rx exit.
E/TC:1 0 set_pages:1461 set_pages 0 addr 41f86000 count 1
E/TC:1 0 mobj_ffa_add_pages_at:220 mobj_ffa_add_pages_at is_ns 0
INFO: 1> 1 8001 FFA_RX_RELEASE_32(84000065) 0 0 0 0 0 0 0
INFO: 1< 1 8001 FFA_SUCCESS_32(84000061) 0 0 0 0 0 0 0
E/TC:1 0 ffa_inc_map:566 ffa_inc_map addr fa00000 pages 0x90000000e3eadd0 sz 4096
D/TC:1 0 core_mmu_xlat_table_alloc:526 xlat tables used 4 / 5
A page fault is hit when OP-TEE accesses the page from its VA:
WARNING: Stage-2 page fault: pc=0xe30b764, vmid=0x8001, vcpu=1, vaddr=0xfa0001c, ipaddr=0x41f8601c, mode=0x1 0x40000000000007c
This issue is not observed with the TC2 FVP and similar Hafnium+OP-TEE SW stack, at the same point of initialization.
So it seems qemu is not doing the translation properly from VTTBR_EL2 for a page mapped NS by OP-TEE (hence NS IPA space).
Who should I report this problem to?
Regards,
Olivier.
Hi, Olivier,
Whether Boot Information Blob is a file similar to manifest.dtb or sp.pkg?
How do I generate this blob?
Does it need to be packaged into atf firmware at compile phase?
Thanks.
Regards,
Yuye.
------------------------------------------------------------------
发件人:Olivier Deprez <Olivier.Deprez(a)arm.com>
发送时间:2023年3月21日(星期二) 22:37
收件人:Jens Wiklander <jens.wiklander(a)linaro.org>
抄 送:hafnium(a)lists.trustedfirmware.org <hafnium(a)lists.trustedfirmware.org>; OP-TEE TrustedFirmware <op-tee(a)lists.trustedfirmware.org>
主 题:Re: [Hafnium] Boot arguments for S-EL1 with SPMC at S-EL2
Hi Jens,
See comments inline [OD].
Regards,
Olivier.
________________________________
From: Jens Wiklander via Hafnium <hafnium(a)lists.trustedfirmware.org>
Sent: 21 March 2023 09:30
To: Olivier Deprez <Olivier.Deprez(a)arm.com>
Cc: hafnium(a)lists.trustedfirmware.org <hafnium(a)lists.trustedfirmware.org>; OP-TEE TrustedFirmware <op-tee(a)lists.trustedfirmware.org>
Subject: [Hafnium] Boot arguments for S-EL1 with SPMC at S-EL2
Hi Olivier,
I'm trying to implement a relocatable OP-TEE binary so it can be
loaded at different physical addresses without the need to recompile
it. This means that in the case with Hafnium when changing
"load-address" or "entrypoint-offset" in the OP-TEE SP manifest
there's no need to recompile OP-TEE. For this to work OP-TEE must be
able to figure out which memory range it's supposed to reside in.
[OD] Fair enough, I believe that would conflate into some generic PIE support?
I can see how this helps with the multi OP-TEE instances use case, perhaps?
Currently, OP-TEE knows the entry point address from PC and "memory
size" from X0. However, the "memory size" is from the "load-address"
so "entrypoint-offset" must be subtracted from PC in order to know the
allocated memory range.
Do you have ideas on how OP-TEE at runtime can determine the allocated
memory range?
[OD] The standard way would be to use the FF-A boot protocol e.g. if you add this to OP-TEE's SP manifest:
+ /* Boot protocol */
+ gp-register-num = <0x0>;
+
+ /* Boot Info */
+ boot-info {
+ compatible = "arm,ffa-manifest-boot-info";
+ ffa_manifest;
+ };
See https://trustedfirmware-a.readthedocs.io/en/latest/components/secure-partit… <https://trustedfirmware-a.readthedocs.io/en/latest/components/secure-partit… >
At SP entry, x0 contains the address of a 'Boot Information Blob' (FF-A v1.1 REL0 section 5.4.2) starting with a Boot Information Header (Table 5.9).
The SP manifest DT address can be extracted from a Boot Information Descriptor (Table 5.8).
Eventually the SP is able to parse its own SP manifest, into which it finds FF-A standard fields (load-address...) in addition to any impdef field added.
Note the partition size is not standarded (it should probably be), so for starters you can add your own memsize field.
There is some sample usage in TF-a-tests:
https://git.trustedfirmware.org/TF-A/tf-a-tests.git/tree/spm/cactus/cactus_… <https://git.trustedfirmware.org/TF-A/tf-a-tests.git/tree/spm/cactus/cactus_… >
and also in context of EDKII/StMM:
https://github.com/odeprez/edk2/commit/7066c913cb227056d68ea99de5e92f2067fe… <https://github.com/odeprez/edk2/commit/7066c913cb227056d68ea99de5e92f2067fe… >
Thanks,
Jens
--
Hafnium mailing list -- hafnium(a)lists.trustedfirmware.org
To unsubscribe send an email to hafnium-leave(a)lists.trustedfirmware.org
Hi, experts,
We seem to have solved the problem. This problem is not caused by the load_address of optee, but by the security space configured in the ATF.
It seems that the security space we have configured is too large. And why will it causes the problem needs to be further identified.
Regards,
Yuye.
------------------------------------------------------------------
发件人:梅建强(禹夜) <meijianqiang.mjq(a)alibaba-inc.com>
发送时间:2023年3月22日(星期三) 11:50
收件人:Olivier Deprez <Olivier.Deprez(a)arm.com>; Jens Wiklander <jens.wiklander(a)linaro.org>
抄 送:hafnium <hafnium(a)lists.trustedfirmware.org>; op-tee <op-tee(a)lists.trustedfirmware.org>
主 题:load optee at 36bit address
Hi, experts,
We are currently moving optee to a 36bit address to boot with environment
that hafnium runs at sel2 as spmc and optee runs at sel1 as sp.
Now we have moved hafnium to 0x880000000 and run successfully.
Then we tried moving optee to a 36bit address (0x89000000) as well.
Although hafnium and optee were successfully initialized on the primary cpu,
psci_cpu_on does not seem to be entered into when the secondary cpu is started.
The error is as follows:
https://github.com/OP-TEE/optee_os/issues/5895 <https://github.com/OP-TEE/optee_os/issues/5895 >
Is there any difference between the two cases
where hafnium and optee initialize the secondary cpu with different load_address?
The log print shows that the secondary cpu has not entered hafnium.
Could Hafnium be affected by the 36bit address when dealing with psci related transactions?
Regards,
Yuye.
Hi, Olivier
I need to explain the query in more detail.
>For building/running hafnium , did you have to apply a change similar to:
>https://review.trustedfirmware.org/c/hafnium/hafnium/+/18510 <https://review.trustedfirmware.org/c/hafnium/hafnium/+/18510 >
Yes.
Some time ago, we used hafnium codebase based on the commit ca03054ba6a351534b93e6d64c12e671578eb340.
And I verified the fix. The compilation passed, but the system cannot boot successfully.
Later, we updated the hafnium codebase based on the commit dd883207ee9b31c19169adf97c918d561dcb9a5c.
And I verified the fix again. The system can boot successfully.
Regards,
Yuye.
------------------------------------------------------------------
发件人:Olivier Deprez <Olivier.Deprez(a)arm.com>
发送时间:2023年3月23日(星期四) 16:54
收件人:梅建强(禹夜) <meijianqiang.mjq(a)alibaba-inc.com>
主 题:Re: load optee at 36bit address
Hi Yuye, just to acknowledge receipt of this query.
Please accept a slow response as I'm currently away from my office.
I will do a few trials at my end to reproduce and try to help at my best.
For building/running hafnium , did you have to apply a change similar to:
https://review.trustedfirmware.org/c/hafnium/hafnium/+/18510 <https://review.trustedfirmware.org/c/hafnium/hafnium/+/18510 >
Regards,
Olivier.
From: 梅建强(禹夜) <meijianqiang.mjq(a)alibaba-inc.com>
Sent: 22 March 2023 04:50
To: Olivier Deprez <Olivier.Deprez(a)arm.com>; Jens Wiklander <jens.wiklander(a)linaro.org>
Cc: hafnium <hafnium(a)lists.trustedfirmware.org>; op-tee <op-tee(a)lists.trustedfirmware.org>
Subject: load optee at 36bit address
Hi, experts,
We are currently moving optee to a 36bit address to boot with environment
that hafnium runs at sel2 as spmc and optee runs at sel1 as sp.
Now we have moved hafnium to 0x880000000 and run successfully.
Then we tried moving optee to a 36bit address (0x89000000) as well.
Although hafnium and optee were successfully initialized on the primary cpu,
psci_cpu_on does not seem to be entered into when the secondary cpu is started.
The error is as follows:
https://github.com/OP-TEE/optee_os/issues/5895 <https://github.com/OP-TEE/optee_os/issues/5895 >
Is there any difference between the two cases
where hafnium and optee initialize the secondary cpu with different load_address?
The log print shows that the secondary cpu has not entered hafnium.
Could Hafnium be affected by the 36bit address when dealing with psci related transactions?
Regards,
Yuye.
IMPORTANT NOTICE: The contents of this email and any attachments are confidential and may also be privileged. If you are not the intended recipient, please notify the sender immediately and do not disclose the contents to any other person, use it for any purpose, or store or copy the information in any medium. Thank you.
Hi Jens,
We're preparing for the Hafnium changes introducing FF-A v1.1 mem sharing structures, up to:
https://review.trustedfirmware.org/c/hafnium/hafnium/+/17399
This is done in a backwards compatible manner, in which a FF-A v1.0 endpoint can still use former mem sharing struct definitions.
The requirement is that the endpoint provides the right version either by its manifest, or by calling FFA_VERSION when booted.
In order for this transition to happen smoothly it would require OP-TEE to declare the v1.0 version here:
https://github.com/jenswi-linaro/build/blob/qemu_sel2/qemu_v8/optee_sp_mani…
or call FFA_VERSION first thing:
diff --git a/core/arch/arm/kernel/thread_spmc.c b/core/arch/arm/kernel/thread_spmc.c
index 240bcffe..893cb63b 100644
--- a/core/arch/arm/kernel/thread_spmc.c
+++ b/core/arch/arm/kernel/thread_spmc.c
@@ -1382,6 +1382,18 @@ static uint16_t spmc_get_id(void)
return args.a2;
}
+static uint16_t spmc_version(void)
+{
+ struct thread_smc_args args = {
+ .a0 = FFA_VERSION,
+ .a1 = MAKE_FFA_VERSION(FFA_VERSION_MAJOR, FFA_VERSION_MINOR),
+ };
+
+ thread_smccc(&args);
+
+ return (uint16_t)args.a0;
+}
+
static struct ffa_mem_transaction *spmc_retrieve_req(uint64_t cookie)
{
struct ffa_mem_transaction *trans_descr = nw_rxtx.tx;
@@ -1519,6 +1531,10 @@ out:
static TEE_Result spmc_init(void)
{
+ DMSG("OP-TEE FF-A version %x, SPMC version %x",
+ MAKE_FFA_VERSION(FFA_VERSION_MAJOR, FFA_VERSION_MINOR),
+ spmc_version());
+
spmc_rxtx_map(&nw_rxtx);
my_endpoint_id = spmc_get_id();
DMSG("My endpoint ID %#x", my_endpoint_id);
I also noticed PR #5359 introducing the v1.1 mem sharing structures, so this may be another way forward that I did not investigate.
https://github.com/jenswi-linaro/optee_os/commit/ddd107f019386d035488e3b4e7…
Let me know your opinions.
Regards,
Olivier.
Hi Olivier,
I'm trying to implement a relocatable OP-TEE binary so it can be
loaded at different physical addresses without the need to recompile
it. This means that in the case with Hafnium when changing
"load-address" or "entrypoint-offset" in the OP-TEE SP manifest
there's no need to recompile OP-TEE. For this to work OP-TEE must be
able to figure out which memory range it's supposed to reside in.
Currently, OP-TEE knows the entry point address from PC and "memory
size" from X0. However, the "memory size" is from the "load-address"
so "entrypoint-offset" must be subtracted from PC in order to know the
allocated memory range.
Do you have ideas on how OP-TEE at runtime can determine the allocated
memory range?
Thanks,
Jens
Hi, experts,
We are currently moving optee to a 36bit address to boot with environment
that hafnium runs at sel2 as spmc and optee runs at sel1 as sp.
Now we have moved hafnium to 0x880000000 and run successfully.
Then we tried moving optee to a 36bit address (0x89000000) as well.
Although hafnium and optee were successfully initialized on the primary cpu,
psci_cpu_on does not seem to be entered into when the secondary cpu is started.
The error is as follows:
https://github.com/OP-TEE/optee_os/issues/5895 <https://github.com/OP-TEE/optee_os/issues/5895 >
Is there any difference between the two cases
where hafnium and optee initialize the secondary cpu with different load_address?
The log print shows that the secondary cpu has not entered hafnium.
Could Hafnium be affected by the 36bit address when dealing with psci related transactions?
Regards,
Yuye.
Hi, Jens,
We seem to have found the answer following the comment you gave in the issue:
https://github.com/OP-TEE/optee_os/issues/5877 <https://github.com/OP-TEE/optee_os/issues/5877 >
The reason seems to be that the system has more non-secure memory than what OP-TEE is aware of.
When our system memory DRAM size is 32GB, optee has the following configuration for non-secure state memory:
#define DRAM1_BASE 0x880000000UL
#define DRAM1_SIZE 0x780000000UL
On another device, our system memory DRAM size is 64G, optee uses the above configuration for non-secure memory.
And xtest will not run stably.
In addition, I would like to ask you a few questions in order to understand the problem better.
1.
System has more non-secure memory than what OP-TEE is aware of, and optee will not run stably.
Why? What is the root cause of TEEC_ERROR_OUT_OF_MEMORY on the issue?
I'll do some research on it myself, meanwhile I hope you could give us some insights.
2.
When I execute the following command,
for i in {1.. 10}; do ./xtest&; done;
xtest prints out of order,
And the following error is reported:
E/TC:032 002 mobj_ffa_get_by_cookie:356 possible spinlock deadlock reminder 1
E/TC:070 mobj_ffa_unregister_by_cookie:310 possible spinlock deadlock reminder 1
E/TC:004 003 mobj_ffa_get_by_cookie:356 possible spinlock deadlock reminder 1
E/TC:058 001 mobj_ffa_get_by_cookie:356 possible spinlock deadlock reminder 1
Then the system hangs and restarts automatically.
Does optee currently support running multiple TAs in parallel?
Is the maximum number of TAs running in parallel equal to the number of vcpus?
3.
The values of MAX_XLAT_TABLES and CFG_CORE_HEAP_SIZE also seem to affect the stability of the system.
What configurations, such as the number of cores, should I pay attention to if I want to configure appropriate values for these options?
Thanks a lot.
Regards,
Yuye.
------------------------------------------------------------------
发件人:Jens Wiklander <jens.wiklander(a)linaro.org>
发送时间:2023年3月20日(星期一) 15:05
收件人:梅建强(禹夜) <meijianqiang.mjq(a)alibaba-inc.com>
抄 送:Olivier Deprez <Olivier.Deprez(a)arm.com>; hafnium <hafnium(a)lists.trustedfirmware.org>; op-tee <op-tee(a)lists.trustedfirmware.org>
主 题:Re: optee xtest cannot run success stably
Hi Yuye,
Comment below.
On Mon, Mar 20, 2023 at 4:43 AM 梅建强(禹夜)
<meijianqiang.mjq(a)alibaba-inc.com> wrote:
> Hi, experts
>
> Recently, we are testing the stability for running optee xtest with environment that hafnium runs as SPMC and optee runs on SPMC as SP.
> When we reboot the system, xtest failed on some cases with TEEC_ERROR_OUT_OF_MEMORY.
> It seems to be that there is an insufficient memory allocation somewhere in the chain.
> We tried the following:
> Using Single core startup,
> Increased optee MAX_XLAT_TABLES size to 16,
> Increased optee CFG_CORE_HEAP_SIZE to 0x2000000,
> Increasing the size of optee CFG_TEE_RAM_VA_SIZE to 0x4000000,
> Increasing the size of hafnium heap_pages to 8192,
> But nothing seems to be working.
> Can you offer any help or suggestions?
It would help if you could pinpoint the source of the out-of-memory
error. I guess it happens somewhere during mobj_ffa_get_by_cookie(),
where especially thread_spmc_populate_mobj_from_rx() is interesting.
It could also be worth setting CFG_CORE_DUMP_OOM=y, it's easy to
enable but I'm afraid it's more of a long shot.
Cheers,
Jens
> Some other configuration for optee is attached in the issue:
> https://github.com/OP-TEE/optee_os/issues/5893 <https://github.com/OP-TEE/optee_os/issues/5893 >
>
> Regards,
> Yuye.
>
Hi, experts
Recently, we are testing the stability for running optee xtest with environment that hafnium runs as SPMC and optee runs on SPMC as SP.
When we reboot the system, xtest failed on some cases with TEEC_ERROR_OUT_OF_MEMORY.
It seems to be that there is an insufficient memory allocation somewhere in the chain.
We tried the following:
Using Single core startup,
Increased optee MAX_XLAT_TABLES size to 16,
Increased optee CFG_CORE_HEAP_SIZE to 0x2000000,
Increasing the size of optee CFG_TEE_RAM_VA_SIZE to 0x4000000,
Increasing the size of hafnium heap_pages to 8192,
But nothing seems to be working.
Can you offer any help or suggestions?
Some other configuration for optee is attached in the issue:
https://github.com/OP-TEE/optee_os/issues/5893 <https://github.com/OP-TEE/optee_os/issues/5893 >
Regards,
Yuye.