Hi,
I'm testing with Hafnium as SPMC at S-EL2 and OP-TEE as an SP at S-EL1 on
QEMU v7.0.0. I've run into a few problems and fixed most of them.
I believe the setup is similar to what Shiju is using in this mail thread
https://lists.trustedfirmware.org/archives/list/hafnium@lists.trustedfirmwa…
My setup can be duplicated with:
repo init -u https://github.com/jenswi-linaro/manifest.git -m qemu_v8.xml \
-b qemu_sel2
repo sync -j8
(cd hafnium && git submodule init && git submodule update)
cd build
make -j8 toolchains
make -j8 all
make run-only
With this xtest -x 1034 passes, xtest 1034 often causes
ERROR: Data abort: pc=0xe1198b8, esr=0x96000006, ec=0x25, far=0x9c
Panic: EL2 exception
Xtest runs dreadfully slow, I haven't investigated why yet, but at
least it works.
This is based on patches provided by Olivier at:
[1] https://review.trustedfirmware.org/c/TF-A/trusted-firmware-a/+/16412/2
[2] https://review.trustedfirmware.org/c/hafnium/hafnium/+/16323/7
I've also encountered the problem cache maintenance problem Shiju
described in the mail thread above:
NOTICE: Trapped access to system register write: op0=1, op1=0, crn=7,
crm=14, op2=2, rt=11.
It can be worked around by compiling OP-TEE with
CFG_CORE_WORKAROUND_NSITR_CACHE_PRIME=n. I'm pretty sure we do dcache
clean+inv elsewhere so I'm surprised it fails here. Is Hafnium expected
to block dcache clean+inv?
For Hafnium I've added two patches on top of [2], available at
https://github.com/jenswi-linaro/hafnium/tree/qemu_sel2:
- 79b4d2cbe06e SPMC: add missing ME initialization for secondary cores
- 659c79d5eacf feat(mm): fix FEAT_LPA workaround
For TF-A I've added a few patches on top of [1], available at
https://github.com/jenswi-linaro/arm-trusted-firmware/tree/qemu_sel2:
- a040396cae9e feat(qemu): add tos-fw-config for sel2 spmc
- 4f7d91723485 fix(qemu): change TOS_FW_CONFIG_NAME value
- fbfc9a222c7f spmd_smc_handler() add s/ns state to SMC traces
- ca65081b9cdc feat(sptool): add dependency to SP image
- b1e1b46a0680 fix(qemu): restore code to added needed psci nodes
For OP-TEE I've also added a few patches, available at
https://github.com/jenswi-linaro/optee_os/tree/qemu_sel2:
- 1057def23777 plat-vexpress: sel2 spmc: update for hafnium
- f18a54ed3524 core: ffa: use hvc instead of smc with S-EL2
- d18bbc92f7c1 core: mobj_ffa_add_pages_at() trust addresses from SPMC
There's also one patch for QEMU on top of v7.0.0, available at:
https://github.com/jenswi-linaro/qemu/tree/qemu_sel2
- 0c1e39672dcb Read PS bits from VTCR_EL2
The QEMU problem is fixed in v.7.1.0, but I can't get that version of
QEMU to work with TF-A. I guess it's because of yet another new CPU
feature since I'm running with "-cpu max".
I'll try to upstream the Hafnium and TF-A patches that make sense on
their own.
What's the plan with the interrupt controller?
How will OP-TEE be able to handle secure interrupts?
The hafnium git pulls in a few git submodules and even the source code
for a Linux kernel.
I guess this is useful in your internal CI setup, but when used
isolated as in my setup it makes no sense at all.
It would also be nice to be able to build with an external toolchain.
I hope this is a temporary situation, I don't see why Hafnium should
be pickier about toolchain than for instance TF-A.
Speaking of building, I haven't been able to figure out how to build
only for the QEMU variant I need so right now I'm building for
everything and that takes a bit longer than necessary.
I'm going to maintain the setup above as long as it's relevant to me. I may
add more patches on the branches or even rebase as needed. So if anyone is
using this, keep in mind that my branches may change without warning.
Thanks,
Jens
Hi,
It is confirmed that the problem is related to the pmu register configuration and that pmccntr_el0 can be read at any exception level.
Regards,
Yuye.
------------------------------------------------------------------
发件人:梅建强(禹夜) <meijianqiang.mjq(a)alibaba-inc.com>
发送时间:2023年5月26日(星期五) 16:59
收件人:Jens Wiklander <jens.wiklander(a)linaro.org>; Olivier Deprez <olivier.deprez(a)arm.com>
抄 送:op-tee <op-tee(a)lists.trustedfirmware.org>; hafnium <hafnium(a)lists.trustedfirmware.org>; 黄明(连一) <hm281385(a)alibaba-inc.com>
主 题:optee_benchmark pmccfiltr_el0
Hi, Jens, Olivier,
In case of that optee runs at sel1 and hafnium runs at sel2, we want to test benchmark by executing the following command at optee_benchmark path:
./out/benchmark ../optee_examples/out/ca/optee_example_hello_world
After entering into the benchmark pta, the bm_timestamp function attempts to read the pmccfiltr_el0 register.
In cold boot, the following code will be executed during hafnium initialization:
vm->arch.trapped_features |= HF_FEATURE_PERFMON;
This will prevent the secondary vm from accessing the performance counter registers.
We remove the code, the bm_timestamp function can read pmccfiltr_el0 without trapping into hafnium.
But the value of pmccfiltr_el0 remains unchanged and cannot be counted.
We tried to read the register in hafnium and found that there was no change either.
In contrast, in the normal world, pmccfiltr_el0 counts normally.
Is it related to the pmu register configuration or does sel1 not support the pmccfiltr_el0 count at present?
Thanks for the support.
Regards,
Yuye.
Hi, Jens, Olivier,
In case of that optee runs at sel1 and hafnium runs at sel2, we want to test benchmark by executing the following command at optee_benchmark path:
./out/benchmark ../optee_examples/out/ca/optee_example_hello_world
After entering into the benchmark pta, the bm_timestamp function attempts to read the pmccfiltr_el0 register.
In cold boot, the following code will be executed during hafnium initialization:
vm->arch.trapped_features |= HF_FEATURE_PERFMON;
This will prevent the secondary vm from accessing the performance counter registers.
We remove the code, the bm_timestamp function can read pmccfiltr_el0 without trapping into hafnium.
But the value of pmccfiltr_el0 remains unchanged and cannot be counted.
We tried to read the register in hafnium and found that there was no change either.
In contrast, in the normal world, pmccfiltr_el0 counts normally.
Is it related to the pmu register configuration or does sel1 not support the pmccfiltr_el0 count at present?
Thanks for the support.
Regards,
Yuye.
Hi Jens,
I have a couple of Hafnium changes implementing the use of VSTTBR_EL2/VTTBR_EL2 to split an SP IPA into secure and non-secure IPA spaces.
They're very much in experimental stage so difficult to share just now (I will do some time later in February).
However I'd like to report some possible issue observed with qemu.
Essentially, when normal world driver inits, it performs a first share operation for a single NS page:
INFO: 1> 1 0 FFA_MEM_SHARE_32(84000073) 50 50 0 0 0 0 0
[...]
VERBOSE: Marked sending complete.
Current share states:
SHARE 0x0 (from VM 0x0, attributes 0x6f (NS), flags 0x8, tag 0, to 1 recipients [VM 0x8001: 0x6 (offset 48)]): fully sent with 1 fragments, 0 retrieved, sender's original mode: 0x87
INFO: 1< 1 0 FFA_SUCCESS_32(84000061) 0 0 0 0 0 0 0
INFO: 1> 1 0 FFA_MSG_SEND_DIRECT_REQ_32(8400006f) 8001 0 80000000 0 0 0 0
E/TC:1 0 mobj_ffa_get_by_cookie:387 Populating mobj from rx buffer, cookie 0
Retrieve operation happens:
E/TC:1 0 spmc_retrieve_req:1415 spmc_retrieve_req enter.
INFO: 1> 1 8001 FFA_MEM_RETRIEVE_REQ_32(84000074) 30 30 0 0 0 0 0
Current share states:
SHARE 0x0 (from VM 0x0, attributes 0x6f (NS), flags 0x8, tag 0, to 1 recipients [VM 0x8001: 0x6 (offset 48)]): fully sent with 1 fragments, 0 retrieved, sender's original mode: 0x87
Current share states:
SHARE 0x0 (from VM 0x0, attributes 0x6f (NS), flags 0x8, tag 0, to 1 recipients [VM 0x8001: 0x6 (offset 48)]): fully sent with 1 fragments, 1 retrieved, sender's original mode: 0x87
INFO: 1< 1 8001 Unknown(84000075) 50 50 0 0 0 0 0
Hafnium maps the NS page into OP-TEE's S2 page tables rooted to by VTTBR_EL2
0: e178003 S
1: e179003 S
f: e17a003 S
186: 240000041f867ff NS
(similar dump from VSTTBR_EL2 show OP-TEE secure pages properly mapped)
OP-TEE then maps the page in its S1 PTs as NS:
E/TC:1 0 spmc_retrieve_req:1428 spmc_retrieve_req exit.
E/TC:1 0 thread_spmc_populate_mobj_from_rx:1506 thread_spmc_populate_mobj_from_rx exit.
E/TC:1 0 set_pages:1461 set_pages 0 addr 41f86000 count 1
E/TC:1 0 mobj_ffa_add_pages_at:220 mobj_ffa_add_pages_at is_ns 0
INFO: 1> 1 8001 FFA_RX_RELEASE_32(84000065) 0 0 0 0 0 0 0
INFO: 1< 1 8001 FFA_SUCCESS_32(84000061) 0 0 0 0 0 0 0
E/TC:1 0 ffa_inc_map:566 ffa_inc_map addr fa00000 pages 0x90000000e3eadd0 sz 4096
D/TC:1 0 core_mmu_xlat_table_alloc:526 xlat tables used 4 / 5
A page fault is hit when OP-TEE accesses the page from its VA:
WARNING: Stage-2 page fault: pc=0xe30b764, vmid=0x8001, vcpu=1, vaddr=0xfa0001c, ipaddr=0x41f8601c, mode=0x1 0x40000000000007c
This issue is not observed with the TC2 FVP and similar Hafnium+OP-TEE SW stack, at the same point of initialization.
So it seems qemu is not doing the translation properly from VTTBR_EL2 for a page mapped NS by OP-TEE (hence NS IPA space).
Who should I report this problem to?
Regards,
Olivier.
Hi All,
We are pleased to announce the formal release of Trusted Firmware-A version 2.9 bundle of project deliverables.
This includes Trusted Firmware-A, Trusted Firmware-A Tests, Hafnium and TF-A OpenCI Scripts/Jobs 2.9 releases involving the tagging of multiple repositories. Aligned but not yet part of the release is Trusted Firmware-A Realm Management Monitor v0.3.0.
These went live on 23rd May 2023.
I would like to thank all of the contributors for their excellent work and achievements since the last release.
Thanks Joanna
Notable Features of the Version 2.9 Release are as follows:
TF-A/EL3 Root World
* New Features:
* Support for PSCI OS initiated mode
* Architecture feature support for FEAT_TCR2, FEAT_GCS, FEAT_HCX, FEAT_SME2, FEAT_PIE/POR, FEAT_MPAM.
* System registers access trap handler
* Introduction to dynamic detection of features
* Refactoring:
* Context management
* RAS extension exception handling and crash reporting.
* Distinguish between BL2 as TF-A entry point or BL2 running at EL3 exception level.
* General Support
* CPU Support for Chaberton and Blackhawk for TC2023
* Eighteen (18) Errata Mitigations for Cortex X2/X3/A710/A510/A78/A78C and Neoverse N2/V1 family CPU’S
* Errata Management Firmware Interface implementation supported for version 1.0 of the public specification
TF-A Boot BL1/BL2
* New Feature/Support
* Support for Trusted Boot rooted into RSS RoT on TC2022 platform.
* Support for PSA attestation scheme with Measured Boot rooted into RSS on TC2022 platform
* Migration to mbedTLS 3.x as the default cryptography library retaining backwards compatibility with mbedTLS 2.x
* Improvements and hardening of Arm CCA boot and attestation support.
* Hardening efforts in the X.509 certificate parser, including a security fix (TFV-10 CVE-2022-47630)
Hafnium/SEL2 SPM
* FF-A v1.2 ALP0 Specification Early Adoption Support
* Implemented ppartition info get ABI using GP registers.
* Group0 secure interrupt handling delegation.
* Improved console log ABI.
* FF-A v1.1 REL0 Specification Support
* Interrupt handling (S-EL0 partition signalling, added action to Other-S-Int, allow a physical interrupt to be routed to a specified PE).
* Memory sharing (structures updates supporting FF-A backwards compatibility, share/lend/donate memory to multiple borrowers, normal/secure fragmented memory sharing).
* Power management (events relayed to the SPMC and removed limitations).
* Indirect messaging (buffer synchronization and ownership transfer rules).
* General Support
* SPMC manifest to declare non-secure and secure system memory address ranges.
* Hardened SP manifest memory regions boot time validation.
* CI migration to LLVM/clang 15.0.6
* Removal of non-VHE build and test configurations.
* Added EL3 SPMC test configurations using the Hafnium's CI infrastructure.
TF-A Tests
* New Test Support
* Errata Management Firmware Interface testing
* FF-A v1.1 feature testing
* Realm Management Extension feature testing
* New Architecture Specific feature testing related to v8.8
* 1 new platform port (RD-N2-Cfg3)
TF-RMM/REL2
* New Feature/Support
* Added support to create Realms which can make use of SVE, if present in hardware.
* Refactor and improved the Stage 1 translation table library lib/xlat API to better fit RMM usage.
* Add PMU support for Realms as described by RMM v1.0 Beta0 specification.
* Support getting DRAM info from the Boot manifest dynamically at runtime.
* RMM can now support the 2nd DDR bank on FVP
* Define a unit test framework using CppUTest for RMM.
* Added unit tests for granule, slot-buffer and Stage 1 translation table lib xlat.
* Improvements to fake-host and unit tests framework.
* Build improvements in RMM
Platform Support
* 1 new platform added, the Allwinner T507 SoC
* 26 platforms updated from 14 providers
* 17 different driver updates
OpenCI
* First release done solely relying on Trustedfirmware.org OpenCI
Patch Statistics Across all Repositories
* 1403 Patches merged since v2.8 November 2022 release
Please refer to the TF-A [1], Hafnium [2] and TF-A Tests [3] changelogs for the complete summary of changes from the previous release.
TF-A [4], TF-A Test [5], Hafnium [6], TF-A OpenCI Scripts [7] and TF-A OpenCI Jobs [8] repositories are available along with the compatible TF-RMM repository [9] and changelog [10].
[1] https://trustedfirmware-a.readthedocs.io/en/v2.9/change-log.html#id1
[2] https://review.trustedfirmware.org/plugins/gitiles/hafnium/hafnium/+/HEAD/d…
[3] https://trustedfirmware-a-tests.readthedocs.io/en/v2.9/change-log.html#vers…
[4] https://git.trustedfirmware.org/TF-A/trusted-firmware-a.git/tag/?h=v2.9
[5] https://git.trustedfirmware.org/TF-A/tf-a-tests.git/tag/?h=v2.9
[6] https://git.trustedfirmware.org/hafnium/hafnium.git/tag/?h=v2.9
[7] https://git.trustedfirmware.org/ci/tf-a-ci-scripts.git/tag/?h=v2.9
[8] https://git.trustedfirmware.org/ci/tf-a-job-configs.git/tag/?h=v2.9
[9] https://git.trustedfirmware.org/TF-RMM/tf-rmm.git/tag/?h=tf-rmm-v0.3.0
[10] https://tf-rmm.readthedocs.io/en/tf-rmm-v0.3.0/about/change-log.html#v0-3-0
Hi, Olivier,
I have a few questions for your reply.
> [OD] Unfortunately Hafnium is not designed for this. I can see few places where things can go wrong, but firstly as you highlighted in another thread, the init should take place on the same core that was used for initializing (aka the primary core). As I understand, the optee load smc can happen on any core as it's launched from a nwd/linux thread running on any core. This is not only a hafnium limitation, I guess OP-TEE would have the same problem further on, as it also must go through the primary core init on physical boot/primary core. Then it would require the normal world to coordinate the secondary core init. for all cores, the same way as it happens in the cold boot phase by duplicating TF-A's PSCI boot flows.
I have done a restrained test for optee hot update. By setting grub.cfg, only one physical core is enabled, in which case hafnium and optee could be initialized normally. But after spmd returning to the driver, the optee driver hang in optee_ffa_probe and cannot exit normally. Maybe hafnium changes the state of some registers during initialization. And these registers may be shared with the el2 host os kernel. What issues do you think I may have overlooked?
1. Primary core and secondary core should be concepts during boot phase. At runtime, primary core and secondary core should not be distinguished. Does this mean that I can initialize the core of nwd/linux running thread with hafnium as the primary core? This may require hafnium or Atf to do some initialization for the core.
2. Can a hafnium implement an optee hot update without fully initializing itself but with a few initializing actions for optee, such as manifest_init?
3. For all cores, the same way as it happens in the cold boot phase by duplicating TF-A's PSCI boot flows. Agree with you.
> [OD] No this isn't supported and not in our roadmap. As things stand with upstream components, it would be preferable going with a regular FW update flow (but requires a machine reboot).
1. When you say upstream_components, do you mean bl2u_image, ns_bl2u_image things like that or something else in other spec?
Although this is an alternative update solution, machine reboot does not have the significance and advantages that belong to the hot update.
Thanks for your support.
Regards,
Yuye.
------------------------------------------------------------------
发件人:Olivier Deprez <Olivier.Deprez(a)arm.com>
发送时间:2023年5月9日(星期二) 23:54
收件人:梅建强(禹夜) <meijianqiang.mjq(a)alibaba-inc.com>
抄 送:hafnium <hafnium(a)lists.trustedfirmware.org>
主 题:Re: optee hot update
Hi Yuye,
See few comments inline [OD].
Regards,
Olivier.
From: 梅建强(禹夜) <meijianqiang.mjq(a)alibaba-inc.com>
Sent: 06 May 2023 09:34
To: Olivier Deprez <Olivier.Deprez(a)arm.com>
Cc: hafnium <hafnium(a)lists.trustedfirmware.org>
Subject: optee hot update
Hi, Olivier,
As I can see in the Hafnium code,
Hafnium performs a number of one-time functions at boot phase,
like one_time_init_mm and one_time_init.
I want to initialize Hafnium again from image_entry and then initialize optee after optee is updated.
I wonder if there are any problems with this flow for hafnium at runtime phase.
[OD] Unfortunately Hafnium is not designed for this. I can see few places where things can go wrong, but firstly as you highlighted in another thread, the init should take place on the same core that was used for initializing (aka the primary core). As I understand, the optee load smc can happen on any core as it's launched from a nwd/linux thread running on any core. This is not only a hafnium limitation, I guess OP-TEE would have the same problem further on, as it also must go through the primary core init on physical boot/primary core. Then it would require the normal world to coordinate the secondary core init. for all cores, the same way as it happens in the cold boot phase by duplicating TF-A's PSCI boot flows.
Does Hafnium provide any interface to boot optee directly at runtime phase,
and if not, how do you think it makes sense to modify the above flow?
[OD] No this isn't supported and not in our roadmap. As things stand with upstream components, it would be preferable going with a regular FW update flow (but requires a machine reboot).
Does anyone else know how to fix it?
Thanks for your support.
Regards,
Yuye.
Hi, Olivier,
As I can see in the Hafnium code,
Hafnium performs a number of one-time functions at boot phase,
like one_time_init_mm and one_time_init.
I want to initialize Hafnium again from image_entry and then initialize optee after optee is updated.
I wonder if there are any problems with this flow for hafnium at runtime phase.
Does Hafnium provide any interface to boot optee directly at runtime phase,
and if not, how do you think it makes sense to modify the above flow?
Does anyone else know how to fix it?
Thanks for your support.
Regards,
Yuye.