Hi Olivier,
IMO, in VHE enabled linux, host apps are used to support an admin interface to the Host OS/Hypervisor. Or in other words, they're the user facing Hypervisor extension to create/run/destroy guests.
[RK] Possibly. As I understand it, If you take Ubuntu with KVM for example, you can run a full blow desktop host OS, have the KVM hypervisor and run guests on it. All of the desktop services and apps are effectively arbitrary apps since it is a full blown desktop environment and not necessarily only an Admin interface to manage VM's. VHE allows an unmodified kernel written for EL1 to run in EL2, to act as both an operating system and hypervisor since it will have all the EL2 controls for hypervisor fucntionality.
By removing one stage of translation, is there a threat where a compromised S-EL0 SP could influence another S-EL1 TOS residing in a VM?
[RK] I don’t think the extra stage of translation would do anything to enhance security. We are trying for S-EL2 implementations to not have the same issues as TEE-OS(as Achin mentioned in his other replies) by making sure it ONLY supports FF-A and no other fancy services or API's to reduce attack surface, while ALSO enabling legacy trusted OS's to run unmodified and parallelly. The interface to the SPMC for both an S-EL0 SP and S-EL1 SP is the same and from that perspective, both are equally powerful or equally not powerful in their ability to influence other partitions irrespective of whether they are S-EL1 or S-EL0 partitions.
Right, but is this a S-EL0 partition property to define that it cannot receive interrupts? Physical interrupts can still happen within SPMC while a S-EL0 partition runs, but essentially an interrupt is never injected to a S-EL0 SP?
[RK] Let me know if I have misunderstood your question here. Stating the obvious here(apologies): An S-EL0 partition cannot receive a hardware interrupt directly due to the ARMv8 architecture(since it does not have its own VBAR etc, like any application on a regular OS). Physical interrupts can happen when an S-EL0 partition is running but it does not get "injected" into the S-EL0 partition, like the SPMC can inject an interrupt into a S-EL1 partition/VM(through HCR_EL2.VI). The "injection" of an interrupt into an S-EL0 SP is really a software injection mechanism, i.e completing a previous FFA_MSG_WAIT or FFA_DIRECT_RSP with the FFA_INTERRUPT error code. An S-EL0 SP will have to observe it's error code to know if it has to dispatch an interrupt handler or a message handler.
Thanks Raghu
-----Original Message----- From: Olivier Deprez Olivier.Deprez@arm.com Sent: Monday, December 21, 2020 1:40 AM To: raghu.ncstate@icloud.com; hafnium@lists.trustedfirmware.org; Achin Gupta Achin.Gupta@arm.com Cc: 'Mayur Gudmeti' mgudmeti@nvidia.com Subject: Re: [Hafnium] VHE support
Hi Raghu,
See few comments inline [OD]
Thanks, Olivier.
________________________________________ From: raghu.ncstate@icloud.com raghu.ncstate@icloud.com Sent: 17 December 2020 21:51 To: Olivier Deprez; hafnium@lists.trustedfirmware.org; Achin Gupta Cc: 'Mayur Gudmeti' Subject: RE: [Hafnium] VHE support
Merging Achin's response.
Thanks Achin, Olivier.
Achin,
With VHE, the S-EL1 exception level disappears. The SPMC can only have awareness of S-EL0 SPs.
[RK] Do you mean this from the SPMC architecture perspective? If yes, does this mean that if we enable VHE, S-EL1 partitions cannot co-exist with S-EL0 partitions when VHE is enabled? Clarification would helpful.
From the HW/ARM ARM perspective, When HCR_EL2.E2H=1(VHE enabled), S-EL1 "disappears" when HCR_EL2.TGE is set. If TGE is not set, S-EL2 can still ERET to S-EL1/S-EL0.
Olivier,
[OD] Yes, I got this. So in other words, the Hypervisor and the Host OS are conflated in this configuration.
[RK] Correct.
[OD] Right, so my understanding is that with this patch set Hafnium runs the EL2&0 translation regime. Although EL1/EL0 accesses are still made using the EL1&0 translation regime because HCR_EL2.TGE=0.
[RK] Correct.
[OD] Just curious, why restricting to a single vCPU? Is this to map with the existing StMM model? Anyways, there is always the possibility to set the VM's vCPU count to 1 in the SPMC manifest.
[RK] This requirement is from FF-A 1.0 spec, section 2.9, page 33, that says S-EL0 partitions must be UP. Agree, we can enforce this through the manifest and initialize only one "lightweight" vCPU(execution context really, modeled through vCPU structure).
[OD] Ack.
[OD] Right, notice there is a provision in partition manifests to state that a partitions is an S-EL0 partition (exception-level field set to S_EL0 in https://trustedfirmware-a.readthedocs.io/en/latest/components/psa-ffa- manifest-binding.html) This property can be used to identify this configuration.
[RK] Noted. Thx.
[OD] What I still miss is how the VM isolation is done. If a S-EL0 partition is a host application, does it mean page tables are shared between the Hypervisor/Host OS and the S-EL0 partition? This looks to provide more privileges to S-EL0 partitions than regular S-EL1 SPs? Or in other words, memory accesses for the applicative part only go through one translation stage (the Stage-1 EL2&0) and not through the 2-stage MMU? Or maybe the model is that there is a unique more privileged S-EL0 SP?
[RK] Part of the confusion may be that I said I'll model it as a VM. It is not really a VM, it is an equivalent of a user process/thread. So yes, it will be shared page tables between Hypervisor/Host OS and S-EL0 partition. Of course, each S-EL0 partition will have its own set of page tables and ttbr0_el2 and all of them will have the hypervisor memory mapped(note I'm assuming we will run on HW that does not require KPTI for side-channel mitigations). Also assuming we will turn on architectural knobs such as PSTATE.PAN, UAO, UXN/PXN bits and AP bits will be set appropriately for the pages so that EL0 can never access El2 memory and vice versa. So S-EL0 SP will not be more privileged than an S-EL1 SP. The difference will be that it does not have or require any OS like functionality a S-EL1 SP and behaves like any trusted application on top of hafnium. Effectively, hafnium is turning into a more minimal FF-A(only) based trusted OS and S-EL0 SP's are FF-A based trusted apps(back to square 1??? 😝)
[OD] Right I see. I still need to make up my mind in terms of threat modelling and "defense-in-depth" (just thinking out loud...): -It seems the proposed model deviates a bit of how VHE was originally thought in the normal world. IMO, in VHE enabled linux, host apps are used to support an admin interface to the Host OS/Hypervisor. Or in other words, they're the user facing Hypervisor extension to create/run/destroy guests. Now, with the proposal in the secure world, we intend to run arbitrary services in place of the Host apps (rather than SPMC bound services). Depending on the S-EL0 partition origins and level of audit, can this be threat to compromise the SPMC? (in parallel to pre-Armv8.4 where an arbitrary TA attempts to compromise a TEE). Maybe this is acceptable as it was for a S-EL0 SP vs EL3 SPM-MM. -Similarly one intent to isolate partitions through Stage-2 translations is to defeat the pre-Armv8.4 case where a compromised S-EL1 TEE could compromise another TEE running cooperatively. By removing one stage of translation, is there a threat where a compromised S-EL0 SP could influence another S-EL1 TOS residing in a VM?
[OD]IIUC if we take linux, the host OS executes with E2H/TGE=11 (both set early at boot time). The KVM infrastructure provides the interface to create/run a VM which will then execute with TGE=0. So to make the parallel with secure Hafnium should we have TGE=1 from an early start, and toggle it to zero when switching to a S-EL1 SP?
[RK] Yep. Will add that on my immediate todo and submit as patch if the current patch series goes through.
[OD] With TGE=1 exceptions and interrupts would still trap at (S)EL2 There is still the possibility a secure interrupt is injected to an S-EL1 SP, or that a managed exit is required. Btw if I recall the model for StMM is run-to-completion without interrupt?
[RK] With TGE=1, there will be no injection of interrupts, since TGE=1 will apply only to S-EL0 SP's and it will not have access to VBAR's, SCTLR's etc. Interrupt injection will remain the same for S-EL1 SP's, where TGE=0. As for StMM, my understanding is the same as yours, it is run-to-completion w.r.t to normal world interrupts(secure interrupts may still pre-empt it but that's a separate discussion 😊).
[OD] Right, but is this a S-EL0 partition property to define that it cannot receive interrupts? Physical interrupts can still happen within SPMC while a S-EL0 partition runs, but essentially an interrupt is never injected to a S-EL0 SP?
Thanks Raghu
-----Original Message----- From: Achin Gupta achin.gupta@arm.com Sent: Thursday, December 17, 2020 8:43 AM To: raghu.ncstate@icloud.com Cc: 'Olivier Deprez' Olivier.Deprez@arm.com; hafnium@lists.trustedfirmware.org; Mayur Gudmeti mgudmeti@nvidia.com Subject: Re: [Hafnium] VHE support
[snip]
Hi Achin,
Question for you. I interpreted the FF-A 1.0 spec as "requiring" VHE for S-EL0 partitions, based on options 1 and 2 in section 2.2.1, since option 2 does not mention S-EL0 partitions. Can you confirm this is the case? As I was thinking through this, it seems like we might be able to pull off EL0 partitions without using VHE, i.e by using HCR_EL2.E2H = 0 and HCR_EL2.TGE=1. HCR_EL2.TGE is present even on ARMv8.0 architecture and removes dependence on VHE, and effectively removes the necessity for this patch series. However, I'd like to understand your thoughts on why VHE is suggested in the spec and if hafnium should use VHE for S-EL0 partitions or if hafnium can use the approach suggested above.
The intent in the spec is not to mandate VHE for S-EL0 SPs. We can add a clarification if that helps.
The intent in the spec is to highlight that VHE is applicable to S-EL0 SPs only.
With VHE, the S-EL1 exception level disappears. The SPMC can only have awareness of S-EL0 SPs.
Without WHE, the S-EL1 exception level is present. As you state above, there are knobs in the architecture that reduce the role of S-EL1. But based upon my current understanding, the SPMC will see S-EL1 as the next lower exception level.
A S-EL0 SP can be "wrapped" as a S-EL1 SP such that the role of S-EL1 is reduced. This could be done in SW (e.g. a shim layer in S-EL1), HW (e.g. the TGE bit above) or both. But architecturally, the SPMC will still see S-EL1.
I hope this clarifies. That said, I do need to double check the original intent and impact of TGE. So please do correct me if I have misinterpreted anything.
cheers, Achin
-----Original Message----- From: Olivier Deprez Olivier.Deprez@arm.com Sent: Thursday, December 17, 2020 3:24 AM To: raghu.ncstate@icloud.com; hafnium@lists.trustedfirmware.org; Achin Gupta Achin.Gupta@arm.com Cc: Mayur Gudmeti mgudmeti@nvidia.com Subject: Re: [Hafnium] VHE support
Hi Raghu,
Comments inline [OD]
Regards, Olivier.
________________________________________ From: raghu.ncstate@icloud.com raghu.ncstate@icloud.com Sent: 15 December 2020 18:41 To: Olivier Deprez; hafnium@lists.trustedfirmware.org; Achin Gupta Cc: Mayur Gudmeti Subject: RE: [Hafnium] VHE support
Hi Olivier,
Sorry if my questions below sound obvious, I may miss bits of the VHE
architecture.
[RK] Thanks. Please feel free to push me on these questions. This was done more or less in isolation so I'm happy to be corrected and my understanding checked. Please feel free to suggest alternative paths to take as well.
[OD] Sure.
I guess it requires an additional change in project/reference to
eventually enable the feature for a platform (enable_vhe=1 in the gn build flow)?
[RK] Correct. I had this change locally and did not push it since the default would be to have enable_vhe=0
The changes are effectively toggling HCR_EL2.E2H=1. Currently when
Hafnium ERETs resuming a SP, this happens with the secure EL1&0 translation regime. Though what's the next step forward? Does this require a host OS in secure world? (like linux does when booting under a Hypervisor with VHE enabled?) Would this be a VHE-enabled TOS?
[RK] Hanfium effectively becomes the host OS when we enable E2H, so we don't need anything special. Sorry if this is obvious or you already know - When linux boots with VHE enabled, it runs code written to work on EL1 in EL2 as the host OS and and EL1 register accesses redirect to the equivalent EL2 registers. Hafnium is effectively doing this, except hafnium's code is already written to access the EL2 system registers so I think of hafnium as the host OS. Let me know if that does not make sense, we can discuss further. The next steps answered below.
[OD] Yes, I got this. So in other words, the Hypervisor and the Host OS are conflated in this configuration.
Is the later goal to enable the secure EL2&0 translation regime? 4/ Is there anything to do with HCR_EL2.TGE?
[RK] EL2&0 translation regime is enabled as soon as we set E2H is set, so the current patches already enable it. Note that I'm only using TTBR0_EL2 and not using TTBR1_EL2 since there is no need for it as of today. I don't expect there to be use for it in the near future either given that we want to use identity mapping everywhere.
[OD] Right, so my understanding is that with this patch set Hafnium runs the EL2&0 translation regime. Although EL1/EL0 accesses are still made using the EL1&0 translation regime because HCR_EL2.TGE=0.
As for future steps, this is what I'm thinking - note that this needs more thought and exploration and are not fool proof thoughts: 1) Model S-EL0/EL0 partitions as "lightweight" VM's. By lightweight, what I mean is that we represent EL0 partitions using the existing VM and VCPU structures, except these VM's will be forced to only have a maximum of one VCPU,
[OD] Just curious, why restricting to a single vCPU? Is this to map with the existing StMM model? Anyways, there is always the possibility to set the VM's vCPU count to 1 in the SPMC manifest.
and the context switch for these partitions would be "light weight", ie only GPR's and tbbr0_el2. I have to explore some of the other settings/system registers in EL2 that may have to be set up appropriately and is open. The advantage of doing so is that we can reuse all the loading infrastructure for VM's on EL0 partitions too.
[OD] Right, notice there is a provision in partition manifests to state that a partitions is an S-EL0 partition (exception-level field set to S_EL0 in https://trustedfirmware-a.readthedocs.io/en/latest/components/psa-ffa-manife...) This property can be used to identify this configuration.
2) As for TGE, HCR_EL2.TGE will be set only for EL0 partitions so that these partitions's would become the equivalent of "host" applications. This bit will not be set for S-EL1 partitions/VM's. Effectively, when E2H is set, this bit differentiates between a VM and a host application.
[OD] What I still miss is how the VM isolation is done. If a S-EL0 partition is a host application, does it mean page tables are shared between the Hypervisor/Host OS and the S-EL0 partition? This looks to provide more privileges to S-EL0 partitions than regular S-EL1 SPs? Or in other words, memory accesses for the applicative part only go through one translation stage (the Stage-1 EL2&0) and not through the 2-stage MMU? Or maybe the model is that there is a unique more privileged S-EL0 SP?
IIUC if we take linux, the host OS executes with E2H/TGE=11 (both set early at boot time). The KVM infrastructure provides the interface to create/run a VM which will then execute with TGE=0. So to make the parallel with secure Hafnium should we have TGE=1 from an early start, and toggle it to zero when switching to a S-EL1 SP?
3) There needs to be more thought around how hafnium will handle EL0 partitions vs VM's w.r.t to interrupts since there is no injection of virtual interrupts in this case. Exception handling and TLB maintenance are other area's I need to explore.
[OD] With TGE=1 exceptions and interrupts would still trap at (S)EL2 There is still the possibility a secure interrupt is injected to an S-EL1 SP, or that a managed exit is required. Btw if I recall the model for StMM is run-to-completion without interrupt?
4) Anything else that comes up?? I plan to start prototyping this and see where it takes me but at this point it seems achievable without having to break hafnium entirely.
Hi Achin,
Question for you. I interpreted the FF-A 1.0 spec as "requiring" VHE for S-EL0 partitions, based on options 1 and 2 in section 2.2.1, since option 2 does not mention S-EL0 partitions. Can you confirm this is the case? As I was thinking through this, it seems like we might be able to pull off EL0 partitions without using VHE, i.e by using HCR_EL2.E2H = 0 and HCR_EL2.TGE=1. HCR_EL2.TGE is present even on ARMv8.0 architecture and removes dependence on VHE, and effectively removes the necessity for this patch series. However, I'd like to understand your thoughts on why VHE is suggested in the spec and if hafnium should use VHE for S-EL0 partitions or if hafnium can use the approach suggested above.
Thanks Raghu
-----Original Message----- From: Olivier Deprez Olivier.Deprez@arm.com Sent: Tuesday, December 15, 2020 12:18 AM To: hafnium@lists.trustedfirmware.org; raghu.ncstate@icloud.com; Olivier Deprez Olivier.Deprez@arm.com Subject: Re: [Hafnium] VHE support
Hi Raghu,
one more
5/ maybe answer to 2/3/4 is that it requires an EL1-shim embedded into Hafnium which itself ERETs to a S-EL0 partition?
BTW notice my questions are obviously oriented towards the secure implementation.
Regards, Olivier.
________________________________________ From: Hafnium hafnium-bounces@lists.trustedfirmware.org on behalf of Olivier Deprez via Hafnium hafnium@lists.trustedfirmware.org Sent: 15 December 2020 09:10 To: hafnium@lists.trustedfirmware.org; raghu.ncstate@icloud.com Subject: Re: [Hafnium] VHE support
Hi Raghu,
Thanks for sharing this work.
Few thoughts...
1/ I guess it requires an additional change in project/reference to eventually enable the feature for a platform (enable_vhe=1 in the gn build flow)?
Sorry if my questions below sound obvious, I may miss bits of the VHE architecture.
2/ The changes are effectively toggling HCR_EL2.E2H=1. Currently when Hafnium ERETs resuming a SP, this happens with the secure EL1&0 translation regime. Though what's the next step forward? Does this require a host OS in secure world? (like linux does when booting under a Hypervisor with VHE enabled?) Would this be a VHE-enabled TOS?
3/ Is the later goal to enable the secure EL2&0 translation regime?
4/ Is there anything to do with HCR_EL2.TGE?
Regards, Olivier.
________________________________________ From: Hafnium hafnium-bounces@lists.trustedfirmware.org on behalf of Raghu Krishnamurthy via Hafnium hafnium@lists.trustedfirmware.org Sent: 15 December 2020 04:57 To: hafnium@lists.trustedfirmware.org Subject: [Hafnium] VHE support
Hi All,
I have a series of patches pushed to Gerrit at https://review.trustedfirmware.org/c/hafnium/hafnium/+/7599 with topic "vhe_enable". The goal of this patch series is to enable VM's in both secure and normal world to run with VHE enabled(hcr_el2.e2h=1), without breaking any existing functionality. This is expected to be the first step in the long term goal of enabling S-EL0 partitions(and optionally EL0 partitions), that require VHE support, per the FF-A 1.0 Spec. I'd appreciate feedback on the patches and approach taken to nominally enabling VHE. Note that the FF-A 1.0 spec(AFAIK) does not expect VHE support in the normal world but this patch series enables it anyway due to the wealth of available tests in the hafnium test suite to help with providing confidence in the implementation.
The patch series has been tested as follows:
Hafnium tests using QEMU(prebuilt in the hafnium repo) - Without VHE, since the prebuilt QEMU does not support VHE.
Hafnium tests using QEMU(5.2-RC4, built from source) - With and without VHE, this version of QEMU supports VHE.
Hafnium tests using FVP 11.12.28 - With and Without VHE.
TFTF tests for secure hafnium using FVP 11.12.28 - With and without VHE.
Thanks
Raghu
-- Hafnium mailing list Hafnium@lists.trustedfirmware.org https://lists.trustedfirmware.org/mailman/listinfo/hafnium -- Hafnium mailing list Hafnium@lists.trustedfirmware.org https://lists.trustedfirmware.org/mailman/listinfo/hafnium