Hi Soby,

Agree with the fix overall.

1) However, is an ISB sufficient? ISB does not prevent speculative execution of any instruction, before the ISB is complete, it only causes a refetch after the completion of ISB, IIUC. So in an implementation, an AT instruction could speculatively execute before the ISB(for whatever micro-architectural reason, and this may not apply to the cores under consideration), fill the TLB with garbage (the damage is done), the ISB is executed and the speculative instructions are flushed and refetch occurs.
The only barrier that guarantees this wont happen is a DSB, which is why i was suggesting it.

2) For (b) in the fix, would you not require a barrier after the restore? If you don't have it, you could have a speculative AT between the window of time where the SCTLR and TCR registers are restored and ERET.

3) "So prior to use of AT, ensure the PTW are re-enabled and disabled back again after the AT instructions" -
I assume this cannot be done on any entry to EL3 except through SMC's. Otherwise we would have the same issue of
not being sure that the EL1 registers are in a state that would not cause this issue.

Thanks
Raghu

On 7/2/20 7:15 AM, Soby Mathew wrote:
Thanks Andrew for raising the issue and Marc and Raghu for your inputs.
The effect of stage 2 when Stage 1 is disabled is something that was overlooked in the workaround implementation. After some discussions this is what we currently understand regarding the problem.

1. EL3 does not modify the EL2 stage 1 translations and hence any speculative execution of AT execution in EL3 and resulting PTW and TLB caching should not be a problem as we always return to the same context in EL2. TF-A in EL3 need not worry about this. Also, we expect that future CPUs having v8.4 extensions for S-EL2 will have already corrected this errata.

2. Whenever the EL1/0 Stage 1 translations are switched in EL3 as part of switching the worlds, it should take Stage 2 into account. Hence disabling the Stage 1 MMU is not effective, as Marc pointed out, because speculative PTW can still take place using Stage 2 identity mapping. After discussion with James, we also realized that flipping the SCR_EL3.NS bit could take Stage 2 out of context (Stage 2 is not valid in secure world)  leaving Stage 1 pointing to invalid translations. 

As discussed in previous emails, on Entry to EL3, if TCR_EL1.EPDx = 1 and SCTLR_EL3.M = 1, any speculative AT PTWs for EL1 context can be prevented (including Stage 2).

So the fix as we currently understand would involve the following sequence :

        a. On Entry to EL3, save the incoming SCTLR_EL1.M and TCR_EL1.EPDx bits and set them (ensure TCR_EL1.EPDx =1 prior to SCTLR_EL1.M =1 using isb())
        b. Prior to Exit from EL3, after the target context is restored, restore the SCTLR_EL1.M and TCR_EL1.EPDx bits.

The above sequence now means that any use of AT instruction targeted at lower EL from EL3 that require PTW will fault. So prior to use of AT, ensure the PTW are re-enabled and disabled back again after the AT instructions.

If the above sequence is agreed upon to resolve the errata, then we can work on a patch for the same. I suspect current el1 register save and restore sequence in TF-A is a bit unwieldy and we may need to analyze all the entry points to EL3 to ensure we cover everything.

Best Regards
Soby Mathew

-----Original Message-----
From: TF-A <tf-a-bounces@lists.trustedfirmware.org> On Behalf Of Andrew
Scull via TF-A
Sent: 02 July 2020 09:20
To: Raghu K <raghu.ncstate@icloud.com>
Cc: android-kvm@google.com; willdeacon@google.com; Marc Zyngier
<mzyngier@google.com>; tf-a@lists.trustedfirmware.org
Subject: Re: [TF-A] Erroneous speculative AT workaround

On Wed, Jul 01, 2020 at 05:47:00PM -0700, Raghu K wrote:
This is interesting. It appears that there is no way on entry to EL3
to guarantee that the out-of-context(el2 and el1) translation regimes
are in a consistent state and on every entry into EL3, we have to
conservatively assume that it is in an inconsistent state. This is
because of the situation Andrew mentioned(interrupts to EL3 can occur at any
time).
If this is the case, on EL3 entry:
1) For EL1, we will need to save SCTLR_EL1, set SCTLR_EL1.M = 1,.EPDx
= 0
This would still be racing against any potential speculative execution of an  AT
instruction upon the switch to EL3, IIUC. The window would be much smaller
but not entirely eliminated.

For KVM, this would be enough as KVM will have already applied this
workaround (with Marc's corrections) whenever it is going to enter an
inconsistent state.  However, other EL2 software may choose to handle the
errata differently, possibly going to the lengths of ensuring that no AT
instruction is ever mapped executable.
--
TF-A mailing list
TF-A@lists.trustedfirmware.org
https://lists.trustedfirmware.org/mailman/listinfo/tf-a