New subject: Erroneous speculative AT workaround

2 Jul 2020


      This is interesting. It appears that there is no way on entry to EL3 to 
guarantee that the out-of-context(el2 and el1) translation regimes are 
in a consistent state and on every entry into EL3, we have to 
conservatively assume that it is in an inconsistent state. This is 
because of the situation Andrew mentioned(interrupts to EL3 can occur at 
any time).
If this is the case, on EL3 entry:
1) For EL1, we will need to save SCTLR_EL1, set SCTLR_EL1.M = 1,.EPDx = 0
2) Set whatever bits we need to for EL2 and S2 translations to not 
succeed(What are these?)
3) DSB, to ensure no speculative AT can be issued until completion of 
DSB, so any AT that occurs will not fill the TLB with bad translations.
On exit(right before ERET), we need to restore the registers saved on 
entry, and have the ERET followed by a DSB so that there can be no 
speculative execution of AT instructions.
Thanks
Raghu
On 7/1/20 5:54 AM, Marc Zyngier via TF-A wrote:
...
Hi Manish,
On Wed, 1 Jul 2020 at 13:14, Manish Badarkhe Manish.Badarkhe@arm.com wrote:
...
Hi Andrew,
As per current implementation, in “el1_sysregs_context_restore” routine do below things:
   1.  TCR_EL1.EPD0 = 1
   2.  TCR_EL1.EPD1 = 1
   3.  SCTR_EL1.M = 0
   4.  Isb

Code snippet:
         mrs     x9, tcr_el1
         orr     x9, x9, #TCR_EPD0_BIT
         orr     x9, x9, #TCR_EPD1_BIT
         msr     tcr_el1, x9
         mrs     x9, sctlr_el1
         bic     x9, x9, #SCTLR_M_BIT
         msr     sctlr_el1, x9
         isb
This is to avoid PTW through while updating system registers at step 5
Unfortunately, this doesn't prevent anything.
If SCTLR_EL1.M is clear, TCR_EL1.EPDx don't mean much (S1 MMU is
disabled, no S1 page table walk), and you can still have S2 PTWs
(using an idmap for S1) and creating TLB corruption if these entry
alias with any S1 mapping that exists at EL1.
Which is why KVM does *set* SCTLR_EL1.M, which prevents the use of a
1:1 mapping at S1, and at which point the TCR_EL1.EPDx bits are
actually useful in preventing a PTW.
...
   5. Restore all system registers for El1 except SCTLR_EL1 and TCR_EL1
   6. isb()
   7. restore SCTLR_EL1 and TCR_EL1

Code Snippet:
         ldr     x9, [x0, #CTX_SCTLR_EL1]     -> saved value from "el2_sysregs_context_save"
         msr     sctlr_el1, x9
         ldr     x9, [x0, #CTX_TCR_EL1]
         msr     tcr_el1, x9
As per above steps. SCTLR_EL1 get restored back with actual settings at step 7.
Similar flow is present for “el2_sysregs_context_restore” to restore SCTLR_EL1 register.
In conclusion, this routine temporarily clear M bit of SCTLR_EL1 to avoid speculation but restored it back
to its original setting while leaving back to its caller. Please let us know whether this align with KVM
workaround for speculative AT erratum.
It doesn't, unfortunately. I believe this code actively creates
problems on a system that is affected by speculative AT execution.
I don't understand your rationale for touching SCTLR_EL2.M either if
you are not context-switching the EL2 S1 state: as far as I understand
no affected cores have S-EL2, so no switch should happen at this
stage.
Thanks,
     M.

Re: [TF-A] Erroneous speculative AT workaround