I was going through src/mm.c in the hafnium repo and noticed a couple of things that I think are problem:
mm_vm_defrag and mm_defrag, which eventually call mm_ptable_defrag
appears to be changing page entries into block entries. Per D5.10.1 of ARM ARM, this should use break-before-make, but I don't see it being done in code. Am I missing where this is done in code, or is there some reason break-before-make is not required during defrag(that is not obvious to me from code or comments)?
This does seem like it's just wrong. Despite there not being a change in the attributes, the ARM does say that a change of mapping size should also break-before-make, but that's not what it does. The places that mm_ptable_defrag_entry() is being assigned should also be using mm_replace_entry() if there was a change and there are likely some missing barriers there too.
In mm_replace_entry, line 278
(https://git.trustedfirmware.org/hafnium/hafnium.git/tree/src/mm.c#n278) is not followed by a DSB + ISB as required by the break-before-make. Without the DSB, a load/store that uses a VA translated by the PTE being replaced could be reordered before line 278 and fault. Once again, this looks like a bug to me but is there some reason that is not obvious due to which it does not require barriers?
From a quick scan, it looks like the barriers in this case are in the
arch code [1] for the TLB invalidation. The intermediate ones happen as the break is being made and there's a final invalidation after the entire mapping has been completed [2].
[1] -- https://git.trustedfirmware.org/hafnium/hafnium.git/tree/src/arch/aarch64/mm... [2] -- https://git.trustedfirmware.org/hafnium/hafnium.git/tree/src/mm.c#n478