there's a final invalidation after the entire mapping has been completed [2].
[RK] Agree. This works. However, is an invalidation really required if we did use break-before-make or add new entries? Break-before-make invalidates replaced entries and the architecture does not allow invalid translations to be cached in the TLB. With respect to line 478 in src/mm.c, It seemed odd to me to rely on that line for the final barriers, since it is under an if statement. It appears that after one_time_init, we always invalidate page tables whether it is new mappings or replacing mappings. What is the reason we wanted to delay invalidation until end of one_time_init? Did it give a significant performance boost? It seems cleaner to me to explicitly have a DSB+ISB before line 478 or remove the if condition around line 478 and have some code comments about previous PTE writes becoming entirely visible after the invalidation.
You're absolutely right that it's a roundabout way of hitting the right instructions (possibly because the non-existent x86 was in mind?). Most of the table updates did two passes, one to check the memory allocations and the second to commit the actual changes to the address space, hence the if statement. I don't know any good reason for the TLB invalidation being done in that way, it certainly wasn't a measured performance benefit
Let me know what you think.
I think things could be made much clearer, especially if an x86 port is no longer going to be an objective. Some of the arch abstraction here makes things more confusing.
I'd also wonder whether an existing page table library, e.g. xlat_v2 from TF-A, would be applicable so we can reduce, or at least concentrate, the amount of buggy PT code in the world. But I suspect there are issues with needing stage 1 and 2 being handled.