Hi,
On Mon, Jun 26, 2023 at 10:39 AM haiyuan.ghy--- via OP-TEE op-tee@lists.trustedfirmware.org wrote:
Hi, Jens, I tried CFG_CORE_PREALLOC_EL0_TBLS=y to compile optee_os. I used the same optee_example_hello_world CA and TA, tested 10K times InvokeCommand. I got average of time was 17 us, somewhat less than previous 20 us.
A small improvement, at least it didn't make things worse. :-)
Any suggestion? Thanks for help.
With CFG_CORE_PREALLOC_EL0_TBLS=y each TA has preallocated translation tables that don't need to be completely reinitialized each time. I guess some of the saved time above is from core_mmu_populate_user_map() taking a bit less time. It would be interesting to know how the time is spent by vm_set_ctx(), assuming that's still the main problem.
There are also the calls to tlbi_all() and icache_inv_all() in core_mmu_set_user_map() that might be a bit brutal. If you're only calling the same TA repeatedly it could be interesting to so how how much time can be saved by skipping those two calls. So we know if it's worth trying to optimize that part.
Cheers, Jens