Hi,

 

Today, I measured the call overhead on the function entry to TF-M is significant and will cause side effects for time deterministic MCU applications using the MDK debugger on STM32L5.

 

Compiler: AC6.14 -oz (optimized for image size)

TFM configuration: TFM_LVL=1, library mode, TFM_NS_CLIENT_IDENTIFICATION = OFF

 

--- Execution time measurement:

Function call of NS psa_open_key to corresponding secure function:

NS: dispatch  ->  S: tfm_crypto_open_key    2135 cycles

NS: dispatch ->  S: psa_open_key  2536 cycles

NS: psa_open_key ->  S: psa_open_key 2825 cycles  (this is with RTOS mutex overhead)

 

 

tfm_core_sfn_request(const struct tfm_sfn_req_s *desc_ptr)

{

    __ASM volatile(

        "PUSH   {r4-r12, lr}                \n" 

        "SVC    %[SVC_REQ]                  \n"   <--- effectively disables interrupts for 1970 Cycles

        "MOV    r4,  #0                     \n"

                               

On Musca (~48MHz) the overhead is 45us for a TF-M call.

 

--- Code Size overhead:

Each TFM function has the following flow:

 

tfm_ns_interface_dispatch  (this is a central function)

#33 result = fn(arg0, arg1, arg2, arg3);  -> calls each TF-M function with individual veneer

tfm_core_partition_request (which is again central function)

 

As function inlining is used, the each veneer requires 180 bytes.

In my system there are 4 ITS and 46 Crypto functions; with the net result of ~10K code for just the veneer entries. 

 

Here are some suggestions:

 

I hope this helps to improve TFM.

Reinhard