[PATCH] optee: Suppress false positive kmemleak report in optee_handle_rpc()

List overview All Threads
Download

newer

older

[PATCH v2] tee: handle lookup of...

[PATCH v2] tee: optee: Fix...

Xiaolei Wang

6 Dec 2021 6 Dec '21

12:05 p.m.

We observed the following kmemleak report: unreferenced object 0xffff000007904500 (size 128): comm "swapper/0", pid 1, jiffies 4294892671 (age 44.036s) hex dump (first 32 bytes): 00 47 90 07 00 00 ff ff 60 00 c0 ff 00 00 00 00 .G......`....... 60 00 80 13 00 80 ff ff a0 00 00 00 00 00 00 00 `............... backtrace: [<000000004c12b1c7>] kmem_cache_alloc+0x1ac/0x2f4 [<000000005d23eb4f>] tee_shm_alloc+0x78/0x230 [<00000000794dd22c>] optee_handle_rpc+0x60/0x6f0 [<00000000d9f7c52d>] optee_do_call_with_arg+0x17c/0x1dc [<00000000c35884da>] optee_open_session+0x128/0x1ec [<000000001748f2ff>] tee_client_open_session+0x28/0x40 [<00000000aecb5389>] optee_enumerate_devices+0x84/0x2a0 [<000000003df18bf1>] optee_probe+0x674/0x6cc [<000000003a4a534a>] platform_drv_probe+0x54/0xb0 [<000000000c51ce7d>] really_probe+0xe4/0x4d0 [<000000002f04c865>] driver_probe_device+0x58/0xc0 [<00000000b485397d>] device_driver_attach+0xc0/0xd0 [<00000000c835f0df>] __driver_attach+0x84/0x124 [<000000008e5a429c>] bus_for_each_dev+0x70/0xc0 [<000000001735e8a8>] driver_attach+0x24/0x30 [<000000006d94b04f>] bus_add_driver+0x104/0x1ec

This is not a memory leak because we pass the share memory pointer to secure world and would get it from secure world before releasing it.

Signed-off-by: Xiaolei Wang xiaolei.wang@windriver.com --- drivers/tee/optee/smc_abi.c | 2 ++ 1 file changed, 2 insertions(+)

diff --git a/drivers/tee/optee/smc_abi.c b/drivers/tee/optee/smc_abi.c index 6196d7c3888f..cf2e3293567d 100644 --- a/drivers/tee/optee/smc_abi.c +++ b/drivers/tee/optee/smc_abi.c @@ -23,6 +23,7 @@ #include "optee_private.h" #include "optee_smc.h" #include "optee_rpc_cmd.h" +#include <linux/kmemleak.h> #define CREATE_TRACE_POINTS #include "optee_trace.h"

@@ -783,6 +784,7 @@ static void optee_handle_rpc(struct tee_context *ctx, param->a4 = 0; param->a5 = 0; } + kmemleak_not_leak(shm); break; case OPTEE_SMC_RPC_FUNC_FREE: shm = reg_pair_to_ptr(param->a1, param->a2);

-- 2.25.1

Show replies by date

Sumit Garg

9 Dec 9 Dec

11:40 a.m.

On Mon, 6 Dec 2021 at 17:35, Xiaolei Wang xiaolei.wang@windriver.com wrote:

...

We observed the following kmemleak report: unreferenced object 0xffff000007904500 (size 128): comm "swapper/0", pid 1, jiffies 4294892671 (age 44.036s) hex dump (first 32 bytes): 00 47 90 07 00 00 ff ff 60 00 c0 ff 00 00 00 00 .G......`....... 60 00 80 13 00 80 ff ff a0 00 00 00 00 00 00 00 `............... backtrace: [<000000004c12b1c7>] kmem_cache_alloc+0x1ac/0x2f4 [<000000005d23eb4f>] tee_shm_alloc+0x78/0x230 [<00000000794dd22c>] optee_handle_rpc+0x60/0x6f0 [<00000000d9f7c52d>] optee_do_call_with_arg+0x17c/0x1dc [<00000000c35884da>] optee_open_session+0x128/0x1ec [<000000001748f2ff>] tee_client_open_session+0x28/0x40 [<00000000aecb5389>] optee_enumerate_devices+0x84/0x2a0 [<000000003df18bf1>] optee_probe+0x674/0x6cc [<000000003a4a534a>] platform_drv_probe+0x54/0xb0 [<000000000c51ce7d>] really_probe+0xe4/0x4d0 [<000000002f04c865>] driver_probe_device+0x58/0xc0 [<00000000b485397d>] device_driver_attach+0xc0/0xd0 [<00000000c835f0df>] __driver_attach+0x84/0x124 [<000000008e5a429c>] bus_for_each_dev+0x70/0xc0 [<000000001735e8a8>] driver_attach+0x24/0x30 [<000000006d94b04f>] bus_add_driver+0x104/0x1ec

This is not a memory leak because we pass the share memory pointer to secure world and would get it from secure world before releasing it.

How about if it's actually a memory leak caused by the secure world? An example being secure world just allocates kernel memory via OPTEE_SMC_RPC_FUNC_ALLOC and doesn't free it via OPTEE_SMC_RPC_FUNC_FREE.

IMO, we need to cross-check optee-os if it's responsible for leaking kernel memory.

-Sumit

...

Signed-off-by: Xiaolei Wang xiaolei.wang@windriver.com

drivers/tee/optee/smc_abi.c | 2 ++ 1 file changed, 2 insertions(+)

diff --git a/drivers/tee/optee/smc_abi.c b/drivers/tee/optee/smc_abi.c index 6196d7c3888f..cf2e3293567d 100644 --- a/drivers/tee/optee/smc_abi.c +++ b/drivers/tee/optee/smc_abi.c @@ -23,6 +23,7 @@ #include "optee_private.h" #include "optee_smc.h" #include "optee_rpc_cmd.h" +#include <linux/kmemleak.h> #define CREATE_TRACE_POINTS #include "optee_trace.h"

@@ -783,6 +784,7 @@ static void optee_handle_rpc(struct tee_context *ctx, param->a4 = 0; param->a5 = 0; }
          kmemleak_not_leak(shm);
          break;
  case OPTEE_SMC_RPC_FUNC_FREE:
          shm = reg_pair_to_ptr(param->a1, param->a2);
-- 2.25.1

Wang, Xiaolei

2:40 p.m.

________________________________ From: Sumit Garg sumit.garg@linaro.org Sent: Thursday, December 9, 2021 7:40 PM To: Wang, Xiaolei Xiaolei.Wang@windriver.com Cc: jens.wiklander@linaro.org jens.wiklander@linaro.org; op-tee@lists.trustedfirmware.org op-tee@lists.trustedfirmware.org; linux-kernel@vger.kernel.org linux-kernel@vger.kernel.org Subject: Re: [PATCH] optee: Suppress false positive kmemleak report in optee_handle_rpc()

[Please note: This e-mail is from an EXTERNAL e-mail address]

On Mon, 6 Dec 2021 at 17:35, Xiaolei Wang xiaolei.wang@windriver.com wrote:

...

We observed the following kmemleak report: unreferenced object 0xffff000007904500 (size 128): comm "swapper/0", pid 1, jiffies 4294892671 (age 44.036s) hex dump (first 32 bytes): 00 47 90 07 00 00 ff ff 60 00 c0 ff 00 00 00 00 .G......`....... 60 00 80 13 00 80 ff ff a0 00 00 00 00 00 00 00 `............... backtrace: [<000000004c12b1c7>] kmem_cache_alloc+0x1ac/0x2f4 [<000000005d23eb4f>] tee_shm_alloc+0x78/0x230 [<00000000794dd22c>] optee_handle_rpc+0x60/0x6f0 [<00000000d9f7c52d>] optee_do_call_with_arg+0x17c/0x1dc [<00000000c35884da>] optee_open_session+0x128/0x1ec [<000000001748f2ff>] tee_client_open_session+0x28/0x40 [<00000000aecb5389>] optee_enumerate_devices+0x84/0x2a0 [<000000003df18bf1>] optee_probe+0x674/0x6cc [<000000003a4a534a>] platform_drv_probe+0x54/0xb0 [<000000000c51ce7d>] really_probe+0xe4/0x4d0 [<000000002f04c865>] driver_probe_device+0x58/0xc0 [<00000000b485397d>] device_driver_attach+0xc0/0xd0 [<00000000c835f0df>] __driver_attach+0x84/0x124 [<000000008e5a429c>] bus_for_each_dev+0x70/0xc0 [<000000001735e8a8>] driver_attach+0x24/0x30 [<000000006d94b04f>] bus_add_driver+0x104/0x1ec

This is not a memory leak because we pass the share memory pointer to secure world and would get it from secure world before releasing it.

...

How about if it's actually a memory leak caused by the secure world? An example being secure world just allocates kernel memory via OPTEE_SMC_RPC_FUNC_ALLOC and doesn't free it via OPTEE_SMC_RPC_FUNC_FREE.

...

IMO, we need to cross-check optee-os if it's responsible for leaking kernel memory.

Hi sumit

You mean we need to check whether there is a real memory leak, if being secure world just allocates kernel memory via OPTEE_SMC_RPC_FUNC_ALLOC and until the end, there is no free it via OPTEE_SMC_RPC_FUNC_FREE, then we should judge it as a memory leak.

...

-Sumit

...

Signed-off-by: Xiaolei Wang xiaolei.wang@windriver.com

drivers/tee/optee/smc_abi.c | 2 ++ 1 file changed, 2 insertions(+)

diff --git a/drivers/tee/optee/smc_abi.c b/drivers/tee/optee/smc_abi.c index 6196d7c3888f..cf2e3293567d 100644 --- a/drivers/tee/optee/smc_abi.c +++ b/drivers/tee/optee/smc_abi.c @@ -23,6 +23,7 @@ #include "optee_private.h" #include "optee_smc.h" #include "optee_rpc_cmd.h" +#include <linux/kmemleak.h> #define CREATE_TRACE_POINTS #include "optee_trace.h"

@@ -783,6 +784,7 @@ static void optee_handle_rpc(struct tee_context *ctx, param->a4 = 0; param->a5 = 0; }
          kmemleak_not_leak(shm);
          break;
  case OPTEE_SMC_RPC_FUNC_FREE:
          shm = reg_pair_to_ptr(param->a1, param->a2);
-- 2.25.1

Wang, Xiaolei

10 Dec 10 Dec

4:12 a.m.

-----Original Message----- From: Sumit Garg sumit.garg@linaro.org Sent: Thursday, December 9, 2021 7:41 PM To: Wang, Xiaolei Xiaolei.Wang@windriver.com Cc: jens.wiklander@linaro.org; op-tee@lists.trustedfirmware.org; linux-kernel@vger.kernel.org Subject: Re: [PATCH] optee: Suppress false positive kmemleak report in optee_handle_rpc()

[Please note: This e-mail is from an EXTERNAL e-mail address]

On Mon, 6 Dec 2021 at 17:35, Xiaolei Wang xiaolei.wang@windriver.com wrote:

...

We observed the following kmemleak report: unreferenced object 0xffff000007904500 (size 128): comm "swapper/0", pid 1, jiffies 4294892671 (age 44.036s) hex dump (first 32 bytes): 00 47 90 07 00 00 ff ff 60 00 c0 ff 00 00 00 00 .G......`....... 60 00 80 13 00 80 ff ff a0 00 00 00 00 00 00 00 `............... backtrace: [<000000004c12b1c7>] kmem_cache_alloc+0x1ac/0x2f4 [<000000005d23eb4f>] tee_shm_alloc+0x78/0x230 [<00000000794dd22c>] optee_handle_rpc+0x60/0x6f0 [<00000000d9f7c52d>] optee_do_call_with_arg+0x17c/0x1dc [<00000000c35884da>] optee_open_session+0x128/0x1ec [<000000001748f2ff>] tee_client_open_session+0x28/0x40 [<00000000aecb5389>] optee_enumerate_devices+0x84/0x2a0 [<000000003df18bf1>] optee_probe+0x674/0x6cc [<000000003a4a534a>] platform_drv_probe+0x54/0xb0 [<000000000c51ce7d>] really_probe+0xe4/0x4d0 [<000000002f04c865>] driver_probe_device+0x58/0xc0 [<00000000b485397d>] device_driver_attach+0xc0/0xd0 [<00000000c835f0df>] __driver_attach+0x84/0x124 [<000000008e5a429c>] bus_for_each_dev+0x70/0xc0 [<000000001735e8a8>] driver_attach+0x24/0x30 [<000000006d94b04f>] bus_add_driver+0x104/0x1ec

This is not a memory leak because we pass the share memory pointer to secure world and would get it from secure world before releasing it.

...

How about if it's actually a memory leak caused by the secure world? An example being secure world just allocates kernel memory via OPTEE_SMC_RPC_FUNC_ALLOC and doesn't free it via OPTEE_SMC_RPC_FUNC_FREE.

...

IMO, we need to cross-check optee-os if it's responsible for leaking kernel memory.

Hi sumit,

You mean we need to check whether there is a real memleak, If being secure world just allocate kernel memory via OPTEE_SMC_PRC_FUNC_ALLOC and until the end, there is no free It via OPTEE_SMC_PRC_FUNC_FREE, then we should judge it as a memory leak, wo need to judge whether it is caused by secure os?

Thanks Xiaolei

...

-Sumit

...

Signed-off-by: Xiaolei Wang xiaolei.wang@windriver.com

drivers/tee/optee/smc_abi.c | 2 ++ 1 file changed, 2 insertions(+)

diff --git a/drivers/tee/optee/smc_abi.c b/drivers/tee/optee/smc_abi.c index 6196d7c3888f..cf2e3293567d 100644 --- a/drivers/tee/optee/smc_abi.c +++ b/drivers/tee/optee/smc_abi.c @@ -23,6 +23,7 @@ #include "optee_private.h" #include "optee_smc.h" #include "optee_rpc_cmd.h" +#include <linux/kmemleak.h> #define CREATE_TRACE_POINTS #include "optee_trace.h"

@@ -783,6 +784,7 @@ static void optee_handle_rpc(struct tee_context *ctx, param->a4 = 0; param->a5 = 0; }
          kmemleak_not_leak(shm);
          break;
  case OPTEE_SMC_RPC_FUNC_FREE:
          shm = reg_pair_to_ptr(param->a1, param->a2);
-- 2.25.1

Sumit Garg

5 a.m.

On Fri, 10 Dec 2021 at 09:42, Wang, Xiaolei Xiaolei.Wang@windriver.com wrote:

...

-----Original Message----- From: Sumit Garg sumit.garg@linaro.org Sent: Thursday, December 9, 2021 7:41 PM To: Wang, Xiaolei Xiaolei.Wang@windriver.com Cc: jens.wiklander@linaro.org; op-tee@lists.trustedfirmware.org; linux-kernel@vger.kernel.org Subject: Re: [PATCH] optee: Suppress false positive kmemleak report in optee_handle_rpc()

[Please note: This e-mail is from an EXTERNAL e-mail address]

On Mon, 6 Dec 2021 at 17:35, Xiaolei Wang xiaolei.wang@windriver.com wrote:

...
We observed the following kmemleak report: unreferenced object 0xffff000007904500 (size 128): comm "swapper/0", pid 1, jiffies 4294892671 (age 44.036s) hex dump (first 32 bytes): 00 47 90 07 00 00 ff ff 60 00 c0 ff 00 00 00 00 .G......`....... 60 00 80 13 00 80 ff ff a0 00 00 00 00 00 00 00 `............... backtrace: [<000000004c12b1c7>] kmem_cache_alloc+0x1ac/0x2f4 [<000000005d23eb4f>] tee_shm_alloc+0x78/0x230 [<00000000794dd22c>] optee_handle_rpc+0x60/0x6f0 [<00000000d9f7c52d>] optee_do_call_with_arg+0x17c/0x1dc [<00000000c35884da>] optee_open_session+0x128/0x1ec [<000000001748f2ff>] tee_client_open_session+0x28/0x40 [<00000000aecb5389>] optee_enumerate_devices+0x84/0x2a0 [<000000003df18bf1>] optee_probe+0x674/0x6cc [<000000003a4a534a>] platform_drv_probe+0x54/0xb0 [<000000000c51ce7d>] really_probe+0xe4/0x4d0 [<000000002f04c865>] driver_probe_device+0x58/0xc0 [<00000000b485397d>] device_driver_attach+0xc0/0xd0 [<00000000c835f0df>] __driver_attach+0x84/0x124 [<000000008e5a429c>] bus_for_each_dev+0x70/0xc0 [<000000001735e8a8>] driver_attach+0x24/0x30 [<000000006d94b04f>] bus_add_driver+0x104/0x1ec

This is not a memory leak because we pass the share memory pointer to secure world and would get it from secure world before releasing it.

...
How about if it's actually a memory leak caused by the secure world? An example being secure world just allocates kernel memory via OPTEE_SMC_RPC_FUNC_ALLOC and doesn't free it via OPTEE_SMC_RPC_FUNC_FREE.

...
IMO, we need to cross-check optee-os if it's responsible for leaking kernel memory.

Hi sumit,

You mean we need to check whether there is a real memleak, If being secure world just allocate kernel memory via OPTEE_SMC_PRC_FUNC_ALLOC and until the end, there is no free It via OPTEE_SMC_PRC_FUNC_FREE, then we should judge it as a memory leak, wo need to judge whether it is caused by secure os?

Yes. AFAICT, optee-os should allocate shared memory to communicate with tee-supplicant. So once the communication is done, the underlying shared memory should be freed. I can't think of any scenario where optee-os should keep hold-off shared memory indefinitely.

-Sumit

...

Thanks Xiaolei

...
-Sumit

...
Signed-off-by: Xiaolei Wang xiaolei.wang@windriver.com

drivers/tee/optee/smc_abi.c | 2 ++ 1 file changed, 2 insertions(+)

diff --git a/drivers/tee/optee/smc_abi.c b/drivers/tee/optee/smc_abi.c index 6196d7c3888f..cf2e3293567d 100644 --- a/drivers/tee/optee/smc_abi.c +++ b/drivers/tee/optee/smc_abi.c @@ -23,6 +23,7 @@ #include "optee_private.h" #include "optee_smc.h" #include "optee_rpc_cmd.h" +#include <linux/kmemleak.h> #define CREATE_TRACE_POINTS #include "optee_trace.h"

@@ -783,6 +784,7 @@ static void optee_handle_rpc(struct tee_context *ctx, param->a4 = 0; param->a5 = 0; }
          kmemleak_not_leak(shm);
          break;
  case OPTEE_SMC_RPC_FUNC_FREE:
          shm = reg_pair_to_ptr(param->a1, param->a2);
-- 2.25.1

Jerome Forissier

8:10 a.m.

+CC Jens, Etienne

On 12/10/21 06:00, Sumit Garg wrote:

...

On Fri, 10 Dec 2021 at 09:42, Wang, Xiaolei Xiaolei.Wang@windriver.com wrote:

...
-----Original Message----- From: Sumit Garg sumit.garg@linaro.org Sent: Thursday, December 9, 2021 7:41 PM To: Wang, Xiaolei Xiaolei.Wang@windriver.com Cc: jens.wiklander@linaro.org; op-tee@lists.trustedfirmware.org; linux-kernel@vger.kernel.org Subject: Re: [PATCH] optee: Suppress false positive kmemleak report in optee_handle_rpc()

[Please note: This e-mail is from an EXTERNAL e-mail address]

On Mon, 6 Dec 2021 at 17:35, Xiaolei Wang xiaolei.wang@windriver.com wrote:

...
We observed the following kmemleak report: unreferenced object 0xffff000007904500 (size 128): comm "swapper/0", pid 1, jiffies 4294892671 (age 44.036s) hex dump (first 32 bytes): 00 47 90 07 00 00 ff ff 60 00 c0 ff 00 00 00 00 .G......`....... 60 00 80 13 00 80 ff ff a0 00 00 00 00 00 00 00 `............... backtrace: [<000000004c12b1c7>] kmem_cache_alloc+0x1ac/0x2f4 [<000000005d23eb4f>] tee_shm_alloc+0x78/0x230 [<00000000794dd22c>] optee_handle_rpc+0x60/0x6f0 [<00000000d9f7c52d>] optee_do_call_with_arg+0x17c/0x1dc [<00000000c35884da>] optee_open_session+0x128/0x1ec [<000000001748f2ff>] tee_client_open_session+0x28/0x40 [<00000000aecb5389>] optee_enumerate_devices+0x84/0x2a0 [<000000003df18bf1>] optee_probe+0x674/0x6cc [<000000003a4a534a>] platform_drv_probe+0x54/0xb0 [<000000000c51ce7d>] really_probe+0xe4/0x4d0 [<000000002f04c865>] driver_probe_device+0x58/0xc0 [<00000000b485397d>] device_driver_attach+0xc0/0xd0 [<00000000c835f0df>] __driver_attach+0x84/0x124 [<000000008e5a429c>] bus_for_each_dev+0x70/0xc0 [<000000001735e8a8>] driver_attach+0x24/0x30 [<000000006d94b04f>] bus_add_driver+0x104/0x1ec

This is not a memory leak because we pass the share memory pointer to secure world and would get it from secure world before releasing it.

...
How about if it's actually a memory leak caused by the secure world? An example being secure world just allocates kernel memory via OPTEE_SMC_RPC_FUNC_ALLOC and doesn't free it via OPTEE_SMC_RPC_FUNC_FREE.

...
IMO, we need to cross-check optee-os if it's responsible for leaking kernel memory.

Hi sumit,

You mean we need to check whether there is a real memleak, If being secure world just allocate kernel memory via OPTEE_SMC_PRC_FUNC_ALLOC and until the end, there is no free It via OPTEE_SMC_PRC_FUNC_FREE, then we should judge it as a memory leak, wo need to judge whether it is caused by secure os?

Yes. AFAICT, optee-os should allocate shared memory to communicate with tee-supplicant. So once the communication is done, the underlying shared memory should be freed. I can't think of any scenario where optee-os should keep hold-off shared memory indefinitely.

I believe it can happen when OP-TEE's CFG_PREALLOC_RPC_CACHE is y. See the config file [1] and the commit which introduced this config [2].

[1] https://github.com/OP-TEE/optee_os/blob/3.15.0/mk/config.mk#L709 [2] https://github.com/OP-TEE/optee_os/commit/8887663248ad

-- Jerome

Etienne Carriere

9:38 a.m.

Hello all,

On Fri, 10 Dec 2021 at 09:10, Jerome Forissier jerome@forissier.org wrote:

...

+CC Jens, Etienne

On 12/10/21 06:00, Sumit Garg wrote:

...
On Fri, 10 Dec 2021 at 09:42, Wang, Xiaolei Xiaolei.Wang@windriver.com wrote:

...
-----Original Message----- From: Sumit Garg sumit.garg@linaro.org Sent: Thursday, December 9, 2021 7:41 PM To: Wang, Xiaolei Xiaolei.Wang@windriver.com Cc: jens.wiklander@linaro.org; op-tee@lists.trustedfirmware.org; linux-kernel@vger.kernel.org Subject: Re: [PATCH] optee: Suppress false positive kmemleak report in optee_handle_rpc()

[Please note: This e-mail is from an EXTERNAL e-mail address]

On Mon, 6 Dec 2021 at 17:35, Xiaolei Wang xiaolei.wang@windriver.com wrote:

...
We observed the following kmemleak report: unreferenced object 0xffff000007904500 (size 128): comm "swapper/0", pid 1, jiffies 4294892671 (age 44.036s) hex dump (first 32 bytes): 00 47 90 07 00 00 ff ff 60 00 c0 ff 00 00 00 00 .G......`....... 60 00 80 13 00 80 ff ff a0 00 00 00 00 00 00 00 `............... backtrace: [<000000004c12b1c7>] kmem_cache_alloc+0x1ac/0x2f4 [<000000005d23eb4f>] tee_shm_alloc+0x78/0x230 [<00000000794dd22c>] optee_handle_rpc+0x60/0x6f0 [<00000000d9f7c52d>] optee_do_call_with_arg+0x17c/0x1dc [<00000000c35884da>] optee_open_session+0x128/0x1ec [<000000001748f2ff>] tee_client_open_session+0x28/0x40 [<00000000aecb5389>] optee_enumerate_devices+0x84/0x2a0 [<000000003df18bf1>] optee_probe+0x674/0x6cc [<000000003a4a534a>] platform_drv_probe+0x54/0xb0 [<000000000c51ce7d>] really_probe+0xe4/0x4d0 [<000000002f04c865>] driver_probe_device+0x58/0xc0 [<00000000b485397d>] device_driver_attach+0xc0/0xd0 [<00000000c835f0df>] __driver_attach+0x84/0x124 [<000000008e5a429c>] bus_for_each_dev+0x70/0xc0 [<000000001735e8a8>] driver_attach+0x24/0x30 [<000000006d94b04f>] bus_add_driver+0x104/0x1ec

This is not a memory leak because we pass the share memory pointer to secure world and would get it from secure world before releasing it.

...
How about if it's actually a memory leak caused by the secure world? An example being secure world just allocates kernel memory via OPTEE_SMC_RPC_FUNC_ALLOC and doesn't free it via OPTEE_SMC_RPC_FUNC_FREE.

...
IMO, we need to cross-check optee-os if it's responsible for leaking kernel memory.

Hi sumit,

You mean we need to check whether there is a real memleak, If being secure world just allocate kernel memory via OPTEE_SMC_PRC_FUNC_ALLOC and until the end, there is no free It via OPTEE_SMC_PRC_FUNC_FREE, then we should judge it as a memory leak, wo need to judge whether it is caused by secure os?

Yes. AFAICT, optee-os should allocate shared memory to communicate with tee-supplicant. So once the communication is done, the underlying shared memory should be freed. I can't think of any scenario where optee-os should keep hold-off shared memory indefinitely.

I believe it can happen when OP-TEE's CFG_PREALLOC_RPC_CACHE is y. See the config file [1] and the commit which introduced this config [2].

[1] https://github.com/OP-TEE/optee_os/blob/3.15.0/mk/config.mk#L709 [2] https://github.com/OP-TEE/optee_os/commit/8887663248ad

It's been a while since OP-TEE caches some shm buffers to prevent re-allocting them on and on. OP-TEE does so for 1 shm buffer per "tee threads" OP-TEE has provisioned. Each thread can cache a shm reference. Note that used RPCs from optee to linux/u-boot/ree do not require such message buffer (IMO).

The main issue is the shm buffer are allocated per optee thread (thread context assigned to client invocation request when entreing optee). Therefore, if an optee thread caches a shm buffer, it makes the caller tee session to have a shm reference with a refcount held, until Optee thread releases its cached shm reference.

There are ugly side effects. Linux must disable the cache to release all resources. We recently saw some tee sessions may be left open because of such shm refcount held. It can lead to few misbehaviour of the TA service (restarting a service, releasing a resource)

Config switch CFG_PREALLOC_RPC_CACHE was introduced [pr4896] to disable the feature at boot time. There are means to not use it, or to explicitly enable/disable it at run time (already used optee smc services for that). Would maybe be a better default config. Note this discussion thread ending at his comment [issue1918]:

Comments are welcome. I may have missed something in the description (or understanding :).

[pr4896] https://github.com/OP-TEE/optee_os/pull/4896 [issue1918] https://github.com/OP-TEE/optee_os/issues/1918#issuecomment-968747738

Best regards, etienne

...

-- Jerome

Etienne Carriere

9:43 a.m.

fixing typos :(

On Fri, 10 Dec 2021 at 10:38, Etienne Carriere etienne.carriere@linaro.org wrote:

...

On Fri, 10 Dec 2021 at 09:10, Jerome Forissier jerome@forissier.org wrote:

...
+CC Jens, Etienne On 12/10/21 06:00, Sumit Garg wrote:

...
On Fri, 10 Dec 2021 at 09:42, Wang, Xiaolei Xiaolei.Wang@windriver.com wrote:

...
From: Sumit Garg sumit.garg@linaro.org Sent: Thursday, December 9, 2021 7:41 PM

Yes. AFAICT, optee-os should allocate shared memory to communicate with tee-supplicant. So once the communication is done, the underlying shared memory should be freed. I can't think of any scenario where optee-os should keep hold-off shared memory indefinitely.

I believe it can happen when OP-TEE's CFG_PREALLOC_RPC_CACHE is y. See the config file [1] and the commit which introduced this config [2].

[1] https://github.com/OP-TEE/optee_os/blob/3.15.0/mk/config.mk#L709 [2] https://github.com/OP-TEE/optee_os/commit/8887663248ad

(snip)

It's been a while since OP-TEE caches some shm buffers to prevent re-allocting them on and on. OP-TEE does so for 1 shm buffer per "tee threads" OP-TEE has provisioned. Each thread can cache a shm reference. Note that used RPCs from optee to linux/u-boot/ree do not require such message buffer (IMO).

I meant: "Note that **most of the** used RPCs from ..."

br, etienne

Sumit Garg

10:28 a.m.

On Fri, 10 Dec 2021 at 15:08, Etienne Carriere etienne.carriere@linaro.org wrote:

...

Hello all,

On Fri, 10 Dec 2021 at 09:10, Jerome Forissier jerome@forissier.org wrote:

...
+CC Jens, Etienne

On 12/10/21 06:00, Sumit Garg wrote:

...
On Fri, 10 Dec 2021 at 09:42, Wang, Xiaolei Xiaolei.Wang@windriver.com wrote:

...
-----Original Message----- From: Sumit Garg sumit.garg@linaro.org Sent: Thursday, December 9, 2021 7:41 PM To: Wang, Xiaolei Xiaolei.Wang@windriver.com Cc: jens.wiklander@linaro.org; op-tee@lists.trustedfirmware.org; linux-kernel@vger.kernel.org Subject: Re: [PATCH] optee: Suppress false positive kmemleak report in optee_handle_rpc()

[Please note: This e-mail is from an EXTERNAL e-mail address]

On Mon, 6 Dec 2021 at 17:35, Xiaolei Wang xiaolei.wang@windriver.com wrote:

...
We observed the following kmemleak report: unreferenced object 0xffff000007904500 (size 128): comm "swapper/0", pid 1, jiffies 4294892671 (age 44.036s) hex dump (first 32 bytes): 00 47 90 07 00 00 ff ff 60 00 c0 ff 00 00 00 00 .G......`....... 60 00 80 13 00 80 ff ff a0 00 00 00 00 00 00 00 `............... backtrace: [<000000004c12b1c7>] kmem_cache_alloc+0x1ac/0x2f4 [<000000005d23eb4f>] tee_shm_alloc+0x78/0x230 [<00000000794dd22c>] optee_handle_rpc+0x60/0x6f0 [<00000000d9f7c52d>] optee_do_call_with_arg+0x17c/0x1dc [<00000000c35884da>] optee_open_session+0x128/0x1ec [<000000001748f2ff>] tee_client_open_session+0x28/0x40 [<00000000aecb5389>] optee_enumerate_devices+0x84/0x2a0 [<000000003df18bf1>] optee_probe+0x674/0x6cc [<000000003a4a534a>] platform_drv_probe+0x54/0xb0 [<000000000c51ce7d>] really_probe+0xe4/0x4d0 [<000000002f04c865>] driver_probe_device+0x58/0xc0 [<00000000b485397d>] device_driver_attach+0xc0/0xd0 [<00000000c835f0df>] __driver_attach+0x84/0x124 [<000000008e5a429c>] bus_for_each_dev+0x70/0xc0 [<000000001735e8a8>] driver_attach+0x24/0x30 [<000000006d94b04f>] bus_add_driver+0x104/0x1ec

This is not a memory leak because we pass the share memory pointer to secure world and would get it from secure world before releasing it.

...
How about if it's actually a memory leak caused by the secure world? An example being secure world just allocates kernel memory via OPTEE_SMC_RPC_FUNC_ALLOC and doesn't free it via OPTEE_SMC_RPC_FUNC_FREE.

...
IMO, we need to cross-check optee-os if it's responsible for leaking kernel memory.

Hi sumit,

You mean we need to check whether there is a real memleak, If being secure world just allocate kernel memory via OPTEE_SMC_PRC_FUNC_ALLOC and until the end, there is no free It via OPTEE_SMC_PRC_FUNC_FREE, then we should judge it as a memory leak, wo need to judge whether it is caused by secure os?

Yes. AFAICT, optee-os should allocate shared memory to communicate with tee-supplicant. So once the communication is done, the underlying shared memory should be freed. I can't think of any scenario where optee-os should keep hold-off shared memory indefinitely.

I believe it can happen when OP-TEE's CFG_PREALLOC_RPC_CACHE is y. See the config file [1] and the commit which introduced this config [2].

[1] https://github.com/OP-TEE/optee_os/blob/3.15.0/mk/config.mk#L709 [2] https://github.com/OP-TEE/optee_os/commit/8887663248ad

It's been a while since OP-TEE caches some shm buffers to prevent re-allocting them on and on. OP-TEE does so for 1 shm buffer per "tee threads" OP-TEE has provisioned. Each thread can cache a shm reference. Note that used RPCs from optee to linux/u-boot/ree do not require such message buffer (IMO).

The main issue is the shm buffer are allocated per optee thread (thread context assigned to client invocation request when entreing optee). Therefore, if an optee thread caches a shm buffer, it makes the caller tee session to have a shm reference with a refcount held, until Optee thread releases its cached shm reference.

There are ugly side effects. Linux must disable the cache to release all resources. We recently saw some tee sessions may be left open because of such shm refcount held. It can lead to few misbehaviour of the TA service (restarting a service, releasing a resource)

Config switch CFG_PREALLOC_RPC_CACHE was introduced [pr4896] to disable the feature at boot time. There are means to not use it, or to explicitly enable/disable it at run time (already used optee smc services for that). Would maybe be a better default config. Note this discussion thread ending at his comment [issue1918]:

Thanks etienne for the detailed description and references. Although, we can set CFG_PREALLOC_RPC_CACHE=n by default but it feels like we would miss a valuable optimization.

How about we just allocate a shared memory page during the OP-TEE driver probe and share it with optee-os to use for RPC arguments? And later it can be freed during OP-TEE driver removal. This would avoid any refconting of this special memory to be associated with TA sessions.

-Sumit

...

Comments are welcome. I may have missed something in the description (or understanding :).

[pr4896] https://github.com/OP-TEE/optee_os/pull/4896 [issue1918] https://github.com/OP-TEE/optee_os/issues/1918#issuecomment-968747738

Best regards, etienne

...
-- Jerome

Etienne Carriere

10:39 a.m.

On Fri, 10 Dec 2021 at 11:29, Sumit Garg sumit.garg@linaro.org wrote:

...

On Fri, 10 Dec 2021 at 15:08, Etienne Carriere etienne.carriere@linaro.org wrote:

...
Hello all,

On Fri, 10 Dec 2021 at 09:10, Jerome Forissier jerome@forissier.org wrote:

...
+CC Jens, Etienne

On 12/10/21 06:00, Sumit Garg wrote:

...
On Fri, 10 Dec 2021 at 09:42, Wang, Xiaolei Xiaolei.Wang@windriver.com wrote:

...
-----Original Message----- From: Sumit Garg sumit.garg@linaro.org Sent: Thursday, December 9, 2021 7:41 PM To: Wang, Xiaolei Xiaolei.Wang@windriver.com Cc: jens.wiklander@linaro.org; op-tee@lists.trustedfirmware.org; linux-kernel@vger.kernel.org Subject: Re: [PATCH] optee: Suppress false positive kmemleak report in optee_handle_rpc()

[Please note: This e-mail is from an EXTERNAL e-mail address]

On Mon, 6 Dec 2021 at 17:35, Xiaolei Wang xiaolei.wang@windriver.com wrote:

...
We observed the following kmemleak report: unreferenced object 0xffff000007904500 (size 128): comm "swapper/0", pid 1, jiffies 4294892671 (age 44.036s) hex dump (first 32 bytes): 00 47 90 07 00 00 ff ff 60 00 c0 ff 00 00 00 00 .G......`....... 60 00 80 13 00 80 ff ff a0 00 00 00 00 00 00 00 `............... backtrace: [<000000004c12b1c7>] kmem_cache_alloc+0x1ac/0x2f4 [<000000005d23eb4f>] tee_shm_alloc+0x78/0x230 [<00000000794dd22c>] optee_handle_rpc+0x60/0x6f0 [<00000000d9f7c52d>] optee_do_call_with_arg+0x17c/0x1dc [<00000000c35884da>] optee_open_session+0x128/0x1ec [<000000001748f2ff>] tee_client_open_session+0x28/0x40 [<00000000aecb5389>] optee_enumerate_devices+0x84/0x2a0 [<000000003df18bf1>] optee_probe+0x674/0x6cc [<000000003a4a534a>] platform_drv_probe+0x54/0xb0 [<000000000c51ce7d>] really_probe+0xe4/0x4d0 [<000000002f04c865>] driver_probe_device+0x58/0xc0 [<00000000b485397d>] device_driver_attach+0xc0/0xd0 [<00000000c835f0df>] __driver_attach+0x84/0x124 [<000000008e5a429c>] bus_for_each_dev+0x70/0xc0 [<000000001735e8a8>] driver_attach+0x24/0x30 [<000000006d94b04f>] bus_add_driver+0x104/0x1ec

This is not a memory leak because we pass the share memory pointer to secure world and would get it from secure world before releasing it.

...
How about if it's actually a memory leak caused by the secure world? An example being secure world just allocates kernel memory via OPTEE_SMC_RPC_FUNC_ALLOC and doesn't free it via OPTEE_SMC_RPC_FUNC_FREE.

...
IMO, we need to cross-check optee-os if it's responsible for leaking kernel memory.

Hi sumit,

You mean we need to check whether there is a real memleak, If being secure world just allocate kernel memory via OPTEE_SMC_PRC_FUNC_ALLOC and until the end, there is no free It via OPTEE_SMC_PRC_FUNC_FREE, then we should judge it as a memory leak, wo need to judge whether it is caused by secure os?

Yes. AFAICT, optee-os should allocate shared memory to communicate with tee-supplicant. So once the communication is done, the underlying shared memory should be freed. I can't think of any scenario where optee-os should keep hold-off shared memory indefinitely.

I believe it can happen when OP-TEE's CFG_PREALLOC_RPC_CACHE is y. See the config file [1] and the commit which introduced this config [2].

[1] https://github.com/OP-TEE/optee_os/blob/3.15.0/mk/config.mk#L709 [2] https://github.com/OP-TEE/optee_os/commit/8887663248ad

It's been a while since OP-TEE caches some shm buffers to prevent re-allocting them on and on. OP-TEE does so for 1 shm buffer per "tee threads" OP-TEE has provisioned. Each thread can cache a shm reference. Note that used RPCs from optee to linux/u-boot/ree do not require such message buffer (IMO).

The main issue is the shm buffer are allocated per optee thread (thread context assigned to client invocation request when entreing optee). Therefore, if an optee thread caches a shm buffer, it makes the caller tee session to have a shm reference with a refcount held, until Optee thread releases its cached shm reference.

There are ugly side effects. Linux must disable the cache to release all resources. We recently saw some tee sessions may be left open because of such shm refcount held. It can lead to few misbehaviour of the TA service (restarting a service, releasing a resource)

Config switch CFG_PREALLOC_RPC_CACHE was introduced [pr4896] to disable the feature at boot time. There are means to not use it, or to explicitly enable/disable it at run time (already used optee smc services for that). Would maybe be a better default config. Note this discussion thread ending at his comment [issue1918]:

Thanks etienne for the detailed description and references. Although, we can set CFG_PREALLOC_RPC_CACHE=n by default but it feels like we would miss a valuable optimization.

How about we just allocate a shared memory page during the OP-TEE driver probe and share it with optee-os to use for RPC arguments? And later it can be freed during OP-TEE driver removal. This would avoid any refconting of this special memory to be associated with TA sessions.

True. The driver currently invokes OPTEE_SMC_ENABLE_SHM_CACHE to start caching some shm allocations. The optee_os part of that command could be changed to preallocate the required small rpc message buffer per provisioned tee thread.

Existing OPTEE_SMC_DISABLE_SHM_CACHE should behave accordingly.

etienne

...

-Sumit

...
Comments are welcome. I may have missed something in the description (or understanding :).

[pr4896] https://github.com/OP-TEE/optee_os/pull/4896 [issue1918] https://github.com/OP-TEE/optee_os/issues/1918#issuecomment-968747738

Best regards, etienne

...
-- Jerome

Jens Wiklander

10:41 a.m.

Hi,

On Fri, Dec 10, 2021 at 11:29 AM Sumit Garg sumit.garg@linaro.org wrote:

...

On Fri, 10 Dec 2021 at 15:08, Etienne Carriere etienne.carriere@linaro.org wrote:

...
Hello all,

On Fri, 10 Dec 2021 at 09:10, Jerome Forissier jerome@forissier.org wrote:

...
+CC Jens, Etienne

On 12/10/21 06:00, Sumit Garg wrote:

...
On Fri, 10 Dec 2021 at 09:42, Wang, Xiaolei Xiaolei.Wang@windriver.com wrote:

...
-----Original Message----- From: Sumit Garg sumit.garg@linaro.org Sent: Thursday, December 9, 2021 7:41 PM To: Wang, Xiaolei Xiaolei.Wang@windriver.com Cc: jens.wiklander@linaro.org; op-tee@lists.trustedfirmware.org; linux-kernel@vger.kernel.org Subject: Re: [PATCH] optee: Suppress false positive kmemleak report in optee_handle_rpc()

[Please note: This e-mail is from an EXTERNAL e-mail address]

On Mon, 6 Dec 2021 at 17:35, Xiaolei Wang xiaolei.wang@windriver.com wrote:

...
We observed the following kmemleak report: unreferenced object 0xffff000007904500 (size 128): comm "swapper/0", pid 1, jiffies 4294892671 (age 44.036s) hex dump (first 32 bytes): 00 47 90 07 00 00 ff ff 60 00 c0 ff 00 00 00 00 .G......`....... 60 00 80 13 00 80 ff ff a0 00 00 00 00 00 00 00 `............... backtrace: [<000000004c12b1c7>] kmem_cache_alloc+0x1ac/0x2f4 [<000000005d23eb4f>] tee_shm_alloc+0x78/0x230 [<00000000794dd22c>] optee_handle_rpc+0x60/0x6f0 [<00000000d9f7c52d>] optee_do_call_with_arg+0x17c/0x1dc [<00000000c35884da>] optee_open_session+0x128/0x1ec [<000000001748f2ff>] tee_client_open_session+0x28/0x40 [<00000000aecb5389>] optee_enumerate_devices+0x84/0x2a0 [<000000003df18bf1>] optee_probe+0x674/0x6cc [<000000003a4a534a>] platform_drv_probe+0x54/0xb0 [<000000000c51ce7d>] really_probe+0xe4/0x4d0 [<000000002f04c865>] driver_probe_device+0x58/0xc0 [<00000000b485397d>] device_driver_attach+0xc0/0xd0 [<00000000c835f0df>] __driver_attach+0x84/0x124 [<000000008e5a429c>] bus_for_each_dev+0x70/0xc0 [<000000001735e8a8>] driver_attach+0x24/0x30 [<000000006d94b04f>] bus_add_driver+0x104/0x1ec

This is not a memory leak because we pass the share memory pointer to secure world and would get it from secure world before releasing it.

...
How about if it's actually a memory leak caused by the secure world? An example being secure world just allocates kernel memory via OPTEE_SMC_RPC_FUNC_ALLOC and doesn't free it via OPTEE_SMC_RPC_FUNC_FREE.

...
IMO, we need to cross-check optee-os if it's responsible for leaking kernel memory.

Hi sumit,

You mean we need to check whether there is a real memleak, If being secure world just allocate kernel memory via OPTEE_SMC_PRC_FUNC_ALLOC and until the end, there is no free It via OPTEE_SMC_PRC_FUNC_FREE, then we should judge it as a memory leak, wo need to judge whether it is caused by secure os?

Yes. AFAICT, optee-os should allocate shared memory to communicate with tee-supplicant. So once the communication is done, the underlying shared memory should be freed. I can't think of any scenario where optee-os should keep hold-off shared memory indefinitely.

I believe it can happen when OP-TEE's CFG_PREALLOC_RPC_CACHE is y. See the config file [1] and the commit which introduced this config [2].

[1] https://github.com/OP-TEE/optee_os/blob/3.15.0/mk/config.mk#L709 [2] https://github.com/OP-TEE/optee_os/commit/8887663248ad

It's been a while since OP-TEE caches some shm buffers to prevent re-allocting them on and on. OP-TEE does so for 1 shm buffer per "tee threads" OP-TEE has provisioned. Each thread can cache a shm reference. Note that used RPCs from optee to linux/u-boot/ree do not require such message buffer (IMO).

The main issue is the shm buffer are allocated per optee thread (thread context assigned to client invocation request when entreing optee). Therefore, if an optee thread caches a shm buffer, it makes the caller tee session to have a shm reference with a refcount held, until Optee thread releases its cached shm reference.

There are ugly side effects. Linux must disable the cache to release all resources. We recently saw some tee sessions may be left open because of such shm refcount held. It can lead to few misbehaviour of the TA service (restarting a service, releasing a resource)

Config switch CFG_PREALLOC_RPC_CACHE was introduced [pr4896] to disable the feature at boot time. There are means to not use it, or to explicitly enable/disable it at run time (already used optee smc services for that). Would maybe be a better default config. Note this discussion thread ending at his comment [issue1918]:

Thanks etienne for the detailed description and references. Although, we can set CFG_PREALLOC_RPC_CACHE=n by default but it feels like we would miss a valuable optimization.

How about we just allocate a shared memory page during the OP-TEE driver probe and share it with optee-os to use for RPC arguments? And later it can be freed during OP-TEE driver removal. This would avoid any refconting of this special memory to be associated with TA sessions.

The FF-A ABI part of the driver avoids this problem. I've started on a similar solution for the SMC based ABI, but it's not ready to post yet.

Cheers, Jens

...

-Sumit

...
Comments are welcome. I may have missed something in the description (or understanding :).

[pr4896] https://github.com/OP-TEE/optee_os/pull/4896 [issue1918] https://github.com/OP-TEE/optee_os/issues/1918#issuecomment-968747738

Best regards, etienne

...
-- Jerome

Sumit Garg

9:38 a.m.

On Fri, 10 Dec 2021 at 13:40, Jerome Forissier jerome@forissier.org wrote:

...

+CC Jens, Etienne

On 12/10/21 06:00, Sumit Garg wrote:

...
On Fri, 10 Dec 2021 at 09:42, Wang, Xiaolei Xiaolei.Wang@windriver.com wrote:

...
-----Original Message----- From: Sumit Garg sumit.garg@linaro.org Sent: Thursday, December 9, 2021 7:41 PM To: Wang, Xiaolei Xiaolei.Wang@windriver.com Cc: jens.wiklander@linaro.org; op-tee@lists.trustedfirmware.org; linux-kernel@vger.kernel.org Subject: Re: [PATCH] optee: Suppress false positive kmemleak report in optee_handle_rpc()

[Please note: This e-mail is from an EXTERNAL e-mail address]

On Mon, 6 Dec 2021 at 17:35, Xiaolei Wang xiaolei.wang@windriver.com wrote:

...
We observed the following kmemleak report: unreferenced object 0xffff000007904500 (size 128): comm "swapper/0", pid 1, jiffies 4294892671 (age 44.036s) hex dump (first 32 bytes): 00 47 90 07 00 00 ff ff 60 00 c0 ff 00 00 00 00 .G......`....... 60 00 80 13 00 80 ff ff a0 00 00 00 00 00 00 00 `............... backtrace: [<000000004c12b1c7>] kmem_cache_alloc+0x1ac/0x2f4 [<000000005d23eb4f>] tee_shm_alloc+0x78/0x230 [<00000000794dd22c>] optee_handle_rpc+0x60/0x6f0 [<00000000d9f7c52d>] optee_do_call_with_arg+0x17c/0x1dc [<00000000c35884da>] optee_open_session+0x128/0x1ec [<000000001748f2ff>] tee_client_open_session+0x28/0x40 [<00000000aecb5389>] optee_enumerate_devices+0x84/0x2a0 [<000000003df18bf1>] optee_probe+0x674/0x6cc [<000000003a4a534a>] platform_drv_probe+0x54/0xb0 [<000000000c51ce7d>] really_probe+0xe4/0x4d0 [<000000002f04c865>] driver_probe_device+0x58/0xc0 [<00000000b485397d>] device_driver_attach+0xc0/0xd0 [<00000000c835f0df>] __driver_attach+0x84/0x124 [<000000008e5a429c>] bus_for_each_dev+0x70/0xc0 [<000000001735e8a8>] driver_attach+0x24/0x30 [<000000006d94b04f>] bus_add_driver+0x104/0x1ec

This is not a memory leak because we pass the share memory pointer to secure world and would get it from secure world before releasing it.

...
How about if it's actually a memory leak caused by the secure world? An example being secure world just allocates kernel memory via OPTEE_SMC_RPC_FUNC_ALLOC and doesn't free it via OPTEE_SMC_RPC_FUNC_FREE.

...
IMO, we need to cross-check optee-os if it's responsible for leaking kernel memory.

Hi sumit,

You mean we need to check whether there is a real memleak, If being secure world just allocate kernel memory via OPTEE_SMC_PRC_FUNC_ALLOC and until the end, there is no free It via OPTEE_SMC_PRC_FUNC_FREE, then we should judge it as a memory leak, wo need to judge whether it is caused by secure os?

Yes. AFAICT, optee-os should allocate shared memory to communicate with tee-supplicant. So once the communication is done, the underlying shared memory should be freed. I can't think of any scenario where optee-os should keep hold-off shared memory indefinitely.

I believe it can happen when OP-TEE's CFG_PREALLOC_RPC_CACHE is y. See the config file [1] and the commit which introduced this config [2].

Okay, I see the reasoning. So during the OP-TEE driver's lifetime, the RPC shared memory remains allocated. I guess that is done primarily for performance reasons.

But still it doesn't feel appropriate that we term all RPC shm allocations as not leaking memory as we might miss obvious ones.

Xiaolei,

Can you once test with CFG_PREALLOC_RPC_CACHE=n while compiling optee-os and see if the observed memory leak disappears or not?

-Sumit

...

[1] https://github.com/OP-TEE/optee_os/blob/3.15.0/mk/config.mk#L709 [2] https://github.com/OP-TEE/optee_os/commit/8887663248ad

-- Jerome

Daniel Thompson

3:49 p.m.

On Fri, Dec 10, 2021 at 03:08:21PM +0530, Sumit Garg wrote:

...

On Fri, 10 Dec 2021 at 13:40, Jerome Forissier jerome@forissier.org wrote:

...
+CC Jens, Etienne

On 12/10/21 06:00, Sumit Garg wrote:

...
On Fri, 10 Dec 2021 at 09:42, Wang, Xiaolei Xiaolei.Wang@windriver.com wrote:

...
-----Original Message----- From: Sumit Garg sumit.garg@linaro.org Sent: Thursday, December 9, 2021 7:41 PM To: Wang, Xiaolei Xiaolei.Wang@windriver.com Cc: jens.wiklander@linaro.org; op-tee@lists.trustedfirmware.org; linux-kernel@vger.kernel.org Subject: Re: [PATCH] optee: Suppress false positive kmemleak report in optee_handle_rpc()

[Please note: This e-mail is from an EXTERNAL e-mail address]

On Mon, 6 Dec 2021 at 17:35, Xiaolei Wang xiaolei.wang@windriver.com wrote:

...
We observed the following kmemleak report: unreferenced object 0xffff000007904500 (size 128): comm "swapper/0", pid 1, jiffies 4294892671 (age 44.036s) hex dump (first 32 bytes): 00 47 90 07 00 00 ff ff 60 00 c0 ff 00 00 00 00 .G......`....... 60 00 80 13 00 80 ff ff a0 00 00 00 00 00 00 00 `............... backtrace: [<000000004c12b1c7>] kmem_cache_alloc+0x1ac/0x2f4 [<000000005d23eb4f>] tee_shm_alloc+0x78/0x230 [<00000000794dd22c>] optee_handle_rpc+0x60/0x6f0 [<00000000d9f7c52d>] optee_do_call_with_arg+0x17c/0x1dc [<00000000c35884da>] optee_open_session+0x128/0x1ec [<000000001748f2ff>] tee_client_open_session+0x28/0x40 [<00000000aecb5389>] optee_enumerate_devices+0x84/0x2a0 [<000000003df18bf1>] optee_probe+0x674/0x6cc [<000000003a4a534a>] platform_drv_probe+0x54/0xb0 [<000000000c51ce7d>] really_probe+0xe4/0x4d0 [<000000002f04c865>] driver_probe_device+0x58/0xc0 [<00000000b485397d>] device_driver_attach+0xc0/0xd0 [<00000000c835f0df>] __driver_attach+0x84/0x124 [<000000008e5a429c>] bus_for_each_dev+0x70/0xc0 [<000000001735e8a8>] driver_attach+0x24/0x30 [<000000006d94b04f>] bus_add_driver+0x104/0x1ec

This is not a memory leak because we pass the share memory pointer to secure world and would get it from secure world before releasing it.

...
How about if it's actually a memory leak caused by the secure world? An example being secure world just allocates kernel memory via OPTEE_SMC_RPC_FUNC_ALLOC and doesn't free it via OPTEE_SMC_RPC_FUNC_FREE.

...
IMO, we need to cross-check optee-os if it's responsible for leaking kernel memory.

Hi sumit,

You mean we need to check whether there is a real memleak, If being secure world just allocate kernel memory via OPTEE_SMC_PRC_FUNC_ALLOC and until the end, there is no free It via OPTEE_SMC_PRC_FUNC_FREE, then we should judge it as a memory leak, wo need to judge whether it is caused by secure os?

Yes. AFAICT, optee-os should allocate shared memory to communicate with tee-supplicant. So once the communication is done, the underlying shared memory should be freed. I can't think of any scenario where optee-os should keep hold-off shared memory indefinitely.

I believe it can happen when OP-TEE's CFG_PREALLOC_RPC_CACHE is y. See the config file [1] and the commit which introduced this config [2].

Okay, I see the reasoning. So during the OP-TEE driver's lifetime, the RPC shared memory remains allocated. I guess that is done primarily for performance reasons.

But still it doesn't feel appropriate that we term all RPC shm allocations as not leaking memory as we might miss obvious ones.

IIUC this patch adds kmemleak_not_leak() at (pretty much) the last possible point before *ownership* of the SHM block is passed from kernel to OP-TEE.

Sure, after we change ownership it could still be leaked... but it can no longer be leaked by the kernel because the kernel no longer owns it! More importantly, it makes no sense to run the kernel memory detector on the buffer because it simply can't work.

After the RPC completes, doesn't it become impossible for kmemleak to scan to see if the pointer is lost[1]? kmemleak is essentially a tracing garbage collector and needs to be able to scan all memory that could hold a pointer to leakable memory. After the RPC completes the only copy of the pointer will be stored in a memory region that the kernel is prohibited from reading. How could kmemleak possibly give you a useful answer in this circumstance?

In other words if there's nothing kmemleak could do to fix this situation then marking the memory as kmemleak_not_leak() seems entirely appropriate (although it should be prefaced with a big comment explaining the change of ownerhship and why kmemleak cannot work).

Daniel.

[1] Everything I've said here hinges on this being true... so if I've made a mistake about where OP-TEE stores this pointer then most of the rest of this post is junk ;-)

Sumit Garg

13 Dec 13 Dec

8:58 a.m.

On Fri, 10 Dec 2021 at 21:19, Daniel Thompson daniel.thompson@linaro.org wrote:

...

On Fri, Dec 10, 2021 at 03:08:21PM +0530, Sumit Garg wrote:

...
On Fri, 10 Dec 2021 at 13:40, Jerome Forissier jerome@forissier.org wrote:

...
+CC Jens, Etienne

On 12/10/21 06:00, Sumit Garg wrote:

...
On Fri, 10 Dec 2021 at 09:42, Wang, Xiaolei Xiaolei.Wang@windriver.com wrote:

...
-----Original Message----- From: Sumit Garg sumit.garg@linaro.org Sent: Thursday, December 9, 2021 7:41 PM To: Wang, Xiaolei Xiaolei.Wang@windriver.com Cc: jens.wiklander@linaro.org; op-tee@lists.trustedfirmware.org; linux-kernel@vger.kernel.org Subject: Re: [PATCH] optee: Suppress false positive kmemleak report in optee_handle_rpc()

[Please note: This e-mail is from an EXTERNAL e-mail address]

On Mon, 6 Dec 2021 at 17:35, Xiaolei Wang xiaolei.wang@windriver.com wrote:

...
We observed the following kmemleak report: unreferenced object 0xffff000007904500 (size 128): comm "swapper/0", pid 1, jiffies 4294892671 (age 44.036s) hex dump (first 32 bytes): 00 47 90 07 00 00 ff ff 60 00 c0 ff 00 00 00 00 .G......`....... 60 00 80 13 00 80 ff ff a0 00 00 00 00 00 00 00 `............... backtrace: [<000000004c12b1c7>] kmem_cache_alloc+0x1ac/0x2f4 [<000000005d23eb4f>] tee_shm_alloc+0x78/0x230 [<00000000794dd22c>] optee_handle_rpc+0x60/0x6f0 [<00000000d9f7c52d>] optee_do_call_with_arg+0x17c/0x1dc [<00000000c35884da>] optee_open_session+0x128/0x1ec [<000000001748f2ff>] tee_client_open_session+0x28/0x40 [<00000000aecb5389>] optee_enumerate_devices+0x84/0x2a0 [<000000003df18bf1>] optee_probe+0x674/0x6cc [<000000003a4a534a>] platform_drv_probe+0x54/0xb0 [<000000000c51ce7d>] really_probe+0xe4/0x4d0 [<000000002f04c865>] driver_probe_device+0x58/0xc0 [<00000000b485397d>] device_driver_attach+0xc0/0xd0 [<00000000c835f0df>] __driver_attach+0x84/0x124 [<000000008e5a429c>] bus_for_each_dev+0x70/0xc0 [<000000001735e8a8>] driver_attach+0x24/0x30 [<000000006d94b04f>] bus_add_driver+0x104/0x1ec

This is not a memory leak because we pass the share memory pointer to secure world and would get it from secure world before releasing it.

...
How about if it's actually a memory leak caused by the secure world? An example being secure world just allocates kernel memory via OPTEE_SMC_RPC_FUNC_ALLOC and doesn't free it via OPTEE_SMC_RPC_FUNC_FREE.

...
IMO, we need to cross-check optee-os if it's responsible for leaking kernel memory.

Hi sumit,

You mean we need to check whether there is a real memleak, If being secure world just allocate kernel memory via OPTEE_SMC_PRC_FUNC_ALLOC and until the end, there is no free It via OPTEE_SMC_PRC_FUNC_FREE, then we should judge it as a memory leak, wo need to judge whether it is caused by secure os?

Yes. AFAICT, optee-os should allocate shared memory to communicate with tee-supplicant. So once the communication is done, the underlying shared memory should be freed. I can't think of any scenario where optee-os should keep hold-off shared memory indefinitely.

I believe it can happen when OP-TEE's CFG_PREALLOC_RPC_CACHE is y. See the config file [1] and the commit which introduced this config [2].

Okay, I see the reasoning. So during the OP-TEE driver's lifetime, the RPC shared memory remains allocated. I guess that is done primarily for performance reasons.

But still it doesn't feel appropriate that we term all RPC shm allocations as not leaking memory as we might miss obvious ones.

IIUC this patch adds kmemleak_not_leak() at (pretty much) the last possible point before *ownership* of the SHM block is passed from kernel to OP-TEE.

I wouldn't say it's a transfer of ownership from kernel to OP-TEE but rather a way for OP-TEE to access kernel's memory in order to pass info or execute further RPC commands.

...

Sure, after we change ownership it could still be leaked... but it can no longer be leaked by the kernel because the kernel no longer owns it! More importantly, it makes no sense to run the kernel memory detector on the buffer because it simply can't work.

After the RPC completes, doesn't it become impossible for kmemleak to scan to see if the pointer is lost[1]?

Apart from the special OP-TEE prealloc SHM cache stuff, I can't think of any scenario where an OP-TEE thread should hold off kernel's memory pointers for more than 5 seconds before being passed onto kernel for further RPC commands or RPC free action. So the kmemleak should be able to detect if a pointer is lost.

/* * Kmemleak configuration and common defines. */ <snip> #define MSECS_MIN_AGE 5000 /* minimum object age for reporting */ <snip>

...

kmemleak is essentially a tracing garbage collector and needs to be able to scan all memory that could hold a pointer to leakable memory. After the RPC completes the only copy of the pointer will be stored in a memory region that the kernel is prohibited from reading. How could kmemleak possibly give you a useful answer in this circumstance?

There is another aspect of kmemleak being the minimum age of an object to be reported as a memory leak as described above. Also, this case resembles where a pointer is stored on the CPU stack (see struct optee_rpc_param param = { };).

In most of the scenarios apart from special prealloc shm cache case, the flow should be as follows:

1) Alloc kernel memory via RPC 2) OP-TEE passes references to kernel memory for RPC action commands 3) Free kernel memory via RPC

kmemleak should be useful in case the 3rd step is skipped due to incorrect behaviour of a particular OP-TEE thread. And I can't think of any appropriate way in OP-TEE OS to detect this type of kernel memory leak caused by one of its threads.

-Sumit

...

In other words if there's nothing kmemleak could do to fix this situation then marking the memory as kmemleak_not_leak() seems entirely appropriate (although it should be prefaced with a big comment explaining the change of ownerhship and why kmemleak cannot work).

Daniel.

[1] Everything I've said here hinges on this being true... so if I've made a mistake about where OP-TEE stores this pointer then most of the rest of this post is junk ;-)

Daniel Thompson

1:04 p.m.

On Mon, Dec 13, 2021 at 02:28:01PM +0530, Sumit Garg wrote:

...

On Fri, 10 Dec 2021 at 21:19, Daniel Thompson daniel.thompson@linaro.org wrote:

...
On Fri, Dec 10, 2021 at 03:08:21PM +0530, Sumit Garg wrote:

...
On Fri, 10 Dec 2021 at 13:40, Jerome Forissier jerome@forissier.org wrote:

...
On 12/10/21 06:00, Sumit Garg wrote:

...
On Fri, 10 Dec 2021 at 09:42, Wang, Xiaolei Xiaolei.Wang@windriver.com wrote: Yes. AFAICT, optee-os should allocate shared memory to communicate with tee-supplicant. So once the communication is done, the underlying shared memory should be freed. I can't think of any scenario where optee-os should keep hold-off shared memory indefinitely.

I believe it can happen when OP-TEE's CFG_PREALLOC_RPC_CACHE is y. See the config file [1] and the commit which introduced this config [2].

Okay, I see the reasoning. So during the OP-TEE driver's lifetime, the RPC shared memory remains allocated. I guess that is done primarily for performance reasons.

But still it doesn't feel appropriate that we term all RPC shm allocations as not leaking memory as we might miss obvious ones.

IIUC this patch adds kmemleak_not_leak() at (pretty much) the last possible point before *ownership* of the SHM block is passed from kernel to OP-TEE.

I wouldn't say it's a transfer of ownership from kernel to OP-TEE but rather a way for OP-TEE to access kernel's memory in order to pass info or execute further RPC commands.

The RPC handler allocates a pointer (e.g. now the RPC handler owns the allocated memory). The RPC handler then passes that pointer to OP-TEE and forgets what it's value was.

That is a transfer of ownership: the RPC handler does not hold any pointer to the memory and is incapable of freeing it. Moreover this situation is what kmemleak_no_leak() is for! Its job it to inform kmemleak that the pointer is owned/stored somewhere that is does not scan.

...

...
Sure, after we change ownership it could still be leaked... but it can no longer be leaked by the kernel because the kernel no longer owns it! More importantly, it makes no sense to run the kernel memory detector on the buffer because it simply can't work.

After the RPC completes, doesn't it become impossible for kmemleak to scan to see if the pointer is lost[1]?

Apart from the special OP-TEE prealloc SHM cache stuff, I can't think of any scenario where an OP-TEE thread should hold off kernel's memory pointers for more than 5 seconds before being passed onto kernel for further RPC commands or RPC free action. So the kmemleak should be able to detect if a pointer is lost.

Or putting this a different way: there is known to be firmware in the field that allocates pointers for more then five seconds!

...

/*

Kmemleak configuration and common defines.

*/

<snip> #define MSECS_MIN_AGE 5000 /* minimum object age for reporting */ <snip>

...
kmemleak is essentially a tracing garbage collector and needs to be able to scan all memory that could hold a pointer to leakable memory. After the RPC completes the only copy of the pointer will be stored in a memory region that the kernel is prohibited from reading. How could kmemleak possibly give you a useful answer in this circumstance?

There is another aspect of kmemleak being the minimum age of an object to be reported as a memory leak as described above. Also, this case resembles where a pointer is stored on the CPU stack (see struct optee_rpc_param param = { };).

I can't see how this resembles pointers stored on the stack.

Firstly, stack memory is scanned by kmemleak meaning a thread is permitted to own memory for more than five seconds without provoking a warning. OP-TEE memory cannot be scanned like this.

Secondly, stacks don't have any concept of sessions. It is *really* buggy behaviour for a TA to allocate SHM memory during a session open so it can avoid critical path RPC round trips when operational?

...

In most of the scenarios apart from special prealloc shm cache case, the flow should be as follows:

Alloc kernel memory via RPC

OP-TEE passes references to kernel memory for RPC action commands

Free kernel memory via RPC

kmemleak should be useful in case the 3rd step is skipped due to incorrect behaviour of a particular OP-TEE thread. And I can't think of any appropriate way in OP-TEE OS to detect this type of kernel memory leak caused by one of its threads.

If OP-TEE is the only place the pointer is held and you can't think of any way for OP-TEE OS to detect if it has leaked the pointer then how can you expect the kernel to give a correct verdict when it has even less visibility than OP-TEE OS.

Note that, if you think OP-TEE routinely leaks memory, then there are ways that the corresponding kernel driver could track what memory it has handed to OP-TEE. However this should be described as a list of *allocations* rather than a list of *leaks* because the driver cannot distinguish the two.

Daniel.

Sumit Garg

14 Dec 14 Dec

7:03 a.m.

On Mon, 13 Dec 2021 at 18:34, Daniel Thompson daniel.thompson@linaro.org wrote:

...

On Mon, Dec 13, 2021 at 02:28:01PM +0530, Sumit Garg wrote:

...
On Fri, 10 Dec 2021 at 21:19, Daniel Thompson daniel.thompson@linaro.org wrote:

...
On Fri, Dec 10, 2021 at 03:08:21PM +0530, Sumit Garg wrote:

...
On Fri, 10 Dec 2021 at 13:40, Jerome Forissier jerome@forissier.org wrote:

...
On 12/10/21 06:00, Sumit Garg wrote:

...
On Fri, 10 Dec 2021 at 09:42, Wang, Xiaolei Xiaolei.Wang@windriver.com wrote: Yes. AFAICT, optee-os should allocate shared memory to communicate with tee-supplicant. So once the communication is done, the underlying shared memory should be freed. I can't think of any scenario where optee-os should keep hold-off shared memory indefinitely.

I believe it can happen when OP-TEE's CFG_PREALLOC_RPC_CACHE is y. See the config file [1] and the commit which introduced this config [2].

Okay, I see the reasoning. So during the OP-TEE driver's lifetime, the RPC shared memory remains allocated. I guess that is done primarily for performance reasons.

But still it doesn't feel appropriate that we term all RPC shm allocations as not leaking memory as we might miss obvious ones.

IIUC this patch adds kmemleak_not_leak() at (pretty much) the last possible point before *ownership* of the SHM block is passed from kernel to OP-TEE.

I wouldn't say it's a transfer of ownership from kernel to OP-TEE but rather a way for OP-TEE to access kernel's memory in order to pass info or execute further RPC commands.

The RPC handler allocates a pointer (e.g. now the RPC handler owns the allocated memory). The RPC handler then passes that pointer to OP-TEE and forgets what it's value was.

That is a transfer of ownership: the RPC handler does not hold any pointer to the memory and is incapable of freeing it. Moreover this situation is what kmemleak_no_leak() is for! Its job it to inform kmemleak that the pointer is owned/stored somewhere that is does not scan.

Let me put this another way. If the memory allocator belongs to the kernel then how does OP-TEE get to know which memory is currently allocated and it is to be scanned?

I think the complete solution would be to extend kmemleak to support OP-TEE memory scanning via OP-TEE invocation to check if it's holding any kernel memory references.

...

...
...
Sure, after we change ownership it could still be leaked... but it can no longer be leaked by the kernel because the kernel no longer owns it! More importantly, it makes no sense to run the kernel memory detector on the buffer because it simply can't work.

After the RPC completes, doesn't it become impossible for kmemleak to scan to see if the pointer is lost[1]?

Apart from the special OP-TEE prealloc SHM cache stuff, I can't think of any scenario where an OP-TEE thread should hold off kernel's memory pointers for more than 5 seconds before being passed onto kernel for further RPC commands or RPC free action. So the kmemleak should be able to detect if a pointer is lost.

Or putting this a different way: there is known to be firmware in the field that allocates pointers for more then five seconds!

If it's known that upstream OP-TEE doesn't hold any kernel memory references for more than 5 seconds then IMO we should be fine to not disable kmemleak until we have a future kmemleak extension. Otherwise it would be very hard to keep track of kernel memory lost in this way.

...

...
/*

Kmemleak configuration and common defines.

*/

<snip> #define MSECS_MIN_AGE 5000 /* minimum object age for reporting */ <snip>

...
kmemleak is essentially a tracing garbage collector and needs to be able to scan all memory that could hold a pointer to leakable memory. After the RPC completes the only copy of the pointer will be stored in a memory region that the kernel is prohibited from reading. How could kmemleak possibly give you a useful answer in this circumstance?

There is another aspect of kmemleak being the minimum age of an object to be reported as a memory leak as described above. Also, this case resembles where a pointer is stored on the CPU stack (see struct optee_rpc_param param = { };).

I can't see how this resembles pointers stored on the stack.

Firstly, stack memory is scanned by kmemleak meaning a thread is permitted to own memory for more than five seconds without provoking a warning. OP-TEE memory cannot be scanned like this.

Secondly, stacks don't have any concept of sessions. It is *really* buggy behaviour for a TA to allocate SHM memory during a session open so it can avoid critical path RPC round trips when operational?

That's one optimization case for prealloc SHM we have currently. Jens is already working on an update for SMC ABI to avoid such allocations.

BTW, that could be disabled with CFG_PREALLOC_RPC_CACHE=n in upstream OP-TEE.

...

...
In most of the scenarios apart from special prealloc shm cache case, the flow should be as follows:

Alloc kernel memory via RPC

OP-TEE passes references to kernel memory for RPC action commands

Free kernel memory via RPC

kmemleak should be useful in case the 3rd step is skipped due to incorrect behaviour of a particular OP-TEE thread. And I can't think of any appropriate way in OP-TEE OS to detect this type of kernel memory leak caused by one of its threads.

If OP-TEE is the only place the pointer is held and you can't think of any way for OP-TEE OS to detect if it has leaked the pointer then how can you expect the kernel to give a correct verdict when it has even less visibility than OP-TEE OS.

Let me try to elaborate here. Since the nature of shared memory under consideration here are:

1) Allocated by kernel. 2) OP-TEE holds reference to it.

In order to detect its leakage, the memory leakage detector should be invoked by the kernel and OP-TEE should support that detector to say if it holds any references to the memory being scanned.

Also, as you may know OP-TEE is a passive firmware entity without any threads of its own and is scheduled by the Linux kernel. So it won't be possible to have an independent memory leak detector thread within OP-TEE.

...

Note that, if you think OP-TEE routinely leaks memory,

How can we be sure about any piece of complex software (lots of platform specific code) that it will leak memory or not? I guess that is the reason why the kernel has this runtime kmemleak debugging tool.

-Sumit

...

then there are ways that the corresponding kernel driver could track what memory it has handed to OP-TEE. However this should be described as a list of *allocations* rather than a list of *leaks* because the driver cannot distinguish the two.

Daniel.

Daniel Thompson

15 Dec 15 Dec

10:19 a.m.

On Tue, Dec 14, 2021 at 12:33:08PM +0530, Sumit Garg wrote:

...

On Mon, 13 Dec 2021 at 18:34, Daniel Thompson daniel.thompson@linaro.org wrote:

...
On Mon, Dec 13, 2021 at 02:28:01PM +0530, Sumit Garg wrote:

...
On Fri, 10 Dec 2021 at 21:19, Daniel Thompson daniel.thompson@linaro.org wrote:

...
On Fri, Dec 10, 2021 at 03:08:21PM +0530, Sumit Garg wrote:

...
On Fri, 10 Dec 2021 at 13:40, Jerome Forissier jerome@forissier.org wrote:

...
On 12/10/21 06:00, Sumit Garg wrote: > On Fri, 10 Dec 2021 at 09:42, Wang, Xiaolei Xiaolei.Wang@windriver.com wrote:

IIUC this patch adds kmemleak_not_leak() at (pretty much) the last possible point before *ownership* of the SHM block is passed from kernel to OP-TEE.

I wouldn't say it's a transfer of ownership from kernel to OP-TEE but rather a way for OP-TEE to access kernel's memory in order to pass info or execute further RPC commands.

The RPC handler allocates a pointer (e.g. now the RPC handler owns the allocated memory). The RPC handler then passes that pointer to OP-TEE and forgets what it's value was.

That is a transfer of ownership: the RPC handler does not hold any pointer to the memory and is incapable of freeing it. Moreover this situation is what kmemleak_no_leak() is for! Its job it to inform kmemleak that the pointer is owned/stored somewhere that is does not scan.

Let me put this another way. If the memory allocator belongs to the kernel then how does OP-TEE get to know which memory is currently allocated and it is to be scanned?

OP-TEE explicitly requested that the be allocated and responsible for figuring out where to store the pointer. How could it *not* know this information? More specifically OP-TEE is perfectly capable of recording what memory it has allocated and where to scan to find out if it has been lost.

...

I think the complete solution would be to extend kmemleak to support OP-TEE memory scanning via OP-TEE invocation to check if it's holding any kernel memory references.

This is the part I get stuck on... and the reason I'm still posting on the thread.

I struggle to see any value in using kmemleak to identify this type of leak. That is the fundamental issue. False positives from kmemleak are damaging to user confidence in the tool and are especially harmful when it is complex and time consuming to verify that is actually is a false positive (which would certainly be the case for OP-TEE false positives). In short it is *not* always the case failure-to-detect is worse than false-positive.

As discussed already the firmware/kernel contract prevents kmemleak from working as it is designed to and I am unconvinced that relying on fragile timeouts is sufficient.

Extending kmemleak to support OP-TEE memory scanning is also, IMHO, pointless. The reason for this is that OP-TEE cannot perform any scan on behalf of kmemleak without first validating the information provided to it by the kernel (to avoid information leaks). However if OP-TEE tracks enough state to validate the kernel input than it already has enough state to do a scan for leaks independently anyway (apart from being donated an execution context). Therefore it follows that any OP-TEE extension to handle leaks should be independent of kmemleak because it would still be useful to be able to ask OP-TEE to run a self-consistency check even if kmemleak is disabled.

Or, in short, even if you implement improved leak detection for OP-TEE (whether that is based on timers or scanning) then kmemleak_not_leak() is still the right thing to do with pointers whose ownership we have transferred to OP-TEE.

...

...
...
...
Sure, after we change ownership it could still be leaked... but it can no longer be leaked by the kernel because the kernel no longer owns it! More importantly, it makes no sense to run the kernel memory detector on the buffer because it simply can't work.

After the RPC completes, doesn't it become impossible for kmemleak to scan to see if the pointer is lost[1]?

Apart from the special OP-TEE prealloc SHM cache stuff, I can't think of any scenario where an OP-TEE thread should hold off kernel's memory pointers for more than 5 seconds before being passed onto kernel for further RPC commands or RPC free action. So the kmemleak should be able to detect if a pointer is lost.

Or putting this a different way: there is known to be firmware in the field that allocates pointers for more then five seconds!

If it's known that upstream OP-TEE doesn't hold any kernel memory references for more than 5 seconds then IMO we should be fine to not disable kmemleak until we have a future kmemleak extension. Otherwise it would be very hard to keep track of kernel memory lost in this way.

In essence I am arguing for using the right tool for the right job (and against turning down a correct patch because the right tool isn't yet implemented). A memory scanning leak detector is the wrong tool to search for leaks in memory that cannot be scanned.

For me having to rely on fragile implied contracts and undocumented assumptions about timing further reinforces my view that kmemleak is not the wrong tool. Especially so when we know that those assumptions are not met by existing firmware.

Daniel.

Jens Wiklander

12:25 p.m.

On Wed, Dec 15, 2021 at 11:19 AM Daniel Thompson daniel.thompson@linaro.org wrote:

...

On Tue, Dec 14, 2021 at 12:33:08PM +0530, Sumit Garg wrote:

...
On Mon, 13 Dec 2021 at 18:34, Daniel Thompson daniel.thompson@linaro.org wrote:

...
On Mon, Dec 13, 2021 at 02:28:01PM +0530, Sumit Garg wrote:

...
On Fri, 10 Dec 2021 at 21:19, Daniel Thompson daniel.thompson@linaro.org wrote:

...
On Fri, Dec 10, 2021 at 03:08:21PM +0530, Sumit Garg wrote:

...
On Fri, 10 Dec 2021 at 13:40, Jerome Forissier jerome@forissier.org wrote: > On 12/10/21 06:00, Sumit Garg wrote: > > On Fri, 10 Dec 2021 at 09:42, Wang, Xiaolei Xiaolei.Wang@windriver.com wrote:

IIUC this patch adds kmemleak_not_leak() at (pretty much) the last possible point before *ownership* of the SHM block is passed from kernel to OP-TEE.

I wouldn't say it's a transfer of ownership from kernel to OP-TEE but rather a way for OP-TEE to access kernel's memory in order to pass info or execute further RPC commands.

The RPC handler allocates a pointer (e.g. now the RPC handler owns the allocated memory). The RPC handler then passes that pointer to OP-TEE and forgets what it's value was.

That is a transfer of ownership: the RPC handler does not hold any pointer to the memory and is incapable of freeing it. Moreover this situation is what kmemleak_no_leak() is for! Its job it to inform kmemleak that the pointer is owned/stored somewhere that is does not scan.

Let me put this another way. If the memory allocator belongs to the kernel then how does OP-TEE get to know which memory is currently allocated and it is to be scanned?

OP-TEE explicitly requested that the be allocated and responsible for figuring out where to store the pointer. How could it *not* know this information? More specifically OP-TEE is perfectly capable of recording what memory it has allocated and where to scan to find out if it has been lost.

...
I think the complete solution would be to extend kmemleak to support OP-TEE memory scanning via OP-TEE invocation to check if it's holding any kernel memory references.

This is the part I get stuck on... and the reason I'm still posting on the thread.

I struggle to see any value in using kmemleak to identify this type of leak. That is the fundamental issue. False positives from kmemleak are damaging to user confidence in the tool and are especially harmful when it is complex and time consuming to verify that is actually is a false positive (which would certainly be the case for OP-TEE false positives). In short it is *not* always the case failure-to-detect is worse than false-positive.

As discussed already the firmware/kernel contract prevents kmemleak from working as it is designed to and I am unconvinced that relying on fragile timeouts is sufficient.

Extending kmemleak to support OP-TEE memory scanning is also, IMHO, pointless. The reason for this is that OP-TEE cannot perform any scan on behalf of kmemleak without first validating the information provided to it by the kernel (to avoid information leaks). However if OP-TEE tracks enough state to validate the kernel input than it already has enough state to do a scan for leaks independently anyway (apart from being donated an execution context). Therefore it follows that any OP-TEE extension to handle leaks should be independent of kmemleak because it would still be useful to be able to ask OP-TEE to run a self-consistency check even if kmemleak is disabled.

Or, in short, even if you implement improved leak detection for OP-TEE (whether that is based on timers or scanning) then kmemleak_not_leak() is still the right thing to do with pointers whose ownership we have transferred to OP-TEE.

...
...
...
...
Sure, after we change ownership it could still be leaked... but it can no longer be leaked by the kernel because the kernel no longer owns it! More importantly, it makes no sense to run the kernel memory detector on the buffer because it simply can't work.

After the RPC completes, doesn't it become impossible for kmemleak to scan to see if the pointer is lost[1]?

Apart from the special OP-TEE prealloc SHM cache stuff, I can't think of any scenario where an OP-TEE thread should hold off kernel's memory pointers for more than 5 seconds before being passed onto kernel for further RPC commands or RPC free action. So the kmemleak should be able to detect if a pointer is lost.

Or putting this a different way: there is known to be firmware in the field that allocates pointers for more then five seconds!

If it's known that upstream OP-TEE doesn't hold any kernel memory references for more than 5 seconds then IMO we should be fine to not disable kmemleak until we have a future kmemleak extension. Otherwise it would be very hard to keep track of kernel memory lost in this way.

In essence I am arguing for using the right tool for the right job (and against turning down a correct patch because the right tool isn't yet implemented). A memory scanning leak detector is the wrong tool to search for leaks in memory that cannot be scanned.

For me having to rely on fragile implied contracts and undocumented assumptions about timing further reinforces my view that kmemleak is not the wrong tool. Especially so when we know that those assumptions are not met by existing firmware.

I agree, this patch makes sense. It fixes a problem and I can't see a downside with that. In a not too distant future we may change the way this memory is passed to OP-TEE by keeping a reference in the driver, but until then this patch will fix a problem.

Cheers, Jens

Sumit Garg

1:42 p.m.

On Wed, 15 Dec 2021 at 17:55, Jens Wiklander jens.wiklander@linaro.org wrote:

...

On Wed, Dec 15, 2021 at 11:19 AM Daniel Thompson daniel.thompson@linaro.org wrote:

...
On Tue, Dec 14, 2021 at 12:33:08PM +0530, Sumit Garg wrote:

...
On Mon, 13 Dec 2021 at 18:34, Daniel Thompson daniel.thompson@linaro.org wrote:

...
On Mon, Dec 13, 2021 at 02:28:01PM +0530, Sumit Garg wrote:

...
On Fri, 10 Dec 2021 at 21:19, Daniel Thompson daniel.thompson@linaro.org wrote:

...
On Fri, Dec 10, 2021 at 03:08:21PM +0530, Sumit Garg wrote: > On Fri, 10 Dec 2021 at 13:40, Jerome Forissier jerome@forissier.org wrote: > > On 12/10/21 06:00, Sumit Garg wrote: > > > On Fri, 10 Dec 2021 at 09:42, Wang, Xiaolei Xiaolei.Wang@windriver.com wrote: IIUC this patch adds kmemleak_not_leak() at (pretty much) the last possible point before *ownership* of the SHM block is passed from kernel to OP-TEE.

I wouldn't say it's a transfer of ownership from kernel to OP-TEE but rather a way for OP-TEE to access kernel's memory in order to pass info or execute further RPC commands.

The RPC handler allocates a pointer (e.g. now the RPC handler owns the allocated memory). The RPC handler then passes that pointer to OP-TEE and forgets what it's value was.

That is a transfer of ownership: the RPC handler does not hold any pointer to the memory and is incapable of freeing it. Moreover this situation is what kmemleak_no_leak() is for! Its job it to inform kmemleak that the pointer is owned/stored somewhere that is does not scan.

Let me put this another way. If the memory allocator belongs to the kernel then how does OP-TEE get to know which memory is currently allocated and it is to be scanned?

OP-TEE explicitly requested that the be allocated and responsible for figuring out where to store the pointer. How could it *not* know this information? More specifically OP-TEE is perfectly capable of recording what memory it has allocated and where to scan to find out if it has been lost.

...
I think the complete solution would be to extend kmemleak to support OP-TEE memory scanning via OP-TEE invocation to check if it's holding any kernel memory references.

This is the part I get stuck on... and the reason I'm still posting on the thread.

I struggle to see any value in using kmemleak to identify this type of leak. That is the fundamental issue. False positives from kmemleak are damaging to user confidence in the tool and are especially harmful when it is complex and time consuming to verify that is actually is a false positive (which would certainly be the case for OP-TEE false positives). In short it is *not* always the case failure-to-detect is worse than false-positive.

As discussed already the firmware/kernel contract prevents kmemleak from working as it is designed to and I am unconvinced that relying on fragile timeouts is sufficient.

Extending kmemleak to support OP-TEE memory scanning is also, IMHO, pointless. The reason for this is that OP-TEE cannot perform any scan on behalf of kmemleak without first validating the information provided to it by the kernel (to avoid information leaks). However if OP-TEE tracks enough state to validate the kernel input than it already has enough state to do a scan for leaks independently anyway (apart from being donated an execution context). Therefore it follows that any OP-TEE extension to handle leaks should be independent of kmemleak because it would still be useful to be able to ask OP-TEE to run a self-consistency check even if kmemleak is disabled.

Or, in short, even if you implement improved leak detection for OP-TEE (whether that is based on timers or scanning) then kmemleak_not_leak() is still the right thing to do with pointers whose ownership we have transferred to OP-TEE.

...
...
...
...
Sure, after we change ownership it could still be leaked... but it can no longer be leaked by the kernel because the kernel no longer owns it! More importantly, it makes no sense to run the kernel memory detector on the buffer because it simply can't work.

After the RPC completes, doesn't it become impossible for kmemleak to scan to see if the pointer is lost[1]?

Apart from the special OP-TEE prealloc SHM cache stuff, I can't think of any scenario where an OP-TEE thread should hold off kernel's memory pointers for more than 5 seconds before being passed onto kernel for further RPC commands or RPC free action. So the kmemleak should be able to detect if a pointer is lost.

Or putting this a different way: there is known to be firmware in the field that allocates pointers for more then five seconds!

If it's known that upstream OP-TEE doesn't hold any kernel memory references for more than 5 seconds then IMO we should be fine to not disable kmemleak until we have a future kmemleak extension. Otherwise it would be very hard to keep track of kernel memory lost in this way.

In essence I am arguing for using the right tool for the right job (and against turning down a correct patch because the right tool isn't yet implemented). A memory scanning leak detector is the wrong tool to search for leaks in memory that cannot be scanned.

For me having to rely on fragile implied contracts and undocumented assumptions about timing further reinforces my view that kmemleak is not the wrong tool. Especially so when we know that those assumptions are not met by existing firmware.

I agree, this patch makes sense. It fixes a problem and I can't see a downside with that. In a not too distant future we may change the way this memory is passed to OP-TEE by keeping a reference in the driver, but until then this patch will fix a problem.

Fair enough, I was just trying to be more optimistic about leveraging existing kmemleak infrastructure as shared memory bugs are catching on us.

-Sumit

...

Cheers, Jens

wangxiaolei

13 Dec 13 Dec

8:55 a.m.

On 12/10/21 5:38 PM, Sumit Garg wrote:

...

[Please note: This e-mail is from an EXTERNAL e-mail address]

On Fri, 10 Dec 2021 at 13:40, Jerome Forissier jerome@forissier.org wrote:

...
+CC Jens, Etienne

On 12/10/21 06:00, Sumit Garg wrote:

...
On Fri, 10 Dec 2021 at 09:42, Wang, Xiaolei Xiaolei.Wang@windriver.com wrote:

...
-----Original Message----- From: Sumit Garg sumit.garg@linaro.org Sent: Thursday, December 9, 2021 7:41 PM To: Wang, Xiaolei Xiaolei.Wang@windriver.com Cc: jens.wiklander@linaro.org; op-tee@lists.trustedfirmware.org; linux-kernel@vger.kernel.org Subject: Re: [PATCH] optee: Suppress false positive kmemleak report in optee_handle_rpc()

[Please note: This e-mail is from an EXTERNAL e-mail address]

On Mon, 6 Dec 2021 at 17:35, Xiaolei Wang xiaolei.wang@windriver.com wrote:

...
We observed the following kmemleak report: unreferenced object 0xffff000007904500 (size 128): comm "swapper/0", pid 1, jiffies 4294892671 (age 44.036s) hex dump (first 32 bytes): 00 47 90 07 00 00 ff ff 60 00 c0 ff 00 00 00 00 .G......`....... 60 00 80 13 00 80 ff ff a0 00 00 00 00 00 00 00 `............... backtrace: [<000000004c12b1c7>] kmem_cache_alloc+0x1ac/0x2f4 [<000000005d23eb4f>] tee_shm_alloc+0x78/0x230 [<00000000794dd22c>] optee_handle_rpc+0x60/0x6f0 [<00000000d9f7c52d>] optee_do_call_with_arg+0x17c/0x1dc [<00000000c35884da>] optee_open_session+0x128/0x1ec [<000000001748f2ff>] tee_client_open_session+0x28/0x40 [<00000000aecb5389>] optee_enumerate_devices+0x84/0x2a0 [<000000003df18bf1>] optee_probe+0x674/0x6cc [<000000003a4a534a>] platform_drv_probe+0x54/0xb0 [<000000000c51ce7d>] really_probe+0xe4/0x4d0 [<000000002f04c865>] driver_probe_device+0x58/0xc0 [<00000000b485397d>] device_driver_attach+0xc0/0xd0 [<00000000c835f0df>] __driver_attach+0x84/0x124 [<000000008e5a429c>] bus_for_each_dev+0x70/0xc0 [<000000001735e8a8>] driver_attach+0x24/0x30 [<000000006d94b04f>] bus_add_driver+0x104/0x1ec

This is not a memory leak because we pass the share memory pointer to secure world and would get it from secure world before releasing it. How about if it's actually a memory leak caused by the secure world? An example being secure world just allocates kernel memory via OPTEE_SMC_RPC_FUNC_ALLOC and doesn't free it via OPTEE_SMC_RPC_FUNC_FREE. IMO, we need to cross-check optee-os if it's responsible for leaking kernel memory.

Hi sumit,

You mean we need to check whether there is a real memleak, If being secure world just allocate kernel memory via OPTEE_SMC_PRC_FUNC_ALLOC and until the end, there is no free It via OPTEE_SMC_PRC_FUNC_FREE, then we should judge it as a memory leak, wo need to judge whether it is caused by secure os?

Yes. AFAICT, optee-os should allocate shared memory to communicate with tee-supplicant. So once the communication is done, the underlying shared memory should be freed. I can't think of any scenario where optee-os should keep hold-off shared memory indefinitely.

I believe it can happen when OP-TEE's CFG_PREALLOC_RPC_CACHE is y. See the config file [1] and the commit which introduced this config [2].

Okay, I see the reasoning. So during the OP-TEE driver's lifetime, the RPC shared memory remains allocated. I guess that is done primarily for performance reasons.

But still it doesn't feel appropriate that we term all RPC shm allocations as not leaking memory as we might miss obvious ones.

Xiaolei,

Can you once test with CFG_PREALLOC_RPC_CACHE=n while compiling optee-os and see if the observed memory leak disappears or not?

-Sumit

Hi sumit

The version I am using has not increased the CFG_PREALLOC_RPC_CACHE

switch, I checked out to the latest version, but because of the need for

additional patches for the imx8 platform, I still have no way to test the

CFG_PREALLOC_RPC_CACHE=n situation

thanks

xiaolei

...

...
[1] https://github.com/OP-TEE/optee_os/blob/3.15.0/mk/config.mk#L709 [2] https://github.com/OP-TEE/optee_os/commit/8887663248ad

-- Jerome

Sumit Garg

9:04 a.m.

On Mon, 13 Dec 2021 at 14:25, wangxiaolei xiaolei.wang@windriver.com wrote:

...

On 12/10/21 5:38 PM, Sumit Garg wrote:

...
[Please note: This e-mail is from an EXTERNAL e-mail address]

On Fri, 10 Dec 2021 at 13:40, Jerome Forissier jerome@forissier.org wrote:

...
+CC Jens, Etienne

On 12/10/21 06:00, Sumit Garg wrote:

...
On Fri, 10 Dec 2021 at 09:42, Wang, Xiaolei Xiaolei.Wang@windriver.com wrote:

...
-----Original Message----- From: Sumit Garg sumit.garg@linaro.org Sent: Thursday, December 9, 2021 7:41 PM To: Wang, Xiaolei Xiaolei.Wang@windriver.com Cc: jens.wiklander@linaro.org; op-tee@lists.trustedfirmware.org; linux-kernel@vger.kernel.org Subject: Re: [PATCH] optee: Suppress false positive kmemleak report in optee_handle_rpc()

[Please note: This e-mail is from an EXTERNAL e-mail address]

On Mon, 6 Dec 2021 at 17:35, Xiaolei Wang xiaolei.wang@windriver.com wrote:

...
We observed the following kmemleak report: unreferenced object 0xffff000007904500 (size 128): comm "swapper/0", pid 1, jiffies 4294892671 (age 44.036s) hex dump (first 32 bytes): 00 47 90 07 00 00 ff ff 60 00 c0 ff 00 00 00 00 .G......`....... 60 00 80 13 00 80 ff ff a0 00 00 00 00 00 00 00 `............... backtrace: [<000000004c12b1c7>] kmem_cache_alloc+0x1ac/0x2f4 [<000000005d23eb4f>] tee_shm_alloc+0x78/0x230 [<00000000794dd22c>] optee_handle_rpc+0x60/0x6f0 [<00000000d9f7c52d>] optee_do_call_with_arg+0x17c/0x1dc [<00000000c35884da>] optee_open_session+0x128/0x1ec [<000000001748f2ff>] tee_client_open_session+0x28/0x40 [<00000000aecb5389>] optee_enumerate_devices+0x84/0x2a0 [<000000003df18bf1>] optee_probe+0x674/0x6cc [<000000003a4a534a>] platform_drv_probe+0x54/0xb0 [<000000000c51ce7d>] really_probe+0xe4/0x4d0 [<000000002f04c865>] driver_probe_device+0x58/0xc0 [<00000000b485397d>] device_driver_attach+0xc0/0xd0 [<00000000c835f0df>] __driver_attach+0x84/0x124 [<000000008e5a429c>] bus_for_each_dev+0x70/0xc0 [<000000001735e8a8>] driver_attach+0x24/0x30 [<000000006d94b04f>] bus_add_driver+0x104/0x1ec

This is not a memory leak because we pass the share memory pointer to secure world and would get it from secure world before releasing it. How about if it's actually a memory leak caused by the secure world? An example being secure world just allocates kernel memory via OPTEE_SMC_RPC_FUNC_ALLOC and doesn't free it via OPTEE_SMC_RPC_FUNC_FREE. IMO, we need to cross-check optee-os if it's responsible for leaking kernel memory.

Hi sumit,

You mean we need to check whether there is a real memleak, If being secure world just allocate kernel memory via OPTEE_SMC_PRC_FUNC_ALLOC and until the end, there is no free It via OPTEE_SMC_PRC_FUNC_FREE, then we should judge it as a memory leak, wo need to judge whether it is caused by secure os?

Yes. AFAICT, optee-os should allocate shared memory to communicate with tee-supplicant. So once the communication is done, the underlying shared memory should be freed. I can't think of any scenario where optee-os should keep hold-off shared memory indefinitely.

I believe it can happen when OP-TEE's CFG_PREALLOC_RPC_CACHE is y. See the config file [1] and the commit which introduced this config [2].

Okay, I see the reasoning. So during the OP-TEE driver's lifetime, the RPC shared memory remains allocated. I guess that is done primarily for performance reasons.

But still it doesn't feel appropriate that we term all RPC shm allocations as not leaking memory as we might miss obvious ones.

Xiaolei,

Can you once test with CFG_PREALLOC_RPC_CACHE=n while compiling optee-os and see if the observed memory leak disappears or not?

-Sumit

Hi sumit

The version I am using has not increased the CFG_PREALLOC_RPC_CACHE

switch, I checked out to the latest version, but because of the need for

additional patches for the imx8 platform, I still have no way to test the

CFG_PREALLOC_RPC_CACHE=n situation

Can you just try to backport this [1] patch to your imx8 optee-os tree and test?

[1] https://github.com/OP-TEE/optee_os/commit/8887663248ad

-Sumit

...

thanks

xiaolei

...
...
[1] https://github.com/OP-TEE/optee_os/blob/3.15.0/mk/config.mk#L709 [2] https://github.com/OP-TEE/optee_os/commit/8887663248ad

-- Jerome

wangxiaolei

14 Dec 14 Dec

7:11 a.m.

On 12/13/21 5:04 PM, Sumit Garg wrote:

...

[Please note: This e-mail is from an EXTERNAL e-mail address]

On Mon, 13 Dec 2021 at 14:25, wangxiaolei xiaolei.wang@windriver.com wrote:

...
On 12/10/21 5:38 PM, Sumit Garg wrote:

...
[Please note: This e-mail is from an EXTERNAL e-mail address]

On Fri, 10 Dec 2021 at 13:40, Jerome Forissier jerome@forissier.org wrote:

...
+CC Jens, Etienne

On 12/10/21 06:00, Sumit Garg wrote:

...
On Fri, 10 Dec 2021 at 09:42, Wang, Xiaolei Xiaolei.Wang@windriver.com wrote:

...
-----Original Message----- From: Sumit Garg sumit.garg@linaro.org Sent: Thursday, December 9, 2021 7:41 PM To: Wang, Xiaolei Xiaolei.Wang@windriver.com Cc: jens.wiklander@linaro.org; op-tee@lists.trustedfirmware.org; linux-kernel@vger.kernel.org Subject: Re: [PATCH] optee: Suppress false positive kmemleak report in optee_handle_rpc()

[Please note: This e-mail is from an EXTERNAL e-mail address]

On Mon, 6 Dec 2021 at 17:35, Xiaolei Wang xiaolei.wang@windriver.com wrote: > We observed the following kmemleak report: > unreferenced object 0xffff000007904500 (size 128): > comm "swapper/0", pid 1, jiffies 4294892671 (age 44.036s) > hex dump (first 32 bytes): > 00 47 90 07 00 00 ff ff 60 00 c0 ff 00 00 00 00 .G......`....... > 60 00 80 13 00 80 ff ff a0 00 00 00 00 00 00 00 `............... > backtrace: > [<000000004c12b1c7>] kmem_cache_alloc+0x1ac/0x2f4 > [<000000005d23eb4f>] tee_shm_alloc+0x78/0x230 > [<00000000794dd22c>] optee_handle_rpc+0x60/0x6f0 > [<00000000d9f7c52d>] optee_do_call_with_arg+0x17c/0x1dc > [<00000000c35884da>] optee_open_session+0x128/0x1ec > [<000000001748f2ff>] tee_client_open_session+0x28/0x40 > [<00000000aecb5389>] optee_enumerate_devices+0x84/0x2a0 > [<000000003df18bf1>] optee_probe+0x674/0x6cc > [<000000003a4a534a>] platform_drv_probe+0x54/0xb0 > [<000000000c51ce7d>] really_probe+0xe4/0x4d0 > [<000000002f04c865>] driver_probe_device+0x58/0xc0 > [<00000000b485397d>] device_driver_attach+0xc0/0xd0 > [<00000000c835f0df>] __driver_attach+0x84/0x124 > [<000000008e5a429c>] bus_for_each_dev+0x70/0xc0 > [<000000001735e8a8>] driver_attach+0x24/0x30 > [<000000006d94b04f>] bus_add_driver+0x104/0x1ec > > This is not a memory leak because we pass the share memory pointer to > secure world and would get it from secure world before releasing it. > How about if it's actually a memory leak caused by the secure world? > An example being secure world just allocates kernel memory via OPTEE_SMC_RPC_FUNC_ALLOC and doesn't free it via OPTEE_SMC_RPC_FUNC_FREE. > IMO, we need to cross-check optee-os if it's responsible for leaking kernel memory. Hi sumit,

You mean we need to check whether there is a real memleak, If being secure world just allocate kernel memory via OPTEE_SMC_PRC_FUNC_ALLOC and until the end, there is no free It via OPTEE_SMC_PRC_FUNC_FREE, then we should judge it as a memory leak, wo need to judge whether it is caused by secure os?

Yes. AFAICT, optee-os should allocate shared memory to communicate with tee-supplicant. So once the communication is done, the underlying shared memory should be freed. I can't think of any scenario where optee-os should keep hold-off shared memory indefinitely.

I believe it can happen when OP-TEE's CFG_PREALLOC_RPC_CACHE is y. See the config file [1] and the commit which introduced this config [2].

Okay, I see the reasoning. So during the OP-TEE driver's lifetime, the RPC shared memory remains allocated. I guess that is done primarily for performance reasons.

But still it doesn't feel appropriate that we term all RPC shm allocations as not leaking memory as we might miss obvious ones.

Xiaolei,

Can you once test with CFG_PREALLOC_RPC_CACHE=n while compiling optee-os and see if the observed memory leak disappears or not?

-Sumit

Hi sumit

The version I am using has not increased the CFG_PREALLOC_RPC_CACHE

switch, I checked out to the latest version, but because of the need for

additional patches for the imx8 platform, I still have no way to test the

CFG_PREALLOC_RPC_CACHE=n situation

Can you just try to backport this [1] patch to your imx8 optee-os tree and test?

[1] https://github.com/OP-TEE/optee_os/commit/8887663248ad

Hi sumit

I upgraded optee-os from version 3.2.0 to 3.13.0, and the kernel did not detect this problem.

I have not set CFG_PREALLOC_RPC_CACHE to n. This should be a problem that occurs when compatible

with lower versions.

thanks

xiaolei

...

-Sumit

...
thanks

xiaolei

...
...
[1] https://github.com/OP-TEE/optee_os/blob/3.15.0/mk/config.mk#L709 [2] https://github.com/OP-TEE/optee_os/commit/8887663248ad

-- Jerome

Sumit Garg

7:29 a.m.

On Tue, 14 Dec 2021 at 12:41, wangxiaolei xiaolei.wang@windriver.com wrote:

...

On 12/13/21 5:04 PM, Sumit Garg wrote:

...
[Please note: This e-mail is from an EXTERNAL e-mail address]

On Mon, 13 Dec 2021 at 14:25, wangxiaolei xiaolei.wang@windriver.com wrote:

...
On 12/10/21 5:38 PM, Sumit Garg wrote:

...
[Please note: This e-mail is from an EXTERNAL e-mail address]

On Fri, 10 Dec 2021 at 13:40, Jerome Forissier jerome@forissier.org wrote:

...
+CC Jens, Etienne

On 12/10/21 06:00, Sumit Garg wrote:

...
On Fri, 10 Dec 2021 at 09:42, Wang, Xiaolei Xiaolei.Wang@windriver.com wrote: > -----Original Message----- > From: Sumit Garg sumit.garg@linaro.org > Sent: Thursday, December 9, 2021 7:41 PM > To: Wang, Xiaolei Xiaolei.Wang@windriver.com > Cc: jens.wiklander@linaro.org; op-tee@lists.trustedfirmware.org; linux-kernel@vger.kernel.org > Subject: Re: [PATCH] optee: Suppress false positive kmemleak report in optee_handle_rpc() > > [Please note: This e-mail is from an EXTERNAL e-mail address] > > On Mon, 6 Dec 2021 at 17:35, Xiaolei Wang xiaolei.wang@windriver.com wrote: >> We observed the following kmemleak report: >> unreferenced object 0xffff000007904500 (size 128): >> comm "swapper/0", pid 1, jiffies 4294892671 (age 44.036s) >> hex dump (first 32 bytes): >> 00 47 90 07 00 00 ff ff 60 00 c0 ff 00 00 00 00 .G......`....... >> 60 00 80 13 00 80 ff ff a0 00 00 00 00 00 00 00 `............... >> backtrace: >> [<000000004c12b1c7>] kmem_cache_alloc+0x1ac/0x2f4 >> [<000000005d23eb4f>] tee_shm_alloc+0x78/0x230 >> [<00000000794dd22c>] optee_handle_rpc+0x60/0x6f0 >> [<00000000d9f7c52d>] optee_do_call_with_arg+0x17c/0x1dc >> [<00000000c35884da>] optee_open_session+0x128/0x1ec >> [<000000001748f2ff>] tee_client_open_session+0x28/0x40 >> [<00000000aecb5389>] optee_enumerate_devices+0x84/0x2a0 >> [<000000003df18bf1>] optee_probe+0x674/0x6cc >> [<000000003a4a534a>] platform_drv_probe+0x54/0xb0 >> [<000000000c51ce7d>] really_probe+0xe4/0x4d0 >> [<000000002f04c865>] driver_probe_device+0x58/0xc0 >> [<00000000b485397d>] device_driver_attach+0xc0/0xd0 >> [<00000000c835f0df>] __driver_attach+0x84/0x124 >> [<000000008e5a429c>] bus_for_each_dev+0x70/0xc0 >> [<000000001735e8a8>] driver_attach+0x24/0x30 >> [<000000006d94b04f>] bus_add_driver+0x104/0x1ec >> >> This is not a memory leak because we pass the share memory pointer to >> secure world and would get it from secure world before releasing it. >> How about if it's actually a memory leak caused by the secure world? >> An example being secure world just allocates kernel memory via OPTEE_SMC_RPC_FUNC_ALLOC and doesn't free it via OPTEE_SMC_RPC_FUNC_FREE. >> IMO, we need to cross-check optee-os if it's responsible for leaking kernel memory. > Hi sumit, > > You mean we need to check whether there is a real memleak, > If being secure world just allocate kernel memory via OPTEE_SMC_PRC_FUNC_ALLOC and until the end, there is no free > It via OPTEE_SMC_PRC_FUNC_FREE, then we should judge it as a memory leak, wo need to judge whether it is caused by secure os? Yes. AFAICT, optee-os should allocate shared memory to communicate with tee-supplicant. So once the communication is done, the underlying shared memory should be freed. I can't think of any scenario where optee-os should keep hold-off shared memory indefinitely.

I believe it can happen when OP-TEE's CFG_PREALLOC_RPC_CACHE is y. See the config file [1] and the commit which introduced this config [2].

Okay, I see the reasoning. So during the OP-TEE driver's lifetime, the RPC shared memory remains allocated. I guess that is done primarily for performance reasons.

But still it doesn't feel appropriate that we term all RPC shm allocations as not leaking memory as we might miss obvious ones.

Xiaolei,

Can you once test with CFG_PREALLOC_RPC_CACHE=n while compiling optee-os and see if the observed memory leak disappears or not?

-Sumit

Hi sumit

The version I am using has not increased the CFG_PREALLOC_RPC_CACHE

switch, I checked out to the latest version, but because of the need for

additional patches for the imx8 platform, I still have no way to test the

CFG_PREALLOC_RPC_CACHE=n situation

Can you just try to backport this [1] patch to your imx8 optee-os tree and test?

[1] https://github.com/OP-TEE/optee_os/commit/8887663248ad

Hi sumit

I upgraded optee-os from version 3.2.0 to 3.13.0, and the kernel did not detect this problem.

Can you check if CFG_TEE_CORE_EMBED_INTERNAL_TESTS is enabled in optee-os version 3.13.0? As we would require atleast one RPC prealloc SHM invocation from OP-TEE for kmemleak to detect the problem.

-Sumit

...

I have not set CFG_PREALLOC_RPC_CACHE to n. This should be a problem that occurs when compatible

with lower versions.

thanks

xiaolei

...
-Sumit

...
thanks

xiaolei

...
...
[1] https://github.com/OP-TEE/optee_os/blob/3.15.0/mk/config.mk#L709 [2] https://github.com/OP-TEE/optee_os/commit/8887663248ad

-- Jerome

wangxiaolei

7:41 a.m.

On 12/14/21 3:29 PM, Sumit Garg wrote:

...

[Please note: This e-mail is from an EXTERNAL e-mail address]

On Tue, 14 Dec 2021 at 12:41, wangxiaolei xiaolei.wang@windriver.com wrote:

...
On 12/13/21 5:04 PM, Sumit Garg wrote:

...
[Please note: This e-mail is from an EXTERNAL e-mail address]

On Mon, 13 Dec 2021 at 14:25, wangxiaolei xiaolei.wang@windriver.com wrote:

...
On 12/10/21 5:38 PM, Sumit Garg wrote:

...
[Please note: This e-mail is from an EXTERNAL e-mail address]

On Fri, 10 Dec 2021 at 13:40, Jerome Forissier jerome@forissier.org wrote:

...
+CC Jens, Etienne

On 12/10/21 06:00, Sumit Garg wrote: > On Fri, 10 Dec 2021 at 09:42, Wang, Xiaolei Xiaolei.Wang@windriver.com wrote: >> -----Original Message----- >> From: Sumit Garg sumit.garg@linaro.org >> Sent: Thursday, December 9, 2021 7:41 PM >> To: Wang, Xiaolei Xiaolei.Wang@windriver.com >> Cc: jens.wiklander@linaro.org; op-tee@lists.trustedfirmware.org; linux-kernel@vger.kernel.org >> Subject: Re: [PATCH] optee: Suppress false positive kmemleak report in optee_handle_rpc() >> >> [Please note: This e-mail is from an EXTERNAL e-mail address] >> >> On Mon, 6 Dec 2021 at 17:35, Xiaolei Wang xiaolei.wang@windriver.com wrote: >>> We observed the following kmemleak report: >>> unreferenced object 0xffff000007904500 (size 128): >>> comm "swapper/0", pid 1, jiffies 4294892671 (age 44.036s) >>> hex dump (first 32 bytes): >>> 00 47 90 07 00 00 ff ff 60 00 c0 ff 00 00 00 00 .G......`....... >>> 60 00 80 13 00 80 ff ff a0 00 00 00 00 00 00 00 `............... >>> backtrace: >>> [<000000004c12b1c7>] kmem_cache_alloc+0x1ac/0x2f4 >>> [<000000005d23eb4f>] tee_shm_alloc+0x78/0x230 >>> [<00000000794dd22c>] optee_handle_rpc+0x60/0x6f0 >>> [<00000000d9f7c52d>] optee_do_call_with_arg+0x17c/0x1dc >>> [<00000000c35884da>] optee_open_session+0x128/0x1ec >>> [<000000001748f2ff>] tee_client_open_session+0x28/0x40 >>> [<00000000aecb5389>] optee_enumerate_devices+0x84/0x2a0 >>> [<000000003df18bf1>] optee_probe+0x674/0x6cc >>> [<000000003a4a534a>] platform_drv_probe+0x54/0xb0 >>> [<000000000c51ce7d>] really_probe+0xe4/0x4d0 >>> [<000000002f04c865>] driver_probe_device+0x58/0xc0 >>> [<00000000b485397d>] device_driver_attach+0xc0/0xd0 >>> [<00000000c835f0df>] __driver_attach+0x84/0x124 >>> [<000000008e5a429c>] bus_for_each_dev+0x70/0xc0 >>> [<000000001735e8a8>] driver_attach+0x24/0x30 >>> [<000000006d94b04f>] bus_add_driver+0x104/0x1ec >>> >>> This is not a memory leak because we pass the share memory pointer to >>> secure world and would get it from secure world before releasing it. >>> How about if it's actually a memory leak caused by the secure world? >>> An example being secure world just allocates kernel memory via OPTEE_SMC_RPC_FUNC_ALLOC and doesn't free it via OPTEE_SMC_RPC_FUNC_FREE. >>> IMO, we need to cross-check optee-os if it's responsible for leaking kernel memory. >> Hi sumit, >> >> You mean we need to check whether there is a real memleak, >> If being secure world just allocate kernel memory via OPTEE_SMC_PRC_FUNC_ALLOC and until the end, there is no free >> It via OPTEE_SMC_PRC_FUNC_FREE, then we should judge it as a memory leak, wo need to judge whether it is caused by secure os? > Yes. AFAICT, optee-os should allocate shared memory to communicate > with tee-supplicant. So once the communication is done, the underlying > shared memory should be freed. I can't think of any scenario where > optee-os should keep hold-off shared memory indefinitely. I believe it can happen when OP-TEE's CFG_PREALLOC_RPC_CACHE is y. See the config file [1] and the commit which introduced this config [2].

Okay, I see the reasoning. So during the OP-TEE driver's lifetime, the RPC shared memory remains allocated. I guess that is done primarily for performance reasons.

But still it doesn't feel appropriate that we term all RPC shm allocations as not leaking memory as we might miss obvious ones.

Xiaolei,

Can you once test with CFG_PREALLOC_RPC_CACHE=n while compiling optee-os and see if the observed memory leak disappears or not?

-Sumit

Hi sumit

The version I am using has not increased the CFG_PREALLOC_RPC_CACHE

switch, I checked out to the latest version, but because of the need for

additional patches for the imx8 platform, I still have no way to test the

CFG_PREALLOC_RPC_CACHE=n situation

Can you just try to backport this [1] patch to your imx8 optee-os tree and test?

[1] https://github.com/OP-TEE/optee_os/commit/8887663248ad

Hi sumit

I upgraded optee-os from version 3.2.0 to 3.13.0, and the kernel did not detect this problem.

Can you check if CFG_TEE_CORE_EMBED_INTERNAL_TESTS is enabled in optee-os version 3.13.0? As we would require atleast one RPC prealloc SHM invocation from OP-TEE for kmemleak to detect the problem.

CFG_TEE_CORE_EMBED_INTERNAL_TESTS is enabled ,I can see the "*.o" files compiled from the core/pta/tests directory

thanks xiaolei

...

-Sumit

...
I have not set CFG_PREALLOC_RPC_CACHE to n. This should be a problem that occurs when compatible

with lower versions.

thanks

xiaolei

...
-Sumit

...
thanks

xiaolei

...
...
[1] https://github.com/OP-TEE/optee_os/blob/3.15.0/mk/config.mk#L709 [2] https://github.com/OP-TEE/optee_os/commit/8887663248ad

-- Jerome

Jens Wiklander

15 Dec 15 Dec

12:29 p.m.

On Mon, Dec 06, 2021 at 08:05:33PM +0800, Xiaolei Wang wrote:

...

We observed the following kmemleak report: unreferenced object 0xffff000007904500 (size 128): comm "swapper/0", pid 1, jiffies 4294892671 (age 44.036s) hex dump (first 32 bytes): 00 47 90 07 00 00 ff ff 60 00 c0 ff 00 00 00 00 .G......`....... 60 00 80 13 00 80 ff ff a0 00 00 00 00 00 00 00 `............... backtrace: [<000000004c12b1c7>] kmem_cache_alloc+0x1ac/0x2f4 [<000000005d23eb4f>] tee_shm_alloc+0x78/0x230 [<00000000794dd22c>] optee_handle_rpc+0x60/0x6f0 [<00000000d9f7c52d>] optee_do_call_with_arg+0x17c/0x1dc [<00000000c35884da>] optee_open_session+0x128/0x1ec [<000000001748f2ff>] tee_client_open_session+0x28/0x40 [<00000000aecb5389>] optee_enumerate_devices+0x84/0x2a0 [<000000003df18bf1>] optee_probe+0x674/0x6cc [<000000003a4a534a>] platform_drv_probe+0x54/0xb0 [<000000000c51ce7d>] really_probe+0xe4/0x4d0 [<000000002f04c865>] driver_probe_device+0x58/0xc0 [<00000000b485397d>] device_driver_attach+0xc0/0xd0 [<00000000c835f0df>] __driver_attach+0x84/0x124 [<000000008e5a429c>] bus_for_each_dev+0x70/0xc0 [<000000001735e8a8>] driver_attach+0x24/0x30 [<000000006d94b04f>] bus_add_driver+0x104/0x1ec

This is not a memory leak because we pass the share memory pointer to secure world and would get it from secure world before releasing it.

Signed-off-by: Xiaolei Wang xiaolei.wang@windriver.com

drivers/tee/optee/smc_abi.c | 2 ++ 1 file changed, 2 insertions(+)

diff --git a/drivers/tee/optee/smc_abi.c b/drivers/tee/optee/smc_abi.c index 6196d7c3888f..cf2e3293567d 100644 --- a/drivers/tee/optee/smc_abi.c +++ b/drivers/tee/optee/smc_abi.c @@ -23,6 +23,7 @@ #include "optee_private.h" #include "optee_smc.h" #include "optee_rpc_cmd.h" +#include <linux/kmemleak.h> #define CREATE_TRACE_POINTS #include "optee_trace.h" @@ -783,6 +784,7 @@ static void optee_handle_rpc(struct tee_context *ctx, param->a4 = 0; param->a5 = 0; }
kmemleak_not_leak(shm);

Eventually this pointer will be freed below with the call to tee_shm_free(). I assume than once the memory is freed it's not execused from being a leak any longer. Is that correct?

Thanks, Jens

...

break;
case OPTEE_SMC_RPC_FUNC_FREE: shm = reg_pair_to_ptr(param->a1, param->a2); -- 2.25.1

Wang, Xiaolei

1:33 p.m.

在 12/15/2021 8:29 PM, Jens Wiklander 写道:

...

[Please note: This e-mail is from an EXTERNAL e-mail address]

On Mon, Dec 06, 2021 at 08:05:33PM +0800, Xiaolei Wang wrote:

...
We observed the following kmemleak report: unreferenced object 0xffff000007904500 (size 128): comm "swapper/0", pid 1, jiffies 4294892671 (age 44.036s) hex dump (first 32 bytes): 00 47 90 07 00 00 ff ff 60 00 c0 ff 00 00 00 00 .G......`....... 60 00 80 13 00 80 ff ff a0 00 00 00 00 00 00 00 `............... backtrace: [<000000004c12b1c7>] kmem_cache_alloc+0x1ac/0x2f4 [<000000005d23eb4f>] tee_shm_alloc+0x78/0x230 [<00000000794dd22c>] optee_handle_rpc+0x60/0x6f0 [<00000000d9f7c52d>] optee_do_call_with_arg+0x17c/0x1dc [<00000000c35884da>] optee_open_session+0x128/0x1ec [<000000001748f2ff>] tee_client_open_session+0x28/0x40 [<00000000aecb5389>] optee_enumerate_devices+0x84/0x2a0 [<000000003df18bf1>] optee_probe+0x674/0x6cc [<000000003a4a534a>] platform_drv_probe+0x54/0xb0 [<000000000c51ce7d>] really_probe+0xe4/0x4d0 [<000000002f04c865>] driver_probe_device+0x58/0xc0 [<00000000b485397d>] device_driver_attach+0xc0/0xd0 [<00000000c835f0df>] __driver_attach+0x84/0x124 [<000000008e5a429c>] bus_for_each_dev+0x70/0xc0 [<000000001735e8a8>] driver_attach+0x24/0x30 [<000000006d94b04f>] bus_add_driver+0x104/0x1ec

This is not a memory leak because we pass the share memory pointer to secure world and would get it from secure world before releasing it.

Signed-off-by: Xiaolei Wang xiaolei.wang@windriver.com

drivers/tee/optee/smc_abi.c | 2 ++ 1 file changed, 2 insertions(+)

diff --git a/drivers/tee/optee/smc_abi.c b/drivers/tee/optee/smc_abi.c index 6196d7c3888f..cf2e3293567d 100644 --- a/drivers/tee/optee/smc_abi.c +++ b/drivers/tee/optee/smc_abi.c @@ -23,6 +23,7 @@ #include "optee_private.h" #include "optee_smc.h" #include "optee_rpc_cmd.h" +#include <linux/kmemleak.h> #define CREATE_TRACE_POINTS #include "optee_trace.h"

@@ -783,6 +784,7 @@ static void optee_handle_rpc(struct tee_context *ctx, param->a4 = 0; param->a5 = 0; }
        kmemleak_not_leak(shm);
Eventually this pointer will be freed below with the call to tee_shm_free(). I assume than once the memory is freed it's not execused from being a leak any longer. Is that correct?

Yes, it is the correct way to release memory through tee_shm_free, but if a memory leak is detected by the kernel before free memory, it is obviously a false alarm.

thanks

xiaolei

...

Thanks, Jens

...

           break;
   case OPTEE_SMC_RPC_FUNC_FREE:
           shm = reg_pair_to_ptr(param->a1, param->a2);

-- 2.25.1

Jens Wiklander

16 Dec 16 Dec

2:55 p.m.

On Mon, Dec 6, 2021 at 1:05 PM Xiaolei Wang xiaolei.wang@windriver.com wrote:

...

We observed the following kmemleak report: unreferenced object 0xffff000007904500 (size 128): comm "swapper/0", pid 1, jiffies 4294892671 (age 44.036s) hex dump (first 32 bytes): 00 47 90 07 00 00 ff ff 60 00 c0 ff 00 00 00 00 .G......`....... 60 00 80 13 00 80 ff ff a0 00 00 00 00 00 00 00 `............... backtrace: [<000000004c12b1c7>] kmem_cache_alloc+0x1ac/0x2f4 [<000000005d23eb4f>] tee_shm_alloc+0x78/0x230 [<00000000794dd22c>] optee_handle_rpc+0x60/0x6f0 [<00000000d9f7c52d>] optee_do_call_with_arg+0x17c/0x1dc [<00000000c35884da>] optee_open_session+0x128/0x1ec [<000000001748f2ff>] tee_client_open_session+0x28/0x40 [<00000000aecb5389>] optee_enumerate_devices+0x84/0x2a0 [<000000003df18bf1>] optee_probe+0x674/0x6cc [<000000003a4a534a>] platform_drv_probe+0x54/0xb0 [<000000000c51ce7d>] really_probe+0xe4/0x4d0 [<000000002f04c865>] driver_probe_device+0x58/0xc0 [<00000000b485397d>] device_driver_attach+0xc0/0xd0 [<00000000c835f0df>] __driver_attach+0x84/0x124 [<000000008e5a429c>] bus_for_each_dev+0x70/0xc0 [<000000001735e8a8>] driver_attach+0x24/0x30 [<000000006d94b04f>] bus_add_driver+0x104/0x1ec

This is not a memory leak because we pass the share memory pointer to secure world and would get it from secure world before releasing it.

Signed-off-by: Xiaolei Wang xiaolei.wang@windriver.com

drivers/tee/optee/smc_abi.c | 2 ++ 1 file changed, 2 insertions(+)

I'm picking up this.

Thanks, Jens

...

diff --git a/drivers/tee/optee/smc_abi.c b/drivers/tee/optee/smc_abi.c index 6196d7c3888f..cf2e3293567d 100644 --- a/drivers/tee/optee/smc_abi.c +++ b/drivers/tee/optee/smc_abi.c @@ -23,6 +23,7 @@ #include "optee_private.h" #include "optee_smc.h" #include "optee_rpc_cmd.h" +#include <linux/kmemleak.h> #define CREATE_TRACE_POINTS #include "optee_trace.h"

@@ -783,6 +784,7 @@ static void optee_handle_rpc(struct tee_context *ctx, param->a4 = 0; param->a5 = 0; }
          kmemleak_not_leak(shm);
          break;
  case OPTEE_SMC_RPC_FUNC_FREE:
          shm = reg_pair_to_ptr(param->a1, param->a2);
-- 2.25.1

1543

days inactive

1553

days old

op-tee@lists.trustedfirmware.org

26 comments

participants

tags (0)

participants (9)

Daniel Thompson
Etienne Carriere
Jens Wiklander
Jerome Forissier
Sumit Garg
Wang, Xiaolei
Wang, Xiaolei
wangxiaolei
Xiaolei Wang