On Fri, 10 Dec 2021 at 15:08, Etienne Carriere etienne.carriere@linaro.org wrote:
Hello all,
On Fri, 10 Dec 2021 at 09:10, Jerome Forissier jerome@forissier.org wrote:
+CC Jens, Etienne
On 12/10/21 06:00, Sumit Garg wrote:
On Fri, 10 Dec 2021 at 09:42, Wang, Xiaolei Xiaolei.Wang@windriver.com wrote:
-----Original Message----- From: Sumit Garg sumit.garg@linaro.org Sent: Thursday, December 9, 2021 7:41 PM To: Wang, Xiaolei Xiaolei.Wang@windriver.com Cc: jens.wiklander@linaro.org; op-tee@lists.trustedfirmware.org; linux-kernel@vger.kernel.org Subject: Re: [PATCH] optee: Suppress false positive kmemleak report in optee_handle_rpc()
[Please note: This e-mail is from an EXTERNAL e-mail address]
On Mon, 6 Dec 2021 at 17:35, Xiaolei Wang xiaolei.wang@windriver.com wrote:
We observed the following kmemleak report: unreferenced object 0xffff000007904500 (size 128): comm "swapper/0", pid 1, jiffies 4294892671 (age 44.036s) hex dump (first 32 bytes): 00 47 90 07 00 00 ff ff 60 00 c0 ff 00 00 00 00 .G......`....... 60 00 80 13 00 80 ff ff a0 00 00 00 00 00 00 00 `............... backtrace: [<000000004c12b1c7>] kmem_cache_alloc+0x1ac/0x2f4 [<000000005d23eb4f>] tee_shm_alloc+0x78/0x230 [<00000000794dd22c>] optee_handle_rpc+0x60/0x6f0 [<00000000d9f7c52d>] optee_do_call_with_arg+0x17c/0x1dc [<00000000c35884da>] optee_open_session+0x128/0x1ec [<000000001748f2ff>] tee_client_open_session+0x28/0x40 [<00000000aecb5389>] optee_enumerate_devices+0x84/0x2a0 [<000000003df18bf1>] optee_probe+0x674/0x6cc [<000000003a4a534a>] platform_drv_probe+0x54/0xb0 [<000000000c51ce7d>] really_probe+0xe4/0x4d0 [<000000002f04c865>] driver_probe_device+0x58/0xc0 [<00000000b485397d>] device_driver_attach+0xc0/0xd0 [<00000000c835f0df>] __driver_attach+0x84/0x124 [<000000008e5a429c>] bus_for_each_dev+0x70/0xc0 [<000000001735e8a8>] driver_attach+0x24/0x30 [<000000006d94b04f>] bus_add_driver+0x104/0x1ec
This is not a memory leak because we pass the share memory pointer to secure world and would get it from secure world before releasing it.
How about if it's actually a memory leak caused by the secure world? An example being secure world just allocates kernel memory via OPTEE_SMC_RPC_FUNC_ALLOC and doesn't free it via OPTEE_SMC_RPC_FUNC_FREE.
IMO, we need to cross-check optee-os if it's responsible for leaking kernel memory.
Hi sumit,
You mean we need to check whether there is a real memleak, If being secure world just allocate kernel memory via OPTEE_SMC_PRC_FUNC_ALLOC and until the end, there is no free It via OPTEE_SMC_PRC_FUNC_FREE, then we should judge it as a memory leak, wo need to judge whether it is caused by secure os?
Yes. AFAICT, optee-os should allocate shared memory to communicate with tee-supplicant. So once the communication is done, the underlying shared memory should be freed. I can't think of any scenario where optee-os should keep hold-off shared memory indefinitely.
I believe it can happen when OP-TEE's CFG_PREALLOC_RPC_CACHE is y. See the config file [1] and the commit which introduced this config [2].
[1] https://github.com/OP-TEE/optee_os/blob/3.15.0/mk/config.mk#L709 [2] https://github.com/OP-TEE/optee_os/commit/8887663248ad
It's been a while since OP-TEE caches some shm buffers to prevent re-allocting them on and on. OP-TEE does so for 1 shm buffer per "tee threads" OP-TEE has provisioned. Each thread can cache a shm reference. Note that used RPCs from optee to linux/u-boot/ree do not require such message buffer (IMO).
The main issue is the shm buffer are allocated per optee thread (thread context assigned to client invocation request when entreing optee). Therefore, if an optee thread caches a shm buffer, it makes the caller tee session to have a shm reference with a refcount held, until Optee thread releases its cached shm reference.
There are ugly side effects. Linux must disable the cache to release all resources. We recently saw some tee sessions may be left open because of such shm refcount held. It can lead to few misbehaviour of the TA service (restarting a service, releasing a resource)
Config switch CFG_PREALLOC_RPC_CACHE was introduced [pr4896] to disable the feature at boot time. There are means to not use it, or to explicitly enable/disable it at run time (already used optee smc services for that). Would maybe be a better default config. Note this discussion thread ending at his comment [issue1918]:
Thanks etienne for the detailed description and references. Although, we can set CFG_PREALLOC_RPC_CACHE=n by default but it feels like we would miss a valuable optimization.
How about we just allocate a shared memory page during the OP-TEE driver probe and share it with optee-os to use for RPC arguments? And later it can be freed during OP-TEE driver removal. This would avoid any refconting of this special memory to be associated with TA sessions.
-Sumit
Comments are welcome. I may have missed something in the description (or understanding :).
[pr4896] https://github.com/OP-TEE/optee_os/pull/4896 [issue1918] https://github.com/OP-TEE/optee_os/issues/1918#issuecomment-968747738
Best regards, etienne
-- Jerome