v3:
- Tyler inherited the original series from Allen Pais
- New patch to fix memory leaks in OP-TEE's pool_op_alloc()
+ Unrelated to kexec/kdump
- New patch to refuse to load the OP-TEE driver when booting the kdump
kernel
- Minor comment typo cleanups (s/alter/alert/) in the "optee: fix tee
out of memory failure seen during kexec reboot" patch, as mentioned in
v2 feedback
- New patch to clear stale cache entries during initialization to avoid
crashes when kexec'ing from a buggy kernel, that didn't disable the
shm cache, to a fixed kernel
- Three new patches to allow drivers to allocate a multi-page dynamic
shm that's not dma-buf backed but is still fully registered with the
TEE, ensuring that all driver private shms are unregistered during
kexec
v2: https://lore.kernel.org/lkml/20210225090610.242623-1-allen.lkml@gmail.com/
v1: https://lore.kernel.org/lkml/20210217092714.121297-1-allen.lkml@gmail.com/
This series fixes several bugs uncovered while exercising the OP-TEE
(Open Portable Trusted Execution Environment), ftpm (firmware TPM), and
tee_bnxt_fw (Broadcom BNXT firmware manager) drivers with kexec and
kdump (emergency kexec) based workflows.
The majority of the problems are caused by missing .shutdown hooks in
the drivers. The .shutdown hooks are used by the normal kexec code path
to let the drivers clean up prior to executing the target kernel. The
.remove hooks, which are already implemented in these drivers, are not
called as part of the kexec code path. This resulted in shared memory
regions, that were cached and/or registered with OP-TEE, not being
cleared/unregistered prior to kexec. The new kernel would then run into
problems when handling the previously cached virtual addresses or trying
to register newly allocated shared memory objects that overlapped with
the previously registered virtual addresses. The TEE didn't receive
notification that the old virtual addresses were no longer meaningful
and that a new kernel, with a new address space, would soon be running.
However, implementing .shutdown hooks was not enough for supporting
kexec. There was an additional problem caused by the TEE driver's
reliance on the dma-buf subsystem for multi-page shared memory objects
that were registered with the TEE. Shared memory objects backed by a
dma-buf use a different mechanism for reference counting. When the final
reference is released, work is scheduled to be executed to unregister
the shared memory with the TEE but that work is only completed prior to
the current task returning the userspace. In the case of a kexec
operation, the current task that's calling the driver .shutdown hooks
never returns to userspace prior to the kexec operation so the shared
memory was never unregistered. This eventually caused problems from
overlapping shared memory regions that were registered with the TEE
after several kexec operations. The large 4M contiguous region
allocated by the tee_bnxt_fw driver reliably ran into this issue on the
fourth kexec on a system with 8G of RAM.
The use of dma-buf makes sense for shared memory that's in use by
userspace but dma-buf's aren't needed for shared memory that will only
used by the driver. This series separates dma-buf backed shared memory
allocated by the kernel from multi-page shared memory that the kernel
simply needs registered with the TEE for private use.
One other noteworthy change in this series is to completely refuse to
load the OP-TEE driver in the kdump kernel. This is needed because the
secure world may have had all of its threads in suspended state when the
regular kernel crashed. The kdump kernel would then hang during boot
because the OP-TEE driver's .probe function would attempt to use a
secure world thread when they're all in suspended state. Another problem
is that shared memory allocations could fail under the kdump kernel
because the previously registered were not unregistered (the .shutdown
hook is not called when kexec'ing into the kdump kernel).
The first patch in the series fixes potential memory leaks that are not
directly related to kexec or kdump but were noticed during the
development of this series.
Tyler
Allen Pais (2):
optee: fix tee out of memory failure seen during kexec reboot
firmware: tee_bnxt: Release shm, session, and context during kexec
Tyler Hicks (5):
optee: Fix memory leak when failing to register shm pages
optee: Refuse to load the driver under the kdump kernel
optee: Clear stale cache entries during initialization
tee: Support shm registration without dma-buf backing
tpm_ftpm_tee: Free and unregister dynamic shared memory during kexec
drivers/char/tpm/tpm_ftpm_tee.c | 2 +-
drivers/firmware/broadcom/tee_bnxt_fw.c | 11 ++++++-
drivers/tee/optee/call.c | 11 ++++++-
drivers/tee/optee/core.c | 42 ++++++++++++++++++++++++-
drivers/tee/optee/optee_private.h | 2 +-
drivers/tee/optee/shm_pool.c | 17 +++++++---
drivers/tee/tee_shm.c | 11 ++++++-
7 files changed, 85 insertions(+), 11 deletions(-)
--
2.25.1
Hi all,
This adds support for asynchronous notifications from OP-TEE in secure
world to the OP-TEE driver. This allows a design with a top half and bottom
half type of driver where the top half runs in secure interrupt context and
a notifications tells normal world to schedule a yielding call to do the
bottom half processing.
An SPI interrupt is used to notify the driver that there are asynchronous
notifications pending.
Thanks,
Jens
Jens Wiklander (4):
tee: fix put order in teedev_close_context()
tee: add tee_dev_open_helper() primitive
optee: separate notification functions
optee: add asynchronous notifications
drivers/tee/optee/Makefile | 1 +
drivers/tee/optee/call.c | 27 ++++
drivers/tee/optee/core.c | 104 ++++++++++----
drivers/tee/optee/notif.c | 226 ++++++++++++++++++++++++++++++
drivers/tee/optee/optee_msg.h | 9 ++
drivers/tee/optee/optee_private.h | 23 +--
drivers/tee/optee/optee_rpc_cmd.h | 31 ++--
drivers/tee/optee/optee_smc.h | 79 ++++++++++-
drivers/tee/optee/rpc.c | 73 ++--------
drivers/tee/tee_core.c | 37 +++--
include/linux/tee_drv.h | 27 ++++
11 files changed, 512 insertions(+), 125 deletions(-)
create mode 100644 drivers/tee/optee/notif.c
--
2.31.1
Hi Everyone,
The documentation page says Zynq 7000 is not supported. I would like to
know why the support was stopped. Was there any specific technical reason?
Regards,
Manish Shakya
When the system is going to hibernate or suspend it might happen
that the tee-supplicant task is frozen first.
In this case a running OP-TEE task might get stuck in the loop using
wait_for_completion_interruptible to wait for response of tee-supplicant.
As a consequence other OP-TEE tasks waiting for the above or a
succeeding stuck OP-TEE task might get stuck as well
- waiting for call queue entry to be completed
- waiting for OPTEE_RPC_WAIT_QUEUE_WAKEUP
This will result in the tasks "refusing to freeze" and
the hibernate or suspend will fail.
OP-TEE issue: https://github.com/OP-TEE/optee_os/issues/4581
- Read back the object
PM: suspend entry (s2idle)
Filesystems sync: 0.000 seconds
Freezing user space processes ...
Freezing of tasks failed after 20.008 seconds (3 tasks refusing to freeze, wq_busy=0):
task:optee_example_s state:R running task stack: 0 pid: 124 ppid: 1 flags:0x00000001
[<807d3e24>] (__schedule) from [<841c4000>] (0x841c4000)
task:optee_example_s state:D stack: 0 pid: 126 ppid: 1 flags:0x00000001
[<807d3e24>] (__schedule) from [<807d41d0>] (schedule+0x60/0x120)
[<807d41d0>] (schedule) from [<807d7ffc>] (schedule_timeout+0x1f4/0x340)
[<807d7ffc>] (schedule_timeout) from [<807d56a0>] (wait_for_completion+0x94/0xfc)
[<807d56a0>] (wait_for_completion) from [<80692134>] (optee_cq_wait_for_completion+0x14/0x60)
[<80692134>] (optee_cq_wait_for_completion) from [<806924dc>] (optee_do_call_with_arg+0x14c/0x154)
[<806924dc>] (optee_do_call_with_arg) from [<80692edc>] (optee_shm_unregister+0x78/0xcc)
[<80692edc>] (optee_shm_unregister) from [<80690a9c>] (tee_shm_release+0x88/0x174)
[<80690a9c>] (tee_shm_release) from [<8057f89c>] (dma_buf_release+0x44/0xb0)
[<8057f89c>] (dma_buf_release) from [<8028e4e8>] (__dentry_kill+0x110/0x17c)
[<8028e4e8>] (__dentry_kill) from [<80276cfc>] (__fput+0xc0/0x234)
[<80276cfc>] (__fput) from [<80140b1c>] (task_work_run+0x90/0xbc)
[<80140b1c>] (task_work_run) from [<8010b1c8>] (do_work_pending+0x4a0/0x5a0)
[<8010b1c8>] (do_work_pending) from [<801000cc>] (slow_work_pending+0xc/0x20)
Exception stack(0x843f5fb0 to 0x843f5ff8)
5fa0: 00000000 7ef63448 fffffffe 00000000
5fc0: 7ef63448 76f163b0 7ef63448 00000006 7ef63448 7ef634e0 7ef63438 00000000
5fe0: 00000006 7ef63400 76e74833 76dff856 800e0130 00000004
task:optee_example_s state:D stack: 0 pid: 128 ppid: 1 flags:0x00000001
[<807d3e24>] (__schedule) from [<807d41d0>] (schedule+0x60/0x120)
[<807d41d0>] (schedule) from [<807d7ffc>] (schedule_timeout+0x1f4/0x340)
[<807d7ffc>] (schedule_timeout) from [<807d56a0>] (wait_for_completion+0x94/0xfc)
[<807d56a0>] (wait_for_completion) from [<8069359c>] (optee_handle_rpc+0x554/0x710)
[<8069359c>] (optee_handle_rpc) from [<806924cc>] (optee_do_call_with_arg+0x13c/0x154)
[<806924cc>] (optee_do_call_with_arg) from [<80692910>] (optee_invoke_func+0x110/0x190)
[<80692910>] (optee_invoke_func) from [<8068fe3c>] (tee_ioctl+0x113c/0x1244)
[<8068fe3c>] (tee_ioctl) from [<802892ec>] (sys_ioctl+0xe0/0xa24)
[<802892ec>] (sys_ioctl) from [<80100060>] (ret_fast_syscall+0x0/0x54)
Exception stack(0x8424ffa8 to 0x8424fff0)
ffa0: 00000000 7eb67584 00000003 8010a403 7eb67438 7eb675fc
ffc0: 00000000 7eb67584 7eb67604 00000036 7eb67448 7eb674e0 7eb67438 00000000
ffe0: 76ef7030 7eb6742c 76ee6469 76e83178
OOM killer enabled.
Restarting tasks ... done.
PM: suspend exit
sh: write error: Device or resource busy
The patch set will switch to interruptible waits and add try_to_freeze to allow the waiting
OP-TEE tasks to be frozen as well.
---
In my humble understanding without these patches OP-TEE tasks have only been frozen in user-space.
With these patches it is possible that OP-TEE tasks are frozen although the OP-TEE command
invocation didn't complete.
I'm unable to judge if there are any OP-TEE implementations relying on the fact that suspend won't
happen while the OP-TEE command invocation didn't complete.
The theoretical alternative would be to prevent that tee-supplicant is frozen first.
I was able to reproduce the issue in OP-TEE QEMU v7 using a modified version of
optee_example_secure_storage (loop around REE FS read, support multi-session).
See https://github.com/OP-TEE/optee_os/issues/4581 for details.
After applying these patches (minor adjustments of the includes) I was no longer able to
reproduce the issues.
In my tests OP-TEE QEMU v7 did suspend and resume without troubles.
I'm not able to test on other devices supporting OP-TEE.
I decided to handle each of the locations the OP-TEE task could get stuck as a separate commit.
The downside is that the above call stack doesn't really fit to any of the commits.
Christoph Gellner (3):
tee: optee: Allow to freeze the task waiting for tee-supplicant
tee: optee: Allow to freeze while waiting for call_queue
tee: optee: Allow to freeze while waiting in
OPTEE_RPC_WAIT_QUEUE_SLEEP
drivers/tee/optee/call.c | 8 +++++++-
drivers/tee/optee/rpc.c | 9 ++++++++-
drivers/tee/optee/supp.c | 3 +++
3 files changed, 18 insertions(+), 2 deletions(-)
base-commit: c4681547bcce777daf576925a966ffa824edd09d
--
2.32.0.rc0
Hi,
LOC monthly meeting is planned to take place Thursday May 27(a)17.00 (UTC+2).
Looking for topics from people. If you have anything you'd like to discuss,
please let me know.
I have a couple of examples of things that could be worth having a chat
about if there are no other proposals.
- OP-TEE and MISRA C
- Rust in OP-TEE
- SDP (ION support removed in Linux kernel will affect OP-TEE's SDP
solution)
Meeting details:
---------------
Date/time: Thursday May 27(a)17.00 (UTC+2)
https://everytimezone.com/s/83944ce6
Connection details: https://www.trustedfirmware.org/meetings/
Meeting notes: http://bit.ly/loc-notes
Project page: https://www.linaro.org/projects/#LOC
Regards,
Joakim on behalf of the Linaro OP-TEE team
Hello arm-soc maintainers,
Please pull this small OP-TEE driver fix which uses export_uuid() to copy
the client UUID instead of making asumptions about the internal format of
uuid_t.
Thanks,
Jens
The following changes since commit 6efb943b8616ec53a5e444193dccf1af9ad627b5:
Linux 5.13-rc1 (2021-05-09 14:17:44 -0700)
are available in the Git repository at:
git://git.linaro.org/people/jens.wiklander/linux-tee.git tags/optee-fix-for-v5.13
for you to fetch changes up to 673c7aa2436bfc857b92417f3e590a297c586dde:
optee: use export_uuid() to copy client UUID (2021-05-18 07:59:27 +0200)
----------------------------------------------------------------
OP-TEE use export_uuid() to copy UUID
----------------------------------------------------------------
Jens Wiklander (1):
optee: use export_uuid() to copy client UUID
drivers/tee/optee/call.c | 6 ++++--
drivers/tee/optee/optee_msg.h | 6 ++++--
2 files changed, 8 insertions(+), 4 deletions(-)
From: Allen Pais <apais(a)linux.microsoft.com>
The following out of memory errors are seen on kexec reboot
from the optee core.
[ 0.368428] tee_bnxt_fw optee-clnt0: tee_shm_alloc failed
[ 0.368461] tee_bnxt_fw: probe of optee-clnt0 failed with error -22
tee_shm_release() is not invoked on dma shm buffer.
Implement .shutdown() in optee core as well as bnxt firmware driver
to handle the release of the buffers correctly.
More info:
https://github.com/OP-TEE/optee_os/issues/3637
v2:
keep the .shutdown() method simple. [Jens Wiklander]
Allen Pais (2):
optee: fix tee out of memory failure seen during kexec reboot
firmware: tee_bnxt: implement shutdown method to handle kexec reboots
drivers/firmware/broadcom/tee_bnxt_fw.c | 9 +++++++++
drivers/tee/optee/core.c | 20 ++++++++++++++++++++
2 files changed, 29 insertions(+)
--
2.25.1
Hello arm-soc maintainers,
Please pull this AMDTEE driver fix which adds reference counting to
loaded TAs which is needed for proper life cycle management of TAs.
Note that this isn't a usual Arm driver update. This targets AMD instead,
but is part of the TEE subsystem.
Thanks,
Jens
The following changes since commit 9f4ad9e425a1d3b6a34617b8ea226d56a119a717:
Linux 5.12 (2021-04-25 13:49:08 -0700)
are available in the Git repository at:
git://git.linaro.org/people/jens.wiklander/linux-tee.git tags/amdtee-fixes-for-v5.13
for you to fetch changes up to 9f015b3765bf593b3ed5d3b588e409dc0ffa9f85:
tee: amdtee: unload TA only when its refcount becomes 0 (2021-05-05 13:00:11 +0200)
----------------------------------------------------------------
AMD-TEE reference count loaded TAs
----------------------------------------------------------------
Rijo Thomas (1):
tee: amdtee: unload TA only when its refcount becomes 0
drivers/tee/amdtee/amdtee_private.h | 13 +++++
drivers/tee/amdtee/call.c | 94 +++++++++++++++++++++++++++++++++----
drivers/tee/amdtee/core.c | 15 +++---
3 files changed, 106 insertions(+), 16 deletions(-)