On Thu, 13 Feb 2025 14:46:01 +0530 Sumit Garg sumit.garg@linaro.org wrote:
On Thu, 13 Feb 2025 at 14:06, Boris Brezillon boris.brezillon@collabora.com wrote:
On Thu, 13 Feb 2025 12:11:52 +0530 Sumit Garg sumit.garg@linaro.org wrote:
Hi Boris,
On Thu, 13 Feb 2025 at 01:26, Boris Brezillon boris.brezillon@collabora.com wrote:
+Florent, who's working on protected-mode support in Panthor.
Hi Jens,
On Tue, 17 Dec 2024 11:07:36 +0100 Jens Wiklander jens.wiklander@linaro.org wrote:
Hi,
This patch set allocates the restricted DMA-bufs via the TEE subsystem.
We're currently working on protected-mode support for Panthor [1] and it looks like your series (and the OP-TEE implementation that goes with it) would allow us to have a fully upstream/open solution for the protected content use case we're trying to support. I need a bit more time to play with the implementation but this looks very promising (especially the lend rstmem feature, which might help us allocate our FW sections that are supposed to execute code accessing protected content).
Glad to hear that, if you can demonstrate an open source use case based on this series then it will help to land it. We really would love to see support for restricted DMA-buf consumers be it GPU, crypto accelerator, media pipeline etc.
The TEE subsystem handles the DMA-buf allocations since it is the TEE (OP-TEE, AMD-TEE, TS-TEE, or perhaps a future QCOMTEE) which sets up the restrictions for the memory used for the DMA-bufs.
I've added a new IOCTL, TEE_IOC_RSTMEM_ALLOC, to allocate the restricted DMA-bufs. This IOCTL reaches the backend TEE driver, allowing it to choose how to allocate the restricted physical memory.
I'll probably have more questions soon, but here's one to start: any particular reason you didn't go for a dma-heap to expose restricted buffer allocation to userspace? I see you already have a cdev you can take ioctl()s from, but my understanding was that dma-heap was the standard solution for these device-agnostic/central allocators.
This series started with the DMA heap approach only here [1] but later discussions [2] lead us here. To point out specifically:
- DMA heaps require reliance on DT to discover static restricted
regions carve-outs whereas via the TEE implementation driver (eg. OP-TEE) those can be discovered dynamically.
Hm, the system heap [1] doesn't rely on any DT information AFAICT.
Yeah but all the prior vendor specific secure/restricted DMA heaps relied on DT information.
Right, but there's nothing in the DMA heap provider API forcing that.
The dynamic allocation scheme, where the TEE implementation allocates a chunk of protected memory for us would have a similar behavior, I guess.
In a dynamic scheme, the allocation will still be from CMA or system heap depending on TEE implementation capabilities but the restriction will be enforced via interaction with TEE.
Sorry, that's a wording issue. By dynamic allocation I meant the mode where allocations goes through the TEE, not the lend rstmem thing. BTW, calling the lend mode dynamic-allocation is kinda confusing, because in a sense, both modes can be considered dynamic allocation from the user PoV. I get that when the TEE allocates memory, it's picking from its fixed address/size pool, hence the name, but when I first read this, I thought the dynamic mode was the other one, and the static mode was the one where you reserve a mem range from the DT, query it from the driver and pass it to the TEE to restrict access post reservation/static allocation.
- Dynamic allocation of buffers and making them restricted requires
vendor specific driver hooks with DMA heaps whereas the TEE subsystem abstracts that out with underlying TEE implementation (eg. OP-TEE) managing the dynamic buffer restriction.
Yeah, the lend rstmem feature is clearly something tee specific, and I think that's okay to assume the user knows the protection request should go through the tee subsystem in that case.
Yeah but how will the user discover that?
There's nothing to discover here. It would just be explicitly specified:
- for in-kernel users it can be a module parameter (or a DT prop if that's deemed acceptable) - for userspace, it can be an envvar, a config file, or whatever the app/lib uses to get config options
Rather than that it's better for the user to directly ask the TEE device to allocate restricted memory without worrying about how the memory restriction gets enforced.
If the consensus is that restricted/protected memory allocation should always be routed to the TEE, sure, but I had the feeling this wasn't as clear as that. OTOH, using a dma-heap to expose the TEE-SDP implementation provides the same benefits, without making potential future non-TEE based implementations a pain for users. The dma-heap ioctl being common to all implementations, it just becomes a configuration matter if we want to change the heap we rely on for protected/restricted buffer allocation. And because heaps have unique/well-known names, users can still default to (or rely solely on) the TEE-SPD implementation if they want.
- TEE subsystem already has a well defined user-space interface for
managing shared memory buffers with TEE and restricted DMA buffers will be yet another interface managed along similar lines.
Okay, so the very reason I'm asking about the dma-buf heap interface is because there might be cases where the protected/restricted allocation doesn't go through the TEE (Mediatek has a TEE-free implementation for instance, but I realize vendor implementations are probably not the best selling point :-/).
You can always have a system with memory and peripheral access permissions setup during boot (or even have a pre-configured hardware as a special case) prior to booting up the kernel too. But that even gets somehow configured by a TEE implementation during boot, so calling it a TEE-free implementation seems over-simplified and not a scalable solution. However, this patchset [1] from Mediatek requires runtime TEE interaction too.
[1] https://lore.kernel.org/linux-arm-kernel/20240515112308.10171-1-yong.wu@medi...
If we expose things as a dma-heap, we have a solution where integrators can pick the dma-heap they think is relevant for protected buffer allocations without the various drivers (GPU, video codec, ...) having to implement a dispatch function for all possible implementations. The same goes for userspace allocations, where passing a dma-heap name, is simpler than supporting different ioctl()s based on the allocation backend.
There have been several attempts with DMA heaps in the past which all resulted in a very vendor specific vertically integrated solution. But the solution with TEE subsystem aims to make it generic and vendor agnostic.
Just because all previous protected/restricted dma-heap effort failed to make it upstream, doesn't mean dma-heap is the wrong way of exposing this feature IMHO.
Regards,
Boris