Skip manipulating the refcount in case of slab pages according commit b9c0e49abfca ("mm: decline to manipulate the refcount on a slab page").
Fixes: b9c0e49abfca ("mm: decline to manipulate the refcount on a slab page") Signed-off-by: Marco Felsch m.felsch@pengutronix.de --- v2: - Make use of page variable v1: - https://lore.kernel.org/all/20250325195021.3589797-1-m.felsch@pengutronix.de...
drivers/tee/tee_shm.c | 16 ++++++++++++---- 1 file changed, 12 insertions(+), 4 deletions(-)
diff --git a/drivers/tee/tee_shm.c b/drivers/tee/tee_shm.c index daf6e5cfd59a..35f0ac359b12 100644 --- a/drivers/tee/tee_shm.c +++ b/drivers/tee/tee_shm.c @@ -19,16 +19,24 @@ static void shm_put_kernel_pages(struct page **pages, size_t page_count) { size_t n;
- for (n = 0; n < page_count; n++) - put_page(pages[n]); + for (n = 0; n < page_count; n++) { + struct page *page = pages[n]; + + if (!PageSlab(page)) + put_page(page); + } }
static void shm_get_kernel_pages(struct page **pages, size_t page_count) { size_t n;
- for (n = 0; n < page_count; n++) - get_page(pages[n]); + for (n = 0; n < page_count; n++) { + struct page *page = pages[n]; + + if (!PageSlab(page)) + get_page(page); + } }
static void release_registered_pages(struct tee_shm *shm)
Hi Marco,
On Tue, Mar 25, 2025 at 9:07 PM Marco Felsch m.felsch@pengutronix.de wrote:
Skip manipulating the refcount in case of slab pages according commit b9c0e49abfca ("mm: decline to manipulate the refcount on a slab page").
Fixes: b9c0e49abfca ("mm: decline to manipulate the refcount on a slab page") Signed-off-by: Marco Felsch m.felsch@pengutronix.de
v2:
- Make use of page variable
v1:
drivers/tee/tee_shm.c | 16 ++++++++++++---- 1 file changed, 12 insertions(+), 4 deletions(-)
diff --git a/drivers/tee/tee_shm.c b/drivers/tee/tee_shm.c index daf6e5cfd59a..35f0ac359b12 100644 --- a/drivers/tee/tee_shm.c +++ b/drivers/tee/tee_shm.c @@ -19,16 +19,24 @@ static void shm_put_kernel_pages(struct page **pages, size_t page_count) { size_t n;
for (n = 0; n < page_count; n++)
put_page(pages[n]);
for (n = 0; n < page_count; n++) {
struct page *page = pages[n];
if (!PageSlab(page))
put_page(page);
}
}
static void shm_get_kernel_pages(struct page **pages, size_t page_count) { size_t n;
for (n = 0; n < page_count; n++)
get_page(pages[n]);
for (n = 0; n < page_count; n++) {
struct page *page = pages[n];
if (!PageSlab(page))
get_page(page);
b9c0e49abfca ("mm: decline to manipulate the refcount on a slab page") mentions that more page types will have a zero refcount in the longer term. So we'll need to add exception after exception here. Is there a helper function somewhere to get all the pages we need? Or can we do this differently?
Cheers, Jens
}
}
static void release_registered_pages(struct tee_shm *shm)
2.39.5
Hi Jens,
On 25-03-26, Jens Wiklander wrote:
Hi Marco,
On Tue, Mar 25, 2025 at 9:07 PM Marco Felsch m.felsch@pengutronix.de wrote:
Skip manipulating the refcount in case of slab pages according commit b9c0e49abfca ("mm: decline to manipulate the refcount on a slab page").
Fixes: b9c0e49abfca ("mm: decline to manipulate the refcount on a slab page") Signed-off-by: Marco Felsch m.felsch@pengutronix.de
v2:
- Make use of page variable
v1:
drivers/tee/tee_shm.c | 16 ++++++++++++---- 1 file changed, 12 insertions(+), 4 deletions(-)
diff --git a/drivers/tee/tee_shm.c b/drivers/tee/tee_shm.c index daf6e5cfd59a..35f0ac359b12 100644 --- a/drivers/tee/tee_shm.c +++ b/drivers/tee/tee_shm.c @@ -19,16 +19,24 @@ static void shm_put_kernel_pages(struct page **pages, size_t page_count) { size_t n;
for (n = 0; n < page_count; n++)
put_page(pages[n]);
for (n = 0; n < page_count; n++) {
struct page *page = pages[n];
if (!PageSlab(page))
put_page(page);
}
}
static void shm_get_kernel_pages(struct page **pages, size_t page_count) { size_t n;
for (n = 0; n < page_count; n++)
get_page(pages[n]);
for (n = 0; n < page_count; n++) {
struct page *page = pages[n];
if (!PageSlab(page))
get_page(page);
b9c0e49abfca ("mm: decline to manipulate the refcount on a slab page") mentions that more page types will have a zero refcount in the longer term. So we'll need to add exception after exception here. Is there a helper function somewhere to get all the pages we need? Or can we do this differently?
You're right, but in the long-term perspective the patch also mentions "... stop taking a refcount on the pages that it uses and rely on the caller to hold whatever references are necessary to make the memory stable."
As you mentioned, in the medium term more pages are going to have a zero refcount. I think that once mm is starting to add more zero refcounted page types, they will also add a helper like "PageRefcounted()" or so.
At the moment all users are changed to cover only the slab use-case. Therefore I would keep it as it is right now and change it to the new helper later on.
Regards, Marco
Cheers, Jens
}
}
static void release_registered_pages(struct tee_shm *shm)
2.39.5
On Tue, Mar 25, 2025 at 09:07:39PM +0100, Marco Felsch wrote:
Skip manipulating the refcount in case of slab pages according commit b9c0e49abfca ("mm: decline to manipulate the refcount on a slab page").
This almost certainly isn't right. I know nothing about TEE, but that you are doing this indicates a problem. The hack that we put into networking should not be blindly replicated.
Why are you taking a reference on the pages to begin with? Is it copy and pasted from somewhere else, or was there actual thought put into it?
If it's "prevent the caller from freeing the allocation", well, it never accomplished that with slab allocations. So for callers that do kmalloc (eg setup_mm_hdr() in drivers/firmware/efi/stmm/tee_stmm_efi.c), you have to rely on them not freeing the allocation while the TEE driver has it.
And if that's your API contract, then there's no point in taking refcounts on other kinds of pages either; it's just unnecessary atomic instructions. So the right patch might be something like this:
+++ b/drivers/tee/tee_shm.c @@ -15,29 +15,11 @@ #include <linux/highmem.h> #include "tee_private.h"
-static void shm_put_kernel_pages(struct page **pages, size_t page_count) -{ - size_t n; - - for (n = 0; n < page_count; n++) - put_page(pages[n]); -} - -static void shm_get_kernel_pages(struct page **pages, size_t page_count) -{ - size_t n; - - for (n = 0; n < page_count; n++) - get_page(pages[n]); -} - static void release_registered_pages(struct tee_shm *shm) { if (shm->pages) { if (shm->flags & TEE_SHM_USER_MAPPED) unpin_user_pages(shm->pages, shm->num_pages); - else - shm_put_kernel_pages(shm->pages, shm->num_pages);
kfree(shm->pages); } @@ -321,13 +303,6 @@ register_shm_helper(struct tee_context *ctx, struct iov_iter *iter, u32 flags, goto err_free_shm_pages; }
- /* - * iov_iter_extract_kvec_pages does not get reference on the pages, - * get a reference on them. - */ - if (iov_iter_is_kvec(iter)) - shm_get_kernel_pages(shm->pages, num_pages); - shm->offset = off; shm->size = len; shm->num_pages = num_pages; @@ -341,10 +316,8 @@ register_shm_helper(struct tee_context *ctx, struct iov_iter *iter, u32 flags,
return shm; err_put_shm_pages: - if (!iov_iter_is_kvec(iter)) + if (iter_is_uvec(iter)) unpin_user_pages(shm->pages, shm->num_pages); - else - shm_put_kernel_pages(shm->pages, shm->num_pages); err_free_shm_pages: kfree(shm->pages); err_free_shm:
On 25-03-26, Matthew Wilcox wrote:
On Tue, Mar 25, 2025 at 09:07:39PM +0100, Marco Felsch wrote:
Skip manipulating the refcount in case of slab pages according commit b9c0e49abfca ("mm: decline to manipulate the refcount on a slab page").
This almost certainly isn't right. I know nothing about TEE, but that you are doing this indicates a problem. The hack that we put into networking should not be blindly replicated.
Why are you taking a reference on the pages to begin with? Is it copy and pasted from somewhere else, or was there actual thought put into it?
Not sure, this belongs to the TEE maintainers.
If it's "prevent the caller from freeing the allocation", well, it never accomplished that with slab allocations. So for callers that do kmalloc (eg setup_mm_hdr() in drivers/firmware/efi/stmm/tee_stmm_efi.c), you have to rely on them not freeing the allocation while the TEE driver has it.
And if that's your API contract, then there's no point in taking refcounts on other kinds of pages either; it's just unnecessary atomic instructions. So the right patch might be something like this:
+++ b/drivers/tee/tee_shm.c @@ -15,29 +15,11 @@ #include <linux/highmem.h> #include "tee_private.h"
I had the same diff but didn't went this way since we can't be sure that iov's are always slab backed. As far as I understood IOVs. In 'worst-case' scenario an iov can be backed by different page types too.
Regards, Marco
-static void shm_put_kernel_pages(struct page **pages, size_t page_count) -{
size_t n;
for (n = 0; n < page_count; n++)
put_page(pages[n]);
-}
-static void shm_get_kernel_pages(struct page **pages, size_t page_count) -{
size_t n;
for (n = 0; n < page_count; n++)
get_page(pages[n]);
-}
static void release_registered_pages(struct tee_shm *shm) { if (shm->pages) { if (shm->flags & TEE_SHM_USER_MAPPED) unpin_user_pages(shm->pages, shm->num_pages);
else
shm_put_kernel_pages(shm->pages, shm->num_pages); kfree(shm->pages); }
@@ -321,13 +303,6 @@ register_shm_helper(struct tee_context *ctx, struct iov_iter *iter, u32 flags, goto err_free_shm_pages; }
/*
* iov_iter_extract_kvec_pages does not get reference on the pages,
* get a reference on them.
*/
if (iov_iter_is_kvec(iter))
shm_get_kernel_pages(shm->pages, num_pages);
shm->offset = off; shm->size = len; shm->num_pages = num_pages;
@@ -341,10 +316,8 @@ register_shm_helper(struct tee_context *ctx, struct iov_iter *iter, u32 flags,
return shm;
err_put_shm_pages:
if (!iov_iter_is_kvec(iter))
if (iter_is_uvec(iter)) unpin_user_pages(shm->pages, shm->num_pages);
else
shm_put_kernel_pages(shm->pages, shm->num_pages);
err_free_shm_pages: kfree(shm->pages); err_free_shm:
On Wed, Mar 26, 2025 at 12:07 PM Marco Felsch m.felsch@pengutronix.de wrote:
On 25-03-26, Matthew Wilcox wrote:
On Tue, Mar 25, 2025 at 09:07:39PM +0100, Marco Felsch wrote:
Skip manipulating the refcount in case of slab pages according commit b9c0e49abfca ("mm: decline to manipulate the refcount on a slab page").
This almost certainly isn't right. I know nothing about TEE, but that you are doing this indicates a problem. The hack that we put into networking should not be blindly replicated.
Why are you taking a reference on the pages to begin with? Is it copy and pasted from somewhere else, or was there actual thought put into it?
Not sure, this belongs to the TEE maintainers.
I don't know. We were getting the user pages first, so I assume we just did the same thing when we added support for kernel pages.
If it's "prevent the caller from freeing the allocation", well, it never accomplished that with slab allocations. So for callers that do kmalloc (eg setup_mm_hdr() in drivers/firmware/efi/stmm/tee_stmm_efi.c), you have to rely on them not freeing the allocation while the TEE driver has it.
And if that's your API contract, then there's no point in taking refcounts on other kinds of pages either; it's just unnecessary atomic instructions. So the right patch might be something like this:
+++ b/drivers/tee/tee_shm.c @@ -15,29 +15,11 @@ #include <linux/highmem.h> #include "tee_private.h"
I had the same diff but didn't went this way since we can't be sure that iov's are always slab backed. As far as I understood IOVs. In 'worst-case' scenario an iov can be backed by different page types too.
We're only using kvec's. Briefly, before commit 7bdee4157591 ("tee: Use iov_iter to better support shared buffer registration") we checked with is_vmalloc_addr() || is_kmap_addr(). I like Matthew's suggestion, it's nice to fix problems by deleting code. :-)
Sumit, you know the callers better. What do you think?
Cheers, Jens
Regards, Marco
-static void shm_put_kernel_pages(struct page **pages, size_t page_count) -{
size_t n;
for (n = 0; n < page_count; n++)
put_page(pages[n]);
-}
-static void shm_get_kernel_pages(struct page **pages, size_t page_count) -{
size_t n;
for (n = 0; n < page_count; n++)
get_page(pages[n]);
-}
static void release_registered_pages(struct tee_shm *shm) { if (shm->pages) { if (shm->flags & TEE_SHM_USER_MAPPED) unpin_user_pages(shm->pages, shm->num_pages);
else
shm_put_kernel_pages(shm->pages, shm->num_pages); kfree(shm->pages); }
@@ -321,13 +303,6 @@ register_shm_helper(struct tee_context *ctx, struct iov_iter *iter, u32 flags, goto err_free_shm_pages; }
/*
* iov_iter_extract_kvec_pages does not get reference on the pages,
* get a reference on them.
*/
if (iov_iter_is_kvec(iter))
shm_get_kernel_pages(shm->pages, num_pages);
shm->offset = off; shm->size = len; shm->num_pages = num_pages;
@@ -341,10 +316,8 @@ register_shm_helper(struct tee_context *ctx, struct iov_iter *iter, u32 flags,
return shm;
err_put_shm_pages:
if (!iov_iter_is_kvec(iter))
if (iter_is_uvec(iter)) unpin_user_pages(shm->pages, shm->num_pages);
else
shm_put_kernel_pages(shm->pages, shm->num_pages);
err_free_shm_pages: kfree(shm->pages); err_free_shm:
On Wed, Mar 26, 2025 at 02:47:46PM +0100, Jens Wiklander wrote:
On Wed, Mar 26, 2025 at 12:07 PM Marco Felsch m.felsch@pengutronix.de wrote:
On 25-03-26, Matthew Wilcox wrote:
On Tue, Mar 25, 2025 at 09:07:39PM +0100, Marco Felsch wrote:
Skip manipulating the refcount in case of slab pages according commit b9c0e49abfca ("mm: decline to manipulate the refcount on a slab page").
This almost certainly isn't right. I know nothing about TEE, but that you are doing this indicates a problem. The hack that we put into networking should not be blindly replicated.
Why are you taking a reference on the pages to begin with? Is it copy and pasted from somewhere else, or was there actual thought put into it?
Not sure, this belongs to the TEE maintainers.
I don't know. We were getting the user pages first, so I assume we just did the same thing when we added support for kernel pages.
If it's "prevent the caller from freeing the allocation", well, it never accomplished that with slab allocations. So for callers that do kmalloc (eg setup_mm_hdr() in drivers/firmware/efi/stmm/tee_stmm_efi.c), you have to rely on them not freeing the allocation while the TEE driver has it.
It's not just about the TEE driver but rather if the TEE implementation (a trusted OS) to whom the page is registered with. We don't want the trusted OS to work on registered kernel pages if they gets free somehow in the TEE client driver. Having a reference in the TEE subsystem assured us that won't happen. But if you say slab allocations are still prone the kernel pages getting freed even after refcount then can you suggest how should we handle this better?
As otherwise it can cause very hard to debug problems if trusted OS can manipulate kernel pages that are no longer available.
And if that's your API contract, then there's no point in taking refcounts on other kinds of pages either; it's just unnecessary atomic instructions. So the right patch might be something like this:
+++ b/drivers/tee/tee_shm.c @@ -15,29 +15,11 @@ #include <linux/highmem.h> #include "tee_private.h"
I had the same diff but didn't went this way since we can't be sure that iov's are always slab backed. As far as I understood IOVs. In 'worst-case' scenario an iov can be backed by different page types too.
We're only using kvec's. Briefly, before commit 7bdee4157591 ("tee: Use iov_iter to better support shared buffer registration") we checked with is_vmalloc_addr() || is_kmap_addr(). I like Matthew's suggestion, it's nice to fix problems by deleting code. :-)
Sumit, you know the callers better. What do you think?
If we don't have a sane way to refcont registered kernel pages in TEE subsystem then yeah we have to solely rely on the client drivers to behave properly. Nevertheless, it's still within the kernel boundaries which we can rely upon.
-Sumit
op-tee@lists.trustedfirmware.org