Commit 70b0d6b0a199 ("tee: optee: Fix supplicant wait loop") made the client wait as killable so it can be interrupted during shutdown or after a supplicant crash. This changes the original lifetime expectations: the client task can now terminate while the supplicant is still processing its request.
If the client exits first it removes the request from its queue and kfree()s it, while the request ID remains in supp->idr. A subsequent lookup on the supplicant path then dereferences freed memory, leading to a use-after-free.
Serialise access to the request with supp->mutex:
* Hold supp->mutex in optee_supp_recv() and optee_supp_send() while looking up and touching the request. * Let optee_supp_thrd_req() notice that the client has terminated and signal optee_supp_send() accordingly.
With these changes the request cannot be freed while the supplicant still has a reference, eliminating the race.
Fixes: 70b0d6b0a199 ("tee: optee: Fix supplicant wait loop") Signed-off-by: Amirreza Zarrabi amirreza.zarrabi@oss.qualcomm.com --- Changes in v3: - Introduce processed flag instead of -1 for req->id. - Update optee_supp_release() as reported by Michael Wu. - Use mutex instead of guard. - Link to v2: https://lore.kernel.org/r/20250617-fix-use-after-free-v2-1-1fbfafec5917@oss....
Changes in v2: - Replace the static variable with a sentinel value. - Fix the issue with returning the popped request to the supplicant. - Link to v1: https://lore.kernel.org/r/20250605-fix-use-after-free-v1-1-a70d23bff248@oss.... --- drivers/tee/optee/supp.c | 122 +++++++++++++++++++++++++++++++++-------------- 1 file changed, 86 insertions(+), 36 deletions(-)
diff --git a/drivers/tee/optee/supp.c b/drivers/tee/optee/supp.c index d0f397c90242..0ec66008df19 100644 --- a/drivers/tee/optee/supp.c +++ b/drivers/tee/optee/supp.c @@ -10,7 +10,11 @@ struct optee_supp_req { struct list_head link;
+ int id; + bool in_queue; + bool processed; + u32 func; u32 ret; size_t num_params; @@ -19,6 +23,9 @@ struct optee_supp_req { struct completion c; };
+/* It is temporary request used for invalid pending request in supp->idr. */ +#define INVALID_REQ_PTR ((struct optee_supp_req *)ERR_PTR(-ENOENT)) + void optee_supp_init(struct optee_supp *supp) { memset(supp, 0, sizeof(*supp)); @@ -46,6 +53,10 @@ void optee_supp_release(struct optee_supp *supp) /* Abort all request retrieved by supplicant */ idr_for_each_entry(&supp->idr, req, id) { idr_remove(&supp->idr, id); + /* Skip if request was already marked invalid */ + if (IS_ERR(req)) + continue; + req->ret = TEEC_ERROR_COMMUNICATION; complete(&req->c); } @@ -102,6 +113,7 @@ u32 optee_supp_thrd_req(struct tee_context *ctx, u32 func, size_t num_params, mutex_lock(&supp->mutex); list_add_tail(&req->link, &supp->reqs); req->in_queue = true; + req->processed = false; mutex_unlock(&supp->mutex);
/* Tell an eventual waiter there's a new request */ @@ -117,21 +129,40 @@ u32 optee_supp_thrd_req(struct tee_context *ctx, u32 func, size_t num_params, if (wait_for_completion_killable(&req->c)) { mutex_lock(&supp->mutex); if (req->in_queue) { + /* Supplicant has not seen this request yet. */ list_del(&req->link); req->in_queue = false; + + ret = TEEC_ERROR_COMMUNICATION; + } else if (req->processed) { + /* + * Supplicant has processed this request. Ignore the + * kill signal for now and submit the result. + */ + ret = req->ret; + } else { + /* + * Supplicant is in the middle of processing this + * request. Replace req with INVALID_REQ_PTR so that + * the ID remains busy, causing optee_supp_send() to + * fail on the next call to supp_pop_req() with this ID. + */ + idr_replace(&supp->idr, INVALID_REQ_PTR, req->id); + ret = TEEC_ERROR_COMMUNICATION; } + mutex_unlock(&supp->mutex); - req->ret = TEEC_ERROR_COMMUNICATION; + } else { + ret = req->ret; }
- ret = req->ret; kfree(req);
return ret; }
static struct optee_supp_req *supp_pop_entry(struct optee_supp *supp, - int num_params, int *id) + int num_params) { struct optee_supp_req *req;
@@ -153,8 +184,8 @@ static struct optee_supp_req *supp_pop_entry(struct optee_supp *supp, return ERR_PTR(-EINVAL); }
- *id = idr_alloc(&supp->idr, req, 1, 0, GFP_KERNEL); - if (*id < 0) + req->id = idr_alloc(&supp->idr, req, 1, 0, GFP_KERNEL); + if (req->id < 0) return ERR_PTR(-ENOMEM);
list_del(&req->link); @@ -214,7 +245,6 @@ int optee_supp_recv(struct tee_context *ctx, u32 *func, u32 *num_params, struct optee *optee = tee_get_drvdata(teedev); struct optee_supp *supp = &optee->supp; struct optee_supp_req *req = NULL; - int id; size_t num_meta; int rc;
@@ -224,15 +254,48 @@ int optee_supp_recv(struct tee_context *ctx, u32 *func, u32 *num_params,
while (true) { mutex_lock(&supp->mutex); - req = supp_pop_entry(supp, *num_params - num_meta, &id); - mutex_unlock(&supp->mutex);
- if (req) { - if (IS_ERR(req)) - return PTR_ERR(req); - break; + req = supp_pop_entry(supp, *num_params - num_meta); + if (!req) { + mutex_unlock(&supp->mutex); + goto wait_for_request; + } + + if (IS_ERR(req)) { + rc = PTR_ERR(req); + mutex_unlock(&supp->mutex); + + return rc; }
+ /* + * Process the request while holding the lock, so that + * optee_supp_thrd_req() doesn't pull the request from under us. + */ + + if (num_meta) { + /* + * tee-supplicant support meta parameters -> + * requests can be processed asynchronously. + */ + param->attr = TEE_IOCTL_PARAM_ATTR_TYPE_VALUE_INOUT | + TEE_IOCTL_PARAM_ATTR_META; + param->u.value.a = req->id; + param->u.value.b = 0; + param->u.value.c = 0; + } else { + supp->req_id = req->id; + } + + *func = req->func; + *num_params = req->num_params + num_meta; + memcpy(param + num_meta, req->param, + sizeof(struct tee_param) * req->num_params); + + mutex_unlock(&supp->mutex); + return 0; + +wait_for_request: /* * If we didn't get a request we'll block in * wait_for_completion() to avoid needless spinning. @@ -243,29 +306,10 @@ int optee_supp_recv(struct tee_context *ctx, u32 *func, u32 *num_params, */ if (wait_for_completion_interruptible(&supp->reqs_c)) return -ERESTARTSYS; - }
- if (num_meta) { - /* - * tee-supplicant support meta parameters -> requsts can be - * processed asynchronously. - */ - param->attr = TEE_IOCTL_PARAM_ATTR_TYPE_VALUE_INOUT | - TEE_IOCTL_PARAM_ATTR_META; - param->u.value.a = id; - param->u.value.b = 0; - param->u.value.c = 0; - } else { - mutex_lock(&supp->mutex); - supp->req_id = id; - mutex_unlock(&supp->mutex); + /* Check for the next request in the queue. */ }
- *func = req->func; - *num_params = req->num_params + num_meta; - memcpy(param + num_meta, req->param, - sizeof(struct tee_param) * req->num_params); - return 0; }
@@ -297,12 +341,18 @@ static struct optee_supp_req *supp_pop_req(struct optee_supp *supp, if (!req) return ERR_PTR(-ENOENT);
+ /* optee_supp_thrd_req() already returned to optee. */ + if (IS_ERR(req)) + goto failed_req; + if ((num_params - nm) != req->num_params) return ERR_PTR(-EINVAL);
+ *num_meta = nm; +failed_req: idr_remove(&supp->idr, id); supp->req_id = -1; - *num_meta = nm; +
return req; } @@ -328,9 +378,8 @@ int optee_supp_send(struct tee_context *ctx, u32 ret, u32 num_params,
mutex_lock(&supp->mutex); req = supp_pop_req(supp, num_params, param, &num_meta); - mutex_unlock(&supp->mutex); - if (IS_ERR(req)) { + mutex_unlock(&supp->mutex); /* Something is wrong, let supplicant restart. */ return PTR_ERR(req); } @@ -355,9 +404,10 @@ int optee_supp_send(struct tee_context *ctx, u32 ret, u32 num_params, } } req->ret = ret; - + req->processed = true; /* Let the requesting thread continue */ complete(&req->c); + mutex_unlock(&supp->mutex);
return 0; }
--- base-commit: 3f24e4edcd1b8981c6b448ea2680726dedd87279 change-id: 20250604-fix-use-after-free-8ff1b5d5d774
Best regards,
Hi Amir,
On Thu, Jan 29, 2026 at 4:22 AM Amirreza Zarrabi amirreza.zarrabi@oss.qualcomm.com wrote:
Commit 70b0d6b0a199 ("tee: optee: Fix supplicant wait loop") made the client wait as killable so it can be interrupted during shutdown or after a supplicant crash. This changes the original lifetime expectations: the client task can now terminate while the supplicant is still processing its request.
If the client exits first it removes the request from its queue and kfree()s it, while the request ID remains in supp->idr. A subsequent lookup on the supplicant path then dereferences freed memory, leading to a use-after-free.
Serialise access to the request with supp->mutex:
- Hold supp->mutex in optee_supp_recv() and optee_supp_send() while looking up and touching the request.
- Let optee_supp_thrd_req() notice that the client has terminated and signal optee_supp_send() accordingly.
With these changes the request cannot be freed while the supplicant still has a reference, eliminating the race.
Fixes: 70b0d6b0a199 ("tee: optee: Fix supplicant wait loop") Signed-off-by: Amirreza Zarrabi amirreza.zarrabi@oss.qualcomm.com
Changes in v3:
- Introduce processed flag instead of -1 for req->id.
- Update optee_supp_release() as reported by Michael Wu.
- Use mutex instead of guard.
- Link to v2: https://lore.kernel.org/r/20250617-fix-use-after-free-v2-1-1fbfafec5917@oss....
Changes in v2:
- Replace the static variable with a sentinel value.
- Fix the issue with returning the popped request to the supplicant.
- Link to v1: https://lore.kernel.org/r/20250605-fix-use-after-free-v1-1-a70d23bff248@oss....
drivers/tee/optee/supp.c | 122 +++++++++++++++++++++++++++++++++-------------- 1 file changed, 86 insertions(+), 36 deletions(-)
I had forgotten about this. I'd like to prioritize getting this fixed soon. By the way, how did you test this?
diff --git a/drivers/tee/optee/supp.c b/drivers/tee/optee/supp.c index d0f397c90242..0ec66008df19 100644 --- a/drivers/tee/optee/supp.c +++ b/drivers/tee/optee/supp.c @@ -10,7 +10,11 @@ struct optee_supp_req { struct list_head link;
int id;bool in_queue;bool processed;u32 func; u32 ret; size_t num_params;@@ -19,6 +23,9 @@ struct optee_supp_req { struct completion c; };
+/* It is temporary request used for invalid pending request in supp->idr. */ +#define INVALID_REQ_PTR ((struct optee_supp_req *)ERR_PTR(-ENOENT))
void optee_supp_init(struct optee_supp *supp) { memset(supp, 0, sizeof(*supp)); @@ -46,6 +53,10 @@ void optee_supp_release(struct optee_supp *supp) /* Abort all request retrieved by supplicant */ idr_for_each_entry(&supp->idr, req, id) { idr_remove(&supp->idr, id);
/* Skip if request was already marked invalid */if (IS_ERR(req))continue;req->ret = TEEC_ERROR_COMMUNICATION; complete(&req->c); }@@ -102,6 +113,7 @@ u32 optee_supp_thrd_req(struct tee_context *ctx, u32 func, size_t num_params, mutex_lock(&supp->mutex); list_add_tail(&req->link, &supp->reqs); req->in_queue = true;
req->processed = false; mutex_unlock(&supp->mutex); /* Tell an eventual waiter there's a new request */@@ -117,21 +129,40 @@ u32 optee_supp_thrd_req(struct tee_context *ctx, u32 func, size_t num_params, if (wait_for_completion_killable(&req->c)) { mutex_lock(&supp->mutex); if (req->in_queue) {
/* Supplicant has not seen this request yet. */ list_del(&req->link); req->in_queue = false;ret = TEEC_ERROR_COMMUNICATION;} else if (req->processed) {/** Supplicant has processed this request. Ignore the* kill signal for now and submit the result.*/ret = req->ret;} else {/** Supplicant is in the middle of processing this* request. Replace req with INVALID_REQ_PTR so that* the ID remains busy, causing optee_supp_send() to* fail on the next call to supp_pop_req() with this ID.*/idr_replace(&supp->idr, INVALID_REQ_PTR, req->id);ret = TEEC_ERROR_COMMUNICATION; }mutex_unlock(&supp->mutex);
req->ret = TEEC_ERROR_COMMUNICATION;
} else {ret = req->ret; }
ret = req->ret; kfree(req); return ret;}
static struct optee_supp_req *supp_pop_entry(struct optee_supp *supp,
int num_params, int *id)
int num_params){ struct optee_supp_req *req;
@@ -153,8 +184,8 @@ static struct optee_supp_req *supp_pop_entry(struct optee_supp *supp, return ERR_PTR(-EINVAL); }
*id = idr_alloc(&supp->idr, req, 1, 0, GFP_KERNEL);if (*id < 0)
req->id = idr_alloc(&supp->idr, req, 1, 0, GFP_KERNEL);if (req->id < 0) return ERR_PTR(-ENOMEM);
Since we're now storing the supplicant request ID, wouldn't it make sense to pre-allocate the ID when allocating the request to avoid this error case?
list_del(&req->link);@@ -214,7 +245,6 @@ int optee_supp_recv(struct tee_context *ctx, u32 *func, u32 *num_params, struct optee *optee = tee_get_drvdata(teedev); struct optee_supp *supp = &optee->supp; struct optee_supp_req *req = NULL;
int id; size_t num_meta; int rc;@@ -224,15 +254,48 @@ int optee_supp_recv(struct tee_context *ctx, u32 *func, u32 *num_params,
while (true) { mutex_lock(&supp->mutex);
req = supp_pop_entry(supp, *num_params - num_meta, &id);mutex_unlock(&supp->mutex);if (req) {if (IS_ERR(req))return PTR_ERR(req);break;
req = supp_pop_entry(supp, *num_params - num_meta);if (!req) {mutex_unlock(&supp->mutex);goto wait_for_request;}if (IS_ERR(req)) {rc = PTR_ERR(req);mutex_unlock(&supp->mutex);return rc; }/** Process the request while holding the lock, so that* optee_supp_thrd_req() doesn't pull the request from under us.*/if (num_meta) {/** tee-supplicant support meta parameters ->* requests can be processed asynchronously.*/param->attr = TEE_IOCTL_PARAM_ATTR_TYPE_VALUE_INOUT |TEE_IOCTL_PARAM_ATTR_META;param->u.value.a = req->id;param->u.value.b = 0;param->u.value.c = 0;} else {supp->req_id = req->id;}*func = req->func;*num_params = req->num_params + num_meta;memcpy(param + num_meta, req->param,sizeof(struct tee_param) * req->num_params);mutex_unlock(&supp->mutex);return 0;
Do we really need to move this into the loop? The structure of the function becomes a bit unusual and harder to read.
+wait_for_request: /* * If we didn't get a request we'll block in * wait_for_completion() to avoid needless spinning. @@ -243,29 +306,10 @@ int optee_supp_recv(struct tee_context *ctx, u32 *func, u32 *num_params, */ if (wait_for_completion_interruptible(&supp->reqs_c)) return -ERESTARTSYS;
}if (num_meta) {/** tee-supplicant support meta parameters -> requsts can be* processed asynchronously.*/param->attr = TEE_IOCTL_PARAM_ATTR_TYPE_VALUE_INOUT |TEE_IOCTL_PARAM_ATTR_META;param->u.value.a = id;param->u.value.b = 0;param->u.value.c = 0;} else {mutex_lock(&supp->mutex);supp->req_id = id;mutex_unlock(&supp->mutex);
/* Check for the next request in the queue. */ }
*func = req->func;*num_params = req->num_params + num_meta;memcpy(param + num_meta, req->param,sizeof(struct tee_param) * req->num_params);return 0;}
@@ -297,12 +341,18 @@ static struct optee_supp_req *supp_pop_req(struct optee_supp *supp, if (!req) return ERR_PTR(-ENOENT);
/* optee_supp_thrd_req() already returned to optee. */if (IS_ERR(req))goto failed_req;if ((num_params - nm) != req->num_params) return ERR_PTR(-EINVAL);*num_meta = nm;+failed_req: idr_remove(&supp->idr, id); supp->req_id = -1;
*num_meta = nm;
return req;} @@ -328,9 +378,8 @@ int optee_supp_send(struct tee_context *ctx, u32 ret, u32 num_params,
mutex_lock(&supp->mutex); req = supp_pop_req(supp, num_params, param, &num_meta);
mutex_unlock(&supp->mutex);if (IS_ERR(req)) {
mutex_unlock(&supp->mutex);
We need a way to tell the difference between an id not found and an id removed because of a killed requester. How about storing NULL for revoked requests instead of an err-pointer?
Cheers, Jens
/* Something is wrong, let supplicant restart. */ return PTR_ERR(req); }@@ -355,9 +404,10 @@ int optee_supp_send(struct tee_context *ctx, u32 ret, u32 num_params, } } req->ret = ret;
req->processed = true; /* Let the requesting thread continue */ complete(&req->c);mutex_unlock(&supp->mutex); return 0;}
base-commit: 3f24e4edcd1b8981c6b448ea2680726dedd87279 change-id: 20250604-fix-use-after-free-8ff1b5d5d774
Best regards,
Amirreza Zarrabi amirreza.zarrabi@oss.qualcomm.com
Hi Jens,
On 2/2/2026 10:36 PM, Jens Wiklander wrote:
Hi Amir,
On Thu, Jan 29, 2026 at 4:22 AM Amirreza Zarrabi amirreza.zarrabi@oss.qualcomm.com wrote:
Commit 70b0d6b0a199 ("tee: optee: Fix supplicant wait loop") made the client wait as killable so it can be interrupted during shutdown or after a supplicant crash. This changes the original lifetime expectations: the client task can now terminate while the supplicant is still processing its request.
If the client exits first it removes the request from its queue and kfree()s it, while the request ID remains in supp->idr. A subsequent lookup on the supplicant path then dereferences freed memory, leading to a use-after-free.
Serialise access to the request with supp->mutex:
- Hold supp->mutex in optee_supp_recv() and optee_supp_send() while looking up and touching the request.
- Let optee_supp_thrd_req() notice that the client has terminated and signal optee_supp_send() accordingly.
With these changes the request cannot be freed while the supplicant still has a reference, eliminating the race.
Fixes: 70b0d6b0a199 ("tee: optee: Fix supplicant wait loop") Signed-off-by: Amirreza Zarrabi amirreza.zarrabi@oss.qualcomm.com
Changes in v3:
- Introduce processed flag instead of -1 for req->id.
- Update optee_supp_release() as reported by Michael Wu.
- Use mutex instead of guard.
- Link to v2: https://lore.kernel.org/r/20250617-fix-use-after-free-v2-1-1fbfafec5917@oss....
Changes in v2:
- Replace the static variable with a sentinel value.
- Fix the issue with returning the popped request to the supplicant.
- Link to v1: https://lore.kernel.org/r/20250605-fix-use-after-free-v1-1-a70d23bff248@oss....
drivers/tee/optee/supp.c | 122 +++++++++++++++++++++++++++++++++-------------- 1 file changed, 86 insertions(+), 36 deletions(-)
I had forgotten about this. I'd like to prioritize getting this fixed soon. By the way, how did you test this?
Thanks for the update. I currently don't have access to the setup required to run the tests myself. My plan is to finalize the design and implementation, then ask Michael Wu to run his use case. Based on his earlier feedback, the patch appears to be working as expected.
https://lore.kernel.org/all/292653ba-3836-00f1-acd4-a28b1c54388c@allwinnerte...
diff --git a/drivers/tee/optee/supp.c b/drivers/tee/optee/supp.c index d0f397c90242..0ec66008df19 100644 --- a/drivers/tee/optee/supp.c +++ b/drivers/tee/optee/supp.c @@ -10,7 +10,11 @@ struct optee_supp_req { struct list_head link;
int id;bool in_queue;bool processed;u32 func; u32 ret; size_t num_params;@@ -19,6 +23,9 @@ struct optee_supp_req { struct completion c; };
+/* It is temporary request used for invalid pending request in supp->idr. */ +#define INVALID_REQ_PTR ((struct optee_supp_req *)ERR_PTR(-ENOENT))
void optee_supp_init(struct optee_supp *supp) { memset(supp, 0, sizeof(*supp)); @@ -46,6 +53,10 @@ void optee_supp_release(struct optee_supp *supp) /* Abort all request retrieved by supplicant */ idr_for_each_entry(&supp->idr, req, id) { idr_remove(&supp->idr, id);
/* Skip if request was already marked invalid */if (IS_ERR(req))continue;req->ret = TEEC_ERROR_COMMUNICATION; complete(&req->c); }@@ -102,6 +113,7 @@ u32 optee_supp_thrd_req(struct tee_context *ctx, u32 func, size_t num_params, mutex_lock(&supp->mutex); list_add_tail(&req->link, &supp->reqs); req->in_queue = true;
req->processed = false; mutex_unlock(&supp->mutex); /* Tell an eventual waiter there's a new request */@@ -117,21 +129,40 @@ u32 optee_supp_thrd_req(struct tee_context *ctx, u32 func, size_t num_params, if (wait_for_completion_killable(&req->c)) { mutex_lock(&supp->mutex); if (req->in_queue) {
/* Supplicant has not seen this request yet. */ list_del(&req->link); req->in_queue = false;ret = TEEC_ERROR_COMMUNICATION;} else if (req->processed) {/** Supplicant has processed this request. Ignore the* kill signal for now and submit the result.*/ret = req->ret;} else {/** Supplicant is in the middle of processing this* request. Replace req with INVALID_REQ_PTR so that* the ID remains busy, causing optee_supp_send() to* fail on the next call to supp_pop_req() with this ID.*/idr_replace(&supp->idr, INVALID_REQ_PTR, req->id);ret = TEEC_ERROR_COMMUNICATION; }mutex_unlock(&supp->mutex);
req->ret = TEEC_ERROR_COMMUNICATION;
} else {ret = req->ret; }
ret = req->ret; kfree(req); return ret;}
static struct optee_supp_req *supp_pop_entry(struct optee_supp *supp,
int num_params, int *id)
int num_params){ struct optee_supp_req *req;
@@ -153,8 +184,8 @@ static struct optee_supp_req *supp_pop_entry(struct optee_supp *supp, return ERR_PTR(-EINVAL); }
*id = idr_alloc(&supp->idr, req, 1, 0, GFP_KERNEL);if (*id < 0)
req->id = idr_alloc(&supp->idr, req, 1, 0, GFP_KERNEL);if (req->id < 0) return ERR_PTR(-ENOMEM);Since we're now storing the supplicant request ID, wouldn't it make sense to pre-allocate the ID when allocating the request to avoid this error case?
True, but allocating the ID at this stage has one advantage. If an ID is not available, the request can remain on the request list, allowing the supplicant to retry later when resources become available. If ID allocation fails during request creation, I have no choice but to drop the request and report an error to optee.
list_del(&req->link);@@ -214,7 +245,6 @@ int optee_supp_recv(struct tee_context *ctx, u32 *func, u32 *num_params, struct optee *optee = tee_get_drvdata(teedev); struct optee_supp *supp = &optee->supp; struct optee_supp_req *req = NULL;
int id; size_t num_meta; int rc;@@ -224,15 +254,48 @@ int optee_supp_recv(struct tee_context *ctx, u32 *func, u32 *num_params,
while (true) { mutex_lock(&supp->mutex);
req = supp_pop_entry(supp, *num_params - num_meta, &id);mutex_unlock(&supp->mutex);if (req) {if (IS_ERR(req))return PTR_ERR(req);break;
req = supp_pop_entry(supp, *num_params - num_meta);if (!req) {mutex_unlock(&supp->mutex);goto wait_for_request;}if (IS_ERR(req)) {rc = PTR_ERR(req);mutex_unlock(&supp->mutex);return rc; }/** Process the request while holding the lock, so that* optee_supp_thrd_req() doesn't pull the request from under us.*/if (num_meta) {/** tee-supplicant support meta parameters ->* requests can be processed asynchronously.*/param->attr = TEE_IOCTL_PARAM_ATTR_TYPE_VALUE_INOUT |TEE_IOCTL_PARAM_ATTR_META;param->u.value.a = req->id;param->u.value.b = 0;param->u.value.c = 0;} else {supp->req_id = req->id;}*func = req->func;*num_params = req->num_params + num_meta;memcpy(param + num_meta, req->param,sizeof(struct tee_param) * req->num_params);mutex_unlock(&supp->mutex);return 0;Do we really need to move this into the loop? The structure of the function becomes a bit unusual and harder to read.
Ack. I'll reorganize this function.
+wait_for_request: /* * If we didn't get a request we'll block in * wait_for_completion() to avoid needless spinning. @@ -243,29 +306,10 @@ int optee_supp_recv(struct tee_context *ctx, u32 *func, u32 *num_params, */ if (wait_for_completion_interruptible(&supp->reqs_c)) return -ERESTARTSYS;
}if (num_meta) {/** tee-supplicant support meta parameters -> requsts can be* processed asynchronously.*/param->attr = TEE_IOCTL_PARAM_ATTR_TYPE_VALUE_INOUT |TEE_IOCTL_PARAM_ATTR_META;param->u.value.a = id;param->u.value.b = 0;param->u.value.c = 0;} else {mutex_lock(&supp->mutex);supp->req_id = id;mutex_unlock(&supp->mutex);
/* Check for the next request in the queue. */ }
*func = req->func;*num_params = req->num_params + num_meta;memcpy(param + num_meta, req->param,sizeof(struct tee_param) * req->num_params);return 0;}
@@ -297,12 +341,18 @@ static struct optee_supp_req *supp_pop_req(struct optee_supp *supp, if (!req) return ERR_PTR(-ENOENT);
/* optee_supp_thrd_req() already returned to optee. */if (IS_ERR(req))goto failed_req;if ((num_params - nm) != req->num_params) return ERR_PTR(-EINVAL);*num_meta = nm;+failed_req: idr_remove(&supp->idr, id); supp->req_id = -1;
*num_meta = nm;
return req;} @@ -328,9 +378,8 @@ int optee_supp_send(struct tee_context *ctx, u32 ret, u32 num_params,
mutex_lock(&supp->mutex); req = supp_pop_req(supp, num_params, param, &num_meta);
mutex_unlock(&supp->mutex);if (IS_ERR(req)) {
mutex_unlock(&supp->mutex);We need a way to tell the difference between an id not found and an id removed because of a killed requester. How about storing NULL for revoked requests instead of an err-pointer?
Not sure I'm following correctly. Are you expecting supp_pop_req() to return NULL instead of an err-pointer when a request has been revoked?
Best Rearads, Amir
Cheers, Jens
/* Something is wrong, let supplicant restart. */ return PTR_ERR(req); }@@ -355,9 +404,10 @@ int optee_supp_send(struct tee_context *ctx, u32 ret, u32 num_params, } } req->ret = ret;
req->processed = true; /* Let the requesting thread continue */ complete(&req->c);mutex_unlock(&supp->mutex); return 0;}
base-commit: 3f24e4edcd1b8981c6b448ea2680726dedd87279 change-id: 20250604-fix-use-after-free-8ff1b5d5d774
Best regards,
Amirreza Zarrabi amirreza.zarrabi@oss.qualcomm.com
Hi,
On Tue, Feb 3, 2026 at 3:09 AM Amirreza Zarrabi amirreza.zarrabi@oss.qualcomm.com wrote:
Hi Jens,
On 2/2/2026 10:36 PM, Jens Wiklander wrote:
Hi Amir,
On Thu, Jan 29, 2026 at 4:22 AM Amirreza Zarrabi amirreza.zarrabi@oss.qualcomm.com wrote:
Commit 70b0d6b0a199 ("tee: optee: Fix supplicant wait loop") made the client wait as killable so it can be interrupted during shutdown or after a supplicant crash. This changes the original lifetime expectations: the client task can now terminate while the supplicant is still processing its request.
If the client exits first it removes the request from its queue and kfree()s it, while the request ID remains in supp->idr. A subsequent lookup on the supplicant path then dereferences freed memory, leading to a use-after-free.
Serialise access to the request with supp->mutex:
- Hold supp->mutex in optee_supp_recv() and optee_supp_send() while looking up and touching the request.
- Let optee_supp_thrd_req() notice that the client has terminated and signal optee_supp_send() accordingly.
With these changes the request cannot be freed while the supplicant still has a reference, eliminating the race.
Fixes: 70b0d6b0a199 ("tee: optee: Fix supplicant wait loop") Signed-off-by: Amirreza Zarrabi amirreza.zarrabi@oss.qualcomm.com
Changes in v3:
- Introduce processed flag instead of -1 for req->id.
- Update optee_supp_release() as reported by Michael Wu.
- Use mutex instead of guard.
- Link to v2: https://lore.kernel.org/r/20250617-fix-use-after-free-v2-1-1fbfafec5917@oss....
Changes in v2:
- Replace the static variable with a sentinel value.
- Fix the issue with returning the popped request to the supplicant.
- Link to v1: https://lore.kernel.org/r/20250605-fix-use-after-free-v1-1-a70d23bff248@oss....
drivers/tee/optee/supp.c | 122 +++++++++++++++++++++++++++++++++-------------- 1 file changed, 86 insertions(+), 36 deletions(-)
I had forgotten about this. I'd like to prioritize getting this fixed soon. By the way, how did you test this?
Thanks for the update. I currently don't have access to the setup required to run the tests myself. My plan is to finalize the design and implementation, then ask Michael Wu to run his use case. Based on his earlier feedback, the patch appears to be working as expected.
https://lore.kernel.org/all/292653ba-3836-00f1-acd4-a28b1c54388c@allwinnerte...
OK
diff --git a/drivers/tee/optee/supp.c b/drivers/tee/optee/supp.c index d0f397c90242..0ec66008df19 100644 --- a/drivers/tee/optee/supp.c +++ b/drivers/tee/optee/supp.c @@ -10,7 +10,11 @@ struct optee_supp_req { struct list_head link;
int id;bool in_queue;bool processed;u32 func; u32 ret; size_t num_params;@@ -19,6 +23,9 @@ struct optee_supp_req { struct completion c; };
+/* It is temporary request used for invalid pending request in supp->idr. */ +#define INVALID_REQ_PTR ((struct optee_supp_req *)ERR_PTR(-ENOENT))
void optee_supp_init(struct optee_supp *supp) { memset(supp, 0, sizeof(*supp)); @@ -46,6 +53,10 @@ void optee_supp_release(struct optee_supp *supp) /* Abort all request retrieved by supplicant */ idr_for_each_entry(&supp->idr, req, id) { idr_remove(&supp->idr, id);
/* Skip if request was already marked invalid */if (IS_ERR(req))continue;req->ret = TEEC_ERROR_COMMUNICATION; complete(&req->c); }@@ -102,6 +113,7 @@ u32 optee_supp_thrd_req(struct tee_context *ctx, u32 func, size_t num_params, mutex_lock(&supp->mutex); list_add_tail(&req->link, &supp->reqs); req->in_queue = true;
req->processed = false; mutex_unlock(&supp->mutex); /* Tell an eventual waiter there's a new request */@@ -117,21 +129,40 @@ u32 optee_supp_thrd_req(struct tee_context *ctx, u32 func, size_t num_params, if (wait_for_completion_killable(&req->c)) { mutex_lock(&supp->mutex); if (req->in_queue) {
/* Supplicant has not seen this request yet. */ list_del(&req->link); req->in_queue = false;ret = TEEC_ERROR_COMMUNICATION;} else if (req->processed) {/** Supplicant has processed this request. Ignore the* kill signal for now and submit the result.*/ret = req->ret;} else {/** Supplicant is in the middle of processing this* request. Replace req with INVALID_REQ_PTR so that* the ID remains busy, causing optee_supp_send() to* fail on the next call to supp_pop_req() with this ID.*/idr_replace(&supp->idr, INVALID_REQ_PTR, req->id);ret = TEEC_ERROR_COMMUNICATION; }mutex_unlock(&supp->mutex);
req->ret = TEEC_ERROR_COMMUNICATION;
} else {ret = req->ret; }
ret = req->ret; kfree(req); return ret;}
static struct optee_supp_req *supp_pop_entry(struct optee_supp *supp,
int num_params, int *id)
int num_params){ struct optee_supp_req *req;
@@ -153,8 +184,8 @@ static struct optee_supp_req *supp_pop_entry(struct optee_supp *supp, return ERR_PTR(-EINVAL); }
*id = idr_alloc(&supp->idr, req, 1, 0, GFP_KERNEL);if (*id < 0)
req->id = idr_alloc(&supp->idr, req, 1, 0, GFP_KERNEL);if (req->id < 0) return ERR_PTR(-ENOMEM);Since we're now storing the supplicant request ID, wouldn't it make sense to pre-allocate the ID when allocating the request to avoid this error case?
True, but allocating the ID at this stage has one advantage. If an ID is not available, the request can remain on the request list, allowing the supplicant to retry later when resources become available. If ID allocation fails during request creation, I have no choice but to drop the request and report an error to optee.
We're allocating in the range 1..INT_MAX, and not more than a handful are expected to be active at a time. If we run out of IDs, we have bigger problems.
list_del(&req->link);@@ -214,7 +245,6 @@ int optee_supp_recv(struct tee_context *ctx, u32 *func, u32 *num_params, struct optee *optee = tee_get_drvdata(teedev); struct optee_supp *supp = &optee->supp; struct optee_supp_req *req = NULL;
int id; size_t num_meta; int rc;@@ -224,15 +254,48 @@ int optee_supp_recv(struct tee_context *ctx, u32 *func, u32 *num_params,
while (true) { mutex_lock(&supp->mutex);
req = supp_pop_entry(supp, *num_params - num_meta, &id);mutex_unlock(&supp->mutex);if (req) {if (IS_ERR(req))return PTR_ERR(req);break;
req = supp_pop_entry(supp, *num_params - num_meta);if (!req) {mutex_unlock(&supp->mutex);goto wait_for_request;}if (IS_ERR(req)) {rc = PTR_ERR(req);mutex_unlock(&supp->mutex);return rc; }/** Process the request while holding the lock, so that* optee_supp_thrd_req() doesn't pull the request from under us.*/if (num_meta) {/** tee-supplicant support meta parameters ->* requests can be processed asynchronously.*/param->attr = TEE_IOCTL_PARAM_ATTR_TYPE_VALUE_INOUT |TEE_IOCTL_PARAM_ATTR_META;param->u.value.a = req->id;param->u.value.b = 0;param->u.value.c = 0;} else {supp->req_id = req->id;}*func = req->func;*num_params = req->num_params + num_meta;memcpy(param + num_meta, req->param,sizeof(struct tee_param) * req->num_params);mutex_unlock(&supp->mutex);return 0;Do we really need to move this into the loop? The structure of the function becomes a bit unusual and harder to read.
Ack. I'll reorganize this function.
+wait_for_request: /* * If we didn't get a request we'll block in * wait_for_completion() to avoid needless spinning. @@ -243,29 +306,10 @@ int optee_supp_recv(struct tee_context *ctx, u32 *func, u32 *num_params, */ if (wait_for_completion_interruptible(&supp->reqs_c)) return -ERESTARTSYS;
}if (num_meta) {/** tee-supplicant support meta parameters -> requsts can be* processed asynchronously.*/param->attr = TEE_IOCTL_PARAM_ATTR_TYPE_VALUE_INOUT |TEE_IOCTL_PARAM_ATTR_META;param->u.value.a = id;param->u.value.b = 0;param->u.value.c = 0;} else {mutex_lock(&supp->mutex);supp->req_id = id;mutex_unlock(&supp->mutex);
/* Check for the next request in the queue. */ }
*func = req->func;*num_params = req->num_params + num_meta;memcpy(param + num_meta, req->param,sizeof(struct tee_param) * req->num_params);return 0;}
@@ -297,12 +341,18 @@ static struct optee_supp_req *supp_pop_req(struct optee_supp *supp, if (!req) return ERR_PTR(-ENOENT);
/* optee_supp_thrd_req() already returned to optee. */if (IS_ERR(req))goto failed_req;if ((num_params - nm) != req->num_params) return ERR_PTR(-EINVAL);*num_meta = nm;+failed_req: idr_remove(&supp->idr, id); supp->req_id = -1;
*num_meta = nm;
return req;} @@ -328,9 +378,8 @@ int optee_supp_send(struct tee_context *ctx, u32 ret, u32 num_params,
mutex_lock(&supp->mutex); req = supp_pop_req(supp, num_params, param, &num_meta);
mutex_unlock(&supp->mutex);if (IS_ERR(req)) {
mutex_unlock(&supp->mutex);We need a way to tell the difference between an id not found and an id removed because of a killed requester. How about storing NULL for revoked requests instead of an err-pointer?
Not sure I'm following correctly. Are you expecting supp_pop_req() to return NULL instead of an err-pointer when a request has been revoked?
I was looking at it again, and storing an err-pointer as you do in this patch has the advantage that we can tell whether the ID has been revoked or was never supplied. In the latter case, it suggests that the supplicant is doing something wrong and might as well restart in an attempt to recover. So, please keep using an err-pointer as a placeholder, but we must be able to distinguish a revoked request from other errors to make sure that the supplicant doesn't restart due to a revoked request.
Cheers, Jens
Best Rearads, Amir
Cheers, Jens
/* Something is wrong, let supplicant restart. */ return PTR_ERR(req); }@@ -355,9 +404,10 @@ int optee_supp_send(struct tee_context *ctx, u32 ret, u32 num_params, } } req->ret = ret;
req->processed = true; /* Let the requesting thread continue */ complete(&req->c);mutex_unlock(&supp->mutex); return 0;}
base-commit: 3f24e4edcd1b8981c6b448ea2680726dedd87279 change-id: 20250604-fix-use-after-free-8ff1b5d5d774
Best regards,
Amirreza Zarrabi amirreza.zarrabi@oss.qualcomm.com
Hi Jens,
On 2/3/2026 5:59 PM, Jens Wiklander wrote:
Hi,
On Tue, Feb 3, 2026 at 3:09 AM Amirreza Zarrabi amirreza.zarrabi@oss.qualcomm.com wrote:
Hi Jens,
On 2/2/2026 10:36 PM, Jens Wiklander wrote:
Hi Amir,
On Thu, Jan 29, 2026 at 4:22 AM Amirreza Zarrabi amirreza.zarrabi@oss.qualcomm.com wrote:
Commit 70b0d6b0a199 ("tee: optee: Fix supplicant wait loop") made the client wait as killable so it can be interrupted during shutdown or after a supplicant crash. This changes the original lifetime expectations: the client task can now terminate while the supplicant is still processing its request.
If the client exits first it removes the request from its queue and kfree()s it, while the request ID remains in supp->idr. A subsequent lookup on the supplicant path then dereferences freed memory, leading to a use-after-free.
Serialise access to the request with supp->mutex:
- Hold supp->mutex in optee_supp_recv() and optee_supp_send() while looking up and touching the request.
- Let optee_supp_thrd_req() notice that the client has terminated and signal optee_supp_send() accordingly.
With these changes the request cannot be freed while the supplicant still has a reference, eliminating the race.
Fixes: 70b0d6b0a199 ("tee: optee: Fix supplicant wait loop") Signed-off-by: Amirreza Zarrabi amirreza.zarrabi@oss.qualcomm.com
Changes in v3:
- Introduce processed flag instead of -1 for req->id.
- Update optee_supp_release() as reported by Michael Wu.
- Use mutex instead of guard.
- Link to v2: https://lore.kernel.org/r/20250617-fix-use-after-free-v2-1-1fbfafec5917@oss....
Changes in v2:
- Replace the static variable with a sentinel value.
- Fix the issue with returning the popped request to the supplicant.
- Link to v1: https://lore.kernel.org/r/20250605-fix-use-after-free-v1-1-a70d23bff248@oss....
drivers/tee/optee/supp.c | 122 +++++++++++++++++++++++++++++++++-------------- 1 file changed, 86 insertions(+), 36 deletions(-)
I had forgotten about this. I'd like to prioritize getting this fixed soon. By the way, how did you test this?
Thanks for the update. I currently don't have access to the setup required to run the tests myself. My plan is to finalize the design and implementation, then ask Michael Wu to run his use case. Based on his earlier feedback, the patch appears to be working as expected.
https://lore.kernel.org/all/292653ba-3836-00f1-acd4-a28b1c54388c@allwinnerte...
OK
diff --git a/drivers/tee/optee/supp.c b/drivers/tee/optee/supp.c index d0f397c90242..0ec66008df19 100644 --- a/drivers/tee/optee/supp.c +++ b/drivers/tee/optee/supp.c @@ -10,7 +10,11 @@ struct optee_supp_req { struct list_head link;
int id;bool in_queue;bool processed;u32 func; u32 ret; size_t num_params;@@ -19,6 +23,9 @@ struct optee_supp_req { struct completion c; };
+/* It is temporary request used for invalid pending request in supp->idr. */ +#define INVALID_REQ_PTR ((struct optee_supp_req *)ERR_PTR(-ENOENT))
void optee_supp_init(struct optee_supp *supp) { memset(supp, 0, sizeof(*supp)); @@ -46,6 +53,10 @@ void optee_supp_release(struct optee_supp *supp) /* Abort all request retrieved by supplicant */ idr_for_each_entry(&supp->idr, req, id) { idr_remove(&supp->idr, id);
/* Skip if request was already marked invalid */if (IS_ERR(req))continue;req->ret = TEEC_ERROR_COMMUNICATION; complete(&req->c); }@@ -102,6 +113,7 @@ u32 optee_supp_thrd_req(struct tee_context *ctx, u32 func, size_t num_params, mutex_lock(&supp->mutex); list_add_tail(&req->link, &supp->reqs); req->in_queue = true;
req->processed = false; mutex_unlock(&supp->mutex); /* Tell an eventual waiter there's a new request */@@ -117,21 +129,40 @@ u32 optee_supp_thrd_req(struct tee_context *ctx, u32 func, size_t num_params, if (wait_for_completion_killable(&req->c)) { mutex_lock(&supp->mutex); if (req->in_queue) {
/* Supplicant has not seen this request yet. */ list_del(&req->link); req->in_queue = false;ret = TEEC_ERROR_COMMUNICATION;} else if (req->processed) {/** Supplicant has processed this request. Ignore the* kill signal for now and submit the result.*/ret = req->ret;} else {/** Supplicant is in the middle of processing this* request. Replace req with INVALID_REQ_PTR so that* the ID remains busy, causing optee_supp_send() to* fail on the next call to supp_pop_req() with this ID.*/idr_replace(&supp->idr, INVALID_REQ_PTR, req->id);ret = TEEC_ERROR_COMMUNICATION; }mutex_unlock(&supp->mutex);
req->ret = TEEC_ERROR_COMMUNICATION;
} else {ret = req->ret; }
ret = req->ret; kfree(req); return ret;}
static struct optee_supp_req *supp_pop_entry(struct optee_supp *supp,
int num_params, int *id)
int num_params){ struct optee_supp_req *req;
@@ -153,8 +184,8 @@ static struct optee_supp_req *supp_pop_entry(struct optee_supp *supp, return ERR_PTR(-EINVAL); }
*id = idr_alloc(&supp->idr, req, 1, 0, GFP_KERNEL);if (*id < 0)
req->id = idr_alloc(&supp->idr, req, 1, 0, GFP_KERNEL);if (req->id < 0) return ERR_PTR(-ENOMEM);Since we're now storing the supplicant request ID, wouldn't it make sense to pre-allocate the ID when allocating the request to avoid this error case?
True, but allocating the ID at this stage has one advantage. If an ID is not available, the request can remain on the request list, allowing the supplicant to retry later when resources become available. If ID allocation fails during request creation, I have no choice but to drop the request and report an error to optee.
We're allocating in the range 1..INT_MAX, and not more than a handful are expected to be active at a time. If we run out of IDs, we have bigger problems.
Ack.
list_del(&req->link);@@ -214,7 +245,6 @@ int optee_supp_recv(struct tee_context *ctx, u32 *func, u32 *num_params, struct optee *optee = tee_get_drvdata(teedev); struct optee_supp *supp = &optee->supp; struct optee_supp_req *req = NULL;
int id; size_t num_meta; int rc;@@ -224,15 +254,48 @@ int optee_supp_recv(struct tee_context *ctx, u32 *func, u32 *num_params,
while (true) { mutex_lock(&supp->mutex);
req = supp_pop_entry(supp, *num_params - num_meta, &id);mutex_unlock(&supp->mutex);if (req) {if (IS_ERR(req))return PTR_ERR(req);break;
req = supp_pop_entry(supp, *num_params - num_meta);if (!req) {mutex_unlock(&supp->mutex);goto wait_for_request;}if (IS_ERR(req)) {rc = PTR_ERR(req);mutex_unlock(&supp->mutex);return rc; }/** Process the request while holding the lock, so that* optee_supp_thrd_req() doesn't pull the request from under us.*/if (num_meta) {/** tee-supplicant support meta parameters ->* requests can be processed asynchronously.*/param->attr = TEE_IOCTL_PARAM_ATTR_TYPE_VALUE_INOUT |TEE_IOCTL_PARAM_ATTR_META;param->u.value.a = req->id;param->u.value.b = 0;param->u.value.c = 0;} else {supp->req_id = req->id;}*func = req->func;*num_params = req->num_params + num_meta;memcpy(param + num_meta, req->param,sizeof(struct tee_param) * req->num_params);mutex_unlock(&supp->mutex);return 0;Do we really need to move this into the loop? The structure of the function becomes a bit unusual and harder to read.
Ack. I'll reorganize this function.
+wait_for_request: /* * If we didn't get a request we'll block in * wait_for_completion() to avoid needless spinning. @@ -243,29 +306,10 @@ int optee_supp_recv(struct tee_context *ctx, u32 *func, u32 *num_params, */ if (wait_for_completion_interruptible(&supp->reqs_c)) return -ERESTARTSYS;
}if (num_meta) {/** tee-supplicant support meta parameters -> requsts can be* processed asynchronously.*/param->attr = TEE_IOCTL_PARAM_ATTR_TYPE_VALUE_INOUT |TEE_IOCTL_PARAM_ATTR_META;param->u.value.a = id;param->u.value.b = 0;param->u.value.c = 0;} else {mutex_lock(&supp->mutex);supp->req_id = id;mutex_unlock(&supp->mutex);
/* Check for the next request in the queue. */ }
*func = req->func;*num_params = req->num_params + num_meta;memcpy(param + num_meta, req->param,sizeof(struct tee_param) * req->num_params);return 0;}
@@ -297,12 +341,18 @@ static struct optee_supp_req *supp_pop_req(struct optee_supp *supp, if (!req) return ERR_PTR(-ENOENT);
/* optee_supp_thrd_req() already returned to optee. */if (IS_ERR(req))goto failed_req;if ((num_params - nm) != req->num_params) return ERR_PTR(-EINVAL);*num_meta = nm;+failed_req: idr_remove(&supp->idr, id); supp->req_id = -1;
*num_meta = nm;
return req;} @@ -328,9 +378,8 @@ int optee_supp_send(struct tee_context *ctx, u32 ret, u32 num_params,
mutex_lock(&supp->mutex); req = supp_pop_req(supp, num_params, param, &num_meta);
mutex_unlock(&supp->mutex);if (IS_ERR(req)) {
mutex_unlock(&supp->mutex);We need a way to tell the difference between an id not found and an id removed because of a killed requester. How about storing NULL for revoked requests instead of an err-pointer?
Not sure I'm following correctly. Are you expecting supp_pop_req() to return NULL instead of an err-pointer when a request has been revoked?
I was looking at it again, and storing an err-pointer as you do in this patch has the advantage that we can tell whether the ID has been revoked or was never supplied. In the latter case, it suggests that the supplicant is doing something wrong and might as well restart in an attempt to recover. So, please keep using an err-pointer as a placeholder, but we must be able to distinguish a revoked request from other errors to make sure that the supplicant doesn't restart due to a revoked request.
Understood. What if I switch the stored err-pointer to EBADF instead of ENOENT (which seems more natural), so it doesn't overlap with other supp_pop_req() error codes and the supplicant can reliably detect it.
Best Regards, Amir
Cheers, Jens
Best Rearads, Amir
Cheers, Jens
/* Something is wrong, let supplicant restart. */ return PTR_ERR(req); }@@ -355,9 +404,10 @@ int optee_supp_send(struct tee_context *ctx, u32 ret, u32 num_params, } } req->ret = ret;
req->processed = true; /* Let the requesting thread continue */ complete(&req->c);mutex_unlock(&supp->mutex); return 0;}
base-commit: 3f24e4edcd1b8981c6b448ea2680726dedd87279 change-id: 20250604-fix-use-after-free-8ff1b5d5d774
Best regards,
Amirreza Zarrabi amirreza.zarrabi@oss.qualcomm.com
Hi Amir,
On Tue, Feb 3, 2026 at 11:56 PM Amirreza Zarrabi amirreza.zarrabi@oss.qualcomm.com wrote:
Hi Jens,
On 2/3/2026 5:59 PM, Jens Wiklander wrote:
Hi,
On Tue, Feb 3, 2026 at 3:09 AM Amirreza Zarrabi amirreza.zarrabi@oss.qualcomm.com wrote:
Hi Jens,
On 2/2/2026 10:36 PM, Jens Wiklander wrote:
Hi Amir,
On Thu, Jan 29, 2026 at 4:22 AM Amirreza Zarrabi amirreza.zarrabi@oss.qualcomm.com wrote:
Commit 70b0d6b0a199 ("tee: optee: Fix supplicant wait loop") made the client wait as killable so it can be interrupted during shutdown or after a supplicant crash. This changes the original lifetime expectations: the client task can now terminate while the supplicant is still processing its request.
If the client exits first it removes the request from its queue and kfree()s it, while the request ID remains in supp->idr. A subsequent lookup on the supplicant path then dereferences freed memory, leading to a use-after-free.
Serialise access to the request with supp->mutex:
- Hold supp->mutex in optee_supp_recv() and optee_supp_send() while looking up and touching the request.
- Let optee_supp_thrd_req() notice that the client has terminated and signal optee_supp_send() accordingly.
With these changes the request cannot be freed while the supplicant still has a reference, eliminating the race.
Fixes: 70b0d6b0a199 ("tee: optee: Fix supplicant wait loop") Signed-off-by: Amirreza Zarrabi amirreza.zarrabi@oss.qualcomm.com
Changes in v3:
- Introduce processed flag instead of -1 for req->id.
- Update optee_supp_release() as reported by Michael Wu.
- Use mutex instead of guard.
- Link to v2: https://lore.kernel.org/r/20250617-fix-use-after-free-v2-1-1fbfafec5917@oss....
Changes in v2:
- Replace the static variable with a sentinel value.
- Fix the issue with returning the popped request to the supplicant.
- Link to v1: https://lore.kernel.org/r/20250605-fix-use-after-free-v1-1-a70d23bff248@oss....
drivers/tee/optee/supp.c | 122 +++++++++++++++++++++++++++++++++-------------- 1 file changed, 86 insertions(+), 36 deletions(-)
I had forgotten about this. I'd like to prioritize getting this fixed soon. By the way, how did you test this?
Thanks for the update. I currently don't have access to the setup required to run the tests myself. My plan is to finalize the design and implementation, then ask Michael Wu to run his use case. Based on his earlier feedback, the patch appears to be working as expected.
https://lore.kernel.org/all/292653ba-3836-00f1-acd4-a28b1c54388c@allwinnerte...
OK
diff --git a/drivers/tee/optee/supp.c b/drivers/tee/optee/supp.c index d0f397c90242..0ec66008df19 100644 --- a/drivers/tee/optee/supp.c +++ b/drivers/tee/optee/supp.c @@ -10,7 +10,11 @@ struct optee_supp_req { struct list_head link;
int id;bool in_queue;bool processed;u32 func; u32 ret; size_t num_params;@@ -19,6 +23,9 @@ struct optee_supp_req { struct completion c; };
+/* It is temporary request used for invalid pending request in supp->idr. */ +#define INVALID_REQ_PTR ((struct optee_supp_req *)ERR_PTR(-ENOENT))
void optee_supp_init(struct optee_supp *supp) { memset(supp, 0, sizeof(*supp)); @@ -46,6 +53,10 @@ void optee_supp_release(struct optee_supp *supp) /* Abort all request retrieved by supplicant */ idr_for_each_entry(&supp->idr, req, id) { idr_remove(&supp->idr, id);
/* Skip if request was already marked invalid */if (IS_ERR(req))continue;req->ret = TEEC_ERROR_COMMUNICATION; complete(&req->c); }@@ -102,6 +113,7 @@ u32 optee_supp_thrd_req(struct tee_context *ctx, u32 func, size_t num_params, mutex_lock(&supp->mutex); list_add_tail(&req->link, &supp->reqs); req->in_queue = true;
req->processed = false; mutex_unlock(&supp->mutex); /* Tell an eventual waiter there's a new request */@@ -117,21 +129,40 @@ u32 optee_supp_thrd_req(struct tee_context *ctx, u32 func, size_t num_params, if (wait_for_completion_killable(&req->c)) { mutex_lock(&supp->mutex); if (req->in_queue) {
/* Supplicant has not seen this request yet. */ list_del(&req->link); req->in_queue = false;ret = TEEC_ERROR_COMMUNICATION;} else if (req->processed) {/** Supplicant has processed this request. Ignore the* kill signal for now and submit the result.*/ret = req->ret;} else {/** Supplicant is in the middle of processing this* request. Replace req with INVALID_REQ_PTR so that* the ID remains busy, causing optee_supp_send() to* fail on the next call to supp_pop_req() with this ID.*/idr_replace(&supp->idr, INVALID_REQ_PTR, req->id);ret = TEEC_ERROR_COMMUNICATION; }mutex_unlock(&supp->mutex);
req->ret = TEEC_ERROR_COMMUNICATION;
} else {ret = req->ret; }
ret = req->ret; kfree(req); return ret;}
static struct optee_supp_req *supp_pop_entry(struct optee_supp *supp,
int num_params, int *id)
int num_params){ struct optee_supp_req *req;
@@ -153,8 +184,8 @@ static struct optee_supp_req *supp_pop_entry(struct optee_supp *supp, return ERR_PTR(-EINVAL); }
*id = idr_alloc(&supp->idr, req, 1, 0, GFP_KERNEL);if (*id < 0)
req->id = idr_alloc(&supp->idr, req, 1, 0, GFP_KERNEL);if (req->id < 0) return ERR_PTR(-ENOMEM);Since we're now storing the supplicant request ID, wouldn't it make sense to pre-allocate the ID when allocating the request to avoid this error case?
True, but allocating the ID at this stage has one advantage. If an ID is not available, the request can remain on the request list, allowing the supplicant to retry later when resources become available. If ID allocation fails during request creation, I have no choice but to drop the request and report an error to optee.
We're allocating in the range 1..INT_MAX, and not more than a handful are expected to be active at a time. If we run out of IDs, we have bigger problems.
Ack.
list_del(&req->link);@@ -214,7 +245,6 @@ int optee_supp_recv(struct tee_context *ctx, u32 *func, u32 *num_params, struct optee *optee = tee_get_drvdata(teedev); struct optee_supp *supp = &optee->supp; struct optee_supp_req *req = NULL;
int id; size_t num_meta; int rc;@@ -224,15 +254,48 @@ int optee_supp_recv(struct tee_context *ctx, u32 *func, u32 *num_params,
while (true) { mutex_lock(&supp->mutex);
req = supp_pop_entry(supp, *num_params - num_meta, &id);mutex_unlock(&supp->mutex);if (req) {if (IS_ERR(req))return PTR_ERR(req);break;
req = supp_pop_entry(supp, *num_params - num_meta);if (!req) {mutex_unlock(&supp->mutex);goto wait_for_request;}if (IS_ERR(req)) {rc = PTR_ERR(req);mutex_unlock(&supp->mutex);return rc; }/** Process the request while holding the lock, so that* optee_supp_thrd_req() doesn't pull the request from under us.*/if (num_meta) {/** tee-supplicant support meta parameters ->* requests can be processed asynchronously.*/param->attr = TEE_IOCTL_PARAM_ATTR_TYPE_VALUE_INOUT |TEE_IOCTL_PARAM_ATTR_META;param->u.value.a = req->id;param->u.value.b = 0;param->u.value.c = 0;} else {supp->req_id = req->id;}*func = req->func;*num_params = req->num_params + num_meta;memcpy(param + num_meta, req->param,sizeof(struct tee_param) * req->num_params);mutex_unlock(&supp->mutex);return 0;Do we really need to move this into the loop? The structure of the function becomes a bit unusual and harder to read.
Ack. I'll reorganize this function.
+wait_for_request: /* * If we didn't get a request we'll block in * wait_for_completion() to avoid needless spinning. @@ -243,29 +306,10 @@ int optee_supp_recv(struct tee_context *ctx, u32 *func, u32 *num_params, */ if (wait_for_completion_interruptible(&supp->reqs_c)) return -ERESTARTSYS;
}if (num_meta) {/** tee-supplicant support meta parameters -> requsts can be* processed asynchronously.*/param->attr = TEE_IOCTL_PARAM_ATTR_TYPE_VALUE_INOUT |TEE_IOCTL_PARAM_ATTR_META;param->u.value.a = id;param->u.value.b = 0;param->u.value.c = 0;} else {mutex_lock(&supp->mutex);supp->req_id = id;mutex_unlock(&supp->mutex);
/* Check for the next request in the queue. */ }
*func = req->func;*num_params = req->num_params + num_meta;memcpy(param + num_meta, req->param,sizeof(struct tee_param) * req->num_params);return 0;}
@@ -297,12 +341,18 @@ static struct optee_supp_req *supp_pop_req(struct optee_supp *supp, if (!req) return ERR_PTR(-ENOENT);
/* optee_supp_thrd_req() already returned to optee. */if (IS_ERR(req))goto failed_req;if ((num_params - nm) != req->num_params) return ERR_PTR(-EINVAL);*num_meta = nm;+failed_req: idr_remove(&supp->idr, id); supp->req_id = -1;
*num_meta = nm;
return req;} @@ -328,9 +378,8 @@ int optee_supp_send(struct tee_context *ctx, u32 ret, u32 num_params,
mutex_lock(&supp->mutex); req = supp_pop_req(supp, num_params, param, &num_meta);
mutex_unlock(&supp->mutex);if (IS_ERR(req)) {
mutex_unlock(&supp->mutex);We need a way to tell the difference between an id not found and an id removed because of a killed requester. How about storing NULL for revoked requests instead of an err-pointer?
Not sure I'm following correctly. Are you expecting supp_pop_req() to return NULL instead of an err-pointer when a request has been revoked?
I was looking at it again, and storing an err-pointer as you do in this patch has the advantage that we can tell whether the ID has been revoked or was never supplied. In the latter case, it suggests that the supplicant is doing something wrong and might as well restart in an attempt to recover. So, please keep using an err-pointer as a placeholder, but we must be able to distinguish a revoked request from other errors to make sure that the supplicant doesn't restart due to a revoked request.
Understood. What if I switch the stored err-pointer to EBADF instead of ENOENT (which seems more natural), so it doesn't overlap with other supp_pop_req() error codes and the supplicant can reliably detect it.
Any error returned by TEE_IOC_SUPPL_SEND (or TEE_IOC_SUPPL_RECV) will cause the tee-supplicant to exit. Even if we update it to ignore certain codes, we must also consider the installed base. There's not much tee-supplicant could do with this error, except logging it. But I don't think that is very useful either. Unless the tee-supplicant does anything wrong or if the device isn't working any longer, we shouldn't return an error.
Cheers, Jens
Best Regards, Amir
Cheers, Jens
Best Rearads, Amir
Cheers, Jens
/* Something is wrong, let supplicant restart. */ return PTR_ERR(req); }@@ -355,9 +404,10 @@ int optee_supp_send(struct tee_context *ctx, u32 ret, u32 num_params, } } req->ret = ret;
req->processed = true; /* Let the requesting thread continue */ complete(&req->c);mutex_unlock(&supp->mutex); return 0;}
base-commit: 3f24e4edcd1b8981c6b448ea2680726dedd87279 change-id: 20250604-fix-use-after-free-8ff1b5d5d774
Best regards,
Amirreza Zarrabi amirreza.zarrabi@oss.qualcomm.com
Hi Jens,
On 2/4/2026 6:46 PM, Jens Wiklander wrote:
Hi Amir,
On Tue, Feb 3, 2026 at 11:56 PM Amirreza Zarrabi amirreza.zarrabi@oss.qualcomm.com wrote:
Hi Jens,
On 2/3/2026 5:59 PM, Jens Wiklander wrote:
Hi,
On Tue, Feb 3, 2026 at 3:09 AM Amirreza Zarrabi amirreza.zarrabi@oss.qualcomm.com wrote:
Hi Jens,
On 2/2/2026 10:36 PM, Jens Wiklander wrote:
Hi Amir,
On Thu, Jan 29, 2026 at 4:22 AM Amirreza Zarrabi amirreza.zarrabi@oss.qualcomm.com wrote:
Commit 70b0d6b0a199 ("tee: optee: Fix supplicant wait loop") made the client wait as killable so it can be interrupted during shutdown or after a supplicant crash. This changes the original lifetime expectations: the client task can now terminate while the supplicant is still processing its request.
If the client exits first it removes the request from its queue and kfree()s it, while the request ID remains in supp->idr. A subsequent lookup on the supplicant path then dereferences freed memory, leading to a use-after-free.
Serialise access to the request with supp->mutex:
- Hold supp->mutex in optee_supp_recv() and optee_supp_send() while looking up and touching the request.
- Let optee_supp_thrd_req() notice that the client has terminated and signal optee_supp_send() accordingly.
With these changes the request cannot be freed while the supplicant still has a reference, eliminating the race.
Fixes: 70b0d6b0a199 ("tee: optee: Fix supplicant wait loop") Signed-off-by: Amirreza Zarrabi amirreza.zarrabi@oss.qualcomm.com
Changes in v3:
- Introduce processed flag instead of -1 for req->id.
- Update optee_supp_release() as reported by Michael Wu.
- Use mutex instead of guard.
- Link to v2: https://lore.kernel.org/r/20250617-fix-use-after-free-v2-1-1fbfafec5917@oss....
Changes in v2:
- Replace the static variable with a sentinel value.
- Fix the issue with returning the popped request to the supplicant.
- Link to v1: https://lore.kernel.org/r/20250605-fix-use-after-free-v1-1-a70d23bff248@oss....
drivers/tee/optee/supp.c | 122 +++++++++++++++++++++++++++++++++-------------- 1 file changed, 86 insertions(+), 36 deletions(-)
I had forgotten about this. I'd like to prioritize getting this fixed soon. By the way, how did you test this?
Thanks for the update. I currently don't have access to the setup required to run the tests myself. My plan is to finalize the design and implementation, then ask Michael Wu to run his use case. Based on his earlier feedback, the patch appears to be working as expected.
https://lore.kernel.org/all/292653ba-3836-00f1-acd4-a28b1c54388c@allwinnerte...
OK
diff --git a/drivers/tee/optee/supp.c b/drivers/tee/optee/supp.c index d0f397c90242..0ec66008df19 100644 --- a/drivers/tee/optee/supp.c +++ b/drivers/tee/optee/supp.c @@ -10,7 +10,11 @@ struct optee_supp_req { struct list_head link;
int id;bool in_queue;bool processed;u32 func; u32 ret; size_t num_params;@@ -19,6 +23,9 @@ struct optee_supp_req { struct completion c; };
+/* It is temporary request used for invalid pending request in supp->idr. */ +#define INVALID_REQ_PTR ((struct optee_supp_req *)ERR_PTR(-ENOENT))
void optee_supp_init(struct optee_supp *supp) { memset(supp, 0, sizeof(*supp)); @@ -46,6 +53,10 @@ void optee_supp_release(struct optee_supp *supp) /* Abort all request retrieved by supplicant */ idr_for_each_entry(&supp->idr, req, id) { idr_remove(&supp->idr, id);
/* Skip if request was already marked invalid */if (IS_ERR(req))continue;req->ret = TEEC_ERROR_COMMUNICATION; complete(&req->c); }@@ -102,6 +113,7 @@ u32 optee_supp_thrd_req(struct tee_context *ctx, u32 func, size_t num_params, mutex_lock(&supp->mutex); list_add_tail(&req->link, &supp->reqs); req->in_queue = true;
req->processed = false; mutex_unlock(&supp->mutex); /* Tell an eventual waiter there's a new request */@@ -117,21 +129,40 @@ u32 optee_supp_thrd_req(struct tee_context *ctx, u32 func, size_t num_params, if (wait_for_completion_killable(&req->c)) { mutex_lock(&supp->mutex); if (req->in_queue) {
/* Supplicant has not seen this request yet. */ list_del(&req->link); req->in_queue = false;ret = TEEC_ERROR_COMMUNICATION;} else if (req->processed) {/** Supplicant has processed this request. Ignore the* kill signal for now and submit the result.*/ret = req->ret;} else {/** Supplicant is in the middle of processing this* request. Replace req with INVALID_REQ_PTR so that* the ID remains busy, causing optee_supp_send() to* fail on the next call to supp_pop_req() with this ID.*/idr_replace(&supp->idr, INVALID_REQ_PTR, req->id);ret = TEEC_ERROR_COMMUNICATION; }mutex_unlock(&supp->mutex);
req->ret = TEEC_ERROR_COMMUNICATION;
} else {ret = req->ret; }
ret = req->ret; kfree(req); return ret;}
static struct optee_supp_req *supp_pop_entry(struct optee_supp *supp,
int num_params, int *id)
int num_params){ struct optee_supp_req *req;
@@ -153,8 +184,8 @@ static struct optee_supp_req *supp_pop_entry(struct optee_supp *supp, return ERR_PTR(-EINVAL); }
*id = idr_alloc(&supp->idr, req, 1, 0, GFP_KERNEL);if (*id < 0)
req->id = idr_alloc(&supp->idr, req, 1, 0, GFP_KERNEL);if (req->id < 0) return ERR_PTR(-ENOMEM);Since we're now storing the supplicant request ID, wouldn't it make sense to pre-allocate the ID when allocating the request to avoid this error case?
True, but allocating the ID at this stage has one advantage. If an ID is not available, the request can remain on the request list, allowing the supplicant to retry later when resources become available. If ID allocation fails during request creation, I have no choice but to drop the request and report an error to optee.
We're allocating in the range 1..INT_MAX, and not more than a handful are expected to be active at a time. If we run out of IDs, we have bigger problems.
Ack.
list_del(&req->link);@@ -214,7 +245,6 @@ int optee_supp_recv(struct tee_context *ctx, u32 *func, u32 *num_params, struct optee *optee = tee_get_drvdata(teedev); struct optee_supp *supp = &optee->supp; struct optee_supp_req *req = NULL;
int id; size_t num_meta; int rc;@@ -224,15 +254,48 @@ int optee_supp_recv(struct tee_context *ctx, u32 *func, u32 *num_params,
while (true) { mutex_lock(&supp->mutex);
req = supp_pop_entry(supp, *num_params - num_meta, &id);mutex_unlock(&supp->mutex);if (req) {if (IS_ERR(req))return PTR_ERR(req);break;
req = supp_pop_entry(supp, *num_params - num_meta);if (!req) {mutex_unlock(&supp->mutex);goto wait_for_request;}if (IS_ERR(req)) {rc = PTR_ERR(req);mutex_unlock(&supp->mutex);return rc; }/** Process the request while holding the lock, so that* optee_supp_thrd_req() doesn't pull the request from under us.*/if (num_meta) {/** tee-supplicant support meta parameters ->* requests can be processed asynchronously.*/param->attr = TEE_IOCTL_PARAM_ATTR_TYPE_VALUE_INOUT |TEE_IOCTL_PARAM_ATTR_META;param->u.value.a = req->id;param->u.value.b = 0;param->u.value.c = 0;} else {supp->req_id = req->id;}*func = req->func;*num_params = req->num_params + num_meta;memcpy(param + num_meta, req->param,sizeof(struct tee_param) * req->num_params);mutex_unlock(&supp->mutex);return 0;Do we really need to move this into the loop? The structure of the function becomes a bit unusual and harder to read.
Ack. I'll reorganize this function.
+wait_for_request: /* * If we didn't get a request we'll block in * wait_for_completion() to avoid needless spinning. @@ -243,29 +306,10 @@ int optee_supp_recv(struct tee_context *ctx, u32 *func, u32 *num_params, */ if (wait_for_completion_interruptible(&supp->reqs_c)) return -ERESTARTSYS;
}if (num_meta) {/** tee-supplicant support meta parameters -> requsts can be* processed asynchronously.*/param->attr = TEE_IOCTL_PARAM_ATTR_TYPE_VALUE_INOUT |TEE_IOCTL_PARAM_ATTR_META;param->u.value.a = id;param->u.value.b = 0;param->u.value.c = 0;} else {mutex_lock(&supp->mutex);supp->req_id = id;mutex_unlock(&supp->mutex);
/* Check for the next request in the queue. */ }
*func = req->func;*num_params = req->num_params + num_meta;memcpy(param + num_meta, req->param,sizeof(struct tee_param) * req->num_params);return 0;}
@@ -297,12 +341,18 @@ static struct optee_supp_req *supp_pop_req(struct optee_supp *supp, if (!req) return ERR_PTR(-ENOENT);
/* optee_supp_thrd_req() already returned to optee. */if (IS_ERR(req))goto failed_req;if ((num_params - nm) != req->num_params) return ERR_PTR(-EINVAL);*num_meta = nm;+failed_req: idr_remove(&supp->idr, id); supp->req_id = -1;
*num_meta = nm;
return req;} @@ -328,9 +378,8 @@ int optee_supp_send(struct tee_context *ctx, u32 ret, u32 num_params,
mutex_lock(&supp->mutex); req = supp_pop_req(supp, num_params, param, &num_meta);
mutex_unlock(&supp->mutex);if (IS_ERR(req)) {
mutex_unlock(&supp->mutex);We need a way to tell the difference between an id not found and an id removed because of a killed requester. How about storing NULL for revoked requests instead of an err-pointer?
Not sure I'm following correctly. Are you expecting supp_pop_req() to return NULL instead of an err-pointer when a request has been revoked?
I was looking at it again, and storing an err-pointer as you do in this patch has the advantage that we can tell whether the ID has been revoked or was never supplied. In the latter case, it suggests that the supplicant is doing something wrong and might as well restart in an attempt to recover. So, please keep using an err-pointer as a placeholder, but we must be able to distinguish a revoked request from other errors to make sure that the supplicant doesn't restart due to a revoked request.
Understood. What if I switch the stored err-pointer to EBADF instead of ENOENT (which seems more natural), so it doesn't overlap with other supp_pop_req() error codes and the supplicant can reliably detect it.
Any error returned by TEE_IOC_SUPPL_SEND (or TEE_IOC_SUPPL_RECV) will cause the tee-supplicant to exit. Even if we update it to ignore certain codes, we must also consider the installed base. There's not much tee-supplicant could do with this error, except logging it. But I don't think that is very useful either. Unless the tee-supplicant does anything wrong or if the device isn't working any longer, we shouldn't return an error.
The direction of data flow in optee_supp_send is from the supplicant to optee, so the only way I can return meaningful information back to the supplicant is through the return value. I suppose I could simply ignore the revoked request and return success, but it might be useful for the supplicant to know about it in case it needs to roll back something.
At this point I'm out of ideas :). Do you have any suggestions on how I can inform the supplicant about a revoked request in optee_supp_send while returning success return value?
Best regards, Amir
Cheers, Jens
Best Regards, Amir
Cheers, Jens
Best Rearads, Amir
Cheers, Jens
/* Something is wrong, let supplicant restart. */ return PTR_ERR(req); }@@ -355,9 +404,10 @@ int optee_supp_send(struct tee_context *ctx, u32 ret, u32 num_params, } } req->ret = ret;
req->processed = true; /* Let the requesting thread continue */ complete(&req->c);mutex_unlock(&supp->mutex); return 0;}
base-commit: 3f24e4edcd1b8981c6b448ea2680726dedd87279 change-id: 20250604-fix-use-after-free-8ff1b5d5d774
Best regards,
Amirreza Zarrabi amirreza.zarrabi@oss.qualcomm.com
Hi Amir,
On Thu, Feb 5, 2026 at 3:13 AM Amirreza Zarrabi amirreza.zarrabi@oss.qualcomm.com wrote:
Hi Jens,
On 2/4/2026 6:46 PM, Jens Wiklander wrote:
Hi Amir,
On Tue, Feb 3, 2026 at 11:56 PM Amirreza Zarrabi amirreza.zarrabi@oss.qualcomm.com wrote:
Hi Jens,
On 2/3/2026 5:59 PM, Jens Wiklander wrote:
Hi,
On Tue, Feb 3, 2026 at 3:09 AM Amirreza Zarrabi amirreza.zarrabi@oss.qualcomm.com wrote:
Hi Jens,
On 2/2/2026 10:36 PM, Jens Wiklander wrote:
Hi Amir,
On Thu, Jan 29, 2026 at 4:22 AM Amirreza Zarrabi amirreza.zarrabi@oss.qualcomm.com wrote: > > Commit 70b0d6b0a199 ("tee: optee: Fix supplicant wait loop") made the > client wait as killable so it can be interrupted during shutdown or > after a supplicant crash. This changes the original lifetime expectations: > the client task can now terminate while the supplicant is still processing > its request. > > If the client exits first it removes the request from its queue and > kfree()s it, while the request ID remains in supp->idr. A subsequent > lookup on the supplicant path then dereferences freed memory, leading to > a use-after-free. > > Serialise access to the request with supp->mutex: > > * Hold supp->mutex in optee_supp_recv() and optee_supp_send() while > looking up and touching the request. > * Let optee_supp_thrd_req() notice that the client has terminated and > signal optee_supp_send() accordingly. > > With these changes the request cannot be freed while the supplicant still > has a reference, eliminating the race. > > Fixes: 70b0d6b0a199 ("tee: optee: Fix supplicant wait loop") > Signed-off-by: Amirreza Zarrabi amirreza.zarrabi@oss.qualcomm.com > --- > Changes in v3: > - Introduce processed flag instead of -1 for req->id. > - Update optee_supp_release() as reported by Michael Wu. > - Use mutex instead of guard. > - Link to v2: https://lore.kernel.org/r/20250617-fix-use-after-free-v2-1-1fbfafec5917@oss.... > > Changes in v2: > - Replace the static variable with a sentinel value. > - Fix the issue with returning the popped request to the supplicant. > - Link to v1: https://lore.kernel.org/r/20250605-fix-use-after-free-v1-1-a70d23bff248@oss.... > --- > drivers/tee/optee/supp.c | 122 +++++++++++++++++++++++++++++++++-------------- > 1 file changed, 86 insertions(+), 36 deletions(-)
[snip]
> @@ -328,9 +378,8 @@ int optee_supp_send(struct tee_context *ctx, u32 ret, u32 num_params, > > mutex_lock(&supp->mutex); > req = supp_pop_req(supp, num_params, param, &num_meta); > - mutex_unlock(&supp->mutex); > - > if (IS_ERR(req)) { > + mutex_unlock(&supp->mutex);
We need a way to tell the difference between an id not found and an id removed because of a killed requester. How about storing NULL for revoked requests instead of an err-pointer?
Not sure I'm following correctly. Are you expecting supp_pop_req() to return NULL instead of an err-pointer when a request has been revoked?
I was looking at it again, and storing an err-pointer as you do in this patch has the advantage that we can tell whether the ID has been revoked or was never supplied. In the latter case, it suggests that the supplicant is doing something wrong and might as well restart in an attempt to recover. So, please keep using an err-pointer as a placeholder, but we must be able to distinguish a revoked request from other errors to make sure that the supplicant doesn't restart due to a revoked request.
Understood. What if I switch the stored err-pointer to EBADF instead of ENOENT (which seems more natural), so it doesn't overlap with other supp_pop_req() error codes and the supplicant can reliably detect it.
Any error returned by TEE_IOC_SUPPL_SEND (or TEE_IOC_SUPPL_RECV) will cause the tee-supplicant to exit. Even if we update it to ignore certain codes, we must also consider the installed base. There's not much tee-supplicant could do with this error, except logging it. But I don't think that is very useful either. Unless the tee-supplicant does anything wrong or if the device isn't working any longer, we shouldn't return an error.
The direction of data flow in optee_supp_send is from the supplicant to optee, so the only way I can return meaningful information back to the supplicant is through the return value. I suppose I could simply ignore the revoked request and return success, but it might be useful for the supplicant to know about it in case it needs to roll back something.
At this point I'm out of ideas :). Do you have any suggestions on how I can inform the supplicant about a revoked request in optee_supp_send while returning success return value?
This became a bit harder than I first thought. At this point, to fix the possible use-after-free, we have two options: 1. Returning an error code: tee-supplicant will exit 2. Returning OK: worst case, tee-supplicant can leak memory
During reboot, neither case is a problem. During normal operation, it's annoying if tee-supplicant exists, but you still need some privileges to kill the client. If we return an error, it's enough to update tee-supplicant to handle that error case, and we're done. The advantage is no added code to the kernel.
I think we should do what you suggested and return an error. This will not happen during normal operation. We'll fix tee-supplicant to handle the return error properly. tee-supplicant doesn't care about what error code it gets. If there's an error in TEE_IOC_SUPPL_SEND, it knows that no one will receive whatever was sent, and cleanup is needed.
Sumit and Jerome, what do you think?
Cheers, Jens
Hi,
On Fri, Feb 6, 2026 at 9:54 AM Jens Wiklander jens.wiklander@linaro.org wrote:
Hi Amir,
On Thu, Feb 5, 2026 at 3:13 AM Amirreza Zarrabi amirreza.zarrabi@oss.qualcomm.com wrote:
Hi Jens,
On 2/4/2026 6:46 PM, Jens Wiklander wrote:
Hi Amir,
On Tue, Feb 3, 2026 at 11:56 PM Amirreza Zarrabi amirreza.zarrabi@oss.qualcomm.com wrote:
Hi Jens,
On 2/3/2026 5:59 PM, Jens Wiklander wrote:
Hi,
On Tue, Feb 3, 2026 at 3:09 AM Amirreza Zarrabi amirreza.zarrabi@oss.qualcomm.com wrote:
Hi Jens,
On 2/2/2026 10:36 PM, Jens Wiklander wrote: > Hi Amir, > > On Thu, Jan 29, 2026 at 4:22 AM Amirreza Zarrabi > amirreza.zarrabi@oss.qualcomm.com wrote: >> >> Commit 70b0d6b0a199 ("tee: optee: Fix supplicant wait loop") made the >> client wait as killable so it can be interrupted during shutdown or >> after a supplicant crash. This changes the original lifetime expectations: >> the client task can now terminate while the supplicant is still processing >> its request. >> >> If the client exits first it removes the request from its queue and >> kfree()s it, while the request ID remains in supp->idr. A subsequent >> lookup on the supplicant path then dereferences freed memory, leading to >> a use-after-free. >> >> Serialise access to the request with supp->mutex: >> >> * Hold supp->mutex in optee_supp_recv() and optee_supp_send() while >> looking up and touching the request. >> * Let optee_supp_thrd_req() notice that the client has terminated and >> signal optee_supp_send() accordingly. >> >> With these changes the request cannot be freed while the supplicant still >> has a reference, eliminating the race. >> >> Fixes: 70b0d6b0a199 ("tee: optee: Fix supplicant wait loop") >> Signed-off-by: Amirreza Zarrabi amirreza.zarrabi@oss.qualcomm.com >> --- >> Changes in v3: >> - Introduce processed flag instead of -1 for req->id. >> - Update optee_supp_release() as reported by Michael Wu. >> - Use mutex instead of guard. >> - Link to v2: https://lore.kernel.org/r/20250617-fix-use-after-free-v2-1-1fbfafec5917@oss.... >> >> Changes in v2: >> - Replace the static variable with a sentinel value. >> - Fix the issue with returning the popped request to the supplicant. >> - Link to v1: https://lore.kernel.org/r/20250605-fix-use-after-free-v1-1-a70d23bff248@oss.... >> --- >> drivers/tee/optee/supp.c | 122 +++++++++++++++++++++++++++++++++-------------- >> 1 file changed, 86 insertions(+), 36 deletions(-) >
[snip]
>> @@ -328,9 +378,8 @@ int optee_supp_send(struct tee_context *ctx, u32 ret, u32 num_params, >> >> mutex_lock(&supp->mutex); >> req = supp_pop_req(supp, num_params, param, &num_meta); >> - mutex_unlock(&supp->mutex); >> - >> if (IS_ERR(req)) { >> + mutex_unlock(&supp->mutex); > > We need a way to tell the difference between an id not found and an id > removed because of a killed requester. > How about storing NULL for revoked requests instead of an err-pointer? >
Not sure I'm following correctly. Are you expecting supp_pop_req() to return NULL instead of an err-pointer when a request has been revoked?
I was looking at it again, and storing an err-pointer as you do in this patch has the advantage that we can tell whether the ID has been revoked or was never supplied. In the latter case, it suggests that the supplicant is doing something wrong and might as well restart in an attempt to recover. So, please keep using an err-pointer as a placeholder, but we must be able to distinguish a revoked request from other errors to make sure that the supplicant doesn't restart due to a revoked request.
Understood. What if I switch the stored err-pointer to EBADF instead of ENOENT (which seems more natural), so it doesn't overlap with other supp_pop_req() error codes and the supplicant can reliably detect it.
Any error returned by TEE_IOC_SUPPL_SEND (or TEE_IOC_SUPPL_RECV) will cause the tee-supplicant to exit. Even if we update it to ignore certain codes, we must also consider the installed base. There's not much tee-supplicant could do with this error, except logging it. But I don't think that is very useful either. Unless the tee-supplicant does anything wrong or if the device isn't working any longer, we shouldn't return an error.
The direction of data flow in optee_supp_send is from the supplicant to optee, so the only way I can return meaningful information back to the supplicant is through the return value. I suppose I could simply ignore the revoked request and return success, but it might be useful for the supplicant to know about it in case it needs to roll back something.
At this point I'm out of ideas :). Do you have any suggestions on how I can inform the supplicant about a revoked request in optee_supp_send while returning success return value?
This became a bit harder than I first thought. At this point, to fix the possible use-after-free, we have two options:
- Returning an error code: tee-supplicant will exit
- Returning OK: worst case, tee-supplicant can leak memory
During reboot, neither case is a problem. During normal operation, it's annoying if tee-supplicant exists, but you still need some privileges to kill the client. If we return an error, it's enough to update tee-supplicant to handle that error case, and we're done. The advantage is no added code to the kernel.
I think we should do what you suggested and return an error. This will not happen during normal operation. We'll fix tee-supplicant to handle the return error properly. tee-supplicant doesn't care about what error code it gets. If there's an error in TEE_IOC_SUPPL_SEND, it knows that no one will receive whatever was sent, and cleanup is needed.
Sumit and Jerome, what do you think?
Sounds good.
Thanks,
Hi Jens,
On 2/6/2026 7:54 PM, Jens Wiklander wrote:
Hi Amir,
On Thu, Feb 5, 2026 at 3:13 AM Amirreza Zarrabi amirreza.zarrabi@oss.qualcomm.com wrote:
Hi Jens,
On 2/4/2026 6:46 PM, Jens Wiklander wrote:
Hi Amir,
On Tue, Feb 3, 2026 at 11:56 PM Amirreza Zarrabi amirreza.zarrabi@oss.qualcomm.com wrote:
Hi Jens,
On 2/3/2026 5:59 PM, Jens Wiklander wrote:
Hi,
On Tue, Feb 3, 2026 at 3:09 AM Amirreza Zarrabi amirreza.zarrabi@oss.qualcomm.com wrote:
Hi Jens,
On 2/2/2026 10:36 PM, Jens Wiklander wrote: > Hi Amir, > > On Thu, Jan 29, 2026 at 4:22 AM Amirreza Zarrabi > amirreza.zarrabi@oss.qualcomm.com wrote: >> >> Commit 70b0d6b0a199 ("tee: optee: Fix supplicant wait loop") made the >> client wait as killable so it can be interrupted during shutdown or >> after a supplicant crash. This changes the original lifetime expectations: >> the client task can now terminate while the supplicant is still processing >> its request. >> >> If the client exits first it removes the request from its queue and >> kfree()s it, while the request ID remains in supp->idr. A subsequent >> lookup on the supplicant path then dereferences freed memory, leading to >> a use-after-free. >> >> Serialise access to the request with supp->mutex: >> >> * Hold supp->mutex in optee_supp_recv() and optee_supp_send() while >> looking up and touching the request. >> * Let optee_supp_thrd_req() notice that the client has terminated and >> signal optee_supp_send() accordingly. >> >> With these changes the request cannot be freed while the supplicant still >> has a reference, eliminating the race. >> >> Fixes: 70b0d6b0a199 ("tee: optee: Fix supplicant wait loop") >> Signed-off-by: Amirreza Zarrabi amirreza.zarrabi@oss.qualcomm.com >> --- >> Changes in v3: >> - Introduce processed flag instead of -1 for req->id. >> - Update optee_supp_release() as reported by Michael Wu. >> - Use mutex instead of guard. >> - Link to v2: https://lore.kernel.org/r/20250617-fix-use-after-free-v2-1-1fbfafec5917@oss.... >> >> Changes in v2: >> - Replace the static variable with a sentinel value. >> - Fix the issue with returning the popped request to the supplicant. >> - Link to v1: https://lore.kernel.org/r/20250605-fix-use-after-free-v1-1-a70d23bff248@oss.... >> --- >> drivers/tee/optee/supp.c | 122 +++++++++++++++++++++++++++++++++-------------- >> 1 file changed, 86 insertions(+), 36 deletions(-) >
[snip]
>> @@ -328,9 +378,8 @@ int optee_supp_send(struct tee_context *ctx, u32 ret, u32 num_params, >> >> mutex_lock(&supp->mutex); >> req = supp_pop_req(supp, num_params, param, &num_meta); >> - mutex_unlock(&supp->mutex); >> - >> if (IS_ERR(req)) { >> + mutex_unlock(&supp->mutex); > > We need a way to tell the difference between an id not found and an id > removed because of a killed requester. > How about storing NULL for revoked requests instead of an err-pointer? >
Not sure I'm following correctly. Are you expecting supp_pop_req() to return NULL instead of an err-pointer when a request has been revoked?
I was looking at it again, and storing an err-pointer as you do in this patch has the advantage that we can tell whether the ID has been revoked or was never supplied. In the latter case, it suggests that the supplicant is doing something wrong and might as well restart in an attempt to recover. So, please keep using an err-pointer as a placeholder, but we must be able to distinguish a revoked request from other errors to make sure that the supplicant doesn't restart due to a revoked request.
Understood. What if I switch the stored err-pointer to EBADF instead of ENOENT (which seems more natural), so it doesn't overlap with other supp_pop_req() error codes and the supplicant can reliably detect it.
Any error returned by TEE_IOC_SUPPL_SEND (or TEE_IOC_SUPPL_RECV) will cause the tee-supplicant to exit. Even if we update it to ignore certain codes, we must also consider the installed base. There's not much tee-supplicant could do with this error, except logging it. But I don't think that is very useful either. Unless the tee-supplicant does anything wrong or if the device isn't working any longer, we shouldn't return an error.
The direction of data flow in optee_supp_send is from the supplicant to optee, so the only way I can return meaningful information back to the supplicant is through the return value. I suppose I could simply ignore the revoked request and return success, but it might be useful for the supplicant to know about it in case it needs to roll back something.
At this point I'm out of ideas :). Do you have any suggestions on how I can inform the supplicant about a revoked request in optee_supp_send while returning success return value?
This became a bit harder than I first thought. At this point, to fix the possible use-after-free, we have two options:
- Returning an error code: tee-supplicant will exit
- Returning OK: worst case, tee-supplicant can leak memory
During reboot, neither case is a problem. During normal operation, it's annoying if tee-supplicant exists, but you still need some privileges to kill the client. If we return an error, it's enough to update tee-supplicant to handle that error case, and we're done. The advantage is no added code to the kernel.
I think we should do what you suggested and return an error. This will not happen during normal operation. We'll fix tee-supplicant to handle the return error properly. tee-supplicant doesn't care about what error code it gets. If there's an error in TEE_IOC_SUPPL_SEND, it knows that no one will receive whatever was sent, and cleanup is needed.
Sumit and Jerome, what do you think?
Thanks Jens, I'll proceed with v4.
Best Regards, Amir
Cheers, Jens
op-tee@lists.trustedfirmware.org