Hi.
While doing my fuzzing work, I found the following kernel crash by NULL pointer dereference in Linux 6.17-rc5.
[ 16.143987] Unable to handle kernel NULL pointer dereference at virtual address 0000000000000008 [ 16.144141] Mem abort info: [ 16.144215] ESR = 0x0000000096000004 [ 16.144246] EC = 0x25 ** replaying previous printk message ** [ 16.144246] EC = 0x25: DABT (current EL), IL = 32 bits [ 16.144271] SET = 0, FnV = 0 [ 16.144289] EA = 0, S1PTW = 0 [ 16.144308] FSC = 0x04: level 0 translation fault [ 16.144325] Data abort info: [ 16.144335] ISV = 0, ISS = 0x00000004, ISS2 = 0x00000000 [ 16.144346] CM = 0, WnR = 0, TnD = 0, TagAccess = 0 [ 16.144358] GCS = 0, Overlay = 0, DirtyBit = 0, Xs = 0 [ 16.144412] user pgtable: 4k pages, 52-bit VAs, pgdp=0000000048b34b00 [ 16.144432] [0000000000000008] pgd=0800000040bb3403, p4d=0000000000000000 [ 16.144876] Internal error: Oops: 0000000096000004 [#1] SMP [ 16.146429] Modules linked in: [ 16.146775] CPU: 0 UID: 0 PID: 148 Comm: xtest Not tainted 6.17.0-rc5 #58 PREEMPT [ 16.146995] Hardware name: linux,dummy-virt (DT) [ 16.147181] pstate: 21402005 (nzCv daif +PAN -UAO -TCO +DIT -SSBS BTYPE=--) [ 16.147330] pc : unpin_user_pages+0x78/0xd0 [ 16.147763] lr : unpin_user_pages+0xa0/0xd0 [ 16.147842] sp : ffff800084403d20 [ 16.147912] x29: ffff800084403d20 x28: fff00000054aa300 x27: 0000000000000000 [ 16.148089] x26: 0000000000000000 x25: 0000000000000000 x24: 0000000000000000 [ 16.148235] x23: fff00000004fb5a8 x22: 0000000000000001 x21: 000000000000000d [ 16.148401] x20: fff0000000b2f9c0 x19: 0000000000000011 x18: 0000000000000001 [ 16.148544] x17: 0000000000000000 x16: 0000000000000000 x15: 0000000000000000 [ 16.148659] x14: 0000000000000002 x13: 0000000000000002 x12: 0000000000037d0f [ 16.148786] x11: fff0000001dad708 x10: 000000000000003f x9 : 0000000000000d1b [ 16.148925] x8 : 00000000000007e0 x7 : 0000000000000001 x6 : 000000000000000d [ 16.149039] x5 : ffffffffffffffff x4 : ffffffffffffffff x3 : 000000000000000e [ 16.149167] x2 : 0000000000000000 x1 : 0000000000000000 x0 : ffffc1ffc0fd68c0 [ 16.149351] Call trace: [ 16.149520] unpin_user_pages+0x78/0xd0 (P) [ 16.149684] tee_shm_put+0x134/0x184 [ 16.149783] tee_shm_fop_release+0x14/0x24 [ 16.149866] __fput+0xcc/0x2dc [ 16.149925] fput_close_sync+0x40/0x108 [ 16.149991] __arm64_sys_close+0x38/0x7c [ 16.150058] invoke_syscall+0x48/0x110 [ 16.150127] el0_svc_common.constprop.0+0x40/0xe8 [ 16.150227] do_el0_svc+0x20/0x2c [ 16.150303] el0_svc+0x34/0xf0 [ 16.150369] el0t_64_sync_handler+0xa0/0xe4 [ 16.150439] el0t_64_sync+0x198/0x19c [ 16.150629] Code: aa0203e3 eb02027f 54000109 f8627a82 (f9400444) [ 16.150940] ---[ end trace 0000000000000000 ]--- [ 16.151230] Kernel panic - not syncing: Oops: Fatal exception [ 16.151466] SMP: stopping secondary CPUs [ 16.151838] Kernel Offset: disabled [ 16.151911] CPU features: 0x000000,0000d180,2bbe33e1,957e7f3f [ 16.152019] Memory Limit: none [ 16.152284] ---[ end Kernel panic - not syncing: Oops: Fatal exception ]---
decode_stacktrace.sh shows the following call sequence.
[ 20.554057] unpin_user_pages (./include/linux/page-flags.h:284 mm/gup.c:259 mm/gup.c:420 mm/gup.c:400) (P) [ 20.554204] tee_shm_put (drivers/tee/tee_shm.c:42 drivers/tee/tee_shm.c:57 drivers/tee/tee_shm.c:587) [ 20.554291] tee_shm_fop_release (drivers/tee/tee_shm.c:437) [ 20.554366] __fput (fs/file_table.c:469) [ 20.554429] fput_close_sync (fs/file_table.c:574) [ 20.554496] __arm64_sys_close (fs/open.c:1590 fs/open.c:1572 fs/open.c:1572) [ 20.554565] invoke_syscall (./arch/arm64/include/asm/current.h:19 arch/arm64/kernel/syscall.c:54) [ 20.554639] el0_svc_common.constprop.0 (./include/linux/thread_info.h:135 arch/arm64/kernel/syscall.c:140) [ 20.554719] do_el0_svc (arch/arm64/kernel/syscall.c:152) [ 20.554782] el0_svc (./arch/arm64/include/asm/irqflags.h:55 ./arch/arm64/include/asm/irqflags.h:76 arch/arm64/kernel/entry-common.c:169 arch/arm64/kernel/entry-common.c:182 arch/arm64/kernel/entry-common.c:880) [ 20.554842] el0t_64_sync_handler (arch/arm64/kernel/entry-common.c:899) [ 20.554914] el0t_64_sync (arch/arm64/kernel/entry.S:596)
I set up a test environment using qemu_v8.xml in the OP-TEE/manifest repository.
## Test case
This test is based on xtest_tee_test_1004 included in xtest in the optee_test repository. For fuzzing, the following data is created using xtest_crypto_test(). It creates parameters like this.
op.params[0].tmpref.buffer = crypt_in; // crypt_in = malloc(sizeof(uint8_t) * 0xff); op.params[0].tmpref.size = input_len; // 0xffff op.params[1].tmpref.buffer = crypt_out; // crypt_out = malloc(sizeof(uint8_t) * 0xff); op.params[1].tmpref.size = input_len; // 0xffff
When TEEC_InvokeCommand() in libteec was called with above parameters, it printed folloing errorr then, linux kernel was crashed.
ERR [152] LT:TEEC_InvokeCommand:730: TEE_IOC_INVOKE failed
## Crash scenario
1. modified version of xtest_crypto_test() creates parameters. 2. xtest_crypto_test() calls TEEC_InvokeCommand() in the libteec. 3. TEEC_InvokeCommand calls ioctl(2). 4. tee_ioctl_invoke() in the tee_core.c is called by ioctl(2). 5. tee_ioctl_invoke() calles params_from_user(). 6. In the params_from_user(), it returned -EINVAL because of following if condition was true. if ((ip.a + ip.b) < ip.a || (ip.a + ip.b) > shm->size) { tee_shm_put(shm); return -EINVAL; }
7. tee_ioctl_invoke() recive an error from params_from_user() so that it run clean up process(in the out label). 8. tee_ioctl_invoke() returns -EINVAL. 9. TEEC_InvokeCommand() prints an error log then calls teec_free_temp_refs() in libteec. 10. teec_free_temp_refs() calls TEEC_ReleaseSharedMemory(). 11. TEEC_ReleaseSharedMemory() calls close(2). 12. tee_shm_put() in the tee_shm.c is called. 13. tee_shm_put() calls tee_shm_release(). 14. tee_shm_release() calls release_registered_pages(). 15. release_registered_pages() calls unpin_user_pages(). 16. Null pointer dereference is happened in unpin_user_pages().
## Debugging
I added following debug logs in the tee_shm_release()
} else if (shm->flags & TEE_SHM_DYNAMIC) { int rc = teedev->desc->ops->shm_unregister(shm->ctx, shm); size_t i; if (rc) dev_err(teedev->dev.parent, "unregister shm %p failed: %d", shm, rc);
pr_info("%s:%d: shm->num_pages: 0x%lx, shm->size: 0x%lx\n", __func__, __LINE__, shm->num_pages, shm->size); for (i = 0; i < shm->num_pages; i++) { if (!shm->pages[i]) { pr_info("%s:%d: shm->pages[%ld] is NULL", __func__, __LINE__, i); } } release_registered_pages(shm); }
It showed the following logs. [ 21.350894] tee_shm_release:57: shm->num_pages: 0x11, shm->size: 0xd690 [ 21.350977] tee_shm_release:60: shm->pages[14] is NULL [ 21.351003] tee_shm_release:60: shm->pages[15] is NULL [ 21.351012] tee_shm_release:60: shm->pages[16] is NULL
According to the above logs, shm->num_pages is 17 but the pages array contains NULL pointers so that it causes a NULL pointer dereference bug.
I have confirmed that I can prevent NULL pointer dereference by making the following changes.
diff --git a/drivers/tee/tee_shm.c b/drivers/tee/tee_shm.c index 2a7d253d9c55..c5d39a0efbdb 100644 --- a/drivers/tee/tee_shm.c +++ b/drivers/tee/tee_shm.c @@ -34,9 +34,16 @@ static void shm_get_kernel_pages(struct page **pages, size_t page_count) static void release_registered_pages(struct tee_shm *shm) { if (shm->pages) { - if (shm->flags & TEE_SHM_USER_MAPPED) - unpin_user_pages(shm->pages, shm->num_pages); - else + if (shm->flags & TEE_SHM_USER_MAPPED) { + size_t num_pages = 0; + size_t i; + for (i = 0; i < shm->num_pages; i++, num_pages++) { + if (!shm->pages[i]) + break; + } + + unpin_user_pages(shm->pages, num_pages); + } else shm_put_kernel_pages(shm->pages, shm->num_pages);
kfree(shm->pages);
Regards,
Hi Masami,
Thanks for the report. Does the problem also occur with earlier versions of the kernel? I have more comments inline below.
On Sat, Sep 13, 2025 at 8:07 AM Masami Ichikawa masami256@gmail.com wrote:
Hi.
While doing my fuzzing work, I found the following kernel crash by NULL pointer dereference in Linux 6.17-rc5.
[ 16.143987] Unable to handle kernel NULL pointer dereference at virtual address 0000000000000008 [ 16.144141] Mem abort info: [ 16.144215] ESR = 0x0000000096000004 [ 16.144246] EC = 0x25 ** replaying previous printk message ** [ 16.144246] EC = 0x25: DABT (current EL), IL = 32 bits [ 16.144271] SET = 0, FnV = 0 [ 16.144289] EA = 0, S1PTW = 0 [ 16.144308] FSC = 0x04: level 0 translation fault [ 16.144325] Data abort info: [ 16.144335] ISV = 0, ISS = 0x00000004, ISS2 = 0x00000000 [ 16.144346] CM = 0, WnR = 0, TnD = 0, TagAccess = 0 [ 16.144358] GCS = 0, Overlay = 0, DirtyBit = 0, Xs = 0 [ 16.144412] user pgtable: 4k pages, 52-bit VAs, pgdp=0000000048b34b00 [ 16.144432] [0000000000000008] pgd=0800000040bb3403, p4d=0000000000000000 [ 16.144876] Internal error: Oops: 0000000096000004 [#1] SMP [ 16.146429] Modules linked in: [ 16.146775] CPU: 0 UID: 0 PID: 148 Comm: xtest Not tainted 6.17.0-rc5 #58 PREEMPT [ 16.146995] Hardware name: linux,dummy-virt (DT) [ 16.147181] pstate: 21402005 (nzCv daif +PAN -UAO -TCO +DIT -SSBS BTYPE=--) [ 16.147330] pc : unpin_user_pages+0x78/0xd0 [ 16.147763] lr : unpin_user_pages+0xa0/0xd0 [ 16.147842] sp : ffff800084403d20 [ 16.147912] x29: ffff800084403d20 x28: fff00000054aa300 x27: 0000000000000000 [ 16.148089] x26: 0000000000000000 x25: 0000000000000000 x24: 0000000000000000 [ 16.148235] x23: fff00000004fb5a8 x22: 0000000000000001 x21: 000000000000000d [ 16.148401] x20: fff0000000b2f9c0 x19: 0000000000000011 x18: 0000000000000001 [ 16.148544] x17: 0000000000000000 x16: 0000000000000000 x15: 0000000000000000 [ 16.148659] x14: 0000000000000002 x13: 0000000000000002 x12: 0000000000037d0f [ 16.148786] x11: fff0000001dad708 x10: 000000000000003f x9 : 0000000000000d1b [ 16.148925] x8 : 00000000000007e0 x7 : 0000000000000001 x6 : 000000000000000d [ 16.149039] x5 : ffffffffffffffff x4 : ffffffffffffffff x3 : 000000000000000e [ 16.149167] x2 : 0000000000000000 x1 : 0000000000000000 x0 : ffffc1ffc0fd68c0 [ 16.149351] Call trace: [ 16.149520] unpin_user_pages+0x78/0xd0 (P) [ 16.149684] tee_shm_put+0x134/0x184 [ 16.149783] tee_shm_fop_release+0x14/0x24 [ 16.149866] __fput+0xcc/0x2dc [ 16.149925] fput_close_sync+0x40/0x108 [ 16.149991] __arm64_sys_close+0x38/0x7c [ 16.150058] invoke_syscall+0x48/0x110 [ 16.150127] el0_svc_common.constprop.0+0x40/0xe8 [ 16.150227] do_el0_svc+0x20/0x2c [ 16.150303] el0_svc+0x34/0xf0 [ 16.150369] el0t_64_sync_handler+0xa0/0xe4 [ 16.150439] el0t_64_sync+0x198/0x19c [ 16.150629] Code: aa0203e3 eb02027f 54000109 f8627a82 (f9400444) [ 16.150940] ---[ end trace 0000000000000000 ]--- [ 16.151230] Kernel panic - not syncing: Oops: Fatal exception [ 16.151466] SMP: stopping secondary CPUs [ 16.151838] Kernel Offset: disabled [ 16.151911] CPU features: 0x000000,0000d180,2bbe33e1,957e7f3f [ 16.152019] Memory Limit: none [ 16.152284] ---[ end Kernel panic - not syncing: Oops: Fatal exception ]---
decode_stacktrace.sh shows the following call sequence.
[ 20.554057] unpin_user_pages (./include/linux/page-flags.h:284 mm/gup.c:259 mm/gup.c:420 mm/gup.c:400) (P) [ 20.554204] tee_shm_put (drivers/tee/tee_shm.c:42 drivers/tee/tee_shm.c:57 drivers/tee/tee_shm.c:587) [ 20.554291] tee_shm_fop_release (drivers/tee/tee_shm.c:437) [ 20.554366] __fput (fs/file_table.c:469) [ 20.554429] fput_close_sync (fs/file_table.c:574) [ 20.554496] __arm64_sys_close (fs/open.c:1590 fs/open.c:1572 fs/open.c:1572) [ 20.554565] invoke_syscall (./arch/arm64/include/asm/current.h:19 arch/arm64/kernel/syscall.c:54) [ 20.554639] el0_svc_common.constprop.0 (./include/linux/thread_info.h:135 arch/arm64/kernel/syscall.c:140) [ 20.554719] do_el0_svc (arch/arm64/kernel/syscall.c:152) [ 20.554782] el0_svc (./arch/arm64/include/asm/irqflags.h:55 ./arch/arm64/include/asm/irqflags.h:76 arch/arm64/kernel/entry-common.c:169 arch/arm64/kernel/entry-common.c:182 arch/arm64/kernel/entry-common.c:880) [ 20.554842] el0t_64_sync_handler (arch/arm64/kernel/entry-common.c:899) [ 20.554914] el0t_64_sync (arch/arm64/kernel/entry.S:596)
I set up a test environment using qemu_v8.xml in the OP-TEE/manifest repository.
## Test case
This test is based on xtest_tee_test_1004 included in xtest in the optee_test repository. For fuzzing, the following data is created using xtest_crypto_test(). It creates parameters like this.
op.params[0].tmpref.buffer = crypt_in; // crypt_in = malloc(sizeof(uint8_t) * 0xff); op.params[0].tmpref.size = input_len; // 0xffff op.params[1].tmpref.buffer = crypt_out; // crypt_out = malloc(sizeof(uint8_t) * 0xff); op.params[1].tmpref.size = input_len; // 0xffff
So you have a small allocation and then report a larger one in tmpref.size. This is an out-of-bounds reference. Memory might follow the buffer, or it might be unmapped, depending on the state of the heap. However, this alone isn't enough for param_from_user() to return -EINVAL as described below.
I've tried making changes in teec_pre_process_tmpref() to so it can trigger the condition in param_from_user() below. I think I succeeded in that, but I'm unable to reproduce the crash.
Can you double-check to see if something is missing to reproduce the problem?
When TEEC_InvokeCommand() in libteec was called with above parameters, it printed folloing errorr then, linux kernel was crashed.
ERR [152] LT:TEEC_InvokeCommand:730: TEE_IOC_INVOKE failed
## Crash scenario
- modified version of xtest_crypto_test() creates parameters.
- xtest_crypto_test() calls TEEC_InvokeCommand() in the libteec.
- TEEC_InvokeCommand calls ioctl(2).
- tee_ioctl_invoke() in the tee_core.c is called by ioctl(2).
- tee_ioctl_invoke() calles params_from_user().
- In the params_from_user(), it returned -EINVAL because of following
if condition was true. if ((ip.a + ip.b) < ip.a || (ip.a + ip.b) > shm->size) { tee_shm_put(shm); return -EINVAL; }
The tee_shm_put() is supposed to balance with the earlier tee_shm_get_from_id(), so this in itself should be OK.
- tee_ioctl_invoke() recive an error from params_from_user() so that
it run clean up process(in the out label). 8. tee_ioctl_invoke() returns -EINVAL. 9. TEEC_InvokeCommand() prints an error log then calls teec_free_temp_refs() in libteec. 10. teec_free_temp_refs() calls TEEC_ReleaseSharedMemory(). 11. TEEC_ReleaseSharedMemory() calls close(2). 12. tee_shm_put() in the tee_shm.c is called. 13. tee_shm_put() calls tee_shm_release(). 14. tee_shm_release() calls release_registered_pages(). 15. release_registered_pages() calls unpin_user_pages(). 16. Null pointer dereference is happened in unpin_user_pages().
## Debugging
I added following debug logs in the tee_shm_release()
} else if (shm->flags & TEE_SHM_DYNAMIC) { int rc = teedev->desc->ops->shm_unregister(shm->ctx, shm); size_t i; if (rc) dev_err(teedev->dev.parent, "unregister shm %p failed: %d", shm, rc); pr_info("%s:%d: shm->num_pages: 0x%lx, shm->size:0x%lx\n", __func__, __LINE__, shm->num_pages, shm->size); for (i = 0; i < shm->num_pages; i++) { if (!shm->pages[i]) { pr_info("%s:%d: shm->pages[%ld] is NULL", __func__, __LINE__, i); } } release_registered_pages(shm); }
It showed the following logs. [ 21.350894] tee_shm_release:57: shm->num_pages: 0x11, shm->size: 0xd690 [ 21.350977] tee_shm_release:60: shm->pages[14] is NULL [ 21.351003] tee_shm_release:60: shm->pages[15] is NULL [ 21.351012] tee_shm_release:60: shm->pages[16] is NULL
According to the above logs, shm->num_pages is 17 but the pages array contains NULL pointers so that it causes a NULL pointer dereference bug.
This doesn't make sense shm->size == 0xd690 should require 14 or 15 pages, depending on shm->offset. The tee_shm is in an inconsistent state, besides the fact that we have NULL pointers in shm->pages[].
I have confirmed that I can prevent NULL pointer dereference by making the following changes.
The patch below works around the problem rather than fixing it. The problem occurred earlier when the tee_shm became inconsistent.
Cheers, Jens
diff --git a/drivers/tee/tee_shm.c b/drivers/tee/tee_shm.c index 2a7d253d9c55..c5d39a0efbdb 100644 --- a/drivers/tee/tee_shm.c +++ b/drivers/tee/tee_shm.c @@ -34,9 +34,16 @@ static void shm_get_kernel_pages(struct page **pages, size_t page_count) static void release_registered_pages(struct tee_shm *shm) { if (shm->pages) {
if (shm->flags & TEE_SHM_USER_MAPPED)unpin_user_pages(shm->pages, shm->num_pages);else
if (shm->flags & TEE_SHM_USER_MAPPED) {size_t num_pages = 0;size_t i;for (i = 0; i < shm->num_pages; i++, num_pages++) {if (!shm->pages[i])break;}unpin_user_pages(shm->pages, num_pages);} else shm_put_kernel_pages(shm->pages, shm->num_pages); kfree(shm->pages);Regards,
-- Masami Ichikawa
Hi Jens,
Thank you for your time to check my report.
On Tue, Sep 16, 2025 at 4:46 PM Jens Wiklander jens.wiklander@linaro.org wrote:
Hi Masami,
Thanks for the report. Does the problem also occur with earlier versions of the kernel? I have more comments inline below.
On Sat, Sep 13, 2025 at 8:07 AM Masami Ichikawa masami256@gmail.com wrote:
Hi.
While doing my fuzzing work, I found the following kernel crash by NULL pointer dereference in Linux 6.17-rc5.
[ 16.143987] Unable to handle kernel NULL pointer dereference at virtual address 0000000000000008 [ 16.144141] Mem abort info: [ 16.144215] ESR = 0x0000000096000004 [ 16.144246] EC = 0x25 ** replaying previous printk message ** [ 16.144246] EC = 0x25: DABT (current EL), IL = 32 bits [ 16.144271] SET = 0, FnV = 0 [ 16.144289] EA = 0, S1PTW = 0 [ 16.144308] FSC = 0x04: level 0 translation fault [ 16.144325] Data abort info: [ 16.144335] ISV = 0, ISS = 0x00000004, ISS2 = 0x00000000 [ 16.144346] CM = 0, WnR = 0, TnD = 0, TagAccess = 0 [ 16.144358] GCS = 0, Overlay = 0, DirtyBit = 0, Xs = 0 [ 16.144412] user pgtable: 4k pages, 52-bit VAs, pgdp=0000000048b34b00 [ 16.144432] [0000000000000008] pgd=0800000040bb3403, p4d=0000000000000000 [ 16.144876] Internal error: Oops: 0000000096000004 [#1] SMP [ 16.146429] Modules linked in: [ 16.146775] CPU: 0 UID: 0 PID: 148 Comm: xtest Not tainted 6.17.0-rc5 #58 PREEMPT [ 16.146995] Hardware name: linux,dummy-virt (DT) [ 16.147181] pstate: 21402005 (nzCv daif +PAN -UAO -TCO +DIT -SSBS BTYPE=--) [ 16.147330] pc : unpin_user_pages+0x78/0xd0 [ 16.147763] lr : unpin_user_pages+0xa0/0xd0 [ 16.147842] sp : ffff800084403d20 [ 16.147912] x29: ffff800084403d20 x28: fff00000054aa300 x27: 0000000000000000 [ 16.148089] x26: 0000000000000000 x25: 0000000000000000 x24: 0000000000000000 [ 16.148235] x23: fff00000004fb5a8 x22: 0000000000000001 x21: 000000000000000d [ 16.148401] x20: fff0000000b2f9c0 x19: 0000000000000011 x18: 0000000000000001 [ 16.148544] x17: 0000000000000000 x16: 0000000000000000 x15: 0000000000000000 [ 16.148659] x14: 0000000000000002 x13: 0000000000000002 x12: 0000000000037d0f [ 16.148786] x11: fff0000001dad708 x10: 000000000000003f x9 : 0000000000000d1b [ 16.148925] x8 : 00000000000007e0 x7 : 0000000000000001 x6 : 000000000000000d [ 16.149039] x5 : ffffffffffffffff x4 : ffffffffffffffff x3 : 000000000000000e [ 16.149167] x2 : 0000000000000000 x1 : 0000000000000000 x0 : ffffc1ffc0fd68c0 [ 16.149351] Call trace: [ 16.149520] unpin_user_pages+0x78/0xd0 (P) [ 16.149684] tee_shm_put+0x134/0x184 [ 16.149783] tee_shm_fop_release+0x14/0x24 [ 16.149866] __fput+0xcc/0x2dc [ 16.149925] fput_close_sync+0x40/0x108 [ 16.149991] __arm64_sys_close+0x38/0x7c [ 16.150058] invoke_syscall+0x48/0x110 [ 16.150127] el0_svc_common.constprop.0+0x40/0xe8 [ 16.150227] do_el0_svc+0x20/0x2c [ 16.150303] el0_svc+0x34/0xf0 [ 16.150369] el0t_64_sync_handler+0xa0/0xe4 [ 16.150439] el0t_64_sync+0x198/0x19c [ 16.150629] Code: aa0203e3 eb02027f 54000109 f8627a82 (f9400444) [ 16.150940] ---[ end trace 0000000000000000 ]--- [ 16.151230] Kernel panic - not syncing: Oops: Fatal exception [ 16.151466] SMP: stopping secondary CPUs [ 16.151838] Kernel Offset: disabled [ 16.151911] CPU features: 0x000000,0000d180,2bbe33e1,957e7f3f [ 16.152019] Memory Limit: none [ 16.152284] ---[ end Kernel panic - not syncing: Oops: Fatal exception ]---
decode_stacktrace.sh shows the following call sequence.
[ 20.554057] unpin_user_pages (./include/linux/page-flags.h:284 mm/gup.c:259 mm/gup.c:420 mm/gup.c:400) (P) [ 20.554204] tee_shm_put (drivers/tee/tee_shm.c:42 drivers/tee/tee_shm.c:57 drivers/tee/tee_shm.c:587) [ 20.554291] tee_shm_fop_release (drivers/tee/tee_shm.c:437) [ 20.554366] __fput (fs/file_table.c:469) [ 20.554429] fput_close_sync (fs/file_table.c:574) [ 20.554496] __arm64_sys_close (fs/open.c:1590 fs/open.c:1572 fs/open.c:1572) [ 20.554565] invoke_syscall (./arch/arm64/include/asm/current.h:19 arch/arm64/kernel/syscall.c:54) [ 20.554639] el0_svc_common.constprop.0 (./include/linux/thread_info.h:135 arch/arm64/kernel/syscall.c:140) [ 20.554719] do_el0_svc (arch/arm64/kernel/syscall.c:152) [ 20.554782] el0_svc (./arch/arm64/include/asm/irqflags.h:55 ./arch/arm64/include/asm/irqflags.h:76 arch/arm64/kernel/entry-common.c:169 arch/arm64/kernel/entry-common.c:182 arch/arm64/kernel/entry-common.c:880) [ 20.554842] el0t_64_sync_handler (arch/arm64/kernel/entry-common.c:899) [ 20.554914] el0t_64_sync (arch/arm64/kernel/entry.S:596)
I set up a test environment using qemu_v8.xml in the OP-TEE/manifest repository.
## Test case
This test is based on xtest_tee_test_1004 included in xtest in the optee_test repository. For fuzzing, the following data is created using xtest_crypto_test(). It creates parameters like this.
op.params[0].tmpref.buffer = crypt_in; // crypt_in = malloc(sizeof(uint8_t) * 0xff); op.params[0].tmpref.size = input_len; // 0xffff op.params[1].tmpref.buffer = crypt_out; // crypt_out = malloc(sizeof(uint8_t) * 0xff); op.params[1].tmpref.size = input_len; // 0xffff
So you have a small allocation and then report a larger one in tmpref.size. This is an out-of-bounds reference. Memory might follow the buffer, or it might be unmapped, depending on the state of the heap. However, this alone isn't enough for param_from_user() to return -EINVAL as described below.
I've tried making changes in teec_pre_process_tmpref() to so it can trigger the condition in param_from_user() below. I think I succeeded in that, but I'm unable to reproduce the crash.
Can you double-check to see if something is missing to reproduce the problem?
I wrote a test program and ran it on both 6.17-rc5 and 6.14. I was able to reproduce the crash on both kernels.
I uploaded test code and test results to my gist. https://gist.github.com/masami256/11e21a7503812af7ee1e890080093a2c
The test code is crash_test.c. This program takes 2 arguments. First argument is malicious buffer size and second one is actual buffer size. I can reproduce the crash with the following pair.
malicious buffer size: 0xffffff actual buffer size: 0xff
The test_log.md file contains the test results for 6.17-rc5.
When TEEC_InvokeCommand() in libteec was called with above parameters, it printed folloing errorr then, linux kernel was crashed.
ERR [152] LT:TEEC_InvokeCommand:730: TEE_IOC_INVOKE failed
## Crash scenario
- modified version of xtest_crypto_test() creates parameters.
- xtest_crypto_test() calls TEEC_InvokeCommand() in the libteec.
- TEEC_InvokeCommand calls ioctl(2).
- tee_ioctl_invoke() in the tee_core.c is called by ioctl(2).
- tee_ioctl_invoke() calles params_from_user().
- In the params_from_user(), it returned -EINVAL because of following
if condition was true. if ((ip.a + ip.b) < ip.a || (ip.a + ip.b) > shm->size) { tee_shm_put(shm); return -EINVAL; }
The tee_shm_put() is supposed to balance with the earlier tee_shm_get_from_id(), so this in itself should be OK.
- tee_ioctl_invoke() recive an error from params_from_user() so that
it run clean up process(in the out label). 8. tee_ioctl_invoke() returns -EINVAL. 9. TEEC_InvokeCommand() prints an error log then calls teec_free_temp_refs() in libteec. 10. teec_free_temp_refs() calls TEEC_ReleaseSharedMemory(). 11. TEEC_ReleaseSharedMemory() calls close(2). 12. tee_shm_put() in the tee_shm.c is called. 13. tee_shm_put() calls tee_shm_release(). 14. tee_shm_release() calls release_registered_pages(). 15. release_registered_pages() calls unpin_user_pages(). 16. Null pointer dereference is happened in unpin_user_pages().
## Debugging
I added following debug logs in the tee_shm_release()
} else if (shm->flags & TEE_SHM_DYNAMIC) { int rc = teedev->desc->ops->shm_unregister(shm->ctx, shm); size_t i; if (rc) dev_err(teedev->dev.parent, "unregister shm %p failed: %d", shm, rc); pr_info("%s:%d: shm->num_pages: 0x%lx, shm->size:0x%lx\n", __func__, __LINE__, shm->num_pages, shm->size); for (i = 0; i < shm->num_pages; i++) { if (!shm->pages[i]) { pr_info("%s:%d: shm->pages[%ld] is NULL", __func__, __LINE__, i); } } release_registered_pages(shm); }
It showed the following logs. [ 21.350894] tee_shm_release:57: shm->num_pages: 0x11, shm->size: 0xd690 [ 21.350977] tee_shm_release:60: shm->pages[14] is NULL [ 21.351003] tee_shm_release:60: shm->pages[15] is NULL [ 21.351012] tee_shm_release:60: shm->pages[16] is NULL
According to the above logs, shm->num_pages is 17 but the pages array contains NULL pointers so that it causes a NULL pointer dereference bug.
This doesn't make sense shm->size == 0xd690 should require 14 or 15 pages, depending on shm->offset. The tee_shm is in an inconsistent state, besides the fact that we have NULL pointers in shm->pages[].
I have confirmed that I can prevent NULL pointer dereference by making the following changes.
The patch below works around the problem rather than fixing it. The problem occurred earlier when the tee_shm became inconsistent.
Yes, that's right.
Please let me know if I can help with the debugging.
Cheers, Jens
diff --git a/drivers/tee/tee_shm.c b/drivers/tee/tee_shm.c index 2a7d253d9c55..c5d39a0efbdb 100644 --- a/drivers/tee/tee_shm.c +++ b/drivers/tee/tee_shm.c @@ -34,9 +34,16 @@ static void shm_get_kernel_pages(struct page **pages, size_t page_count) static void release_registered_pages(struct tee_shm *shm) { if (shm->pages) {
if (shm->flags & TEE_SHM_USER_MAPPED)unpin_user_pages(shm->pages, shm->num_pages);else
if (shm->flags & TEE_SHM_USER_MAPPED) {size_t num_pages = 0;size_t i;for (i = 0; i < shm->num_pages; i++, num_pages++) {if (!shm->pages[i])break;}unpin_user_pages(shm->pages, num_pages);} else shm_put_kernel_pages(shm->pages, shm->num_pages); kfree(shm->pages);Regards,
-- Masami Ichikawa
Regards,
Hi Masami,
[+Sumit in CC]
On Wed, Sep 17, 2025 at 10:58:11PM +0900, Masami Ichikawa wrote: [snip]
I wrote a test program and ran it on both 6.17-rc5 and 6.14. I was able to reproduce the crash on both kernels.
I uploaded test code and test results to my gist. https://gist.github.com/masami256/11e21a7503812af7ee1e890080093a2c
The test code is crash_test.c. This program takes 2 arguments. First argument is malicious buffer size and second one is actual buffer size. I can reproduce the crash with the following pair.
malicious buffer size: 0xffffff actual buffer size: 0xff
Thanks, that easily reproduces the problem. The following diff should fix it: --- a/drivers/tee/tee_shm.c +++ b/drivers/tee/tee_shm.c @@ -318,7 +318,16 @@ register_shm_helper(struct tee_context *ctx, struct iov_iter *iter, u32 flags,
len = iov_iter_extract_pages(iter, &shm->pages, LONG_MAX, num_pages, 0, &off); - if (unlikely(len <= 0)) { + if (unlikely(len < num_pages * PAGE_SIZE)) { + if (len > 0) { + /* + * If we only got a few pages, update to release + * the correct amount below. + */ + shm->num_pages = len / PAGE_SIZE; + ret = ERR_PTR(-ENOMEM); + goto err_put_shm_pages; + } ret = len ? ERR_PTR(len) : ERR_PTR(-ENOMEM); goto err_free_shm_pages; }
Cheers, Jens
Hi Jens,
On Thu, Sep 18, 2025 at 9:25 PM Jens Wiklander jens.wiklander@linaro.org wrote:
Hi Masami,
[+Sumit in CC]
On Wed, Sep 17, 2025 at 10:58:11PM +0900, Masami Ichikawa wrote: [snip]
I wrote a test program and ran it on both 6.17-rc5 and 6.14. I was able to reproduce the crash on both kernels.
I uploaded test code and test results to my gist. https://gist.github.com/masami256/11e21a7503812af7ee1e890080093a2c
The test code is crash_test.c. This program takes 2 arguments. First argument is malicious buffer size and second one is actual buffer size. I can reproduce the crash with the following pair.
malicious buffer size: 0xffffff actual buffer size: 0xff
Thanks, that easily reproduces the problem. The following diff should fix it: --- a/drivers/tee/tee_shm.c +++ b/drivers/tee/tee_shm.c @@ -318,7 +318,16 @@ register_shm_helper(struct tee_context *ctx, struct iov_iter *iter, u32 flags,
len = iov_iter_extract_pages(iter, &shm->pages, LONG_MAX, num_pages, 0, &off);
if (unlikely(len <= 0)) {
if (unlikely(len < num_pages * PAGE_SIZE)) {if (len > 0) {/** If we only got a few pages, update to release* the correct amount below.*/shm->num_pages = len / PAGE_SIZE;ret = ERR_PTR(-ENOMEM);goto err_put_shm_pages;} ret = len ? ERR_PTR(len) : ERR_PTR(-ENOMEM); goto err_free_shm_pages; }Cheers, Jens
Thank you for the fix. I tested both 6.17-rc5 and 6.14. I confirmed your patch solves the problem.
Tested-by: Masami Ichikawa masami256@gmail.com
Regards, -- Masami Ichikawa
On Thu, Sep 18, 2025 at 02:25:41PM +0200, Jens Wiklander wrote:
Hi Masami,
[+Sumit in CC]
On Wed, Sep 17, 2025 at 10:58:11PM +0900, Masami Ichikawa wrote: [snip]
I wrote a test program and ran it on both 6.17-rc5 and 6.14. I was able to reproduce the crash on both kernels.
I uploaded test code and test results to my gist. https://gist.github.com/masami256/11e21a7503812af7ee1e890080093a2c
The test code is crash_test.c. This program takes 2 arguments. First argument is malicious buffer size and second one is actual buffer size. I can reproduce the crash with the following pair.
malicious buffer size: 0xffffff actual buffer size: 0xff
Thanks Masami for the report and the bug reproducer here.
Thanks, that easily reproduces the problem. The following diff should fix it: --- a/drivers/tee/tee_shm.c +++ b/drivers/tee/tee_shm.c @@ -318,7 +318,16 @@ register_shm_helper(struct tee_context *ctx, struct iov_iter *iter, u32 flags, len = iov_iter_extract_pages(iter, &shm->pages, LONG_MAX, num_pages, 0, &off);
- if (unlikely(len <= 0)) {
- if (unlikely(len < num_pages * PAGE_SIZE)) {
if (len > 0) {/** If we only got a few pages, update to release* the correct amount below.*/shm->num_pages = len / PAGE_SIZE;ret = ERR_PTR(-ENOMEM);goto err_put_shm_pages; ret = len ? ERR_PTR(len) : ERR_PTR(-ENOMEM); goto err_free_shm_pages; }}
Thanks Jens for the fix, it sounds appropriate to me. I think this commit [1] introduced the bug in the first place as earlier check for pin_user_pages_fast() would have caught this issue without crashing the kernel.
Jens, can you please send a proper fix here? I hope we should be able to get it merged for v6.17 since it sounds critical to me.
[1] https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?i...
-Sumit
op-tee@lists.trustedfirmware.org