[RFC 0/3] core: riscv: Implement Floating-Point and Vector extension context tracking

23 Jun 2026


      From: Dave Patel dave.patel@riscstar.com
Dear OP-TEE Developers,
This proposal introduces architectural runtime context management structures and
tracking mechanisms for standard scalar Floating-Point (F/D) and Vector (V) ISA
extensions on RISC-V platforms.
### 1. Architectural Alignment and Shift
Currently, the thread scheduling layer in OP-TEE OS implements a tightly coupled
VFP model specific to the ARM architecture (e.g., thread_kernel_enable_vfp). This
model relies heavily on a software driven "lazy context trap" mechanism, where
the kernel disables the FPU to catch subsequent execution faults.
On RISC-V architectures, context tracking cannot rely on software initiated lazy
traps via execution faults due to structural opcode overlap across custom extensions
and the severe pipeline execution penalties of traps. Instead, RISC-V provides
native hardware status state machines managed via the `sstatus.FS` (Floating-Point)
and `sstatus.VS` (Vector) bitfields.
To map seamlessly into OP-TEE's existing `thread_*_vfp` thread scheduling, we
utilize these hardware flags to implement "eager-on-dirty" context saving. The
kernel leaves extensions enabled during active execution and only write at context
switch if the hardware reports a `Dirty` status.
### 2. Implementation Subsystem Architecture
To maintain modular design the implementation explicitly divides scalar processing
from scalable vector configurations:
1. Modularity & Headers:
   - <kernel/riscv_fp.h> specifies bitmasks and structures (`struct riscv_fp_state`)
     for scalar execution.
   - <kernel/riscv_vector.h> encapsulates scalable vector parameter state layouts
     (`struct riscv_vector_state`). This allows devices missing a vector unit to completely
     omit vector footprints or dependency.
2. Bitwidth and Layout Adaptability:
   - Scalar Low-Level Assembly (`fp_asm.S`): Natively adapts to both 32-bit (`rv32`)
     and 64-bit (`rv64`) width configurations via compiler `__riscv_xlen` preprocessing directives.
   - Vector Extension Optimization (`riscv_vector.c`): Fully isolated from primary thread files.
     Instead of storing vector registers sequentially, it adopts whole-register block
     transfer instructions (`vs8r.v` and `vl8r.v`). Grouping elements into blocks of
     eight compresses the save/restore pipeline into four core operational chunks
     (`v0`, `v8`, `v16`, `v24`).
3. Unified Scheduling Abstraction (`thread_vfp.c`):
   Unified top-level implementation (`thread_kernel_save_vfp`, `thread_user_enable_vfp`,
   etc.) are used for context routing.
### 3. Feedback Requested
We are seeking early design feedback from the community regarding:
- Structural Convention: Should we keep the universal "vfp" naming scheme within the global
  `thread.h` header interfaces for structural backward compatibility, or is an explicit upstream
  refactoring toward a generic name (e.g., `thread_kernel_enable_coproc_regs`) preferred?
- Vector Bounds Memory Allocation: What is the preferred approach for safely managing the dynamic
  heap footprint for vector register states (`vregs`) which scale based on runtime CPU `VLENB` bounds?
- Eager Context switching has been proposed, hence are there any reservations on
this ?
Looking forward to your suggestions and design critiques.
Dave Patel (3):
  Floating Point changes
  RISCV Vector changes
  Thread changes for RISCV floating point and vector changes
core/arch/riscv/include/riscv_fp.h     |  30 +++++
 core/arch/riscv/include/riscv_vector.h |  32 +++++
 core/arch/riscv/kernel/riscv_fp.S      | 159 +++++++++++++++++++++++++
 core/arch/riscv/kernel/riscv_vector.c  |  77 ++++++++++++
 core/arch/riscv/kernel/thread_vfp.c    | 142 ++++++++++++++++++++++
 5 files changed, 440 insertions(+)
 create mode 100644 core/arch/riscv/include/riscv_fp.h
 create mode 100644 core/arch/riscv/include/riscv_vector.h
 create mode 100644 core/arch/riscv/kernel/riscv_fp.S
 create mode 100644 core/arch/riscv/kernel/riscv_vector.c
 create mode 100644 core/arch/riscv/kernel/thread_vfp.c
--
2.43.0

2026

2025

2024

2023

2022

2021

2020

[RFC 0/3] core: riscv: Implement Floating-Point and Vector extension context tracking