Hi all, I am working with FVP (Base RevC AEM) and arm integration solution (https://gitlab.arm.com/arm-reference-solutions/arm-reference-solutions-docs/...). I want to measure the overhead of a target ML workload between a realm VM and normal world VM. Both VMs are created by this command: nice -n -20 taskset -c 1 lkvm run --realm -c 1 -m 350 -k /root/VM_image/Image -i /root/VM_image/VM-fs.cpio --irqchip=gicv3 the target workload code and data is envisioned into the VM-fs.cpio. I also use GenericTrace to measure the number of instructions executed by core 1 (taskset -c 1 indicates that the VM process should be only given to core one). I use ToggleMTIPlugin to enable/disable tracing at particular points (at the beginning and end of the target workload inside the VM). What I am experiencing is that the numbers in normal world VM are very stable (271 millions) but, the numbers in the realm VM are very different between different runs of realm VM (from 314 to 463 and even 7671 millions!!!). I do all measurements in the same run of FVP in which I create a NW VM and run the target workload, then I destroy it and create a realm VM, run the target workload and destroy it while I repeat this steps several times and then terminates the FVP. I guess something in between the path from the realm to hypervisor makes the numbers unstable (either RMM or secure monitor). Have you ever seen such a problem and worked around measuring number of instructions for the realm workloads?
Thanks, Sina
Hi Sina We haven't tried this trace mechanism within the RMM team. This sounds like an FVP issue. Could you please try with the latest FVP as available here : https://developer.arm.com/Tools%20and%20Software/Fixed%20Virtual%20Platforms ?
Best Regards Soby Mathew
-----Original Message----- From: sina ab sinaabdollahip@gmail.com Sent: Thursday, June 6, 2024 11:03 AM To: tf-rmm@lists.trustedfirmware.org Subject: [tf-rmm] Measuring number of instructions for a workload in realm
Hi all, I am working with FVP (Base RevC AEM) and arm integration solution (https://gitlab.arm.com/arm-reference-solutions/arm-reference-solutions- docs/-/blob/master/docs/aemfvp-a-rme/install-fvp.rst). I want to measure the overhead of a target ML workload between a realm VM and normal world VM. Both VMs are created by this command: nice -n -20 taskset -c 1 lkvm run --realm -c 1 -m 350 -k /root/VM_image/Image -i /root/VM_image/VM-fs.cpio --irqchip=gicv3 the target workload code and data is envisioned into the VM-fs.cpio. I also use GenericTrace to measure the number of instructions executed by core 1 (taskset -c 1 indicates that the VM process should be only given to core one). I use ToggleMTIPlugin to enable/disable tracing at particular points (at the beginning and end of the target workload inside the VM). What I am experiencing is that the numbers in normal world VM are very stable (271 millions) but, the numbers in the realm VM are very different between different runs of realm VM (from 314 to 463 and even 7671 millions!!!). I do all measurements in the same run of FVP in which I create a NW VM and run the target workload, then I destroy it and create a realm VM, run the target workload and destroy it while I repeat this steps several times and then terminates the FVP. I guess something in between the path from the realm to hypervisor makes the numbers unstable (either RMM or secure monitor). Have you ever seen such a problem and worked around measuring number of instructions for the realm workloads?
Thanks, Sina _______________________________________________ tf-rmm mailing list -- tf-rmm@lists.trustedfirmware.org To unsubscribe send an email to tf-rmm-leave@lists.trustedfirmware.org
Dear Soby,
I am working with the latest release (FVP_Base_RevC-2xAEMvA_11.25_15_Linux64) and get very diverse numbers from realm VM experiment. As I mentioned earlier, I did not face any problem with the normal world VM. I tried and changed lots of configurations outside of FVP (e.g., pinning FVP process to a core or changing its command line parameters) and inside the FVP (e.g., disabling SVE for realm, changing realm RAM size, etc) but, still have unstable numbers in running a same experiment in realm VM. I think it it really worth to investigate this problem by your team or FVP team. I would be happy to share more details if it is helpful
Cheers, Sina
Hi,
Unfortunately we haven't tried the Trace plugin as Soby mentioned, for measuring the instruction count. However, there is certainly overhead in the Realm VM creation (compared to normal VM), w.r.t RMI calls (e.g., Populating Stage2 mappings etc). But I don't expect that to vary in such a large window! Might be a good idea to get insight from the Model team, we could check with them internally here in Arm.
Kind regards Suzuki
Dear Suzuki, Thanks for you response,
Just want to give you more information and measurements that may help. At the beginning, I run FVP (FVP_Base_RevC-2xAEMvA_11.25_15_Linux64). Normal world hypervisor is booted and new realm VM is created using: nice -n -20 taskset -c 1 lkvm run --realm -c 1 -m 350 -k /root/VM_image/Image -i /root/VM_image/VM-fs.cpio --irqchip=gicv3 After booting realm's kernel, 5 machine learning inference is executed while realm computation are local to its VM (the model and input data are provisioned into VM-fs.cpio). The realm is then terminated and a new realm VM is created. This process is repeated six times while each time I measure number of instructions executed by the core 1 between specific points in the VM lifecycle using GenericTrace mixed with MTITracePlugin. I measure these parameters in each run of VMs: 1-Numbers of instructions executed during each inference (and its average over 5 inference as well as standard deviation (std)) 2-Numbers of instructions executed during realm creation (from the point lkvm is executed until command line access to the VM file system). This measure is breaked down into two parts, before and after the activation point of VM (if it is a realm VM). I get these number of instructions in a 6 consecutive runs of realm VM (all in million number of instructions):
1) mean of 5 inference: 8128 std of inference: 189 Realm creation: 41715 Realm_statrup_Before_Activation = 19580 Realm_statrup_After_Activation = 22146 2) mean of 5 inference: 448 std of inference: 2 Realm creation: 23446 Realm_statrup_Before_Activation = 19531 Realm_statrup_After_Activation = 3917 3) mean of 5 inference: 313 std of inference: 1 Realm creation: 23126 Realm_statrup_Before_Activation = 19515 Realm_statrup_After_Activation = 3611 4) mean of 5 inference: 450 std of inference: 1 Realm creation: 23442 Realm_statrup_Before_Activation = 19515 Realm_statrup_After_Activation = 3928 5) mean of 5 inference: 449 std of inference: 1 Realm creation: 23438 Realm_statrup_Before_Activation = 19516 Realm_statrup_After_Activation = 3923 6) mean of 5 inference: 6968 std of inference: 34 Realm creation: 40140 Realm_statrup_Before_Activation = 19516 Realm_statrup_After_Activation = 20634
I also run six normal world VM in between each two realm VM experiments, I roughly get this very stable numbers: mean of 5 inference: 271 std of inference: 1 Realm creation: 859
Based on the numbers: a) Number of instructions executed before realm activation is very stable in different experiments (between 19516 to 19580), but whenever the number of instructions of realm_creation_after_activation increases, mean of inference also increases. We can claim that the reason of increase happens after realm activation point and continues during the realm's runtime. b) As the std of inference is always small (compared to the mean of inference), we can claim that the realm behaviour remains the same during its runtime regardless of its expected/unexpected behaviour. c) The normal world numbers are very stable during different runs of normal world VM experiment. I use the same filesystem for the both experiments.
Looking forward to hearing back from you. Sina
tf-rmm@lists.trustedfirmware.org