Hi Federico,
Thanks. I see you patch fixes the flakiness with tests but does not fix the original problem mentioned in the first email of this thread. Were you able to try the repro patch I provided and ensure that your change fixes the problem? I don’t believe it does, since the problem I reported does not have anything to do with the telnet port, but really with NOT reloading test-binaries but re-running the tests, which causes global variables whose initial/default values have changed, to see values from previous runs...
Thanks Raghu
-----Original Message----- From: Hafnium hafnium-bounces@lists.trustedfirmware.org On Behalf Of Federico Recanati via Hafnium Sent: Wednesday, November 17, 2021 6:34 AM To: hafnium@lists.trustedfirmware.org Subject: Re: [Hafnium] Bug in hftest.py
Hi Raghu,
a fix for hftest.py has been merged: https://review.trustedfirmware.org/c/hafnium/hafnium/+/12481
addressing the random failures of both-world tests and supporting connection to telnet ports other than 5000
Cheers, Federico
____________________________________
From: raghu.ncstate@icloud.com raghu.ncstate@icloud.com Sent: 04 August 2021 21:01 To: Olivier Deprez; 'Raghu Krishnamurthy via Hafnium' Subject: RE: [Hafnium] Bug in hftest.py
Thanks Olivier. I've created https://developer.trustedfirmware.org/T955 to track. Understood all of this is new. I do have local fixes to get around the issue so not a hurry to have a fix merged, but something to consider and fix since it will eventually show up.
the both worlds test scenario is not 100% stable on my machine
[RK] Likewise. I've noticed that this is caused by lingering FVP processes. Usually I ps -ax | grep for FVP instances, kill and then run tests and I never see failures after that. The issue that I faced was that the lingering FVP would take up telnet ports and the newly spawned ones use different ports(>5004) than what hftest.py expects. It appears that when tests fail, we may not be cleaning up/exiting processes properly, but I haven't checked. Or the code may be just fine and a ctrl+c leaves those processes lingering.
Thanks Raghu
-----Original Message----- From: Olivier Deprez Olivier.Deprez@arm.com Sent: Tuesday, August 3, 2021 11:51 PM To: 'Raghu Krishnamurthy via Hafnium' hafnium@lists.trustedfirmware.org; raghu.ncstate@icloud.com Subject: Re: [Hafnium] Bug in hftest.py
Hi Raghu,
Thanks for reporting. This part of the test infrastructure (testing the SPMC) is still very fresh and requires improvement iterations so please bear with us. Also a reason it's not yet part of the automated non-regression with jenkins (as opposed to the legacy kokoro/test.sh). For the time being we still mostly rely on the TF-A CI for testing on the secure side.
IIUC this change was made to help with the test time as the FVP takes long to reload on every test. But indeed it might have the side effect you describe. So either we revert the FVP reloading on every test. Or another (somewhat hackish) possibility is to clear the mentioned variables from within the test (or make them part of BSS)?
To be fair, the both worlds test scenario is not 100% stable on my machine (for some reason the connection is not always successful between the FVP and hftest) hence limiting confidence/robustness of my testing and investigations. So I wonder is the scripting is still somewhat a bit fragile.
Regards, Olivier.
From: Hafnium hafnium-bounces@lists.trustedfirmware.org on behalf of Raghu Krishnamurthy via Hafnium hafnium@lists.trustedfirmware.org Sent: 03 August 2021 23:47 To: 'Raghu Krishnamurthy via Hafnium' Subject: [Hafnium] Bug in hftest.py
Hi All,
Wanted to report to you that commit 18a25f9241f86ba2d637011ff465ce3869e8651b in hafnium "appears" broken. The issue with the optimization in this patch is that the partition images are not reloaded for each test run, which means a previous test could have written data to say SRAM, and the following test would use the old values from the previous test, when the same image is executed again from SRAM for a following test. This would be a problem for pretty much anything in the data section of a partition. In my case, I have a counter in the data section of my partition, which does not get reset back to its original value.
I've attached a patch to help repro the issue. Fix is to disable the optimization or somehow reload the images for each run. This affects only "both world" tests.
Let me know if I'm missing something here.
Apply patch and run timeout --foreground 300s ./test/hftest/hftest.py --out_partitions out/reference/secure_aem_v8a_fvp_vm_clang --log out/reference/kokoro_log --spmc out/reference/secure_aem_v8a_fvp_clang/hafnium.bin --driver=fvp --hypervisor out/reference/aem_v8a_fvp_clang/hafnium.bin --partitions_json test/vmapi/ffa_secure_partitions/ffa_both_world_partitions_test.json
The command line is from kokoro/test_spmc.sh.
Thanks
Raghu
-- Hafnium mailing list Hafnium@lists.trustedfirmware.org https://lists.trustedfirmware.org/mailman/listinfo/hafnium
-- Hafnium mailing list Hafnium@lists.trustedfirmware.org https://lists.trustedfirmware.org/mailman/listinfo/hafnium