Hello,
Overloading and scalability issues have been a known issue with OpenCI for a long time, usually happening around the release cycles due to elevated CI activity, but also occurring from time to time during normal workflow too. That's why last quarter we worked on an actionable plan to leverage TuxSuite, a recent Linaro technology for cloud-based building and testing, which proved itself well with other projects. For this initial pilot project, we're looking to route TF-A FVP tests (90+% of our test load) to TuxSuite away from LAVA, to alleviate load on existing build and physical device test infrastructure.
It was actively worked on during this month, and I'm happy to to report that initial development and preliminary testing on staging show encouraging results. I'd like to perform larger-scale testing on staging yet, but otherwise think that we should be ready to deploy to production next.
As it's a holiday season with not many working days left, I'm sending this a bit earlier to make sure it's not forgotten or comes as a surprise later. The plan is otherwise to test on staging today and/or over weekend, and then proceed with deployment and validating on production next week(s).
Please let me know if you have questions or concerns.
Happy holidays, Paul
Linaro.org | Open source software for ARM SoCs Follow Linaro: http://www.facebook.com/pages/Linaro http://twitter.com/#%21/linaroorg - http://www.linaro.org/linaro-blog
Hi All,
Thanks Paul a solution in this space is very much needed, thankyou. Copying Dean head of Arm CE-SW Infrastructure team so he is aware of the work you are doing as in the new year he and his team will be providing the infrastructure solution a TF-RMM and TrustedServices CI for integration into the OpenCI.
Joanna
From: Paul Sokolovsky via Tf-openci tf-openci@lists.trustedfirmware.org Date: Friday, 22 December 2023 at 08:33 To: tf-openci@lists.trustedfirmware.org tf-openci@lists.trustedfirmware.org, Don Harbin don.harbin@linaro.org, Glen Valante glen.valante@linaro.org, Karen Power karen.power@linaro.org, Arthur She arthur.she@linaro.org, Ben Copeland ben.copeland@linaro.org Cc: tf-openci-triage@lists.trustedfirmware.org tf-openci-triage@lists.trustedfirmware.org Subject: [Tf-openci] Deploying OpenCI TuxSuite FVP integration Hello,
Overloading and scalability issues have been a known issue with OpenCI for a long time, usually happening around the release cycles due to elevated CI activity, but also occurring from time to time during normal workflow too. That's why last quarter we worked on an actionable plan to leverage TuxSuite, a recent Linaro technology for cloud-based building and testing, which proved itself well with other projects. For this initial pilot project, we're looking to route TF-A FVP tests (90+% of our test load) to TuxSuite away from LAVA, to alleviate load on existing build and physical device test infrastructure.
It was actively worked on during this month, and I'm happy to to report that initial development and preliminary testing on staging show encouraging results. I'd like to perform larger-scale testing on staging yet, but otherwise think that we should be ready to deploy to production next.
As it's a holiday season with not many working days left, I'm sending this a bit earlier to make sure it's not forgotten or comes as a surprise later. The plan is otherwise to test on staging today and/or over weekend, and then proceed with deployment and validating on production next week(s).
Please let me know if you have questions or concerns.
Happy holidays, Paul
Linaro.org | Open source software for ARM SoCs Follow Linaro: http://www.facebook.com/pages/Linaro http://twitter.com/#%21/linaroorg - http://www.linaro.org/linaro-blog -- Tf-openci mailing list -- tf-openci@lists.trustedfirmware.org To unsubscribe send an email to tf-openci-leave@lists.trustedfirmware.org IMPORTANT NOTICE: The contents of this email and any attachments are confidential and may also be privileged. If you are not the intended recipient, please notify the sender immediately and do not disclose the contents to any other person, use it for any purpose, or store or copy the information in any medium. Thank you.
Thanks Paul; This is good news. Looking forward to seeing the results.
-g
On 12/22/2023 3:32 AM, Paul Sokolovsky wrote:
Hello,
Overloading and scalability issues have been a known issue with OpenCI for a long time, usually happening around the release cycles due to elevated CI activity, but also occurring from time to time during normal workflow too. That's why last quarter we worked on an actionable plan to leverage TuxSuite, a recent Linaro technology for cloud-based building and testing, which proved itself well with other projects. For this initial pilot project, we're looking to route TF-A FVP tests (90+% of our test load) to TuxSuite away from LAVA, to alleviate load on existing build and physical device test infrastructure.
It was actively worked on during this month, and I'm happy to to report that initial development and preliminary testing on staging show encouraging results. I'd like to perform larger-scale testing on staging yet, but otherwise think that we should be ready to deploy to production next.
As it's a holiday season with not many working days left, I'm sending this a bit earlier to make sure it's not forgotten or comes as a surprise later. The plan is otherwise to test on staging today and/or over weekend, and then proceed with deployment and validating on production next week(s).
Please let me know if you have questions or concerns.
Happy holidays, Paul
Linaro.org | Open source software for ARM SoCs Follow Linaro:http://www.facebook.com/pages/Linaro http://twitter.com/#%21/linaroorg -http://www.linaro.org/linaro-blog
Hello,
Sorry for the lack of further updates regarding TuxSuite FVP progress. Long story short, during extended testing during the short Xmas/NY week, it was discovered that some share of TF-A configurations fail in TuxSuite. And of those, some failed rather deterministically, while some randomly. Last week, post holidays, I made sure to capture all triaging I could do on my side in:
https://linaro.atlassian.net/browse/TFC-569 https://linaro.atlassian.net/browse/TFC-570
, and involve Tux maintainers. So, all-hands work started to investigate it. I didn't want to post update a-la "sorry, it doesn't yet work", but waited for some more specific outlook. Today there was confirmation that the root cause seems to be identified and a patch submitted. It would need to go thru review and more testing yet, so I don't have any ETA as of now, but given the priority given to it, I'm sure there will be further updates this or next week. It also likely still covers only TFC-570 of the issues above. For TFC-569, we may need to try some changes on the TF-A CI scripts side (I believe Chris Kay wanted to look into it).
All in all, TuxSuite remains a priority and there's continued progress on it, and we're working on delivering not just improved scalability for the TF-A testing, but also improved reliability.
Thanks, Paul
On Fri, 22 Dec 2023 11:32:56 +0300 Paul Sokolovsky Paul.Sokolovsky@linaro.org wrote:
Hello,
Overloading and scalability issues have been a known issue with OpenCI for a long time, usually happening around the release cycles due to elevated CI activity, but also occurring from time to time during normal workflow too. That's why last quarter we worked on an actionable plan to leverage TuxSuite, a recent Linaro technology for cloud-based building and testing, which proved itself well with other projects. For this initial pilot project, we're looking to route TF-A FVP tests (90+% of our test load) to TuxSuite away from LAVA, to alleviate load on existing build and physical device test infrastructure.
It was actively worked on during this month, and I'm happy to to report that initial development and preliminary testing on staging show encouraging results. I'd like to perform larger-scale testing on staging yet, but otherwise think that we should be ready to deploy to production next.
As it's a holiday season with not many working days left, I'm sending this a bit earlier to make sure it's not forgotten or comes as a surprise later. The plan is otherwise to test on staging today and/or over weekend, and then proceed with deployment and validating on production next week(s).
Please let me know if you have questions or concerns.
Happy holidays, Paul
Linaro.org | Open source software for ARM SoCs Follow Linaro: http://www.facebook.com/pages/Linaro http://twitter.com/#%21/linaroorg - http://www.linaro.org/linaro-blog
Thanks for the continued updates Paul.
On Mon, 8 Jan 2024 at 14:01, Paul Sokolovsky paul.sokolovsky@linaro.org wrote:
Hello,
Sorry for the lack of further updates regarding TuxSuite FVP progress. Long story short, during extended testing during the short Xmas/NY week, it was discovered that some share of TF-A configurations fail in TuxSuite. And of those, some failed rather deterministically, while some randomly. Last week, post holidays, I made sure to capture all triaging I could do on my side in:
https://linaro.atlassian.net/browse/TFC-569 https://linaro.atlassian.net/browse/TFC-570
, and involve Tux maintainers. So, all-hands work started to investigate it. I didn't want to post update a-la "sorry, it doesn't yet work", but waited for some more specific outlook. Today there was confirmation that the root cause seems to be identified and a patch submitted. It would need to go thru review and more testing yet, so I don't have any ETA as of now, but given the priority given to it, I'm sure there will be further updates this or next week. It also likely still covers only TFC-570 of the issues above. For TFC-569, we may need to try some changes on the TF-A CI scripts side (I believe Chris Kay wanted to look into it).
All in all, TuxSuite remains a priority and there's continued progress on it, and we're working on delivering not just improved scalability for the TF-A testing, but also improved reliability.
Thanks, Paul
On Fri, 22 Dec 2023 11:32:56 +0300 Paul Sokolovsky Paul.Sokolovsky@linaro.org wrote:
Hello,
Overloading and scalability issues have been a known issue with OpenCI for a long time, usually happening around the release cycles due to elevated CI activity, but also occurring from time to time during normal workflow too. That's why last quarter we worked on an actionable plan to leverage TuxSuite, a recent Linaro technology for cloud-based building and testing, which proved itself well with other projects. For this initial pilot project, we're looking to route TF-A FVP tests (90+% of our test load) to TuxSuite away from LAVA, to alleviate load on existing build and physical device test infrastructure.
It was actively worked on during this month, and I'm happy to to report that initial development and preliminary testing on staging show encouraging results. I'd like to perform larger-scale testing on staging yet, but otherwise think that we should be ready to deploy to production next.
As it's a holiday season with not many working days left, I'm sending this a bit earlier to make sure it's not forgotten or comes as a surprise later. The plan is otherwise to test on staging today and/or over weekend, and then proceed with deployment and validating on production next week(s).
Please let me know if you have questions or concerns.
Happy holidays, Paul
Linaro.org | Open source software for ARM SoCs Follow Linaro: http://www.facebook.com/pages/Linaro http://twitter.com/#%21/linaroorg - http://www.linaro.org/linaro-blog
-- Best Regards, Paul
Linaro.org | Open source software for ARM SoCs Follow Linaro: http://www.facebook.com/pages/Linaro http://twitter.com/#%21/linaroorg - http://www.linaro.org/linaro-blog
Hello,
Sorry for the lack of further updates regarding TuxSuite FVP progress. Long story short, during extended testing during the short Xmas/NY week, it was discovered that some share of TF-A configurations fail in TuxSuite. And of those, some failed rather deterministically, while some randomly. Last week, post holidays, I made sure to capture all triaging I could do on my side in:
https://linaro.atlassian.net/browse/TFC-569 https://linaro.atlassian.net/browse/TFC-570
, and involve Tux maintainers. So, all-hands work started to investigate it. I didn't want to post update a-la "sorry, it doesn't yet work", but waited for some more specific outlook. Today there was confirmation that the root cause seems to be identified and a patch submitted. It would need to go thru review and more testing yet, so I don't have any ETA as of now, but given the priority given to it, I'm sure there will be further updates this or next week. It also likely still covers only TFC-570 of the issues above. For TFC-569, we may need to try some changes on the TF-A CI scripts side (I believe Chris Kay wanted to look into it).
All in all, TuxSuite remains a priority and there's continued progress on it, and we're working on delivering not just improved scalability for the TF-A testing, but also improved reliability.
Thanks, Paul
On Fri, 22 Dec 2023 11:32:56 +0300 Paul Sokolovsky Paul.Sokolovsky@linaro.org wrote:
Hello,
Overloading and scalability issues have been a known issue with OpenCI for a long time, usually happening around the release cycles due to elevated CI activity, but also occurring from time to time during normal workflow too. That's why last quarter we worked on an actionable plan to leverage TuxSuite, a recent Linaro technology for cloud-based building and testing, which proved itself well with other projects. For this initial pilot project, we're looking to route TF-A FVP tests (90+% of our test load) to TuxSuite away from LAVA, to alleviate load on existing build and physical device test infrastructure.
It was actively worked on during this month, and I'm happy to to report that initial development and preliminary testing on staging show encouraging results. I'd like to perform larger-scale testing on staging yet, but otherwise think that we should be ready to deploy to production next.
As it's a holiday season with not many working days left, I'm sending this a bit earlier to make sure it's not forgotten or comes as a surprise later. The plan is otherwise to test on staging today and/or over weekend, and then proceed with deployment and validating on production next week(s).
Please let me know if you have questions or concerns.
Happy holidays, Paul
Linaro.org | Open source software for ARM SoCs Follow Linaro: http://www.facebook.com/pages/Linaro http://twitter.com/#%21/linaroorg - http://www.linaro.org/linaro-blog
tf-openci-triage@lists.trustedfirmware.org