<Adding TF-A mailing list to the discussion>
Thanks, Soby. I agree that this needs to be re-evaluated for platforms. I think we should introduce an option to disable them, if required.
We plan to try some more experiments and hopefully remove the locks at least for Tegra platforms.
Looking forward to the elaborate answer.
From: Soby Mathew Soby.Mathew@arm.com Sent: Tuesday, 2 November 2021 10:18 AM To: Varun Wadekar vwadekar@nvidia.com; Manish Pandey2 Manish.Pandey2@arm.com; Dan Handley Dan.Handley@arm.com Cc: Joanna Farley Joanna.Farley@arm.com; Matteo Carlini Matteo.Carlini@arm.com Subject: RE: PSCI lock contention
External email: Use caution opening links or attachments
Hi Varun, The short answer is that the locks are used to differentiate the last-CPU-to-suspend and similarly first-CPU-to-powerup at a given power domain level. Now, recent CPU features like DynamIQ means that we don't need to do this differentiation upto cluster level which TF-A hasn't optimized for yet AFAICS. I am happy to elaborate further , but could you please send the query to the TF-A mailing list as I would prefer this discussion to happen in the open if possible.
Best Regards Soby Mathew
From: Varun Wadekar <vwadekar@nvidia.commailto:vwadekar@nvidia.com> Sent: 01 November 2021 20:14 To: Soby Mathew <Soby.Mathew@arm.commailto:Soby.Mathew@arm.com>; Manish Pandey2 <Manish.Pandey2@arm.commailto:Manish.Pandey2@arm.com>; Dan Handley <Dan.Handley@arm.commailto:Dan.Handley@arm.com> Cc: Joanna Farley <Joanna.Farley@arm.commailto:Joanna.Farley@arm.com>; Matteo Carlini <Matteo.Carlini@arm.commailto:Matteo.Carlini@arm.com> Subject: PSCI lock contention
Hi,
We were trying performance benchmarking for CPU_SUSPEND on Tegra platforms. We take all CPU cores to CPU_SUSPEND and then wake them up with IPI - all at once and in serial order. From the numbers, we see that the CPUs powering up later take more time than the first one. We have narrowed the most time consumed to the PSCI locks - documented at docs/perf/psci-performance-juno.rst.
Can you please help me understand why these locks were added? As a quick experiment we tried the same benchmarking *without* the locks and the firmware does not blow up, but I would like to understand the impact from the analysis on Juno (docs/perf/psci-performance-juno.rst)
Happy to hop on a call to discuss further.
Thanks. IMPORTANT NOTICE: The contents of this email and any attachments are confidential and may also be privileged. If you are not the intended recipient, please notify the sender immediately and do not disclose the contents to any other person, use it for any purpose, or store or copy the information in any medium. Thank you.