Re: [TF-A] psci shutdown do not follow graceful power off sequence

9 Dec 2019

      Hi Sudeep,
On Sat, Dec 7, 2019 at 5:07 PM Sudeep Holla sudeep.holla@arm.com wrote:
...
On Sat, Dec 07, 2019 at 12:45:41PM +0530, Sandeep Tripathy wrote:
...
Hi Sudeep,
 My intention here was to figure out all the reasons to skip power
down sequence deliberately.
Looking other way around, do we have reasons to execute power down
sequence for reboot ? Except the *special* use case / design you have
in your system ?
...
I feel the only reason we do not do graceful power down sequence is:-

In many core (may be 100(s) to have perceivable impact) server

systems it can increase the time of reboot.
For sure, this is one of the main reason and not the only one.
...

we do not see a valid generic use case where things can fail.

Indeed, even in your case as you have not given complete system design
details and not explained why other alternatives suggested don't work,
we have to isolate your case as custom.
...

complexity. I think this is a matter of intent for tf-a folks :)

If above is true I can close the thread and suggest to keep the
solution to plat-specific hook.
Sorry, but we can't help unless this is generic usecase.
IIUC, since the secondaries are parked in the OS, you need to pull in
them to secure side using IPI and then execute so called power down
sequence. To be honest, I don't like it as it's being pushed into TF-A
because OS refuses to do so with *very valid* reasons.
...
On Fri, Dec 6, 2019 at 4:21 PM Sudeep Holla sudeep.holla@arm.com wrote:
...
On Fri, Dec 06, 2019 at 12:21:47PM +0530, Sandeep Tripathy wrote:
...
On Thu, Dec 5, 2019 at 11:13 PM Dan Handley Dan.Handley@arm.com wrote:
...
Hi Sandeep
(I accidentally dropped the TF-A list in my last reply - now re-adding).
...
-----Original Message-----
From: Sandeep Tripathy sandeep.tripathy@broadcom.com
Sent: 05 December 2019 17:17
On Thu, Dec 5, 2019 at 9:54 PM Dan Handley Dan.Handley@arm.com wrote:
>
> Hi Sandeep
>
> > -----Original Message-----
> > From: TF-A tf-a-bounces@lists.trustedfirmware.org On Behalf Of
> > Sandeep Tripathy via TF-A
> > Sent: 05 December 2019 12:00
> >
> > My query is more on the spec.
> > The OS (eg: linux) and atf and psci spec seem to have assumed that
> > it is managing an independent system or managing 'all' the masters
> > in a coherent domain.
> > What other
> > reason could possibly encourage to not to follow a shutdown sequence.
> >
> Do you mean "to not follow a *graceful* shutdown sequence"?
Yes, exactly. Thanks!
> If so I can think of 3 reasons:
> 1. It's much slower than a non-graceful shutdown.
But this is certainly not a concern for smaller embedded systems.
True, but TF-A tries to be a reference for all systems.
...
> 2. There is no observable difference between a graceful and non-graceful
> shutdown from the calling OS's point of view. The OS presumably has no
> knowledge of other masters it does not manage.
Can CCN state machine go bad because one participating entity just goes off
without marking its exit ?
Please note I have not seen the issue and it is my assumption.
It depends on the interconnect. Arm interconnects designed for pre-v8.2
systems required explicit programming to take the master our of the
coherency domain. Arm interconnects for v8.2+ systems do this
automatically via hardware system coherency signals. The TF-A off/reset
platform interfaces have provision to do this programming if necessary,
but only for the running cluster, which is another reason not to use these
PSCI functions in this scenario.
we use the reset/reset2/ platform interface for the coherency exit. I
thought there might be some dependency on a proper core and cluster
power down sequence like clearing smp bit flushing the local caches.
So let's get into details. The OS has either initiated or being asked
by other masters in the system to perform either SYSTEM_OFF or _RESET.
Now, IIUC OS can save all the data it needs to preserve are written to
the non-volatile memory and then the poweroff/reboot sequence as you
have described in earlier mails get executed. So what in the volatile
memory(RAM or caches) you have to preserve at that point ?
/me still struggling to understand the use-case. I am asking again as
you mentioned the requirements have diverged since the original thread
in LKML. If it's same, can we please continue there by first getting
quite a few open questions in that thread.
I will dig into that thread and check where it ended up as I was not
directly involved in that.
tl:dr;
 1-if the secondary cores do not exit coherency domain interconnect can fail.
So just doing 'reboot' can potentially fail such system.
solution:
 'system_off'/'system_reset/system_reset2'  from any core may take
care to do this for all the cores
in firmware implementation ?  And implementation can be in
plat_specific hooks for respective call.
 2-may be the cpu caches (NS and S dirty lines) can be of value. eg:
may be some logs updated by core are in cache.
One thing that I have not seen answered so far is what's those data that
OS maintains in RAM/caches that it's responsible for and fail to write
it to non-volatile memory before executing shutdown. It looks like
some design flaw and OS *must* take care to ensure that it has saved
all the data. The whole discussion is based on that and have never got
response for that question.
*what's those data that
 OS maintains in RAM/caches that it's responsible for *
Any software be it an application/driver sharing the coherent memory
with another
masters can assume that it need not do explicit cache-ops ever  and
coherency is guaranteed by platform (firmware/hardware/os)?
The data updated to the coherent memory region may be in L1/L2 D$  and
we want a graceful/abrupt shutdown/reboot of this
*slave system*  where other master(s) not managed by *slave system*
'linux/tf-a' are still functional and can snoop the data.
In this case such application(s) have to do explicit cache flush on
reboot/shutdown event on a coherent memory.
...
...
solution:
  follow graceful power down sequence for all. I don't know how. May
be do IPI to bring all others to el3 and force
power down sequence for them in order.
As I already guessed this and mentioned above, the idea sounds bad.
...
solution2:  avoid 'reboot'.  initiate cpuhp / suspend if possible.
Maybe, but it depends on your design.
--
Regards,
Sudeep
Thanks
Sandeep

2026

2025

2024

2023

2022

2021

2020

2019

2018

Re: [TF-A] psci shutdown do not follow graceful power off sequence