Re: [TF-A] psci shutdown do not follow graceful power off sequence

9 Dec 2019


      Hi Sudeep,
 I am very specific about the core caches. The app/driver or many
entity for that matter are updating
cacheable coherent memory range(s). On shutdown/reboot notification
what they ought to do ?
If we ask them to do respective range flushes (cache mnt of a coherent
memory)  that is less generic
compared to if the core infrastructure gives the coherency guarantee
(1-OS: especially more than just halting for secondary(other) cores or
may be cpuhp secondary(other) cores etc,  2- tf-a psci
shutdown/reset/reset2 to do graceful pwrdown to take care of the
initiating core). May be this can be under a flag to choose between
faster reboot vs graceful power down sequence if at all it qualifies
as generic ?
Thanks
Sandeep
On Mon, Dec 9, 2019 at 5:45 PM Sudeep Holla sudeep.holla@arm.com wrote:
...
On Mon, Dec 09, 2019 at 05:29:21PM +0530, Sandeep Tripathy wrote:
...
Hi Sudeep,
On Mon, Dec 9, 2019 at 3:53 PM Sudeep Holla sudeep.holla@arm.com wrote:
...
On Mon, Dec 09, 2019 at 02:50:43PM +0530, Sandeep Tripathy wrote:
...
Hi Sudeep,
[...]
...
*what's those data that
 OS maintains in RAM/caches that it's responsible for *
Any software be it an application/driver sharing the coherent memory
with another masters can assume that it need not do explicit cache-ops ever
and coherency is guaranteed by platform (firmware/hardware/os)?
OK, you are missing something obvious in such design. If there are other
slaves and masters depending on this slave(OSPM), then the master
initiating the shutdown of this slave(OSPM) but be aware of it and can
broadcast the same across.
Of Course it will. The issue is not about notification mechanism.
And what's done in those masters with *this particular* notification ?
Anything but not cache maintenance preferably.
...
Why can't it stop snooping into caches(or request firmware to) that
belong to/maintained by the other slave(OS) ?
Sure. the respective clusters to be taken out of snoop domain by firmware
as part of psci plat specific hooks.
But what about their caches. I don't think there is a pull mechanism that
ccn/or other interconnect can voluntarily pull the dirty lines from exiting core
caches as an alternate to have them flushed by the respective cores.
...
...
...
If master can't, then firmware dealing with
this slave shutdown must. You simply can't assume things you currently
are. Sounds like a design gap in such a multi master-slave system to me.
...
The data updated to the coherent memory region may be in L1/L2 D$  and
we want a graceful/abrupt shutdown/reboot of this *slave system*
where other master(s) not managed by *slave system*
Yes of-course slaves don't manage master. Not sure how the master and
slave communicate in such a setup. Looks like some communication
gap between them :)
...
'linux/tf-a' are still functional and can snoop the data.
In this case such application(s) have to do explicit cache flush on
reboot/shutdown event on a coherent memory.
Absence of Shutdown/Reboot notification in such a system seems to be the
root of all such problems to me.
I did not say notification does not exist or applications can't do
cache ops along with many other stuff or communication protocol it might
have to do on a shutdown/reboot (not relevant).
OK
...
Trust me various approaches we discussed here so far and other CENH works :).
I am not saying other approaches are not tried/discussed. But I was not
aware of it. Also I am still not aware of the full design of your system
yet.
I think now we have narrowed down the discussion to very specific
cache maintenance ownership issue.
...
CENH ?
sorry. pls ignore cute embedded .. hacks !
...
...
The generic discussion is: Is it the responsibility of an application
to do cache maintenance on a *coherent memory* in shutdown/reboot path
where it never have to do so in its normal course.
OS don't(or can't as it's about to shutdown) care about the data in this
case. The notification is an indication to the application or other masters.
...
Is it not valid to expect such mechanism from the underlying platform
firmware or hardware.
It is the core which is going down and expected to do so in a graceful
manner if possible. If the limitation is time, understandable but not
exciting for smaller systems.
Not sure if that's the only reason. The core has also notified that it's
about to power off or reboot and that's OS takes care to save what it needs
and platform may give chance to others to do the same via notifications.
--
Regards,
Sudeep

2025

2024

2023

2022

2021

2020

2019

2018

Re: [TF-A] psci shutdown do not follow graceful power off sequence