-----Original Message----- From: Soby Mathew [mailto:Soby.Mathew@arm.com] Sent: Wednesday, June 10, 2020 3:15 PM To: Sandeep Tripathy; Raghu Krishnamurthy; Olivier Deprez; tf- a@lists.trustedfirmware.org Cc: nd Subject: RE: [TF-A] GICV3: system interface EOI ordering RFC
-----Original Message----- From: Sandeep Tripathy sandeep.tripathy@broadcom.com Sent: 09 June 2020 17:04 To: Soby Mathew Soby.Mathew@arm.com; Raghu Krishnamurthy raghu.ncstate@icloud.com; Olivier Deprez Olivier.Deprez@arm.com;
tf-
a@lists.trustedfirmware.org Subject: RE: [TF-A] GICV3: system interface EOI ordering RFC
Hi Soby,
-----Original Message----- From: Soby Mathew [mailto:Soby.Mathew@arm.com] Sent: Tuesday, June 09, 2020 6:42 PM To: Sandeep Tripathy; Raghu Krishnamurthy; Olivier Deprez; tf- a@lists.trustedfirmware.org Subject: RE: [TF-A] GICV3: system interface EOI ordering RFC
-----Original Message----- From: Sandeep Tripathy sandeep.tripathy@broadcom.com Sent: 09 June 2020 06:59 To: Soby Mathew Soby.Mathew@arm.com; Raghu Krishnamurthy raghu.ncstate@icloud.com; Olivier Deprez
tf- a@lists.trustedfirmware.org Subject: RE: [TF-A] GICV3: system interface EOI ordering RFC
Hi Soby,
-----Original Message----- From: Soby Mathew [mailto:Soby.Mathew@arm.com] Sent: Monday, June 08, 2020 9:46 PM To: Raghu Krishnamurthy; Sandeep Tripathy; Olivier Deprez; tf- a@lists.trustedfirmware.org Subject: RE: [TF-A] GICV3: system interface EOI ordering RFC
Hi Sandeep, Raghu It is good that the ARM support has confirmed the same. From my reading of the GICv3 specification, it says the read of ICC_IAR_ELx is self-synchronizing with respect to the interrupt it is activating. It does not mention that writes are self-synchronizing or whether it would be synchronized wrt to a Device memory access. So a dsb() is needed prior to EOI.
For GICv2, Device memory attribute only guarantees " the order of memory accesses arriving at a single peripheral of IMPLEMENTATION DEFINED size, as defined by the peripheral, must be the same order that occurs in a simple sequential execution of the program" (section B2.7.2 - Device memory - reordering, version E.a of ARM ARM). This means that a dmb() is needed if there has to be ordering enforced between peripherals.
I don’t find details about the 'size' in some TRMs ? And it does not seem to
me
DMB will have any additional constraint on such reordering possibilities between DEVICE memories accesses. "The IMPLEMENTATION DEFINED size of the single peripheral is the same as applies for the ordering guarantee provided by the DMB
instruction. "
Hmm, yes, IMPLEMENTATION DEFINED is not defined any apparently.
From
the description of DMB as you have mentioned, looks like a DSB would be needed.
My interpretation of the documentation in reference manual is there is no issue with reordering between different devices(nR) which can be managed by memory barriers. If they exist then that would need other techniques to ensure the completion such as a dummy read followed by the write as the peripheral end might travel different path (protocol/bridges/buffers). But those are certainly things for platforms and devices to worry about.
Hi Sandeep Right, but given that GICv2 CPU interface for EOI does have this ordering dependency with the Peripheral, I would think it would be good if the GICv2 driver API cater for it.
That is something which cannot be ensured with memory barriers (DMB/DSB) as these instructions only prevents reordering within same IMPLEMENTATION DEFINED size range as applicable to PERIPHERAL. This is just my interpretation of the document and needs confirmation.
So while you may reconfirm from arm support about this (that will help)
we
can focus on GICv3 case ie. ordering icc_eoi write (msr) against DEVICE (nR) writes by introducing 'dsbishst especially where should we keep it.
I think it is better for the GICv3 API to cater for the ordering requirements without relying on the caller to worry about this. So your patch as it is stands looks good to me.
Best Regards Soby Mathew
The reordering between different devices may not be a real issue on GICv2 systems. So I would hold back on making any changes for GICv2 atleast for the moment. I will query ARM support on this and check if they have more to say on this.
Best Regards Soby Mathew
Thanks Sandeep
If we take that GICv2 and the interrupting device are 2 peripherals, then there has to be a dmb() between them to ensure ordering between peripheral write and GIC_IAR write for EOI.
Best Regards Soby Mathew
Thanks Sandeep
-----Original Message----- From: Raghu Krishnamurthy raghu.ncstate@icloud.com Sent: 08 June 2020 16:16 To: Sandeep Tripathy sandeep.tripathy@broadcom.com; Soby Mathew Soby.Mathew@arm.com; Olivier Deprez Olivier.Deprez@arm.com;
tf-
a@lists.trustedfirmware.org Subject: Re: [TF-A] GICV3: system interface EOI ordering RFC
Thanks Sandeep. Good enough for me! Very interesting that it is seemingly
not
observed more often on real hardware.
Thanks -Raghu
On 6/8/20 8:11 AM, Sandeep Tripathy wrote: > Hi Raghu, >> -----Original Message----- >> From: TF-A [mailto:tf-a-bounces@lists.trustedfirmware.org] On >> Behalf Of Raghu Krishnamurthy via TF-A >> Sent: Monday, June 08, 2020 7:09 PM >> To: tf-a@lists.trustedfirmware.org >> Subject: Re: [TF-A] GICV3: system interface EOI ordering RFC >> >> Hello, >> >> Sorry for chiming in late. Are we sure that the msr/mrs to >> the >> GICv3 EOI register is not a context synchronizing event or >> ensures ordering itself ? Also did the addition of DSB() fix >> the issue ? If yes, can you try adding a delay or a few >> NOP's/dummy instructions or an ISB to see if the issue goes
away?
> > Yes these experiments were done to observe the behavior which > led to this change. > A definitive confirmation of how core treats these accesses > can be only be from arm core team or by actually probing the
transfers.
> I had a response from arm support on the very same question > which reads as follows: > "In theory, the memory barrier should be placed between the > clearing of the peripheral and the access to the EOI > register". > In a slightly unrelated case I had observed this with NVIC on > axi probe and there is no feasibility to repeat this with > GICv3 at the moment. > >> >> We need to know the right thing to do architecturally and >> confirm that the issue is really due to reordering at the >> core before adding the DSB. >> Adding a DSB might be masking a timing issue. > > Is the above response and what Soby had confirmed sufficient ? > or could someone from arm get this reconfirmed from core > design unless it is obvious to them ? And this is equally > applicable to linux as
well.
> >> >> -Raghu >> > Thanks > Sandeep >> >> >> On 6/8/20 4:47 AM, Sandeep Tripathy via TF-A wrote: >>>> -----Original Message----- >>>> From: Soby Mathew [mailto:Soby.Mathew@arm.com] >>>> Sent: Monday, June 08, 2020 3:36 PM >>>> To: Sandeep Tripathy; Olivier Deprez; >>>> tf-a@lists.trustedfirmware.org >>>> Cc: nd >>>> Subject: RE: [TF-A] GICV3: system interface EOI ordering >>>> RFC >>>> >>>> Just some inputs from my side: >>>> I agree we need at least a dsb() after the write to MMIO >>>> region to silence the peripheral and before the EOI at GIC >>>> sysreg interface. >>>> Adding it to the GIC EOI >>> >>> Thanks for the confirmation. Now it’s about choosing the >>> right place. >>> >>>> API seems the logical thing to do but as Olivier mentions, >>>> there are interrupt handler which do not deal with MMIO >>>> (eg: >>>> the Systimer >>>> interrupt) so adding it to GICv3 API might be arduous for >>>> such handlers. >>>> >>>> So there is a choice here to let the interrupt handlers to >>>> deal with ensuring completeness before EOI at sysreg >>>> interface or adding it to GICv3 EOI API and take the >>>> overhead for interrupt handlers which do not have to deal
with
MMIO.
>>>> >>> >>> Yes I feel either of these is a must to guarantee >>> functionality architecturally though both approach end up >>> with some unnecessary overhead. >>> >>> If GICv3 api takes care then it is an overhead for some ISRs >>> dealing with non-MMIO. >>> At present I do not see an active use case in reference >>> implementation where sys timer ISR is in a performance >>> intensive path where one additional DSB will be perceivable. >>> But there may be some I could be totally wrong in this >>> assumption >> (pmu/debug >>> or.. not sure). >>> Whereas I can certainly imagine some MMIO ISRs in >>> performance critical path where unnecessary DSB is not
acceptable at all.
>>> >>> If the interrupt handler needs to ensure then it will >>> generically add 'DSB' >>> as I think >>> the driver cannot and should not make assumptions about how >>> EOI is done afterwards. >>> That will be overhead for the ISRs dealing with MMIO >>> peripherals and non GIC-v3. >>> If we consider only GICv3+ then good. Otherwise I would >>> prefer the 'plat_ic_end_of_interrupt' >>> like Olivier mentioned with a #if GICv3 instead of each >>> ISRs dealing with it. >>> >>>> The GICv3 legacy MMIO CPU interface is deprecated for TF-A >>>> and the sys interface is the only one GICv3 driver in TF-A >>>> supports. >>> >>> Right we can ignore the GICv3 legacy mode but GICv2 will >>> still remain ? >>> >>>> >>>> Best Regards >>>> Soby Mathew >>>> >>> >>> Thanks >>> Sandeep >>> >>>>> -----Original Message----- >>>>> From: TF-A tf-a-bounces@lists.trustedfirmware.org On >>>>> Behalf Of >> Sandeep >>>>> Tripathy via TF-A >>>>> Sent: 08 June 2020 10:34 >>>>> To: Olivier Deprez Olivier.Deprez@arm.com; tf- >>>> a@lists.trustedfirmware.org >>>>> Subject: Re: [TF-A] GICV3: system interface EOI ordering >>>>> RFC >>>>> >>>>> Hi Olivier, >>>>>> -----Original Message----- >>>>>> From: Olivier Deprez [mailto:Olivier.Deprez@arm.com] >>>>>> Sent: Monday, June 08, 2020 1:14 PM >>>>>> To: tf-a@lists.trustedfirmware.org; Sandeep Tripathy >>>>>> Subject: Re: [TF-A] GICV3: system interface EOI ordering >>>>>> RFC >>>>>> >>>>>> Hi Sandeep, >>>>>> >>>>>> gicv3_end_of_interrupt_sel1 is a static access helper >>>>>> macro. >>>>>> Its naming precisely tells what it does at gicv3 module >>>>>> level. It is called from >>>>> the weak >>>>>> plat_ic_end_of_interrupt function for BL32 image. >>>>>> >>>>>> I tend to think it is the driver responsibility to ensure >>>>>> the module >>>>> interrupt >>>>>> acknowledge register write is reaching HW in order (or >>>>>> "be visible to >>>>> all >>>>>> observers"). >>>>> >>>>> The driver should be agnostic of what interrupt controller >>>>> is used and its behavior. >>>>> And since 'all' writes were to mmio ranges mapped >>>>> Device(nGnRE) >>>>> /strongly-ordered(nGnRnE) >>>>> there was no such need. This is a special case for GICv3 >>>>> system interface only. >>>>> Adding at driver level a DSB is unnecessary for other >>>>> interrupt controllers. >>>>> >>>>>> Also I suspect adding a dsb might not be required >>>>>> generically for all >>>>> kind of IP. >>>>> >>>>> Here are you referring to the peripheral IP or interrupt >>>>> controller IP ? >>>>> The issue is reordering at arm core itself (STR to device >>>>> address Vs msr(sysreg write)). >>>>> So I think Its applicable for all IP. >>>>> >>>>>> Adding a barrier in generic code might incur an >>>>>> unnecessary bottleneck. >>>>> >>>>> But if there is a need to *ensure* presence of at least >>>>> one DSB between the write transfer from core to device >>>>> clear and >>>>> gicv3 eoi icc register write in a generic way then what >>>>> other option we have. >>>>>> >>>>>> Thus wouldn't it be better to add the barrier to the >>>>>> overridden platform function rather than in generic gicv3 >>>>>> code? >>>>> >>>>> That can be done but I feel this is more to do with gicv3 >>>>> system interface only. >>>>> Inside plat_xxx one has to check #if GICV3 ...and system >>>>> interface. >>>>>> >>>>>> I have a put a comment in the review, we can continue the >>>>>> discussion >>>>> there. >>>>>> >>>>>> Regards, >>>>>> Olivier. >>>>>> >>>>> Thanks >>>>> Sandeep >>>>>> ________________________________________ >>>>>> From: TF-A tf-a-bounces@lists.trustedfirmware.org on >>>>>> behalf of Sandeep Tripathy via TF-A >>>>>> tf-a@lists.trustedfirmware.org >>>>>> Sent: 05 June 2020 19:43 >>>>>> To: tf-a@lists.trustedfirmware.org >>>>>> Subject: [TF-A] GICV3: system interface EOI ordering RFC >>>>>> >>>>>> Hi, >>>>>> In a typical interrupt handling sequence we do 1-Read >>>>>> IAR >>>>>> 2-Do interrupt handling >>>>>> 3-Clear the interrupt at source , so that the device >>>>>> de-asserts IRQ >>>>> request to >>>>>> GIC >>>>>> >> I am suggesting a need of DSB here in case >>>>>> of >>>>>> GICv3 >>>>>> +. >>>>>> 4-Write to GIC cpu interface to do EOI. >>>>>> >>>>>> Till GICv2 and with GICv3 legacy interface ICC_EOI write >>>>>> is a write over >>>>> AMBA >>>>>> interface. The >>>>>> Addresses are mapped with (nR) attribute. Hence the >>>>>> write transfers >>>>> from the >>>>>> core will be >>>>>> generated at step 3 and 4 in order. Please ignore the >>>>>> additional >>>>> buffers/bridges >>>>>> in path from >>>>>> core till peripheral which has to be dealt separately as >>>>>> per SOC. >>>>>> >>>>>> Query: I understand GICv3 system interface accesses are >>>>>> not over this >>>>> protocol >>>>>> and core will not >>>>>> follow the ordering rule ? >>>>>> >>>>>> I have observed spurious interrupt issue/manifestation ( >>>>>> I don't have >>>>> the >>>>>> transfers probed) in >>>>>> RTOS environment where I have a primitive GICv3 driver >>>>>> but I wonder why things does not fail in Linux or tf-a. >>>>>> If it is working because from step(3) to >>>>> step(4) we have >>>>>> barriers by chace >>>>>> due to other device register writes then I would suggest >>>>>> to have one in >>>>> the EOI >>>>>> clearing API itself. >>>>>> >>>>>> RFC: >>>>> https://review.trustedfirmware.org/c/TF-A/trusted-firmware >>>>> -a/+ >>>>> /445 >>>>> 4 >>>>>> >>>>>> Thanks >>>>>> Sandeep >>>>> -- >>>>> TF-A mailing list >>>>> TF-A@lists.trustedfirmware.org >>>>> https://lists.trustedfirmware.org/mailman/listinfo/tf-a >> -- >> TF-A mailing list >> TF-A@lists.trustedfirmware.org >> https://lists.trustedfirmware.org/mailman/listinfo/tf-a