Thank Soby, I also had more discussions to understand in detail and agree that we should have the barrier in both GICv2 and GICv3 cases. GICv3 case it’s a must because there is no ordering even from core point of view regardless of what happens downstream.
In rare care the DEVICEnGnRnE writes can *complete* out-of-order depending on the system design. The core as such cannot have much control over it but this dsb can make sure further instructions do not execute till previous transfers are architecturally complete and not just ordering.
Updated the patch to add v2 as well. https://review.trustedfirmware.org/c/TF-A/trusted-firmware-a/+/4454
Thanks Sandeep
-----Original Message----- From: Soby Mathew [mailto:Soby.Mathew@arm.com] Sent: Friday, June 12, 2020 7:21 PM To: Sandeep Tripathy; Raghu Krishnamurthy; Olivier Deprez; tf- a@lists.trustedfirmware.org Cc: nd Subject: RE: [TF-A] GICV3: system interface EOI ordering RFC
-----Original Message----- From: Sandeep Tripathy sandeep.tripathy@broadcom.com Sent: 10 June 2020 12:55 To: Soby Mathew Soby.Mathew@arm.com; Raghu Krishnamurthy raghu.ncstate@icloud.com; Olivier Deprez Olivier.Deprez@arm.com;
tf-
a@lists.trustedfirmware.org Cc: nd nd@arm.com Subject: RE: [TF-A] GICV3: system interface EOI ordering RFC
-----Original Message----- From: Soby Mathew [mailto:Soby.Mathew@arm.com] Sent: Wednesday, June 10, 2020 3:15 PM To: Sandeep Tripathy; Raghu Krishnamurthy; Olivier Deprez; tf- a@lists.trustedfirmware.org Cc: nd Subject: RE: [TF-A] GICV3: system interface EOI ordering RFC
-----Original Message----- From: Sandeep Tripathy sandeep.tripathy@broadcom.com Sent: 09 June 2020 17:04 To: Soby Mathew Soby.Mathew@arm.com; Raghu Krishnamurthy raghu.ncstate@icloud.com; Olivier Deprez
tf-
a@lists.trustedfirmware.org Subject: RE: [TF-A] GICV3: system interface EOI ordering RFC
Hi Soby,
-----Original Message----- From: Soby Mathew [mailto:Soby.Mathew@arm.com] Sent: Tuesday, June 09, 2020 6:42 PM To: Sandeep Tripathy; Raghu Krishnamurthy; Olivier Deprez; tf- a@lists.trustedfirmware.org Subject: RE: [TF-A] GICV3: system interface EOI ordering RFC
-----Original Message----- From: Sandeep Tripathy sandeep.tripathy@broadcom.com Sent: 09 June 2020 06:59 To: Soby Mathew Soby.Mathew@arm.com; Raghu Krishnamurthy raghu.ncstate@icloud.com; Olivier Deprez
tf- a@lists.trustedfirmware.org Subject: RE: [TF-A] GICV3: system interface EOI ordering RFC
Hi Soby,
> -----Original Message----- > From: Soby Mathew [mailto:Soby.Mathew@arm.com] > Sent: Monday, June 08, 2020 9:46 PM > To: Raghu Krishnamurthy; Sandeep Tripathy; Olivier Deprez; tf- > a@lists.trustedfirmware.org > Subject: RE: [TF-A] GICV3: system interface EOI ordering RFC > > Hi Sandeep, Raghu > It is good that the ARM support has confirmed the same. From > my reading of the GICv3 specification, it says the read of > ICC_IAR_ELx is self-synchronizing with respect to the > interrupt it is activating. It does not mention that writes > are self-synchronizing or whether it would be synchronized wrt > to a Device memory access. So a dsb() is needed prior to EOI. > > For GICv2, Device memory attribute only guarantees " the order > of memory accesses arriving at a single peripheral of > IMPLEMENTATION DEFINED size, as defined by the peripheral, > must be the same order that occurs in a simple sequential
execution of the program"
> (section > B2.7.2 - Device memory - reordering, version E.a of ARM ARM). > This means that a dmb() is needed if there has to be ordering > enforced between peripherals. >
I don’t find details about the 'size' in some TRMs ? And it does not seem to
me
DMB will have any additional constraint on such reordering possibilities between DEVICE memories accesses. "The IMPLEMENTATION DEFINED size of the single peripheral is the same as applies for the ordering guarantee provided by the DMB
instruction. "
Hmm, yes, IMPLEMENTATION DEFINED is not defined any apparently.
From
the description of DMB as you have mentioned, looks like a DSB would be needed.
My interpretation of the documentation in reference manual is there is no issue with reordering between different devices(nR) which can be managed by memory barriers. If they exist then that would need other techniques to ensure the completion such as a dummy read followed by the write as the peripheral end might travel different path (protocol/bridges/buffers). But those are certainly things for platforms and devices to worry about.
Hi Sandeep Right, but given that GICv2 CPU interface for EOI does have this ordering dependency with the Peripheral, I would think it would be good if the GICv2 driver API cater for it.
That is something which cannot be ensured with memory barriers (DMB/DSB) as these instructions only prevents reordering within same IMPLEMENTATION DEFINED size range as applicable to PERIPHERAL. This is just my interpretation of the document and needs confirmation.
Hi Sandeep, I emailed Arm support and have some additional info.
If the device memory is mapped with nGnRnE attributes, then the ARM ARM says the following :
".. a DSB barrier instruction, executed by the PE that performed the write to the No Early Write Acknowledgement Location, completes only after the write has reached its endpoint in the memory system."
Although there is a note that writes under PCIe will not support write acknowledgement from the endpoint. Once the device write is done, the dsb() can ensure the write to the device has "completed" prior to GIC_IAR write.
So we could modify the GICv2 API as well to include the dsb() and bring it inline with the GICv3 API.
Best Regards Soby Mathew
So while you may reconfirm from arm support about this (that will help)
we
can focus on GICv3 case ie. ordering icc_eoi write (msr) against DEVICE (nR) writes by introducing 'dsbishst especially where should we keep it.
I think it is better for the GICv3 API to cater for the ordering requirements without relying on the caller to worry about this. So your patch as it is stands looks good to me.
Best Regards Soby Mathew
The reordering between different devices may not be a real issue on GICv2 systems. So I would hold back on making any changes for GICv2 atleast for the moment. I will query ARM support on this and check if they have more to say on this.
Best Regards Soby Mathew
Thanks Sandeep
> If we take that GICv2 and the interrupting device are 2 > peripherals, then there has to be a dmb() between them to > ensure ordering between peripheral write and GIC_IAR write for
EOI.
> > Best Regards > Soby Mathew >
Thanks Sandeep
> > > -----Original Message----- > > From: Raghu Krishnamurthy raghu.ncstate@icloud.com > > Sent: 08 June 2020 16:16 > > To: Sandeep Tripathy sandeep.tripathy@broadcom.com; Soby > > Mathew Soby.Mathew@arm.com; Olivier Deprez > > Olivier.Deprez@arm.com;
tf-
> > a@lists.trustedfirmware.org > > Subject: Re: [TF-A] GICV3: system interface EOI ordering RFC > > > > Thanks Sandeep. Good enough for me! Very interesting that it > > is seemingly > not > > observed more often on real hardware. > > > > Thanks > > -Raghu > > > > On 6/8/20 8:11 AM, Sandeep Tripathy wrote: > > > Hi Raghu, > > >> -----Original Message----- > > >> From: TF-A > > >> [mailto:tf-a-bounces@lists.trustedfirmware.org] On Behalf > > >> Of Raghu Krishnamurthy via TF-A > > >> Sent: Monday, June 08, 2020 7:09 PM > > >> To: tf-a@lists.trustedfirmware.org > > >> Subject: Re: [TF-A] GICV3: system interface EOI ordering > > >> RFC > > >> > > >> Hello, > > >> > > >> Sorry for chiming in late. Are we sure that the msr/mrs > > >> to the > > >> GICv3 EOI register is not a context synchronizing event > > >> or ensures ordering itself ? Also did the addition of > > >> DSB() fix the issue ? If yes, can you try adding a delay > > >> or a few NOP's/dummy instructions or an ISB to see if the > > >> issue goes
away?
> > > > > > Yes these experiments were done to observe the behavior > > > which led to this change. > > > A definitive confirmation of how core treats these > > > accesses can be only be from arm core team or by actually > > > probing the
transfers.
> > > I had a response from arm support on the very same > > > question which reads as follows: > > > "In theory, the memory barrier should be placed between > > > the clearing of the peripheral and the access to the EOI > > > register". > > > In a slightly unrelated case I had observed this with NVIC > > > on axi probe and there is no feasibility to repeat this > > > with > > > GICv3 at the moment. > > > > > >> > > >> We need to know the right thing to do architecturally and > > >> confirm that the issue is really due to reordering at the > > >> core before adding the DSB. > > >> Adding a DSB might be masking a timing issue. > > > > > > Is the above response and what Soby had confirmed > > > sufficient
?
> > > or could someone from arm get this reconfirmed from core > > > design unless it is obvious to them ? And this is equally > > > applicable to linux as
well.
> > > > > >> > > >> -Raghu > > >> > > > Thanks > > > Sandeep > > >> > > >> > > >> On 6/8/20 4:47 AM, Sandeep Tripathy via TF-A wrote: > > >>>> -----Original Message----- > > >>>> From: Soby Mathew [mailto:Soby.Mathew@arm.com] > > >>>> Sent: Monday, June 08, 2020 3:36 PM > > >>>> To: Sandeep Tripathy; Olivier Deprez; > > >>>> tf-a@lists.trustedfirmware.org > > >>>> Cc: nd > > >>>> Subject: RE: [TF-A] GICV3: system interface EOI > > >>>> ordering RFC > > >>>> > > >>>> Just some inputs from my side: > > >>>> I agree we need at least a dsb() after the write to > > >>>> MMIO region to silence the peripheral and before the > > >>>> EOI at GIC sysreg interface. > > >>>> Adding it to the GIC EOI > > >>> > > >>> Thanks for the confirmation. Now it’s about choosing the > > >>> right place. > > >>> > > >>>> API seems the logical thing to do but as Olivier > > >>>> mentions, there are interrupt handler which do not deal > > >>>> with MMIO > > >>>> (eg: > > >>>> the Systimer > > >>>> interrupt) so adding it to GICv3 API might be arduous > > >>>> for such handlers. > > >>>> > > >>>> So there is a choice here to let the interrupt handlers > > >>>> to deal with ensuring completeness before EOI at sysreg > > >>>> interface or adding it to GICv3 EOI API and take the > > >>>> overhead for interrupt handlers which do not have to > > >>>> deal
with
MMIO.
> > >>>> > > >>> > > >>> Yes I feel either of these is a must to guarantee > > >>> functionality architecturally though both approach end > > >>> up with some unnecessary overhead. > > >>> > > >>> If GICv3 api takes care then it is an overhead for some > > >>> ISRs dealing with non-MMIO. > > >>> At present I do not see an active use case in reference > > >>> implementation where sys timer ISR is in a performance > > >>> intensive path where one additional DSB will be
perceivable.
> > >>> But there may be some I could be totally wrong in this > > >>> assumption > > >> (pmu/debug > > >>> or.. not sure). > > >>> Whereas I can certainly imagine some MMIO ISRs in > > >>> performance critical path where unnecessary DSB is not
acceptable at all.
> > >>> > > >>> If the interrupt handler needs to ensure then it will > > >>> generically add 'DSB' > > >>> as I think > > >>> the driver cannot and should not make assumptions about > > >>> how EOI is done afterwards. > > >>> That will be overhead for the ISRs dealing with MMIO > > >>> peripherals and non GIC-v3. > > >>> If we consider only GICv3+ then good. Otherwise I would > > >>> prefer the 'plat_ic_end_of_interrupt' > > >>> like Olivier mentioned with a #if GICv3 instead of each > > >>> ISRs dealing with it. > > >>> > > >>>> The GICv3 legacy MMIO CPU interface is deprecated for > > >>>> TF-A and the sys interface is the only one GICv3 driver > > >>>> in TF-A supports. > > >>> > > >>> Right we can ignore the GICv3 legacy mode but GICv2 will > > >>> still remain ? > > >>> > > >>>> > > >>>> Best Regards > > >>>> Soby Mathew > > >>>> > > >>> > > >>> Thanks > > >>> Sandeep > > >>> > > >>>>> -----Original Message----- > > >>>>> From: TF-A tf-a-bounces@lists.trustedfirmware.org On > > >>>>> Behalf Of > > >> Sandeep > > >>>>> Tripathy via TF-A > > >>>>> Sent: 08 June 2020 10:34 > > >>>>> To: Olivier Deprez Olivier.Deprez@arm.com; tf- > > >>>> a@lists.trustedfirmware.org > > >>>>> Subject: Re: [TF-A] GICV3: system interface EOI > > >>>>> ordering RFC > > >>>>> > > >>>>> Hi Olivier, > > >>>>>> -----Original Message----- > > >>>>>> From: Olivier Deprez [mailto:Olivier.Deprez@arm.com] > > >>>>>> Sent: Monday, June 08, 2020 1:14 PM > > >>>>>> To: tf-a@lists.trustedfirmware.org; Sandeep Tripathy > > >>>>>> Subject: Re: [TF-A] GICV3: system interface EOI > > >>>>>> ordering RFC > > >>>>>> > > >>>>>> Hi Sandeep, > > >>>>>> > > >>>>>> gicv3_end_of_interrupt_sel1 is a static access helper > > >>>>>> macro. > > >>>>>> Its naming precisely tells what it does at gicv3 > > >>>>>> module level. It is called from > > >>>>> the weak > > >>>>>> plat_ic_end_of_interrupt function for BL32 image. > > >>>>>> > > >>>>>> I tend to think it is the driver responsibility to > > >>>>>> ensure the module > > >>>>> interrupt > > >>>>>> acknowledge register write is reaching HW in order > > >>>>>> (or "be visible to > > >>>>> all > > >>>>>> observers"). > > >>>>> > > >>>>> The driver should be agnostic of what interrupt > > >>>>> controller is used and its behavior. > > >>>>> And since 'all' writes were to mmio ranges mapped > > >>>>> Device(nGnRE) > > >>>>> /strongly-ordered(nGnRnE) there was no such need. This > > >>>>> is a special case for GICv3 system interface only. > > >>>>> Adding at driver level a DSB is unnecessary for other > > >>>>> interrupt controllers. > > >>>>> > > >>>>>> Also I suspect adding a dsb might not be required > > >>>>>> generically for all > > >>>>> kind of IP. > > >>>>> > > >>>>> Here are you referring to the peripheral IP or > > >>>>> interrupt controller IP ? > > >>>>> The issue is reordering at arm core itself (STR to > > >>>>> device address Vs msr(sysreg write)). > > >>>>> So I think Its applicable for all IP. > > >>>>> > > >>>>>> Adding a barrier in generic code might incur an > > >>>>>> unnecessary bottleneck. > > >>>>> > > >>>>> But if there is a need to *ensure* presence of at > > >>>>> least one DSB between the write transfer from core to > > >>>>> device clear and > > >>>>> gicv3 eoi icc register write in a generic way then > > >>>>> what other option we have. > > >>>>>> > > >>>>>> Thus wouldn't it be better to add the barrier to the > > >>>>>> overridden platform function rather than in generic > > >>>>>> gicv3 code? > > >>>>> > > >>>>> That can be done but I feel this is more to do with > > >>>>> gicv3 system interface only. > > >>>>> Inside plat_xxx one has to check #if GICV3 ...and > > >>>>> system interface. > > >>>>>> > > >>>>>> I have a put a comment in the review, we can continue > > >>>>>> the discussion > > >>>>> there. > > >>>>>> > > >>>>>> Regards, > > >>>>>> Olivier. > > >>>>>> > > >>>>> Thanks > > >>>>> Sandeep > > >>>>>> ________________________________________ > > >>>>>> From: TF-A tf-a-bounces@lists.trustedfirmware.org > > >>>>>> on behalf of Sandeep Tripathy via TF-A > > >>>>>> tf-a@lists.trustedfirmware.org > > >>>>>> Sent: 05 June 2020 19:43 > > >>>>>> To: tf-a@lists.trustedfirmware.org > > >>>>>> Subject: [TF-A] GICV3: system interface EOI ordering > > >>>>>> RFC > > >>>>>> > > >>>>>> Hi, > > >>>>>> In a typical interrupt handling sequence we do > > >>>>>> 1-Read IAR > > >>>>>> 2-Do interrupt handling > > >>>>>> 3-Clear the interrupt at source , so that the > > >>>>>> device de-asserts IRQ > > >>>>> request to > > >>>>>> GIC > > >>>>>> >> I am suggesting a need of DSB here in > > >>>>>> case of > > >>>>>> GICv3 > > >>>>>> +. > > >>>>>> 4-Write to GIC cpu interface to do EOI. > > >>>>>> > > >>>>>> Till GICv2 and with GICv3 legacy interface ICC_EOI > > >>>>>> write is a write over > > >>>>> AMBA > > >>>>>> interface. The > > >>>>>> Addresses are mapped with (nR) attribute. Hence the > > >>>>>> write transfers > > >>>>> from the > > >>>>>> core will be > > >>>>>> generated at step 3 and 4 in order. Please ignore > > >>>>>> the additional > > >>>>> buffers/bridges > > >>>>>> in path from > > >>>>>> core till peripheral which has to be dealt separately > > >>>>>> as per SOC. > > >>>>>> > > >>>>>> Query: I understand GICv3 system interface accesses > > >>>>>> are not over this > > >>>>> protocol > > >>>>>> and core will not > > >>>>>> follow the ordering rule ? > > >>>>>> > > >>>>>> I have observed spurious interrupt > > >>>>>> issue/manifestation ( I don't have > > >>>>> the > > >>>>>> transfers probed) in > > >>>>>> RTOS environment where I have a primitive GICv3 > > >>>>>> driver but I wonder why things does not fail in Linux > > >>>>>> or
tf-
a.
> > >>>>>> If it is working because from step(3) to > > >>>>> step(4) we have > > >>>>>> barriers by chace > > >>>>>> due to other device register writes then I would > > >>>>>> suggest to have one in > > >>>>> the EOI > > >>>>>> clearing API itself. > > >>>>>> > > >>>>>> RFC: > > >>>>> https://review.trustedfirmware.org/c/TF-A/trusted-firm > > >>>>> ware > > >>>>> -a/+ > > >>>>> /445 > > >>>>> 4 > > >>>>>> > > >>>>>> Thanks > > >>>>>> Sandeep > > >>>>> -- > > >>>>> TF-A mailing list > > >>>>> TF-A@lists.trustedfirmware.org > > >>>>> https://lists.trustedfirmware.org/mailman/listinfo/tf- > > >>>>> a > > >> -- > > >> TF-A mailing list > > >> TF-A@lists.trustedfirmware.org > > >> https://lists.trustedfirmware.org/mailman/listinfo/tf-a