Hi Julius,
Sandrine is away on vacation so I thought I would follow up as we had a chat in one of our team meetings about the point below.
> On 07/07/2020, 23:05, "TF-A on behalf of Julius Werner via TF-A" <tf-a-bounces(a)lists.trustedfirmware.org on behalf of tf-a(a)lists.trustedfirmware.org> wrote:
>> IOW, you're fine with both options but you'd prefer to have different
>> people setting MR+1 and COR+1 (or CR+1 in the special cases mentioned
>> above), right? Just want to double-check I understood your position.
>Right.
So when we had +1/+2 CR before these changes we endeavoured to ensure that these were done by different people and if a maintainer gave a +1 they resisted also giving a +2 as we recognise having multiple eyes is always advantageous. So the general feeling was that this should be the recommended best practice. However, the feeling was also that this specific recommendation should not be strictly enforced by gerrit checking as on occasions it is possible to give both on simple patches and maintainers and code owners in this case can be trusted to make the correct choice.
The enforcing of them in Gerrit should be viewed as an aid to try and avoid mistakes rather than strict control for as you say we don't expect people to intentionally break the rules. Perhaps a warning in this case can be made to remind folks what's best practice here.
For now though as mentioned as we practice the new rules and recommendations we will use an honour system rather than strict enforcement along with updating the documented rules and recomendations.
Thanks
Joanna
Hi Raghu,
The Exception Handling framework (EHF) was designed to provide prioritization between the EL3 interrupts and EA although EA will always have highest priority since it cannot be blocked. In the case you describe, the driver in EL3 which is handling the RAS errors would need to ensure serialization of the events delivered to the S-EL0 payload somehow. This could be either via holding the event in a queue in EL3 till EL0 is done with processing the first event or the EL0 payload is capable of re-entry and can manage a queue internally. In case of MM, I suppose re-entry is not an option and hence a holding queue in EL3 driver needs to be implemented.
The current implementation in sgi_ras.c doesn't do this currently as this was a PoC to showcase the RAS flow.
Best Regards
Soby Mathew
> -----Original Message-----
> From: TF-A <tf-a-bounces(a)lists.trustedfirmware.org> On Behalf Of Raghu K
> via TF-A
> Sent: 13 July 2020 22:28
> To: tf-a(a)lists.trustedfirmware.org
> Subject: [TF-A] Deadlock in SGI RAS handling
>
> Hi All,
>
> I was going through some code in sgi_ras.c and was wondering if the
> situation mentioned below could cause a deadlock or if i'm missing
> something. It seems like it is possible to deadlock if we enter MM in S-
> EL0(say through an MM_COMMUNICATE SMC or perhaps an initial RAS
> interrupt) followed by a SYNC EA or ASYNC EA on the same core. sgi_ras.c
> seems like it registers the same handler for both interrupts and aborts.
> While interrupts can be blocked/masked, SYNC EA's cannot be blocked(not
> that i know of), and i don't see SErrors being blocked on the path to the EA
> handler and entry to MM. If this situation does occur, it seems like we could
> deadlock when the EA attempts to enter MM again in the interrupt handler.
> Is there something that would prevent this situation from happening?
>
>
> Thanks
> Raghu
> --
> TF-A mailing list
> TF-A(a)lists.trustedfirmware.org
> https://lists.trustedfirmware.org/mailman/listinfo/tf-a
Hi All,
I was going through some code in sgi_ras.c and was wondering if the
situation mentioned below could cause a deadlock or if i'm missing
something. It seems like it is possible to deadlock if we enter MM in
S-EL0(say through an MM_COMMUNICATE SMC or perhaps an initial RAS
interrupt) followed by a SYNC EA or ASYNC EA on the same core. sgi_ras.c
seems like it registers the same handler for both interrupts and aborts.
While interrupts can be blocked/masked, SYNC EA's cannot be blocked(not
that i know of), and i don't see SErrors being blocked on the path to
the EA handler and entry to MM. If this situation does occur, it seems
like we could deadlock when the EA attempts to enter MM again in the
interrupt handler.
Is there something that would prevent this situation from happening?
Thanks
Raghu
Hi,
>From the initial read of the Arm ARM, there are multiple keys (instruction, data) provided for authenticating pointers. But the current implementation only writes IA key [1].
I would like to understand the thought process behind programming only one key here. AFAIU, we should enable all keys - IA, IB, DA, DB.
-Varun
[1] https://review.trustedfirmware.org/plugins/gitiles/TF-A/trusted-firmware-a/…
Hi All,
The next TF-A Tech Forum is scheduled for Thu 16th July 2020 16:00 – 17:00 (BST). A reoccurring meeting invite has been sent out to the subscribers of this TF-A mailing list. If you don’t have this please let me know.
Agenda:
* Secure EL2 SPM (Secure Partition Manager) Hafnium-based
* In this TF-A Tech Forum session we present the status and open roadmap for the Secure Partition Manager firmware development. The TF-A SPM is the reference open source implementation for the PSA FF-A (Platform Security Architecture Firmware Framework for A-class) specification in the Secure world. It leverages the Armv8.4-Secure EL2 extension bringing virtualization technology in the Secure world (S-EL2 exception level). The development derives originally from the Google Hafnium project, which has been recently transitioned to https://www.trustedfirmware.org/ under the BSD 3-Clause license.
* Optional TF-A Mailing List Topic Discussions
If TF-A contributors have anything they wish to present at any future TF-A tech forum please contact me to have that scheduled.
Previous sessions, both recording and presentation material can be found on the trustedfirmware.org TF-A Technical meeting webpage: https://www.trustedfirmware.org/meetings/tf-a-technical-forum/
A scheduling tracking page is also available to help track sessions suggested and being prepared: https://developer.trustedfirmware.org/w/tf_a/tf-a-tech-forum-scheduling/ Final decisions on what will be presented will be shared a few days before the next meeting and shared on the TF-A mailing list.
Thanks
Joanna
+1 for re-introduction of 'Code-Review' label in addition to 'Code-Owner-Review' label.
I would like to see the enforcement of authors of patches not being allowed to self review eventually for all labels. Hopefully we can enable maintainers who are not the author of deadlocked patches to allow the setting of all required +1's to labels is situations where this is necessary for admin purposes after due consideration. I agree though while this process is settling down an honour system is appropriate.
Cheers
Joanna
On 01/07/2020, 08:56, "TF-A on behalf of Sandrine Bailleux via TF-A" <tf-a-bounces(a)lists.trustedfirmware.org on behalf of tf-a(a)lists.trustedfirmware.org> wrote:
Hi Julius,
On 7/1/20 2:13 AM, Julius Werner wrote:
> Hi Sandrine,
>
> Sounds like good changes in general! I'm curious what the ACLs are for
> Code-Owner-Review? Is it tied to docs/about/maintainers.rst or just
> based on the honor system? (I notice that I seem to be able to give a
> +1 for code I'm not owning, but maybe that's because I am a
> maintainer?)
Right now, the ACLs are not tied to docs/about/maintainers.rst (any
registered user could vote on this label) and it is just based on the
honor system. However, I'd like this to be enforced in the future, we
just haven't had the time to put the right tooling in place for that.
Also we did not want to spend time on developing such scripts before we
tried out these process changes in practice. If they proved too heavy or
inconvenient, part of this work would have gone to waste.
BTW, any help for the tooling is welcome! If you've got plugin
configuration files or scripts we could reuse (with an appropriate
license) or even tips on how to best set this up, please feel free to
share these.
> Also, are code owners allowed to +1 themselves (I think
> we said we didn't want maintainers to do that, but for code owners I
> could see how we might want to allow it since there are usually not
> that many)? What do we do when someone uploads the first patch for a
> new platform, do they just COR+1 themselves (since there is no code
> owner yet)?
That is indeed a grey area. As you say, many modules have a single code
owner and we need to agree on how we would like such code reviews to be
conducted. Thanks for starting this discussion!
One option as you say would be to allow code owners to self-review their
patches but I am not convinced we would gain anything out of this. It
sounds like a tick-box exercise to me, an admin overhead just to get the
patch through the system and I would like to avoid that as much as
possible. It is likely that a patch submitter has already self-reviewed
his code before posting a patch anyway.
The alternative we've been discussing in the team is to call out another
reviewer in these situations. I think that there is still value in
having a second fresh pair of eyes on a patch. Even if the reviewer has
no particular expertise on this specific module, they can still catch
potential logic problems or structural issues in the code.
The latter would be my preference. What do others think?
> I think it might still be useful to retain the existing Code-Review as
> a +1/-1 label next to the two new ones, just to allow other interested
> parties to record their opinion (without it having any enforcing
> effect on the merge). In other Gerrit instances I have used people
> often like to give CR+1 as a "I'm not the one who needs to approve
> this but I have looked at it and think it's fine" or a CR-1 as a "I
> can't stop you from doing this but I still want to be clear that I
> don't think it's a good idea". It allows people outside the official
> approval process a clearer way to participate and can aid the official
> approvers in their decisions (e.g. when I am reviewing a patch as a
> maintainer that already has a CR-1 from someone else I know to pay
> extra attention to their concerns, and it's more visible than just
> some random comment further up in the list). What do you think?
That's a very good idea, thanks! It also aligns with Javier's concerns,
which sound perfectly valid to me.
I think your proposal of having 3 distinct labels is better than having
code owners and non-code owners voting on the same generic 'Code-Review'
label, which is what Javier was suggesting. It clarifies further the
intent of each label and as you say allows us to configure different
rules for each (i.e. make the generic 'Code-Review' label
informative/optional, while making the 'Code-Owner-Review' label
mandatory for patch submission).
I am gonna wait for others' opinions on this before changing the
configuration in Gerrit again. I see Raghu agrees with this approach. If
nobody disagrees by the end of the week, I'll do these changes on Monday.
In the meantime, as discussed above, any registered user can vote on the
new 'Code-Owner-Review' label so let's continue to use that for the rest
of this week.
Regards,
Sandrine
--
TF-A mailing list
TF-A(a)lists.trustedfirmware.org
https://lists.trustedfirmware.org/mailman/listinfo/tf-a
Hello Sandrine,
IMO, the code review process has now evolved from the previous version. I am not too clear on the final outcome.
Does it make sense to add a "lifecycle of a review" section to the initial proposal?
-Varun
-----Original Message-----
From: TF-A <tf-a-bounces(a)lists.trustedfirmware.org> On Behalf Of Sandrine Bailleux via TF-A
Sent: Monday, July 6, 2020 1:53 AM
To: Julius Werner <jwerner(a)chromium.org>
Cc: tf-a <tf-a(a)lists.trustedfirmware.org>
Subject: Re: [TF-A] Announcing some changes around the code review label in Gerrit
External email: Use caution opening links or attachments
Hi Julius and all,
As agreed and announced last week, I've now reintroduced the Code-Review label in addition to the Code-Owner-Review and Maintainer-Review ones.
The Code-Review label is purely informational and won't influence whether a patch is submittable in Gerrit. Anyone should be able to vote on this label, if you face any issue please let me know.
On 7/1/20 11:12 PM, Julius Werner wrote:
>> BTW, any help for the tooling is welcome! If you've got plugin
>> configuration files or scripts we could reuse (with an appropriate
>> license) or even tips on how to best set this up, please feel free to
>> share these.
>
> The Chromium Gerrit has an owners-enforcement system but I'm not
> familiar with how exactly they set it up. I think they're using this
> plugin
> https://gerrit.googlesource.com/plugins/find-owners/+/refs/heads/maste
> r and this is where it's integrated into Gerrit submission rules:
> https://chromium.googlesource.com/chromiumos/+/refs/meta/config/rules.
> pl . It works by having a file called OWNERS in certain directories
> which sets the owners for the subtree below it, so we would have to
> rewrite the current code owner list in that format if we wanted to use
> it.
This "find-owners" plugin sounds promising! Thanks for the pointers.
I notice the "reviewers" plugin [1] is installed on review.tf.org but it does not sound as configurable as "find-owners".
Rewriting the current code owners list to use the OWNERS format does not sound like a big task to me (just a dull one... could be scripted,
though) so this approach definitely sounds worth investigating to me.
[1]
https://review.trustedfirmware.org/plugins/reviewers/Documentation/index.ht…
> There also seems to be this completely separate plugin that claims to
> do roughly the same thing, I have no idea where the difference is
> between the two:
> https://gerrit.googlesource.com/plugins/owners/+/refs/heads/master
I notice the "find-owners" plugin documentation mentions that "this plugin works with Gerrit projects that use Android or Chromium compatible OWNERS files" so maybe they started off from the generic "owners" plugin and customized it for Android/Chromium's needs? Just a guess.
Anyway, worth a look as well, thanks!
[2]
https://gerrit.googlesource.com/plugins/find-owners/+/refs/heads/master/src…
>> One option as you say would be to allow code owners to self-review their
>> patches but I am not convinced we would gain anything out of this. It
>> sounds like a tick-box exercise to me, an admin overhead just to get the
>> patch through the system and I would like to avoid that as much as
>> possible. It is likely that a patch submitter has already self-reviewed
>> his code before posting a patch anyway.
>>
>> The alternative we've been discussing in the team is to call out another
>> reviewer in these situations. I think that there is still value in
>> having a second fresh pair of eyes on a patch. Even if the reviewer has
>> no particular expertise on this specific module, they can still catch
>> potential logic problems or structural issues in the code.
>
> I definitely did not mean to suggest that these patches should not be
> reviewed at all -- I was just talking about the Code-Owner-Review
> label. There would still be a maintainer review, of course, and it
> seems like a good idea to get additional reviews from other people
> too. They just couldn't set the COR+1 bit once we start ACLing it.
Right, I see (and agree with you).
> Basically, each of these bits have their own purpose, and I would see
> the purpose of the COR+1 bit to be that the person most familiar with
> that particular piece of code has made sure that it fits in and
> doesn't cause unexpected problems with certain quirky configurations
> or something like that. That's important when, say, I make a generic
> refactoring that touches a platform I'm unfamiliar with, but if the
> code owner adds to their own platform that's probably already a given
> and they are likely still the best person to judge that, even for
> their own code.
Yes, that makes sense to me.
> That all code gets reviewed by people other than the author I would
> see as a somewhat orthogonal concern that should be checked
> independently. So maybe the rule could be that if the code owner sets
> COR+1 for themselves, there needs to be at least one Code-Review+1 (if
> we reintroduce that label) from another person (who is also not the
> one setting Maintainer-Review+1) to make sure the minimum amount of
> reviews stays the same. Or maybe these should all be completely
> independent checks so every patch needs a Code-Owner-Review+1 (can be
> from the author), a Maintainer-Review+1 (not from the author) and at
> least two Code-Review+1 from people other than the author -- with the
> normal expectation that when a maintainer or code owner reviews a
> patch, they would also set Code-Review+1 (as a general sign that they
> reviewed the patch) in addition to their special meaning flag.
I still don't see the benefit of code owners setting COR+1 for themselves. I would think that a patch owner already reviews his own patch before posting it for review so a self COR+1 is just reinstating the obvious in my eyes.
Your first suggestion (i.e. having at least one CR+1 from another person when a patch author cannot find another code owner to review their
patches) sounds aligned with what I proposed earlier.
Your second suggestion sounds a bit too heavy to me. The fact that people would have to replay their vote on several labels sounds like an admin overhead to me. I guess this would simplify the ACL configuration but I would prefer to avoid this if we can.
So I suggest the following. We could have 2 ways to get a patch approved.
1) For patches that have multiple code owners:
* COR+1 (not from patch owner)
* MR+1 (not from patch owner, not from the person setting COR+1)
2) For patches that a have single code owner:
* CR+1 (not from patch owner)
* MR+1 (not from patch owner, not from the person setting CR+1)
In 2), I like your suggestion of mandating 2 CR+1 but I fear this might be difficult to achieve in practice (simply because we might not have enough reviewers for this).
Note that in some cases, we would need to mix both policies. If a patch modifies several modules, some of them having multiple code owners and others having a single code owner, we would apply one or the other policy for each module.
How does that sound?
Also, 1) assumes we always want different individuals doing the code-owner review and the maintainer review. I personally don't feel strongly about this split and I would be fine with the same person doing both types of review, as they are typically looking at different criteria.
What do others think?
Regards,
Sandrine
--
TF-A mailing list
TF-A(a)lists.trustedfirmware.org
https://lists.trustedfirmware.org/mailman/listinfo/tf-a
Thanks Andrew for raising the issue and Marc and Raghu for your inputs.
The effect of stage 2 when Stage 1 is disabled is something that was overlooked in the workaround implementation. After some discussions this is what we currently understand regarding the problem.
1. EL3 does not modify the EL2 stage 1 translations and hence any speculative execution of AT execution in EL3 and resulting PTW and TLB caching should not be a problem as we always return to the same context in EL2. TF-A in EL3 need not worry about this. Also, we expect that future CPUs having v8.4 extensions for S-EL2 will have already corrected this errata.
2. Whenever the EL1/0 Stage 1 translations are switched in EL3 as part of switching the worlds, it should take Stage 2 into account. Hence disabling the Stage 1 MMU is not effective, as Marc pointed out, because speculative PTW can still take place using Stage 2 identity mapping. After discussion with James, we also realized that flipping the SCR_EL3.NS bit could take Stage 2 out of context (Stage 2 is not valid in secure world) leaving Stage 1 pointing to invalid translations.
As discussed in previous emails, on Entry to EL3, if TCR_EL1.EPDx = 1 and SCTLR_EL3.M = 1, any speculative AT PTWs for EL1 context can be prevented (including Stage 2).
So the fix as we currently understand would involve the following sequence :
a. On Entry to EL3, save the incoming SCTLR_EL1.M and TCR_EL1.EPDx bits and set them (ensure TCR_EL1.EPDx =1 prior to SCTLR_EL1.M =1 using isb())
b. Prior to Exit from EL3, after the target context is restored, restore the SCTLR_EL1.M and TCR_EL1.EPDx bits.
The above sequence now means that any use of AT instruction targeted at lower EL from EL3 that require PTW will fault. So prior to use of AT, ensure the PTW are re-enabled and disabled back again after the AT instructions.
If the above sequence is agreed upon to resolve the errata, then we can work on a patch for the same. I suspect current el1 register save and restore sequence in TF-A is a bit unwieldy and we may need to analyze all the entry points to EL3 to ensure we cover everything.
Best Regards
Soby Mathew
> -----Original Message-----
> From: TF-A <tf-a-bounces(a)lists.trustedfirmware.org> On Behalf Of Andrew
> Scull via TF-A
> Sent: 02 July 2020 09:20
> To: Raghu K <raghu.ncstate(a)icloud.com>
> Cc: android-kvm(a)google.com; willdeacon(a)google.com; Marc Zyngier
> <mzyngier(a)google.com>; tf-a(a)lists.trustedfirmware.org
> Subject: Re: [TF-A] Erroneous speculative AT workaround
>
> On Wed, Jul 01, 2020 at 05:47:00PM -0700, Raghu K wrote:
> > This is interesting. It appears that there is no way on entry to EL3
> > to guarantee that the out-of-context(el2 and el1) translation regimes
> > are in a consistent state and on every entry into EL3, we have to
> > conservatively assume that it is in an inconsistent state. This is
> > because of the situation Andrew mentioned(interrupts to EL3 can occur at any
> time).
> >
> > If this is the case, on EL3 entry:
> > 1) For EL1, we will need to save SCTLR_EL1, set SCTLR_EL1.M = 1,.EPDx
> > = 0
>
> This would still be racing against any potential speculative execution of an AT
> instruction upon the switch to EL3, IIUC. The window would be much smaller
> but not entirely eliminated.
>
> For KVM, this would be enough as KVM will have already applied this
> workaround (with Marc's corrections) whenever it is going to enter an
> inconsistent state. However, other EL2 software may choose to handle the
> errata differently, possibly going to the lengths of ensuring that no AT
> instruction is ever mapped executable.
> --
> TF-A mailing list
> TF-A(a)lists.trustedfirmware.org
> https://lists.trustedfirmware.org/mailman/listinfo/tf-a