Hi Julius,
Let me try...
This relates to TF-A continuous integration model spanning multiple trees incl. TF-A, TF-A-tests, tf-a-ci-scripts, Hafnium, job configs etc. When Allow-CI+1 label is applied to a TF-A change, jenkins clones the tree at the TF-A change (which can be old vs tip of integration), and clones tip of master for others. This can easily get out of sync when a change is left unattended even for few days, see the example below and attached picture (apologies I didn't manage to express my ascii art directly within the email).
At some point in time a TFA0 change is developed (e.g. branched from master). CI master head is CI0 at this very moment. Allow-CI+1 is applied to TF-A change and passes. Later a new test build config CI1 change is merged to tf-a-ci-scripts requiring a companion TFA1 change to build properly. When we come later to review TFA0 we launch Allow CI+1 but now takes CI1 as tip of master. Build breaks because TFA0 tree misses TFA1 in its history. A TFA0 rebase to TFA0' solves the issue.
Now add all aforementioned trees in this picture, the possible occurrence for this problem grows significantly.
Regards, Olivier.
From: Julius Werner via TF-A tf-a@lists.trustedfirmware.org Sent: 12 August 2022 01:09 To: tf-a tf-a@lists.trustedfirmware.org; Glen Valante glen.valante@linaro.org; Don Harbin don.harbin@linaro.org Subject: [TF-A] TF-A CI often fail due to missing rebase Hi Glen, Don, and others,
I've seen that a couple of TF-A patches I've been CCed on recently often seem to fail the CI run (Allow-CI+1) due to some strange build-time errors that don't seem to have anything to do with the patch at hand, and then one of the maintainers usually suggests that the patch needs a rebase, and the next CI run succeeds after that. Here are two recent examples:
https://review.trustedfirmware.org/c/TF-A/trusted-firmware-a/+/16160 https://review.trustedfirmware.org/c/TF-A/trusted-firmware-a/+/14666
I was wondering if this is a known problem and if the CI can do anything here to mitigate it and save developers from this extra layer of friction? I'm not really sure why a rebase was necessary in either of these examples or why the CI run failed before it (unless the whole repository was in a bad state that didn't build, but since all submissions are guarded by the CI that shouldn't have been possible?). But I also don't really understand why the rebase would make a difference for the CI anyway. Generally, when a patch is submitted in Gerrit, that means it is cherry-picked onto the current master (regardless of what parent commit it was uploaded with). Since the CI is supposed to be a test run for submission, I would expect that the CI should also test a patch by cherry-picking it onto the current master, not just by building the patch on top of whatever parent commit it was uploaded with. But since rebasing a patch evidently seems to make a difference to the CI, that suggests that it's currently doing the latter strategy? Should that maybe be changed to the former to avoid these kinds of issues?
If this isn't a known problem yet maybe it would be worth adding it to the JIRA?
Thanks, Julius