Hi,
I will be collecting for new review comments till this Friday.
If there were no new comments, I will get the patches merged after addressing the existing comments.
Best Regards,
Kevin
-----Original Message-----
From: TF-M <tf-m-bounces(a)lists.trustedfirmware.org> On Behalf Of Kevin Peng (Arm Technology China) via TF-M
Sent: Wednesday, December 11, 2019 11:02 AM
To: 'tf-m(a)lists.trustedfirmware.org' <tf-m(a)lists.trustedfirmware.org>
Cc: nd <nd(a)arm.com>
Subject: [TF-M] Support for optional build for secure partitions and test suites
Hi,
I've made some patches to support optional build for secure partitions and test suites:
https://review.trustedfirmware.org/q/topic:%22optional_build_sp_and_tests%2…
With this patch set, you can optionally build secure partitions by setting the TFM_PARTITION_XXX in CommonConfig.cmake around line 152 - 162.
And for test suites, by setting ENABLE_XXX_TESTS in test/TestConfig.cmake.
I'm collecting for review comments. Thanks.
Best Regards,
Kevin
--
TF-M mailing list
TF-M(a)lists.trustedfirmware.org
https://lists.trustedfirmware.org/mailman/listinfo/tf-m
Hi Matt,
Thanks for your email.
Let me try to answer your question because of the designer of the TF-M interrupt handling is on leave now.
First, you are right about this: "If the interrupted partition is the same as the handler partition, interrupted_ctx_stack_frame_t and handler_ctx_stack_frame_t should be pushed at different location."
Let's start from an example, there are 3 SPs with different interrupt priority, SP1 < SP2 < SP3. And there is one scenario like this:
* SP1 is running, SP2 interrupt SP1. Current, the interrupted partition is SP1, and the handler partition is SP2.
As the below code show in tfm_spm_partition_push_handler_ctx(), the stack pointer will be moved upper as below:
SP2 stack pointer --->
Caller_partition_indx(handler)
Partition_sate(handler)
* SP2 starts to run, and it is interrupted by SP3. Current, the interrupted partition is SP2, and the handler partition is SP3.
As the code shown in tfm_spm_partition_push_interrupted_ctx(), it uses the current stack member directly without moving its pointer.
SP2 stack pointer --->
Partition_state(interrupted)
Caller_partition_indx(handler)
Partition_sate(handler)
And after SP2 is interrupted by SP3, its partition status will be changed to "SPM_PARTITION_STATE_SUSPENDED". So that this partition will not be interrupted by others anymore unless it comes back to running state. Which means it does not need more stack anymore.
We can consider that after one SP is interrupted by others, there is no more stack is needed for it before it comes back to running status.
So I think it should be ok in the current design for the interrupt in the library model.
I am not sure if this can solve your confuse. Please feel free to give feedback. We can discuss more about it.
Thanks,
Edison
From: TF-M <tf-m-bounces(a)lists.trustedfirmware.org> On Behalf Of lg via TF-M
Sent: Monday, December 16, 2019 1:08 PM
To: tf-m(a)lists.trustedfirmware.org
Subject: [TF-M] irq handling in library mode
Hi TFM experts,
I have a question about the code logic of irq handling in library mode, code blocks in spm_api_func.c are as follows:
void tfm_spm_partition_push_interrupted_ctx(uint32_t partition_idx)
{
struct spm_partition_runtime_data_t *runtime_data =
&g_spm_partition_db.partitions[partition_idx].runtime_data;
struct interrupted_ctx_stack_frame_t *stack_frame =
(struct interrupted_ctx_stack_frame_t *)runtime_data->ctx_stack_ptr;
stack_frame->partition_state = runtime_data->partition_state;
}
void tfm_spm_partition_push_handler_ctx(uint32_t partition_idx)
{
struct spm_partition_runtime_data_t *runtime_data =
&g_spm_partition_db.partitions[partition_idx].runtime_data;
struct handler_ctx_stack_frame_t *stack_frame =
(struct handler_ctx_stack_frame_t *)
runtime_data->ctx_stack_ptr;
stack_frame->partition_state = runtime_data->partition_state;
stack_frame->caller_partition_idx = runtime_data->caller_partition_idx;
runtime_data->ctx_stack_ptr +=
sizeof(struct handler_ctx_stack_frame_t) / sizeof(uint32_t);
}
My question is why there is not the following such code logic at the end of function tfm_spm_partition_push_interrupted_ctx.
runtime_data->ctx_stack_ptr +=
sizeof(struct interrupted_ctx_stack_frame_t ) / sizeof(uint32_t);
If the interrupted partition is the same as the handler partition, interrupted_ctx_stack_frame_t and handler_ctx_stack_frame_t should be pushed at different location.
And when pop the stack frame after handling irq, there is the following code logic in tfm_spm_partition_pop_handler_ctx
runtime_data->ctx_stack_ptr -=
sizeof(struct handler_ctx_stack_frame_t) / sizeof(uint32_t);
I think the same logic of changing ctx_stack_ptr should be added the begining of the function tfm_spm_partition_pop_interrupted_ctx like the above code logic in tfm_spm_partition_pop_handler_ctx.
runtime_data->ctx_stack_ptr -=
sizeof(struct interrupted_ctx_stack_frame_t ) / sizeof(uint32_t);
Please help to check.
Thanks,
Matt
Hi Thomas,
What I can see from your description is that the problem should is caused the MPU configure.
You are working on the isolation level 2 model by using ConfigRegressionIPCTfmLevel2 configure file.
In isolation level 2, PSA FF says this: "the PSA Root of Trust is also protected from access by the Application Root of Trust". You can see more detail about the isolation information from PSA FF with this link: https://pages.arm.com/psa-resources-ff.html?_ga=2.156169596.61580709.154261….
So, we need to configure MPU(MSUCA_A board which you are using) for APP RoT to limit the source the APP RoT can access in isolation level 2.
You can see from code:
__attribute__((naked, section("SFN")))
psa_signal_t psa_wait(psa_signal_t signal_mask, uint32_t timeout)
{
__ASM volatile("SVC %0 \n"
"BX LR \n"
: : "I" (TFM_SVC_PSA_WAIT)); }
The psa_wait() is put in the "SFN" region, and the "SFN" region in put in the "TFM_UNPRIV_CODE" section. You can see the information about the "TFM_UNPRIV_CODE" in platform/ext/common/armclang/tfm_common_s.sct.
You can refer tfm_spm_mpu_init() in platfrom/ext/target/musca_a/spm_hal.c. There are some MPU region we need to configure for APP RoT in isolation level 2, include the "TFM_UNPRIV_CODE".
If you are interested, you can see some detail about the design of isolation level 2 from here: https://developer.trustedfirmware.org/w/tf_m/design/trusted_firmware-m_isol…
Please check this first. If cannot work, please feel free to tell us.
Thanks,
Edison
-----Original Message-----
From: TF-M <tf-m-bounces(a)lists.trustedfirmware.org> On Behalf Of Thomas Törnblom via TF-M
Sent: Friday, December 13, 2019 4:25 AM
To: tf-m(a)lists.trustedfirmware.org
Subject: [TF-M] MPU issues, was Re: Regression test issues with IAR port
Next issue.
For some reason the secure image runs into a MemManage exception fairly early in the irq test of the ConfigRegressionIPCTfmLevel2 config and I have not yet been able to figure out why.
It happens in the psa_wait() call in:
---
int32_t tfm_irq_test_1_init(void)
{
tfm_enable_irq(SPM_CORE_IRQ_TEST_1_SIGNAL_TIMER_0_IRQ);
#ifdef TFM_PSA_API
psa_signal_t signals = 0;
while (1) {
signals = psa_wait(PSA_WAIT_ANY, PSA_BLOCK);
---
The exact point of the exception is the SVC call in:
---
__attribute__((naked, section("SFN")))
psa_signal_t psa_wait(psa_signal_t signal_mask, uint32_t timeout)
{
__ASM volatile("SVC %0 \n"
"BX LR \n"
: : "I" (TFM_SVC_PSA_WAIT)); }
---
The cause is IACCVIOL, "The processor attempted an instruction fetch from a location that does not permit execution."
The stack frame indicates that it happened on the SVC instruction, but I as far as I can see none of the MPU regions maps the address so I assumed it should be allowed as it should be handled by the background map, which should allow secure access.
If I don't enable the mpu (just skipping the enable call) then all tests run without problems.
I have tried to compare it with an image built with ARMCLANG, and I don't see anything obviously different. The regions are roughly the same, all regions with fixed addresses are the same, the enable bits are the same and the SVC handler is not mapped to any MPU region there either. I wish there were an MPU status register that would tell exactly what region was causing the exception.
The odd thing is that there is an SVC call in the tfm_enable_irq() call prior to the psa_wait() call, and that works.
This is on a Musca A by the way.
Ideas?
--
*Thomas Törnblom*, /Product Engineer/
IAR Systems AB
Box 23051, Strandbodgatan 1
SE-750 23 Uppsala, SWEDEN
Mobile: +46 76 180 17 80 Fax: +46 18 16 78 01
E-mail: thomas.tornblom(a)iar.com <mailto:thomas.tornblom@iar.com>
Website: www.iar.com <http://www.iar.com>
Twitter: www.twitter.com/iarsystems <http://www.twitter.com/iarsystems>
--
TF-M mailing list
TF-M(a)lists.trustedfirmware.org
https://lists.trustedfirmware.org/mailman/listinfo/tf-m
IMPORTANT NOTICE: The contents of this email and any attachments are confidential and may also be privileged. If you are not the intended recipient, please notify the sender immediately and do not disclose the contents to any other person, use it for any purpose, or store or copy the information in any medium. Thank you.
Hi TFM experts,
I have a question about the code logic of irq handling in library mode, code blocks in spm_api_func.c are as follows:
void tfm_spm_partition_push_interrupted_ctx(uint32_t partition_idx)
{
struct spm_partition_runtime_data_t *runtime_data =
&g_spm_partition_db.partitions[partition_idx].runtime_data;
struct interrupted_ctx_stack_frame_t *stack_frame =
(struct interrupted_ctx_stack_frame_t *)runtime_data->ctx_stack_ptr;
stack_frame->partition_state = runtime_data->partition_state;
}
void tfm_spm_partition_push_handler_ctx(uint32_t partition_idx)
{
struct spm_partition_runtime_data_t *runtime_data =
&g_spm_partition_db.partitions[partition_idx].runtime_data;
struct handler_ctx_stack_frame_t *stack_frame =
(struct handler_ctx_stack_frame_t *)
runtime_data->ctx_stack_ptr;
stack_frame->partition_state = runtime_data->partition_state;
stack_frame->caller_partition_idx = runtime_data->caller_partition_idx;
runtime_data->ctx_stack_ptr +=
sizeof(struct handler_ctx_stack_frame_t) / sizeof(uint32_t);
}
My question is why there is not the following such code logic at the end of function tfm_spm_partition_push_interrupted_ctx.
runtime_data->ctx_stack_ptr +=
sizeof(struct interrupted_ctx_stack_frame_t ) / sizeof(uint32_t);
If the interrupted partition is the same as the handler partition, interrupted_ctx_stack_frame_t and handler_ctx_stack_frame_t should be pushed at different location.
And when pop the stack frame after handling irq, there is the following code logic in tfm_spm_partition_pop_handler_ctx
runtime_data->ctx_stack_ptr -=
sizeof(struct handler_ctx_stack_frame_t) / sizeof(uint32_t);
I think the same logic of changing ctx_stack_ptr should be added the begining of the function tfm_spm_partition_pop_interrupted_ctx like the above code logic in tfm_spm_partition_pop_handler_ctx.
runtime_data->ctx_stack_ptr -=
sizeof(struct interrupted_ctx_stack_frame_t ) / sizeof(uint32_t);
Please help to check.
Thanks,
Matt
Hi all,
After several rounds of review, I'd like to merge Cypress PSoC 64 support on master branch this Tuesday.
If you have more comments or opinions, please share them before Tuesday.
Thank you.
Best regards,
Hu Ziji
-----Original Message-----
From: TF-M <tf-m-bounces(a)lists.trustedfirmware.org> On Behalf Of Christopher Brand via TF-M
Sent: Saturday, December 7, 2019 4:13 AM
To: tf-m(a)lists.trustedfirmware.org
Subject: [TF-M] Adding Cypress PSoC64 platform support
Hi,
I recently pushed patches to add support for a platform based on Cypress' PSoC64 SoC to gerrit.
Given that this is the first non-Arm platform to be posted, it seems worth drawing attention to.
Comments very much appreciated.
I do anticipate a few small updates to the patchset, even in the absence of comments. In particular, there are some documentation improvements to come.
There are four patches in total, ending with https://review.trustedfirmware.org/c/trusted-firmware-m/+/2728https://review.trustedfirmware.org/c/trusted-firmware-m/+/2725/1 adds files to the platform/ext/cmsis directory, and so will affect/be affected by https://review.trustedfirmware.org/c/trusted-firmware-m/+/2578
Thanks,
Chris
This message and any attachments may contain confidential information from Cypress or its subsidiaries. If it has been received in error, please advise the sender and immediately delete this message.
--
TF-M mailing list
TF-M(a)lists.trustedfirmware.org
https://lists.trustedfirmware.org/mailman/listinfo/tf-m
Hi all,
I'd like to merge the patch set https://review.trustedfirmware.org/q/topic:%22template_plat_files%22+(statu… soon if no further comments.
Please share your comments before this Tuesday.
Best regards,
Hu Ziji
-----Original Message-----
From: TF-M <tf-m-bounces(a)lists.trustedfirmware.org> On Behalf Of David Hu (Arm Technology China) via TF-M
Sent: Thursday, December 12, 2019 11:57 AM
To: tf-m(a)lists.trustedfirmware.org
Cc: nd <nd(a)arm.com>
Subject: [TF-M] Please review the patch set to extract duplicated template files from platforms
Hi all,
I submit a patch set to extract the duplicated identical template files dummy_xxx.c from targets and put them under platform/ext/common/template folder.
The purpose is to collect a common template of booting/attestation example for platforms and each platform doesn't need to keep a copy under its folder anymore.
Since it is a general change related to all platforms using template files, I'd like to ask for review here. Any comments would be appreciated.
Please check the patch details in https://review.trustedfirmware.org/q/topic:%22template_plat_files%22+(statu…
The background is described in https://developer.trustedfirmware.org/T539.
Best regards,
Hu Ziji
--
TF-M mailing list
TF-M(a)lists.trustedfirmware.org
https://lists.trustedfirmware.org/mailman/listinfo/tf-m
The current RTOS integration with TZ API support is to make it generic. You can use "empty" implementation for these API if you don't use multiple secure context (SFC or IPC model) and have no multiple NS client IDs requirements.
Besides that, the users can leverage TZ API for some other purposes, e.g. policy control for which NS task can access which secure partitions and etc. But that's quite use case specific. Just FYI.
Regards,
David Wang
ARM Electronic Technology (Shanghai) Co., Ltd
Phone: +86-21-6154 9142 (ext. 59142)
________________________________
From: TF-M <tf-m-bounces(a)lists.trustedfirmware.org> on behalf of Reinhard Keil via TF-M <tf-m(a)lists.trustedfirmware.org>
Sent: 13 December 2019 19:41
To: tf-m(a)lists.trustedfirmware.org <tf-m(a)lists.trustedfirmware.org>
Subject: Re: [TF-M] Simplify RTOS / TF-M interface (single thread execution)
Ken,
thanks for all your swift answers.
Sorry, I need to check on this part of the answer again:
* What happens worst case when an RTOS does not implement TZ RTOS Context Management?
Ken.L: If there is no locking protection in NS and multiple ns calling would panic.
TZ RTOS Context Management does not prevent from that. Correct.
So the only feature that is enabled with TZ RTOS Context Management is 'client ID identification' for Protected Storage (and potentially other services).
Reinhard
IMPORTANT NOTICE: The contents of this email and any attachments are confidential and may also be privileged. If you are not the intended recipient, please notify the sender immediately and do not disclose the contents to any other person, use it for any purpose, or store or copy the information in any medium. Thank you.
--
TF-M mailing list
TF-M(a)lists.trustedfirmware.org
https://lists.trustedfirmware.org/mailman/listinfo/tf-m
Hi,
Hi,
I agree, the CI shall not dictate how we use the version control system. It shall adapt.
Regarding your suggestions, I think the main problem is we are mixing stuff, this time quality with version control. Before we make decisions we shall understand where we are.
The current quality policy is that we only make releases for communication purposes. To give a clean interface for tf-m users and to allow planning their work. Releases allow them to execute their tf-m integration process less frequently. Only for each release or specific releases and not for each commit. The current quality policy identifies a single quality level only, and says any patch we publish is "golden quality", it matches the highest quality standard we can achieve (with sane constraints). Also to make our life easy we decided to use the master branch to hold these patches.
At the same time we use the master branch for development. Any change we make is made against master. This means each pull request and thus each review targets master. For review purposes the best is to have a chain of small modifications, otherwise the review content becomes too large to follow.
The TF-A "branching strategy" tries to address this issue by introducing an integration branch used for development. This allows master to be more release specific.
I suggest to take the following approach (details to be discussed):
- introduce more quality levels i.e.:
- none: content of a topic branch, or content pushed to review.
- bronze: content passed code review and patch specific testing.
- silver: content passed a more though daily testing.
- gold: a release. A pack of source-code, feature state document (release notes), reviewed documentation (user manual, reference manual), test evidence, documentation of test efforts to allow repeatability. The version control system can be used to store content, and to provide identification info (i.e. tagging), but most likely the release will need other kind of storage to be used (i.e. documentation).
- platina: reaching extra quality level trough passing PSA or some FUSA qualification. Or we may simply use extra release for this.
Naming the quality levels allows us to have a cleaner definition of what can be expected at a specific level (set of quality measures, i.e. static analysis, code review, test configuration). It would also allow us cleaner communication and to find how we use the version system for quality purposes.
I also expect this discussion to help defining how the version system is used for development purposes.
The current state works ok, but is a sort of naturally grown. We might have reached the point when more pragmatic approach may be needed.
/George
-----Original Message-----
From: TF-M <tf-m-bounces(a)lists.trustedfirmware.org> On Behalf Of Minos Galanakis via TF-M
Sent: 13 December 2019 12:23
To: Edison Ai (Arm Technology China) via TF-M <tf-m(a)lists.trustedfirmware.org>; Soby Mathew <Soby.Mathew(a)arm.com>
Cc: nd <nd(a)arm.com>
Subject: Re: [TF-M] Create another branch for feature development
Hi all.
My personal comments on this.
I would like to point out that the CI is a tool, not the core project. I do not believe we should be changing our development strategy based on what the tool is doing. We should instead adjust the tool to fit our requirements.
* No patches should be/ are merged to master when CI fails. If master breaks it should most commonly be because of something we are not testing for. Using an integration branch would not change that.
* As a developer I find it more convoluted to work with projects who use different integration strategies. The most common assumption in open source projects is that you have a master branch which is the bleeding edge, but can contain untested bugs, and the release immutable git tags for versioning. Using branch merges as versioning is a design for the pull request model which is not quite compatible with Gerrit.
* Most of the CI downtime has nothing to do with the merge strategy, they are more of a chicken and the egg philosophical problem. If your patch or branch introduced a change which affects the tests outputs, how will you test it if the CI expects the old output? An integration branch would not solve the merge freeze periods, would just affect a different branch from master.
* I believe feature branches are quite useful, since changes to master do not disrupt the development flow of a big change, and even though they will require some maintenance to re-sync before the final patch , it will be handled by an engineer who knows the feature, and full understands the regression vectors.
If I were to suggest some changes for stability purposes, I would start smaller:
* Update documentation to instruct users to check out from release tags, warning then that master is constantly changing.
* Adjust the CI to detect an Allow-CI flag from every branch. That way developers can test any patch from any feature branch. The logic for that is already present in the code, but requires Gerrit to be configured accordingly.
* Add an undo process. This would be the only case for an integration branch. All patches are merged to a temporary branch, after confirming they have passed testing individually. On the once per day nightly test, the group of different patches, will be tested against an extensive job, in models and hardware, and only if successful it will fast forward master to that state.
Regards
Minos
________________________________
From: TF-M <tf-m-bounces(a)lists.trustedfirmware.org> on behalf of Edison Ai (Arm Technology China) via TF-M <tf-m(a)lists.trustedfirmware.org>
Sent: 13 December 2019 08:55
To: Soby Mathew <Soby.Mathew(a)arm.com>; 'tf-m(a)lists.trustedfirmware.org' <tf-m(a)lists.trustedfirmware.org>
Cc: nd <nd(a)arm.com>
Subject: Re: [TF-M] Create another branch for feature development
Hi Soby,
Thanks for your detail description.
> Integration is a temporary merge branch to merge several patches and run the CI against. Usually once CI passes, the master will be fast forwarded to integration within a day.
> This helps us to test integration of patches and detect any failure before master is updated. This means the master will pass CI at any given merge point.
I think it's a good method like this so that we can double confirm the "master" branch is stable.
And this also can fix one case even the CI can work normally: one patch is ready to merge, and it is not based on the latest HEAD, but there is no conflict. We can merge the patch directly and let gerrit do rebase by itself. But we cannot confirm the CI test can pass.
Any comment for this from others?
For multiple feature branches, I think we can stop to discuss about it now until we have some strong demands for it. It is indeed a big change for us now.
Thanks,
Edison
-----Original Message-----
From: Soby Mathew <Soby.Mathew(a)arm.com>
Sent: Friday, December 13, 2019 5:14 AM
To: Edison Ai (Arm Technology China) <Edison.Ai(a)arm.com>; 'tf-m(a)lists.trustedfirmware.org' <tf-m(a)lists.trustedfirmware.org>
Cc: nd <nd(a)arm.com>
Subject: Re: [TF-M] Create another branch for feature development
On 11/12/2019 09:05, Edison Ai (Arm Technology China) via TF-M wrote:
> Hi Gyorgy,
>
> Thanks to point it out. I agree with you that it will be better if we can align these two projects in this. I had a quick check the branches from TF-A: https://git.trustedfirmware.org/TF-A/trusted-firmware-a.git/.
> There are three branches in TF-A:
> - "integration" branch, should be used for new features.
> - "master" branch, which is behind of "integration" branch. But I am nor sure what is the strategy to update it.
Hi Edison,
Integration is a temporary merge branch to merge several patches and run the CI against. Usually once CI passes, the master will be fast forwarded to integration within a day.
This helps us to test integration of patches and detect any failure before master is updated. This means the master will pass CI at any given merge point.
> - "topics/epic_beta0_spmd", I thinks it should like a feature branch for big feature.
> @Soby Mathew Could you help to share more information about it? Thanks very much.
We usually do not have feature branches in TF-A. The topics/epic_beta0_spmd is a prototyping branch where we wanted to share code with collaborators outside TF-A. The patches on this branch are not visible in gerrit review and no patches in gerrit review will be merged to this branch. Once the prototyping is complete, then patches on this branch will be reworked and pushed to gerrit review and finally get merged to integration and this branch will be deleted.
Our experience have been, long running development branches are generally a maintenance overhead. Merging these development branches before a release may also be risky as some of the changes may have unknown interactions and may become difficult to resolve.
The "topic" in gerrit review is effectively a branch. For example, this
review:
https://review.trustedfirmware.org/#/q/topic:od/debugfs+(status:open+OR+sta…
is a set of patches on topic "od/debugfs" and can be treated as development branch. This branch is alive as long as patches are not merged.
We need to understand the motivations for the change. Broken CI is an argument but development branches will only exacerbate that problem since we don't know the stability of each of those branches. Also merge conflict will not reduce due to development branches. Its just delaying the merge conflict to a later point.
There may be other reasons, but generally it is better to merge sensible patches (+2ed) within a feature even before the feature is complete as it will reduce merge conflicts (we have to ensure testing/build coverage for the patch). These are my thoughts on this.
Best Regards
Soby Mathew
--
TF-M mailing list
TF-M(a)lists.trustedfirmware.org
https://lists.trustedfirmware.org/mailman/listinfo/tf-m
--
TF-M mailing list
TF-M(a)lists.trustedfirmware.org
https://lists.trustedfirmware.org/mailman/listinfo/tf-m
Ken,
thanks for all your swift answers.
Sorry, I need to check on this part of the answer again:
* What happens worst case when an RTOS does not implement TZ RTOS Context Management?
Ken.L: If there is no locking protection in NS and multiple ns calling would panic.
TZ RTOS Context Management does not prevent from that. Correct.
So the only feature that is enabled with TZ RTOS Context Management is 'client ID identification' for Protected Storage (and potentially other services).
Reinhard
IMPORTANT NOTICE: The contents of this email and any attachments are confidential and may also be privileged. If you are not the intended recipient, please notify the sender immediately and do not disclose the contents to any other person, use it for any purpose, or store or copy the information in any medium. Thank you.
Hi all.
My personal comments on this.
I would like to point out that the CI is a tool, not the core project. I do not believe we should be changing our development strategy based on what the tool is doing. We should instead adjust the tool to fit our requirements.
* No patches should be/ are merged to master when CI fails. If master breaks it should most commonly be because of something we are not testing for. Using an integration branch would not change that.
* As a developer I find it more convoluted to work with projects who use different integration strategies. The most common assumption in open source projects is that you have a master branch which is the bleeding edge, but can contain untested bugs, and the release immutable git tags for versioning. Using branch merges as versioning is a design for the pull request model which is not quite compatible with Gerrit.
* Most of the CI downtime has nothing to do with the merge strategy, they are more of a chicken and the egg philosophical problem. If your patch or branch introduced a change which affects the tests outputs, how will you test it if the CI expects the old output? An integration branch would not solve the merge freeze periods, would just affect a different branch from master.
* I believe feature branches are quite useful, since changes to master do not disrupt the development flow of a big change, and even though they will require some maintenance to re-sync before the final patch , it will be handled by an engineer who knows the feature, and full understands the regression vectors.
If I were to suggest some changes for stability purposes, I would start smaller:
* Update documentation to instruct users to check out from release tags, warning then that master is constantly changing.
* Adjust the CI to detect an Allow-CI flag from every branch. That way developers can test any patch from any feature branch. The logic for that is already present in the code, but requires Gerrit to be configured accordingly.
* Add an undo process. This would be the only case for an integration branch. All patches are merged to a temporary branch, after confirming they have passed testing individually. On the once per day nightly test, the group of different patches, will be tested against an extensive job, in models and hardware, and only if successful it will fast forward master to that state.
Regards
Minos
________________________________
From: TF-M <tf-m-bounces(a)lists.trustedfirmware.org> on behalf of Edison Ai (Arm Technology China) via TF-M <tf-m(a)lists.trustedfirmware.org>
Sent: 13 December 2019 08:55
To: Soby Mathew <Soby.Mathew(a)arm.com>; 'tf-m(a)lists.trustedfirmware.org' <tf-m(a)lists.trustedfirmware.org>
Cc: nd <nd(a)arm.com>
Subject: Re: [TF-M] Create another branch for feature development
Hi Soby,
Thanks for your detail description.
> Integration is a temporary merge branch to merge several patches and run the CI against. Usually once CI passes, the master will be fast forwarded to integration within a day.
> This helps us to test integration of patches and detect any failure before master is updated. This means the master will pass CI at any given merge point.
I think it's a good method like this so that we can double confirm the "master" branch is stable.
And this also can fix one case even the CI can work normally: one patch is ready to merge, and it is not based on the latest HEAD, but there is no conflict. We can merge the patch directly and let gerrit do rebase by itself. But we cannot confirm the CI test can pass.
Any comment for this from others?
For multiple feature branches, I think we can stop to discuss about it now until we have some strong demands for it. It is indeed a big change for us now.
Thanks,
Edison
-----Original Message-----
From: Soby Mathew <Soby.Mathew(a)arm.com>
Sent: Friday, December 13, 2019 5:14 AM
To: Edison Ai (Arm Technology China) <Edison.Ai(a)arm.com>; 'tf-m(a)lists.trustedfirmware.org' <tf-m(a)lists.trustedfirmware.org>
Cc: nd <nd(a)arm.com>
Subject: Re: [TF-M] Create another branch for feature development
On 11/12/2019 09:05, Edison Ai (Arm Technology China) via TF-M wrote:
> Hi Gyorgy,
>
> Thanks to point it out. I agree with you that it will be better if we can align these two projects in this. I had a quick check the branches from TF-A: https://git.trustedfirmware.org/TF-A/trusted-firmware-a.git/.
> There are three branches in TF-A:
> - "integration" branch, should be used for new features.
> - "master" branch, which is behind of "integration" branch. But I am nor sure what is the strategy to update it.
Hi Edison,
Integration is a temporary merge branch to merge several patches and run the CI against. Usually once CI passes, the master will be fast forwarded to integration within a day.
This helps us to test integration of patches and detect any failure before master is updated. This means the master will pass CI at any given merge point.
> - "topics/epic_beta0_spmd", I thinks it should like a feature branch for big feature.
> @Soby Mathew Could you help to share more information about it? Thanks very much.
We usually do not have feature branches in TF-A. The topics/epic_beta0_spmd is a prototyping branch where we wanted to share code with collaborators outside TF-A. The patches on this branch are not visible in gerrit review and no patches in gerrit review will be merged to this branch. Once the prototyping is complete, then patches on this branch will be reworked and pushed to gerrit review and finally get merged to integration and this branch will be deleted.
Our experience have been, long running development branches are generally a maintenance overhead. Merging these development branches before a release may also be risky as some of the changes may have unknown interactions and may become difficult to resolve.
The "topic" in gerrit review is effectively a branch. For example, this
review:
https://review.trustedfirmware.org/#/q/topic:od/debugfs+(status:open+OR+sta…
is a set of patches on topic "od/debugfs" and can be treated as development branch. This branch is alive as long as patches are not merged.
We need to understand the motivations for the change. Broken CI is an argument but development branches will only exacerbate that problem since we don't know the stability of each of those branches. Also merge conflict will not reduce due to development branches. Its just delaying the merge conflict to a later point.
There may be other reasons, but generally it is better to merge sensible patches (+2ed) within a feature even before the feature is complete as it will reduce merge conflicts (we have to ensure testing/build coverage for the patch). These are my thoughts on this.
Best Regards
Soby Mathew
--
TF-M mailing list
TF-M(a)lists.trustedfirmware.org
https://lists.trustedfirmware.org/mailman/listinfo/tf-m