*Attendees*: Don, Glen, Anton, Riku, Joanna, Matteo, Xinyu, Dave Rodgman,
Shebu
*Minutes*:
- Glen: Shared TFC Board
- Boards:
- Chromebooks - returning to get out of customs and then sending back
again.
- Cypress/NXP being integrated in October. Received h/w in lab
- TFC-82: Arthur working on this while have time
- TFC-36 is ready for Arm team to review. Leonardo
- TFC-176: Intermittent failures. Leonardo continuing to isolate this
- TFC-20 integrated including a resolved regression. Any issues noticed?
- Arm provided AMI files. Go back to mbedTLS?
- Matteo: Will we finish TFC-36? Need to make sure this finishes.
- Glen: Agreed
- Joanna: Would like to continue w/ TFC-176 continue at least part
time.
- Glen recommends re-evaluating after his work day today
- Joanna: Agree to keep it as a background task
- TFC-172: In backlog, not any data in the ticket on quantifying slow
or how to replace.
- Xinyu: No longer reproducing. Xinyu to update ticket and we will
close
- TFC-171: Seems to be solved from other infrastructure improvements
- Joanna: OK to close resolved. Not sure what it was but not
occurring now
- Arthur may have bandwidth for another task
- Joanna: TFC-87 may be a good one to work on. Currently the team
performs a work-around.
- Moved TFC-87 to SC Approved from Backlog
- *Action: Anton* to evaluate moving TFC-173 to SC Approved.
Don
Attendees: Riku, Don, Glen, Matteo, Shebu, Anton, Joanna, Dave Rodgman
Minutes:
- TFC-20: Git performance - infrastructure changes happened including
Leonardo infrastructure changes. Tested on stage. Should keep an eye on it
for next few days. New machine already added back in. Better performance
will be seen as well since more jobs can focus on builds and not clones.
- Expect scripts: Should wrap up this week
- Joanna: Brought up LAVA timeouts TFC-176. Pass on 2nd or 3rd
attempts. Initial analysis is too many parallel LAVA jobs. Starting out by
increasing timeout. Would prevent re-running jobs.
- Riku: Should add LAVA lab folks to this ticket since adjusting
timeouts
- Boards status:
- Chromebooks still dealing w/ import issues.
- *ACTION: Don* ask Julius to reject and resend the 3 boards to the
Cambridge lab
- Cypress and NXP platforms now in the lab
- LAVA team will be updating to latest release with the new board
configs included. Want this done in the next couple of weeks
- Arthur will be available for some other tasks.
- MBedTLS
- Dave: AMI images almost ready. Expect it soon.
- Glen: Linaro support prepared to copy to AMI's when they are queued
up.
- Glen: Joanna's new list of issues
- LAVA timeout was one.
- TFC-87: Joanna's team reviewing that one. CI reporting ticket. Need
some guidance/access from Leonardo
- Glen will let him know
Thanks
Don
*Attendees*: Joanna, Xinyu, Matteo, Janos, Glen, Riku, Shebu, Anton, Ben,
Don
*Actions*:
- Glen: Follow up on notes
- Glen: Set up sync meeting to hear Riku/Leonardo/Anton/Joanna on
proposing a solution on the git clone performance issue.
*Minutes*:
- Glen: TFC Kanban board review
- Glen: Chromebooks stuck in customs - working paperwork now
- Glen: Cypress & NXP platforms both underway
- Glen: Performance issues update: (TFC-171, 172, 164)
- Ben: Limited CI Number of jobs to help relieve a performance issue.
- Riku: Impact - slower builds.
- Anton: were we testing on staging?
- Ben: No
- Should we allocate resources to work on performance?
- Joanna: Would work on server scaling versus Expect scripts
- Riku: TF-M build, launches over 100 builds, then git clones turn
into 400 simultaneous git clones - need to re-factor to do clone
up front.
- Riku/Leonardo - 1-2 week estimate
- Anton has some ideas - sync w/ him on potential solution. Once
agreed, begin the work.
- Glen: Meeting set up for tomorrow to discuss code coverage state and
how Arm might be able to help.
Hi Sherry,
I'm adding the triage maillist to the thread. As a best practice, let's cc
that list on items like this as it includes the stakeholders that
prioritize OpenCI tasks on a weekly basis(minimum) so it's helpful info in
that decision making.
I see you're already subscribed to the list which is great! :) Reviewing
the Aug 31st sync minutes, Expect scripts were determined to be the
priority. Code Coverage next steps are also discussed. Looks like Glen
was going to set up a sync meeting to further discuss this one... The
minutes could have called out this action more clearly:
- Code Coverage:
- A sync w/ Leonardo and Joanna to discuss next steps/Current status
on CC shall be planned. Glen to set up
> Is it suspended for pure priority reason, or any technical reason?
So with the above said, this is a prioritization decision made by the
triage stakeholders, not technical.
Hope this helps, please let me know if any questions or suggestions on
improving the process. :)
Best,
don
On Thu, 2 Sept 2021 at 06:35, Leonardo Sandoval <
leonardo.sandoval(a)linaro.org> wrote:
> Hi Sherry,
>
> In resume, for priority reasons.
>
> Right now I am working on some pending tickets for TF-A (expect scripts
> migration, TFC-36 <https://linaro.atlassian.net/browse/TFC-36>). Once I
> complete TFC-36 <https://linaro.atlassian.net/browse/TFC-36> and MbedTLS
> work is still on hold, I will move to TFC-7
> <https://linaro.atlassian.net/browse/TFC-7> immediately.
>
> Regards,
> lsg
>
>
>
> On Thu, 2 Sept 2021 at 02:15, Sherry Wu <Sherry.Wu(a)arm.com> wrote:
>
>> Hi Leonardo and Don,
>>
>>
>>
>> Just noticed that https://linaro.atlassian.net/browse/TFC-7 changed to
>> “TODO”. Wondering what’s latest update for the code coverage tool
>> integration on Open CI.
>>
>
> Thanks,
>>
>> Sherry
>>
>>
>> IMPORTANT NOTICE: The contents of this email and any attachments are
>> confidential and may also be privileged. If you are not the intended
>> recipient, please notify the sender immediately and do not disclose the
>> contents to any other person, use it for any purpose, or store or copy the
>> information in any medium. Thank you.
>>
>
Hi Xinyu,
Thanks for the escalation.
I see Ben is in the loop, so that's the correct first step. I've also cc'd
the triage maillist to make sure all stakeholders are looped in.
In general, as a best practice, I would suggest having as much quantifiable
data as possible upfront in tickets like this to help better understand the
magnitude as well as how to reproduce. It looks like the tickets have
already started to capture this, but I also see Ben in the ticket
requesting more. Datapoints of interest in my mind:
- Clear details on how to reproduce: The task(s) where noticeable
degradation is seen - is this in parallel to when large builds have kicked
off? etc.
- Tasks invoked and level of degradation: for example, "Gerrit reviews -
previously took xyz seconds/minutes, now taking 20% more time (or 2x, 10x?,
failing & never completing?)," frequency, etc. The more details the
better! This will help determine the priority we place on resolving. :)
Perhaps coming up with a general template for this could be helpful.
- Is there other degradations beyond Gerrit?
Ben is most certainly much more qualified than me in knowing what support
is needed to isolate/resolve, and, as noted in the ticket, he is asking for
more details as well. Let's get the details captured in the ticket(s) in
preparation for next Tuesday's Triage meeting where we can prioritize the
resolution over other TFC tasks. :)
Ben, feel free to chime in and correct any of my assumptions/suggestions.
Regards,
Don
On Thu, 2 Sept 2021 at 01:41, Xinyu Zhang <Xinyu.Zhang(a)arm.com> wrote:
> Hi Don,
>
>
>
> We found that trustedfirmware.org is getting slow. Daily work of some
> developers would be influenced.
>
> Could you please help to take a look on this issue? Here is the TFC link:
> https://linaro.atlassian.net/browse/TFC-172
>
>
>
> BR,
>
> Xinyu
> IMPORTANT NOTICE: The contents of this email and any attachments are
> confidential and may also be privileged. If you are not the intended
> recipient, please notify the sender immediately and do not disclose the
> contents to any other person, use it for any purpose, or store or copy the
> information in any medium. Thank you.
>
*Attendees*: Riku, Anton, Glen, Shebu, Ben C, Joanna, Janos, Matteo, Don,
Xinyu
*Minutes*:
- Glen: Begin w/ TFC Kanban Board
- Glen: H/W Status
- Working thru getting Chromebooks imported. In progress
- Cypress boards flashing and booting, starting on jobs work. In a
week ready for boards to be shipped to Cambridge.
- NXP board is in the queue.
- MBedTLS:
- Glen: Have advanced as far as can. Need AMI files from Dean A.
Blocked awaiting AMI files.
- Janos: Progressed since last week, but working on resolving
internal issues. No estimate yet from Dean.
- Glen: Plan to move to expect script efforts with team approval
- Matteo: Agree Expect Scripts are the next item to work. Would like
completed, so good direction to go.
- Code Coverage:
- A sync w/ Leonardo and Joanna to discuss next steps/Current status
on CC shall be planned. Glen to set up
- Glen: Performance/Fixes needed?
- Riku: Nightly jobs failing due to too many parallel checkouts of
same repo/same version. TFC-20. Need to decide how to implement
- For TF-M, looks currently like 1 week work to modify CI to reduce
parallel checkouts, TF-A still looking to get an estimate.
- Ben: TF-A may be a bit larger task, but still need to look.
- Ben: Other solutions?
- Glen: Could Arm TF-A / TF-M team do this to offload the work?
- Not currently
- Ben: Need a short-term and long-term solution. Short-term - added
a server, long-term, change the configuration changes.
- Can use Staging Server to test it out.
- Git checkout taking lots of the build time.
- Shebu: Failing every night?
- Not sure.
- Riku: Last week, 1 success all other failed.
- Ben: The new server increasing capacity has caused this issue to
show itself.
- Work around is to potentially limit the number of parallel jobs.
- Next Steps:
- Leonardo scope TF-A.
- Create two subtasks for TF-A and TF-M
- Riku: Put in work-around to limit number of parallel jobs
- Need a TFC ticket here?
<end>
Attendees: Janos Fallath, Matteo, Joanna, Don, Glen, Xinyu, Ben C
Minutes:
- Cancel next week instance - folks are out
Platform enablement
- Chromebook: Arthur wrapping up the work, when receive the platforms
will be ready.
- Cypress: Got the info and moving forward again.
- NXP Platform: Some back ground work to see best way to integrate
- Focus back in Cypress
Other
- expect scripts. in holding pattern with focus on mbedTLS
- MC: Two months away from end a FY. Would like expect scripts are
finalized. Would like as a background task when things get blocked
- MC: Would like to at a minimum know what needs to be done and what
help Leonardo may need. LAVA help for example? Anything that
can be done
in parallel. So what is left and who needs to do which task?
- MBedTLS
- JF: Don't have AMI's yet, working to get them out. So blocked.
Can't set up environment.
- GV: Jumped forward to M3's for now. Will need to get to having
this up on a real system.
- GV: By end of the week, M3 should be done, having the AMI files
will be key at that point to move on.
- JF: Originally, scripts / files were to be pulled in. Now thinking
of a restricted repo and sharing it - may not have that option. What can
be used beyond a public repo?
- JF: Can Linaro see the private Arm instance?
- Glen: Have to check if can see the private repo *ACTION Glen*.
- Glen: Set up a sync to review repo sharing options *ACTION Glen*
- Movng to gitlab?
- JF: Not at this point
- JF: Could move the test repo to gitlab, but same issue.
- SC Approved:
- Code Coverage tool
- Joanna: A new ticket raised TFC-160. Can we disable code coverage
for now until the gitlab.arm.com is back up. Without it, jobs are
failing.
- MC: Should we wait?
- Joanna: TF-A has been down since Friday.
- MC: Can we see what Dean does to resolve?
- Plan to see if can disable code coverage until resolved.
- MC: Can we be explicit in the ticket on what Linaro is to do. i.e
request disable code coverage on TF-A
Attendees: Janos Fallath, Don, Glen, Xinyu, Ben, Joanna
Minutes:
- Server is ready to bring online, Ben just prioritizing time to get it in
now.
- TFC-110: Licenses available in October - pushed out till then
- TFC-92 Complete - Can now create docker containers for FVP models
- TFC-5 - Add additional FVP models
Is this needed still or will developers add their own?
- MBedTLS
Need to add Janos to the Jira
Need ANI files
Forward email to Janos
- Boards
May start NXP board since Chromebook and Cypress platforms blocked
Chromebooks in shipping
- Expect scripts
Still need some modifications, but not working on - MBedTLS higher
priority
Joanna - Will these get worked on if gaps in board come up?
Glen - yes this could happen
- Code coverage: The bigger aggregated reports are still open
Attendees: Don, Anton, Joanna, Ben, Dave R, Xinyu
FVP: TFC-92 updates?
Ben: Working in staging, needs Docs, tested and moved to staging. Leonardo
back and to work this. Subtasks are TFC-107,108,115
Anton - Planned to complete TF-M release this week. Within the next two
days.
After the release Ben to bring up the new Server. Already set up and
waiting. Should it be all production, partial staging? Plan to start out
as all Production. Can adjust if still need more staging.
Anton: Is it possible to connect two CI's to one LAVA backend?
Ben: It's possible with correct priorities. LAVA has its own Scheduler.
Anton has a private build from an internal CI that he will send binaries to
be tested in LAVA.
Dave R: Notes on MBeb TLS
- There have been some changes to how MBed TLS tests are run. Will make
this public soon. No impacts, but should be easier to use the public repo
versus a private copy
- Unable to host private repos on Github - the impact is to migrate
private repos to GitLab. Expect this to change CI triggering. May require
a detailed call on impacts.
- Dave is here Wed Thursday next week, then out for 4 weeks. Dave will
provide a delegate.
Hello;
The test with migrating Jira to the Cloud version went well. We will
migrate over on Thursday, 7/29. The transition should only take an hour
and I will send out an email when its completed. The new Jira
instance will be - https://linaro.atlassian.net/jira/projects
Thanks;
-g
On 7/27/2021 12:17 PM, Don Harbin via Tf-openci-triage wrote:
> Attendees: Don, Glen, Ben C., Joanna, Xinyu, Glen, Shebu, Anton
>
> Minutes:
>
> - Glen: Migrating Jira to Cloud - Atlassian is going Cloud only.
> - Glen: New H/W in place?
> - Joanna: No conflicts for next month or two, no issues
> - TF-M release - Anton: First Friday of August
> - Ben: Open a ticket and Ben will bring it up after 1.4.0
> - Glen: Would it help benefit TF-M to do it now?
> - Discussed the split of resources between staging vs production
> - Server load overview
> - https://tinyurl.com/486325k5
> - Change the "01" to "02" to see both servers
> - Anton: ST board can now be used to run release tests?
> - Glen: Yes, it's ready.
> - Glen: Chromebooks. Getting the final platforms to Cambridge lab.
> Arther to finish in next week prior to board arrival. Will start Cypress
> next.
> - Joanna: Need to work out who develops tests for the Chromebooks.
> - Glen: Will check with Arthur on basic boot tests
> - Joanna: Expect Scripts and Code Coverage patches reviewed. They look
> fine. What about uploading our own FVP models (TFC-108).
> - Ben: Riku has done some work on this (created CI job). See ticket
> comments. No ETA yet, but doesn't look like a long task.
> Leonardo will be
> tasked and he's out this week to move Riku's job over.
> - Joanna: How do new users leverage the download tool?
> - Raise a TFC ticket.
> - MBedTLS - SOW updated, milestones charted out. JIra cards created.
> - Support for IAR compiler?
> - Shebu: Got some licenses for Arm use, but will be a future milestone.
> - Shebu out for month of August. Dave R will need to identify a POC for
> MBedTLS guidance.
>
> Don
--
Linaro <http://www.linaro.org>
Glen Valante | /Sr. Technical Program Manager/
T: +1.508.517.3461 <tel:1617-320-5000>
glen.valante(a)linaro.org <mailto:glen.valante@linaro.org> | Skype:
gvalante <callto:gvalante>