zephyr icon indicating copy to clipboard operation
zephyr copied to clipboard

Process: support tiers for hardware for Zephyr v3.3

Open mbolivar-nordic opened this issue 3 years ago • 10 comments

This issue covers a process discussion for introducing support tiers for hardware and "leaf" modules.

The introduction of "tier 2" support would create a body of code not tested in CI that might be useful but for which we make minimal or no representations.

Previous discussion:

https://docs.google.com/document/d/1LOhgronKx3TxUypYLy9nex0_3UC58FMMjbXV4wbiuiE/edit#

mbolivar-nordic avatar Sep 15 '21 14:09 mbolivar-nordic

This issue has been marked as stale because it has been open (more than) 60 days with no activity. Remove the stale label or add a comment saying that you would like to have the label removed otherwise this issue will automatically be closed in 14 days. Note, that you can always re-open a closed issue at any time.

github-actions[bot] avatar Nov 15 '21 00:11 github-actions[bot]

related discussion:

  • https://github.com/zephyrproject-rtos/zephyr/discussions/35872

nashif avatar Feb 09 '22 13:02 nashif

Process WG:

  • @nashif : want to avoid repeating testing work on two different boards with the same hardware
  • @mbolivar-nordic : designating this at the board level misses somethings, like sensor drivers
  • @nashif : also ties to missing documentation at the SoC level, not just board level. E.g. a change to an SoC right now rebuilds everything. If we can maintain meaningful information on what SoCs are present on a board, that helps for coverage, documentation, drivers
  • @mbolivar-nordic : propose we move the conversation over to how we write this metadata down in a machine-readable way for the documentation rework that's ongoing
  • @dleach02 : how to start tracking which platforms are priority for different HW features, though?
  • @nashif : probably worth defining deeper criteria for testing and associating them with release criteria

mbolivar-nordic avatar Feb 16 '22 17:02 mbolivar-nordic

Proposal for board tier names and descriptions:

  • Tier 0: emulation boards used in CI; failures can block PRs. Supported by the Zephyr project itself, commitment to fix bugs in releases. One is required for each new architecture.
  • Tier 1: "real hardware" with commitment from a specific team to run tests using twister device testing for the "Zephyr compatibility test suite" (details TBD) mentioned above on a regular basis using open drivers. Commitment to fix bugs in time for releases. Not supported by "Zephyr Project" itself. Other quality of developer experience criteria TBD. General availability for purchase (modulo silicon supply chain crisis).
  • Tier 2: board implementation is available in upstream, no commitment to testing, may not be generally available. TL;DR no guarantees, if it breaks you get to keep both pieces. Has a dedicated maintainer who commits to respond to issues / review patches, however.
  • Tier 3: Deprecated board. Board implementation is available, but no owner or unresponsive owner. No commitment to support is available. May be removed from upstream if no one works to bring it up to tier 2 or better. If the last board for a particular SoC is removed, the SoC and drivers will be removed as well.

Minutes:

Support tiers:

  • @stephanosio : highest would be supported by company, second community support, etc
  • @dleach02 what does community support mean?
  • @nashif let's look at https://github.com/zephyrproject-rtos/zephyr/discussions/35872 again
  • @mbolivar-nordic I would push back on "silicon vendors" being a requirement for "reference" / "tier 1" boards -- the main thing that is needed is a committed maintainer or team of maintainers who commit to testing and fixing bugs on their boards for every release
  • @stephanosio need commitment to testing and reporting results as well
  • @nashif the least we should expect from tier 1 is to use device testing via twister on a regular basis and submit results. In the future we should work towards making boards available on the cloud, but let's not block this effort on that idea.
  • @dleach02 important to note the consumers for this data: a primary one is the release management team
  • @keith-zephyr what is the state of hardware testing now?
  • @dleach02 depends on the vendor. @hakehuang on NXP does our testing on a shared range; he says he tries to run once a week right before the testing wg meeting. Intel does a daily run, right?
  • @nashif we do, yes -- results reported on https://github.com/zephyrproject-rtos/test_results
  • @mbolivar-nordic this is an ask for people to commit to; I think we should list the tier 1 boards in a prominent place so their vendors can claim that their boards are well supported to customers etc.
  • @galak what about other drivers?
  • @nashif we do need this for e.g. sensor drivers, but let's cover platform support first
  • @mbolivar-nordic we agreed in the binary blobs discussion (https://docs.google.com/document/d/1heqcv7dzGvM5rA9xpTMW3kyKJLpZsl2Gjsmr5eqBje8/edit#) to define a suite of tests that targets must be able to pass. I think Tier 1 boards should be able to pass all of these tests on every release.
  • @nashif yes, at least this.
  • @galak should we require open source implementations for key features for tier 1 boards?
  • @nashif I think that makes sense
  • @galak what about public documentation for the SoC?
  • @nashif it's tricky. This is a plus, but not everybody does it the same way; not everybody has permissions to share these.
  • @galak perhaps public availability of documentation for the SoC could give you some points. I bring this up because when I review a vendor driver, if I don't have HW docs, I can't fully review.
  • @nashif this would also enable third parties to write drivers. This is a plus, but certain items on the checklist shouldn't outweigh others
  • @dleach02 we're also really talking about general availability for purchase here. Some board owners don't want to release the board directory, but may want to report testing results. This still is useful for testing results.
  • @galak additional things that might matter here are schematics, etc.
  • @stephanosio two relevant things for board "tier level" here: 1. support level, 2. developer experience (docs, schematics, etc.)
  • @galak there's a separate conversation about lifecycle to be had here.
  • @stephanosio we can do a period re-evaluation with a nomination process for moving between tiers.

Test selection:

  • @nashif : another issue is deciding which tests to run on a board; need to establish the scope of things to cover for testing. It's more about hardware testing of specific functionality: MPU, SMP, etc.

mbolivar-nordic avatar Jul 27 '22 17:07 mbolivar-nordic

Process WG:

  • @mbolivar-nordic: Nordic is interested in having Tier 1 boards. We are looking for a process owner for test reporting infrastructure.
  • @nashif the testing WG needs to define requirements
  • @dleach02 : release managers are a stakeholder here as well

mbolivar-nordic avatar Aug 03 '22 17:08 mbolivar-nordic

Process WG:

  • @nashif we discussed this at the testing WG this week. I asked everyone to explore and come up with proposals, tomorrow hopefully we will know more at the testing WG meeting tomorrow.
  • @mbolivar-nordic any leading candidates for test management systems, or still open?
  • @nashif we had been looking at an open source solution, but on its own it was not enough, as it was an open core solution with attempted lock in via commercial "brother". The commercial solution looked good, but we need more testing and a decision on cloud-based vs self-hosted. Infrastructure "team" (@stephanosio ) will need to manage whatever is decided.
  • @mbolivar-nordic I think the test suite mentioned in https://github.com/zephyrproject-rtos/zephyr/pull/47678/ is going to form the initial scope for testing for some tier here, and we'll need to move the list of tests in the binary blobs PR over to a dedicated "tier support" test suite definition

mbolivar-nordic avatar Aug 10 '22 16:08 mbolivar-nordic

Process WG:

  • @stephanosio : Testing WG is still discussing
  • @dleach02 : I discussed with @hakehuang ; this is ongoing for tomorrow's meeting as well. General feedback on the NXP side is to start looking for boards to promote internally.

mbolivar-nordic avatar Aug 17 '22 16:08 mbolivar-nordic

Process WG:

  • @stephanosio : "opensearch" was proposed by Nordic at the last testing WG. @nashif was not there so no feedback from him yet; discussion continues. Opensearch is like grafana but has a filtering function that allows you to view what tests ran on what platforms.
  • @nashif let's keep this in progress here and revisit next week; need to discuss some other process related items

mbolivar-nordic avatar Aug 24 '22 16:08 mbolivar-nordic

Process WG:

  • @stephanosio no updates from testing WG on opensearch. I think @nashif is looking into something else

mbolivar-nordic avatar Aug 31 '22 16:08 mbolivar-nordic

Process WG:

  • @nashif (who could not attend) requests that we make the board tier level recommendations to the TSC, with details to be sorted out by the testing WG assuming TSC approval
  • no objections, moving forward with that

mbolivar-nordic avatar Sep 21 '22 16:09 mbolivar-nordic

We've had TSC approval for this in principle for a while, and have now moved the implementation work to the testing WG's domain

mbolivar-nordic avatar Mar 08 '23 15:03 mbolivar-nordic