SpacetimeDB icon indicating copy to clipboard operation
SpacetimeDB copied to clipboard

Yank and remove support for `clockworklabs/spacetimedb` docker image

Open gefjon opened this issue 9 months ago • 16 comments

Bounty: Enable & Validate mistralai/Ministral-8B-Instruct-2410 on Tenstorrent Hardware

Description

This bounty involves enabling and validating mistralai/Ministral-8B-Instruct-2410 on Tenstorrent’s hardware. The goal is to ensure that mistralai/Ministral-8B-Instruct-2410 compiles, runs end-to-end (inference), meets minimal performance benchmarks, and includes sufficient documentation for community adoption.

Target Model & Difficulty Ratings

  • Model: mistralai/Ministral-8B-Instruct-2410
  • Model Hardware Target: Wormhole (N150/N300)
  • Theoretical Maximum Throughput: 23 t/s/u (tokens per second per user)

Difficulty Ratings (Performance Targets):

  • Easy ($500): Achieve ≥ 25% of theoretical maximum throughput (~6 t/s/u).
  • Medium ($1500): Achieve ≥ 50% of theoretical maximum throughput (~12 t/s/u).
  • Hard ($2500): Achieve ≥ 70% of theoretical maximum throughput (~16 t/s/u).

Success Criteria

1. Functional Bring-Up

  • mistralai/Ministral-8B-Instruct-2410 compiles and runs on Tenstorrent hardware with no blocking errors.

2. Performance Validation

  • Clearly document throughput and latency.
  • Achieve at least the baseline targets for the selected difficulty rating.

3. Accuracy Validation

  • Validate accuracy by running mistralai/Ministral-8B-Instruct-2410 both on Tenstorrent hardware and a CPU baseline.
  • Document comparative accuracy clearly, meeting or exceeding:
    • Top-1 accuracy greater than 80% and top-5 accuracy greater than 95% when compared to CPU baseline.
    • Reference current model benchmarks here: Tenstorrent Transformer Performance

4. Documentation

  • Provide clear instructions for building, installing dependencies, and running on Tenstorrent.
  • Note any relevant parameter tuning or known issues.

:red_circle: Strict (hard requirement, PRs may be rejected)

5. Code Reuse & Clean Structure

  • PRs that copy-paste entire codebases or duplicate existing functionality will not be accepted.
  • If the model is transformer-based and similar to Llama or Qwen, it must use the tt-transformers base in tt-metal/models/tt_transformers, reusing modules with minimal changes.
  • Code should follow a clean, modular structure for internal reuse and customer deliverables.
  • Contributors must avoid unnecessary duplication — bring-up is only successful if done in a maintainable, production-ready way.

Bring-up Checklist

  • [ ] Model functionality verified
  • [ ] Post commit CI passes
  • [ ] Model regression CI testing passes (if applicable)
  • [ ] Device performance regression CI testing passes (if applicable)
  • [ ] New/Existing tests provide coverage for changes
  • [ ] Demo script to show the inference output

Notes

  • Work may involve multiple pull requests across relevant repositories (e.g., TT-metal or others).
  • If additional patches or dependencies are required, please include a short guide for maintainers.

Helpful Links

gefjon avatar Mar 26 '25 17:03 gefjon

the spacetimedb image is created automatically by our GH workflows. We should consider migrating them to create the spacetime one instead, which I believe we currently create manually.

bfops avatar Mar 26 '25 18:03 bfops

we've made it private

bfops avatar Apr 16 '25 19:04 bfops

We've not done everything described in this issue. We should either leave this issue open, or create a new one to track those other pieces.

bfops avatar Apr 18 '25 16:04 bfops

Completing this ticket https://github.com/clockworklabs/SpacetimeDB/pull/2643

jdetter avatar Apr 18 '25 21:04 jdetter

Context transferred to this ticket: https://github.com/clockworklabs/SpacetimeDBPrivate/issues/1627

jdetter avatar Apr 22 '25 20:04 jdetter

this ticket is marked done

bfops avatar Apr 28 '25 17:04 bfops