zenml icon indicating copy to clipboard operation
zenml copied to clipboard

SafeTensors serialization for PyTorch models

Open kunigori opened this issue 1 month ago β€’ 5 comments

Adds SafeTensors-based serialization for PyTorch models (addresses #2532) and implements metadata-driven loading to integrate cleanly with the materializer workflow (per @bcdurak's feedback).

Changes

  • βœ… Add safetensors optional extra in pyproject.toml
  • βœ… Save state_dict to .safetensors when available; fallback to .pt with warning
  • βœ… Write minimal metadata.json (class_path, serialization_format)
  • βœ… Use TemporaryDirectory + copy_dir() for remote stores
  • βœ… load() always returns nn.Module
  • βœ… Backward compat: supports weights.pt, checkpoint.pt, and legacy entire_model.pt

New artifact layout

artifact_uri/
β”œβ”€ weights.safetensors   # or weights.pt on fallback
└─ metadata.json         # class_path + format

Metadata

{
  "class_path": "my_package.models.MyModel",
  "serialization_format": "safetensors",
  "init_args": [],
  "init_kwargs": {},
  "factory_path": null
}

Why SafeTensors?

  • Security: Avoids pickle-based code execution risks
  • Performance: Faster, memory-mapped weight loads
  • Compatibility: Works with S3/GCS/Azure via artifact stores

Tests

Local run:

pytest tests/unit/integrations/pytorch/materializers/test_pytorch_module_materializer.py -v
# 4 passed in 1.88s

Coverage:

  • Round-trip with safetensors
  • Pickle fallback path
  • Metadata-driven load
  • Legacy formats (weights.pt, checkpoint.pt, entire_model.pt)
  • Clear error when safetensors extra is missing at load

Known limitations (Phase 1)

  • Zero-argument __init__() requirement: Models needing config should use a factory method (planned for Phase 2)

  • Legacy artifacts without metadata (weights.pt / checkpoint.pt) require:

  model = materializer.load(data_type=MyModel)
  • Legacy entire_model.pt is loaded and returned as a Module directly (no data_type needed)

Documentation

Happy to add a short guide covering why/how/limits/troubleshooting. Which file should I update?

  • docs/book/component-guide/materializers/pytorch.md (materializer behavior)?
  • docs/book/integration-guide/pytorch.md (integration landing)?

Or would you prefer a new section?

Future work (separate PRs)

  • Phase 2: Support init_args / init_kwargs / factory functions
  • Phase 3: PyTorch Lightning materializer
  • Phase 4: HuggingFace Transformers support

Checklist

  • [x] Tests pass locally
  • [x] Code formatted (ruff check --fix + ruff format)
  • [x] Also ran project scripts: bash scripts/format.sh and bash scripts/lint.sh
  • [x] Type hints added (mypy clean)
  • [x] Backward compatibility maintained
  • [x] Rebased on develop
  • [ ] Documentation updated (pending guidance on location)
  • [x] CLA signed

kunigori avatar Nov 11 '25 04:11 kunigori

Check out this pull request onΒ  ReviewNB

See visual diffs & provide feedback on Jupyter Notebooks.


Powered by ReviewNB

CLA assistant check
Thank you for your submission! We really appreciate it. Like many open source projects, we ask that you sign our Contributor License Agreement before we can accept your contribution.


yusuke kunimitsu seems not to be a GitHub user. You need a GitHub account to be able to sign the CLA. If you have already a GitHub account, please add the email address used for this commit to your account.
You have signed the CLA already but the status is still pending? Let us recheck it.

CLAassistant avatar Nov 11 '25 04:11 CLAassistant

Hey @kunigori, thanks for the PR! Can you please base your changes on the develop branch and then also change the target of this PR.

schustmi avatar Nov 11 '25 04:11 schustmi

⚠️ This PR has been inactive for 2 weeks and has been marked as stale. Timeline:

  • Week 2 (now): First reminder - PR marked as stale
  • Week 4: PR will be automatically closed if no activity Please update this PR or leave a comment to keep it active. Any activity will reset the timer and remove the stale label.

github-actions[bot] avatar Dec 10 '25 02:12 github-actions[bot]

Thanks for the update. I'm still actively working on this PR and will push revisions soon.

kunigori avatar Dec 11 '25 07:12 kunigori

⚠️ This PR has been inactive for 2 weeks and has been marked as stale. Timeline:

  • Week 2 (now): First reminder - PR marked as stale
  • Week 4: PR will be automatically closed if no activity Please update this PR or leave a comment to keep it active. Any activity will reset the timer and remove the stale label.

github-actions[bot] avatar Dec 27 '25 02:12 github-actions[bot]