torchtune
torchtune copied to clipboard
Plans for fp8 tuning going forward? Eg Deepseek v3
As foundation models move towards being trained in eight bits, is there a plan in the roadmap to begin to support this type of approach?
Related to deepseek v3, are there plans to support mixture of expert architectures? I could fully understand if this is too far away from a coherent roadmap.
Docs are broken? Almost all buttons in the "basic" menu lead to 404 errors.
And the return home button on 404 pages, returns to "docs.crawl4ai.com" which cannot be found.
@AADaoud Try again "docs.crawl4ai.com", make sure clear your cache