Ewout ter Hoeven

Results 174 issues of Ewout ter Hoeven

I came across the concept of surrogate models, and if sounds very interesting (and tricky), especially after having encountered a model with a runtime of an hour myself. Especially this...

I was utterly amazed to read this in the ROCm 5.6 [release notes](https://rocm.docs.amd.com/en/latest/CHANGELOG.html#rocm-5-6-0): > - AMD Instinct MI50, Radeon Pro VII, and Radeon VII products (collectively referred to as gfx906...

The [Phi-3 Technical Report](https://arxiv.org/abs/2404.14219) was just published by the Microsoft team, in which they introduce a model family of 3 state-of-the-art models: - phi-3-mini (3.3B) - phi-3-small (7B) - phi-3-medium...

Currently the [Reka](https://www.reka.ai/) Flash model can be compared in the Chatbot Arena, but the smaller Edge and larger Core cannot. It would be interesting to see how these two models...

A bit of maintenance on the CI: - Update the used actions to their latest versions - Removes Python 3.6 (has been end-of-life for over 2 years) and add Python...

Currently [docs](https://mesa.readthedocs.io/) aren't build for features in [mesa/experimental](https://github.com/projectmesa/mesa/tree/main/mesa/experimental), particularly the [cell_space](https://github.com/projectmesa/mesa/tree/main/mesa/experimental/cell_space) and [devs](https://github.com/projectmesa/mesa/tree/main/mesa/experimental/devs). It would be really usefull to have those docs build in Read the Docs. It probably requires...

docs
Sprints!

The error ``` Warning: binaries did not work: Error: HTTP GET failed: Not Found ``` has been [popping up](https://github.com/ilammy/setup-nasm/actions?query=branch%3Amaster) on Ubuntu runs. Now it builds from source, which is fine...

### Describe the feature Deepinfra is currently one of the cheapest API providers of LLM inference (together with groq see #108), which makes it interesting to support their API. They...

➕ feature

Some maintenance to the CI: - Update the checkout and setup-python actions to their latest versions - Allow triggering manually and run once a week (on Monday morning) - Use...

Microsoft released a new paper, which contains details and tips on training a ternary LLM. Might be useful! - https://github.com/microsoft/unilm/blob/master/bitnet/The-Era-of-1-bit-LLMs__Training_Tips_Code_FAQ.pdf