open_llama issues

Any plans on 8192 context version?

5

StarCoderPlus uses StarCoder + RefinedWeb dataset for training but with a longer context length. Are there plans to release a version with a longer context length, such as 8192?

imoneoi

OpenLLaMA can quickly learn how to code

4

I know it's mentioned in the readme of the repo that this model apparently can't code because of the spaces that are merged. And this has been discussed in #40...

jorgemcgomes

Open-LLaMA-3B results are much worse than reported in this repo

5

| Task |Version| Metric |Value | |Stderr| |-------------|------:|--------|-----:|---|-----:| |anli_r1 | 0|acc |0.3330|± |0.0149| |anli_r2 | 0|acc |0.3320|± |0.0149| |anli_r3 | 0|acc |0.3367|± |0.0136| |arc_challenge| 0|acc |0.2099|± |0.0119| | | |acc_norm|0.2705|±...

XinnuoXu

Can I boost performance with an Intel Neural Compute Stick 2?

I've just learned about this [USB stick from Intel](https://www.ebay.co.uk/itm/334715824949?itmmeta=01HZC22YC6GXAV6T0A054HNQXX&hash=item4dee9e2b35:g:rxAAAOSwZGpjz8qo&itmprp=enc%3AAQAJAAAA4MdspbnsfeUlmRKNHn5xIlD17cTXRaY7QazjwNPBbMKWvmMFj9ucFTUQ7gYxg1KnRE0IoAVBk3UamDUMOivD1UZLTviDtiGsZgVKZ%2Fwp4ie4P63BqYKHNWJl49KDby0M05A2jjtvYdQzbgB%2F5QC9ju%2BRwahD6mOmmQ2p710E7KXrqpnDQvWzQME9ZJbCOxDhQTqGG1%2BUqpC3dgMRPPgWoJA08BH4vlK%2BbjOU8DpfTOBQu8rhZBX%2FgtUteJf%2FaqvtIbb5LKAIYmCvTAmiR9Cc3FJjdU115jj9gS4HEv8o85AY%7Ctkp%3ABk9SR5jmi4L7Yw). Would I be able to boost performance with one of these, or wouldn't it be compatible?

mrpmorris

What learning rate was used to pretrain 3B model?

itsnamgyu

open_llama
open_llama copied to clipboard

Metadata

Any plans on 8192 context version?

OpenLLaMA can quickly learn how to code

Open-LLaMA-3B results are much worse than reported in this repo

Can I boost performance with an Intel Neural Compute Stick 2?

What learning rate was used to pretrain 3B model?

← Metadata

Owner

Metadata

open_llama open_llama copied to clipboard

Metadata

Any plans on 8192 context version?

OpenLLaMA can quickly learn how to code

Open-LLaMA-3B results are much worse than reported in this repo

Can I boost performance with an Intel Neural Compute Stick 2?

What learning rate was used to pretrain 3B model?

← Metadata

Owner

Metadata

open_llama
open_llama copied to clipboard