Yi issues

是否可以给一下引用格式

4

xieeryihe

question

200K context window fine-tuning

Hello! Is there any information about how to finetune the 6B-200K context window model?

NonaryR

question

sft

doc-not-needed

How should I verify that the model supports the functionality of 200K context length ?

1

因为Yi-6B-200K 和Yi-34B-200K两个模型看上去都拥有了200K长度的context能力，这相当令人振奋（GPT4-turbo 仅仅只有128K的能力）。这一点吸引了海内外很多业内人士的关注，我对此很好奇，想知道我应该如果验证这一点，目前官方好像没有提供相关实验数据或者guide，有没有人能给予一点帮助，谢谢。 ----------------------------------------------------------------------- Because both models Yi-6B-200K and Yi-34B-200K seem to have 200K length context capabilities, this is quite exciting (GPT4-turbo only has 128K capabilities). This has attracted...

FlyingPotatoZ

doc

doc-not-needed

safety guardrails

1

Hi I've tried the 34b-chat model on replicate and found that model safety guardrails can be bypassed quite easily with minimum adversarial prompting. The same prompt will fail on any...

CelestialCoder03

regression

sft

doc-not-needed

Long context probe experiment to test whether Yi-6B-Chat can retrieve information by "needle in a haystack" test

Could you please run this context test for the `Yi-6B-Chat` model? Here is the code: [link](https://github.com/gkamradt/LLMTest_NeedleInAHaystack) Below are the result for the `Qwen-72B-Chat` model: ![image](https://github.com/QwenLM/Qwen/raw/main/assets/qwen_72b_needle_in_a_haystack.png) This request is not a...

excellent-ai

enhancement

help wanted

200K chat model performance

1

Could you run this long context test for `Yi-34B-200k`? code: [link](https://github.com/gkamradt/LLMTest_NeedleInAHaystack) The results for `GPT-4-128k` and `Claude 2.1` are as follows: ![image](https://github.com/gkamradt/LLMTest_NeedleInAHaystack/raw/main/img/GPT_4_testing.png) ![image](https://github.com/gkamradt/LLMTest_NeedleInAHaystack/raw/main/img/Claude_2_1_testing.png)

excellent-ai

enhancement

sft

如何从4k扩展到200k

4

看代码好像没使用ntk？是依靠rope硬拓展，在200k数据上微调？

guanghuixu

question

doc-not-needed

Any technical report？

2

Is there any technical report？Thanks a lot!

MrRace

question

Add Chinese version README

- [ ] Add cross-link

ZhaoFancy

doc

good first issue

help wanted

支持合并预训练的大型语言模型的工具。

6b模型+6b-200k模型能有怎样的性能呢? https://github.com/cg123/mergekit

win10ogod

enhancement

help wanted

performance

Yi
Yi copied to clipboard

Metadata

是否可以给一下引用格式

200K context window fine-tuning

How should I verify that the model supports the functionality of 200K context length ?

safety guardrails

Long context probe experiment to test whether Yi-6B-Chat can retrieve information by "needle in a haystack" test

200K chat model performance

如何从4k扩展到200k

Any technical report？

Add Chinese version README

支持合并预训练的大型语言模型的工具。

← Metadata

Owner

Metadata

Yi Yi copied to clipboard

Metadata

← Metadata

Owner

Metadata

Yi
Yi copied to clipboard