CodeGen2 issues

Feedback on Salesforce use of Github Repos

Please close this repo and redirect it to [salesforce/CodeGen](https://github.com/salesforce/CodeGen)

Question of unk, bos, and eos tokens.

Hi, When I loaded codegen2-7b vocabulary, I found that unk, bos, and eos tokens are identical, which is confused to me since I think these three special tokens should be...

TomasAndersonFang

Question of eos token in codegen2-7B model

Hi, I found it is wield that **the eos token** in config.json of codegen2-7B model is set to 2, but in codegen I think it was 50256, is it for...

edwardelric1202

Will the training implementation of CodeGen 2 be released?

Dear CodeGen Team, Thanks for the amazing work and congrats on your ICLR 2023 acceptance! As the paper mentioned in Section 1.4 and Section 4, as a valuable property and...

Robin-Y-Ding

Bump transformers from 4.25.1 to 4.30.0

Bumps [transformers](https://github.com/huggingface/transformers) from 4.25.1 to 4.30.0. Release notes Sourced from transformers's releases. v4.30.0: 100k, Agents improvements, Safetensors core dependency, Swiftformer, Autoformer, MobileViTv2, timm-as-a-backbone 100k Transformers has just reached 100k stars...

dependabot[bot]

dependencies

Is CodeGen2-16B the final and complete version?

Thanks for the great work. From the paper I noticed that CodeGen-16 was under training as of submission. I am curious if the current version on huggingface the complete one...

ganler

How is it compared with the StarCoder

https://huggingface.co/blog/starcoder They published some results on HumanEval. Not sure how they are compared

allanj

Can't this model be trained using multi nodes and multiple cards ?

2

My machine has a single node 4-card 16G graphics memory, and running the 16B model with multiple nodes will result in OOM regardless of how the number of nodes is...

Tengfei9228

How many tokens were used for training?

Curious to know how many tokens the models have seen. The repo mentions the dataset, but not the totals. > This checkpoint is trained on the stricter permissive subset of...

edward-io

checkpoints of NL and MIX

Hello, I would like to express my appreciation for your outstanding work. I was reading your research on the influence of DATA MIXING and came across Figure 1, which shows...

Wangpeiyi9979

CodeGen2
CodeGen2 copied to clipboard

Metadata

Feedback on Salesforce use of Github Repos

Question of unk, bos, and eos tokens.

Question of eos token in codegen2-7B model

Will the training implementation of CodeGen 2 be released?

Bump transformers from 4.25.1 to 4.30.0

Is CodeGen2-16B the final and complete version?

How is it compared with the StarCoder

Can't this model be trained using multi nodes and multiple cards ?

How many tokens were used for training?

checkpoints of NL and MIX

← Metadata

Owner

Metadata

CodeGen2 CodeGen2 copied to clipboard

Metadata

← Metadata

Owner

Metadata

CodeGen2
CodeGen2 copied to clipboard