blog issues

Update pytorch-ddp-accelerate-transformers.md

As the model is on 'cuda', the test data and labels also should be on the same device for evaluation. I guess this is just a copy paste issue

csabakecskemeti

Update model variant names to reflect release of 70B variant

As per updates here: https://ai.meta.com/blog/code-llama-large-language-model-coding/#:~:text=Update%3A%20Jan%2029,on%20code%20tasks

tryrobbo

Fetch case study eap

1

Want to copy AWS Case Study on HF X fetch in our case study section https://aws.amazon.com/fr/partners/success/fetch-hugging-face/

VioletteLepercq

add links for whisper large v2 and v3 to the finetune article

As I was reading through your finetune article for Whisper, I saw that both `large-v2` and `large-v3` are missing in the table showcase. I have added these, so new readers...

iitsg

only classifier head is trained in tweet sentiment classification LoRA finetuning blog

@mehdiir, We tried to reproduce your work in our env and found one weird issue: by using your code, `gradient_checkpointing=True` runs much faster than `gradient_checkpointing=False` which betrayed our intuition(2 hr...

yao-matrix

The training procedure of probabilistic time series forecasting is wrong 🔴

Hello, I wanted to bring to your attention an issue I encountered while working with the notebook provided for training time series models. The results I obtained do not match...

oublalkhalid

fix: translation in moe.md

1

chinese translation in figure caption is not consistent with the paragraph above

alephpi

Pad token issue with Falcon, setting eos = pad leads to generation never stopping, proper fix?

9

I saw the falcon blog: https://github.com/huggingface/blog/blob/main/falcon.md and here: https://huggingface.co/blog/falcon. I tried using it but I noticed setting eos = pad leads to the issue where a fine-tuned model never generates...

brando90

[w2v-bert] Questions about average duration by token

Hi @ylacombe, Thank you for the new blog post about fine-tuning w2v-BERT. However, I have some doubts about the "average duration seen by each token", or perhaps I might be...

bofenghuang

Can not execute example in idefics-9b-instruct

When I run example in https://huggingface.co/HuggingFaceM4/idefics-9b-instruct ``` import torch from transformers import IdeficsForVisionText2Text, AutoProcessor device = "cuda" if torch.cuda.is_available() else "cpu" checkpoint = "HuggingFaceM4/idefics-9b" model = IdeficsForVisionText2Text.from_pretrained(checkpoint, torch_dtype=torch.bfloat16).to(device) processor =...

ppsmk388

blog
blog copied to clipboard

Metadata

Update pytorch-ddp-accelerate-transformers.md

Update model variant names to reflect release of 70B variant

Fetch case study eap

add links for whisper large v2 and v3 to the finetune article

only classifier head is trained in tweet sentiment classification LoRA finetuning blog

The training procedure of probabilistic time series forecasting is wrong 🔴

fix: translation in moe.md

Pad token issue with Falcon, setting eos = pad leads to generation never stopping, proper fix?

[w2v-bert] Questions about average duration by token

Can not execute example in idefics-9b-instruct

← Metadata

Owner

Metadata

blog blog copied to clipboard

Metadata

← Metadata

Owner

Metadata

blog
blog copied to clipboard