Thomas Capelle comments

Results 169 comments of


                                            Thomas Capelle

Adds new artifacts colab.

I would also use this PR to remove/replace old stuff in wandb-artifacts (and put this file in there as a getting started)

Adds new artifacts colab.

can you make both wandbcode consistent?

How to perform full parameter finetuning without A100 GPUs

I would also like more info about this. Do you use Deepspeed to increase batch size? A 7B model fits nicely on 80GB GPUs without any model paralellism.

How to perform full parameter finetuning without A100 GPUs

Thanks for the prompt response =). BTW outstanding preso at DL.ai @edbeeching ! What I am curious is why use Deepspeed zero3 when using 80GB GPUs, is it faster? or...

How to perform full parameter finetuning without A100 GPUs

Yes, but in the Readme: > Full fine-tuning on a multi-GPU machine with DeepSpeed ZeRO-3 (tested on an 8 x A100 (80GB) node) I am curious about why you chose...

How to perform full parameter finetuning without A100 GPUs

The DPO recipe with a 7b model with config_full get's me OOM so I was wondering what should I reduce to keep the recipe consistent > I am on 8xA100...

docs: Mistral docs fix

@jamie-rasmussen I was missing the sidebar.ts update

[WIP] Anthropic integration

I have read the CLA Document and I hereby sign the CLA

feat(integration): Integration with DSPy

Great! missing the docs bro.

[WIP] Simplify and refactor to use Weave

Hey, it depends. We probably should merge this into a working branch instead of main, as it introduces breaking changes and removes a lot of files. The scores obtained are...