stanford_alpaca issues

[wip] makes it possible to run alpaca with flyte

To execute locally ```bash pyflyte run train.py train --model_args='{}' --data_args='{}' --training_args='{"output_dir":"/tmp"}' ```

kumare3

Why proceed with this kind of research evaluation process?

1

Please correct me if I got anything wrong. I am trying to learn more about LLM research. Alpaca Contribution: Your research team instruction tuned llama via the self-instruct method and...

timothylimyl

train using 2 nodes is slower than 1 node

2

when I use two A100 nodes, each node is (80GX8). I found two nodes train is slower than one node. I use torchrun xxx. can any one meet this?

hujunchao

finetuning with error

6

Hi everyone, I tried to reproduce the finetuning of the alpaca, but I met follow error. Could you please help me? ```python Running command git clone --quiet https://github.com/huggingface/transformers /tmp/4267942.1.nvidiagpu.q/pip-req-build-317x2j5l ERROR:...

HaoBytes

torch.cuda.OutOfMemoryError: CUDA out of memory.

2

Getting this error while using single A100 8G0GB while loading llama-7b I tried reducing the batch size also changes the **--gradient_accumulation_steps** but not able to work it out. I was...

Ahtesham00

Align human preferences with Alpaca without reinforcement learning

We propose a new learning paradigm named RRHF (Rank Responses to Align Human Feedback) which does not need reinforcement learning and can perform on par with PPO to align human...

GanjinZero

ERROR:torch.distributed.elastic.multiprocessing.api:failed (exitcode: -9) local_rank: 1

4

ERRO: ``` Using .cache/torch_extensions/py310_cu117 as PyTorch extensions root... Emitting ninja build file .cache/torch_extensions/py310_cu117/utils/build.ninja... Building extension module utils... Allowing ninja to set a default number of workers... (overridable by setting the...

yanqiangmiffy

Why one token corresponds to multiple token ids

1

![f4ce54cf-7ef4-4895-b7e0-9b09df84f711](https://user-images.githubusercontent.com/30040649/230756848-c5517bbf-abd0-48ad-9810-71f9ccb5afc8.jpeg)

FinalFlowers

about v100 save model

3

I use the 8 v100 train the model, the saved model is error, the size of model is : the command is as follows torchrun --nproc_per_node=1 --master_port=12345 train.py \ --model_name_or_path...

yyl199655

Anyone fine tuned successfully on 1 or 2 GPUs?

4

I cannot start running the train.py script (on 2 x 4090 gpu) Got this error: File ".../alp/lib/python3.10/site-packages/transformers/hf_argparser.py", line 341, in parse_args_into_dataclasses raise ValueError(f"Some specified arguments are not used by the...

weiddeng

stanford_alpaca
stanford_alpaca copied to clipboard

Metadata

[wip] makes it possible to run alpaca with flyte

Why proceed with this kind of research evaluation process?

train using 2 nodes is slower than 1 node

finetuning with error

torch.cuda.OutOfMemoryError: CUDA out of memory.

Align human preferences with Alpaca without reinforcement learning

ERROR:torch.distributed.elastic.multiprocessing.api:failed (exitcode: -9) local_rank: 1

Why one token corresponds to multiple token ids

about v100 save model

Anyone fine tuned successfully on 1 or 2 GPUs?

← Metadata

Owner

Metadata

stanford_alpaca stanford_alpaca copied to clipboard

Metadata

← Metadata

Owner

Metadata

stanford_alpaca
stanford_alpaca copied to clipboard