LESS issues

No optimizer.bin in Step 2

3

Hi， When I run "Step 2: Building the gradient datastore" FileNotFoundError: [Errno 2] No such file or directory: '../out/llama2-7b-p0.05-lora-seed3/checkpoint-1688/optimizer.bin' I check the folder "llama2-7b-p0.05-lora-seed3" generate from Step 1, only files...

mihara-bot

step 2 when run "/get_train_lora_grads.sh", load the optimizer.pt error is happend

19

when load the optimizer.pt display the key is different KeyError: 'base_model.model.model.layers.0.self_attn.q_proj.lora_A.default.weight' the items in optimizer.pt state is 0~255.

victorjiax

Question about running stage 3 script

3

Following the script provided in the second step of "Selecting data for a task" in your readme, I have a command line that needs to be run as shown below:...

2003pro

Using Multiple GPUs

3

How can we utilize multiple GPUs for the gradient feature collection step? The current implementation only works with a single GPU.

SD325

a small mistake in the code

3

thanks for sharing your code, https://github.com/princeton-nlp/LESS/blob/main/less/data_selection/write_selected_data.py#L76 In this code version, a small mistake made sorted.csv goes wrong, to make it correct, line 76 and line 77 should exchange their position

ZZZZZccccc123

Error with fast_jl

1

CUDA error: the provided PTX was compiled with an unsupported toolchain. What might be the cause for this mistake?

simplelifetime

some question about calulate socre？

1

``` N_SUBTASKS = {"mmlu": 57, "bbh": 27, "tydiqa": 9} influence_score = influence_score.reshape( influence_score.shape[0], N_SUBTASKS[target_task_name], `-1).mean(-1).max(-1)[0] ``` what is meaning N_SUBTASKS , why do this? Can I change it to "...

smashfan

Question about accuracy

1

Hello, I have some questions about the accuracy of llama2-7b. In the Table 5, the accuracy of llama2-7b-base on MMLU/TYDIQA/BBH are 46.7/52.1/39.8, but we use llama2-7b from "https://huggingface.co/meta-llama/Llama-2-7b-hf/tree/main" to test...

fyf3

Are samples used for warmup training and gradient calculation the same?

1

Hi, I'm trying to run experiments following the instructions given in README. I find that in Step 1 warmup training, 5% of samples are randomly selected to train $M_S$. But...

ZigeW

Discrepancy in Model Performance When Reproducing Experiment

2

![image](https://github.com/user-attachments/assets/50c4684d-b052-493a-9995-584b07c52b79) Hi, I've been attempting to reproduce an experiment involving the finetuning of the Llama-2-7b-hf model, specifically using a random 5% of training data, using open-instruct [finetune_with_accelerate.sh](https://github.com/allenai/open-instruct/blob/main/scripts/finetune_with_accelerate.sh). I adhered to...

tangzhy

LESS
LESS copied to clipboard

Metadata

No optimizer.bin in Step 2

step 2 when run "/get_train_lora_grads.sh", load the optimizer.pt error is happend

Question about running stage 3 script

Using Multiple GPUs

a small mistake in the code

Error with fast_jl

some question about calulate socre？

Question about accuracy

Are samples used for warmup training and gradient calculation the same?

Discrepancy in Model Performance When Reproducing Experiment

← Metadata

Owner

Metadata

LESS LESS copied to clipboard

Metadata

← Metadata

Owner

Metadata

LESS
LESS copied to clipboard