dspy
dspy copied to clipboard
even if i am handling this case by using format attribute of dspy.OutputFeild() but still getting error.
12 frames
TypeError: sequence item 0: expected str instance, list found
Hm that’s strange. Can you share a small example to replicate the error
@okhat leave i tackled it anyhow but now i am getting this new error "index out of range in self" when i use more than 14 examples to train but when i decrease no. of examples then it works all fine.
i am using gpt2 from hugging face and small snipet of code is ....
#teleprompeter to optimize the weights of this program using a metric defined validate_Pred_Imp_words from dspy.teleprompt import BootstrapFewShot,Ensemble
Define the validation logic for the hate speech classification task
def validate_hate_speech(example, pred, trace=None): for i in range(3): if i+1<=len(example.extracted_tokens): if not find_similarity(example.extracted_tokens[i],pred.pred_extracted_tokens[i]): return False if i+1<=len(example.label): if str(example.label[i]) != str(pred.pred_label[i]): return False if i+1<=len(example.target): if not find_similarity(example.target[i],pred.pred_target[i]): return False return True
Set up the teleprompter
teleprompter = BootstrapFewShot(metric=validate_hate_speech)
Compile the program
compiled_program = teleprompter.compile(HateSpeechClassifier(), trainset=extract_words_dataset_train[:14])
Hi @anushka192001 , could you share the full error trace for this program? It's unclear what "index out of range in self" refers to
@arnavsinghvi11 can you please share your mail id i will share the code on which i am getting this error.
@arnavsinghvi11 THE ERROR-----
0%| | 0/100 [00:00<?, ?it/s]Token indices sequence length is longer than the specified maximum sequence length for this model (1037 > 1024). Running this sequence through the model will result in indexing errors
Setting pad_token_id to eos_token_id:50256 for open-end generation.
Setting pad_token_id to eos_token_id:50256 for open-end generation.
Setting pad_token_id to eos_token_id:50256 for open-end generation.
Failed to run or to evaluate example Example({'speech': #REDACTED (input_keys={'speech'}) with <function validate_hate_speech at 0x7eeb746a1090> due to index out of range in self.
Failed to run or to evaluate example Example({'speech': #REDACTED(input_keys={'speech'}) with <function validate_hate_speech at 0x7eeb746a1090> due to index out of range in self.
3%|▎ | 3/100 [00:07<04:04, 2.53s/it]Setting pad_token_id to eos_token_id:50256 for open-end generation.
Failed to run or to evaluate example Example({'speech': #REDACTED (input_keys={'speech'}) with <function validate_hate_speech at 0x7eeb746a1090> due to index out of range in self.
4%|▍ | 4/100 [00:16<07:29, 4.68s/it]Setting pad_token_id to eos_token_id:50256 for open-end generation.
4%|▍ | 4/100 [00:16<06:43, 4.21s/it]Failed to run or to evaluate example Example({'speech'#REDACTED(input_keys={'speech'}) with <function validate_hate_speech at 0x7eeb746a1090> due to index out of range in self.
IndexError Traceback (most recent call last)
25 frames
/usr/local/lib/python3.10/dist-packages/torch/nn/functional.py in embedding(input, weight, padding_idx, max_norm, norm_type, scale_grad_by_freq, sparse) 2235 # remove once script supports set_grad_enabled 2236 no_grad_embedding_renorm(weight, input, max_norm, norm_type) -> 2237 return torch.embedding(weight, input, padding_idx, scale_grad_by_freq, sparse) 2238 2239
IndexError: index out of range in self
Hi @anushka192001 ,
please refrain from pasting code that has harmful content going forward. I have corrected this on various raised issues by you.
the issue is still unclear as the stack trace you've pasted does not highlight the impacted DSPy components. I see that 25 frames have not been shown. can you please only post the ones related to DSPy?
Closing duplicate issues related to this.