even if i am handling this case by using format attribute of dspy.OutputFeild() but still getting error.

Open anushka192001 opened this issue 4 months ago • 6 comments

12 frames in (x) 3 4 speech = dspy.InputField() ----> 5 extracted_tokens = dspy.OutputField(desc="extract words from text",format=lambda x: ' '.join(x) if isinstance(x, list) else str(x))

TypeError: sequence item 0: expected str instance, list found

Apr 20 '24 09:04 anushka192001

Hm that’s strange. Can you share a small example to replicate the error

Apr 20 '24 16:04 okhat

@okhat leave i tackled it anyhow but now i am getting this new error "index out of range in self" when i use more than 14 examples to train but when i decrease no. of examples then it works all fine.

i am using gpt2 from hugging face and small snipet of code is ....

#teleprompeter to optimize the weights of this program using a metric defined validate_Pred_Imp_words from dspy.teleprompt import BootstrapFewShot,Ensemble

Define the validation logic for the hate speech classification task

def validate_hate_speech(example, pred, trace=None): for i in range(3): if i+1<=len(example.extracted_tokens): if not find_similarity(example.extracted_tokens[i],pred.pred_extracted_tokens[i]): return False if i+1<=len(example.label): if str(example.label[i]) != str(pred.pred_label[i]): return False if i+1<=len(example.target): if not find_similarity(example.target[i],pred.pred_target[i]): return False return True

Set up the teleprompter

teleprompter = BootstrapFewShot(metric=validate_hate_speech)

Compile the program

compiled_program = teleprompter.compile(HateSpeechClassifier(), trainset=extract_words_dataset_train[:14])

Apr 21 '24 07:04 anushka192001

Hi @anushka192001 , could you share the full error trace for this program? It's unclear what "index out of range in self" refers to

Apr 27 '24 23:04 arnavsinghvi11

@arnavsinghvi11 can you please share your mail id i will share the code on which i am getting this error.

Apr 28 '24 10:04 anushka192001

@arnavsinghvi11 THE ERROR-----

0%| | 0/100 [00:00<?, ?it/s]Token indices sequence length is longer than the specified maximum sequence length for this model (1037 > 1024). Running this sequence through the model will result in indexing errors Setting pad_token_id to eos_token_id:50256 for open-end generation. Setting pad_token_id to eos_token_id:50256 for open-end generation. Setting pad_token_id to eos_token_id:50256 for open-end generation. Failed to run or to evaluate example Example({'speech': #REDACTED (input_keys={'speech'}) with <function validate_hate_speech at 0x7eeb746a1090> due to index out of range in self. Failed to run or to evaluate example Example({'speech': #REDACTED(input_keys={'speech'}) with <function validate_hate_speech at 0x7eeb746a1090> due to index out of range in self.

3%|▎ | 3/100 [00:07<04:04, 2.53s/it]Setting pad_token_id to eos_token_id:50256 for open-end generation. Failed to run or to evaluate example Example({'speech': #REDACTED (input_keys={'speech'}) with <function validate_hate_speech at 0x7eeb746a1090> due to index out of range in self.

4%|▍ | 4/100 [00:16<07:29, 4.68s/it]Setting `pad_token_id` to `eos_token_id`:50256 for open-end generation. 4%|▍ | 4/100 [00:16<06:43, 4.21s/it]Failed to run or to evaluate example Example({'speech'#REDACTED(input_keys={'speech'}) with <function validate_hate_speech at 0x7eeb746a1090> due to index out of range in self.

IndexError Traceback (most recent call last) in <cell line: 37>() 35 36 # Compile the program ---> 37 compiled_program = teleprompter.compile(HateSpeechClassifier(), trainset=extract_words_dataset_train[:100])

25 frames

/usr/local/lib/python3.10/dist-packages/torch/nn/functional.py in embedding(input, weight, padding_idx, max_norm, norm_type, scale_grad_by_freq, sparse) 2235 # remove once script supports set_grad_enabled 2236 no_grad_embedding_renorm(weight, input, max_norm, norm_type) -> 2237 return torch.embedding(weight, input, padding_idx, scale_grad_by_freq, sparse) 2238 2239

IndexError: index out of range in self

Apr 28 '24 10:04 anushka192001

Hi @anushka192001 ,

please refrain from pasting code that has harmful content going forward. I have corrected this on various raised issues by you. the issue is still unclear as the stack trace you've pasted does not highlight the impacted DSPy components. I see that 25 frames have not been shown. can you please only post the ones related to DSPy?

Closing duplicate issues related to this.

Apr 28 '24 18:04 arnavsinghvi11

dspy dspy copied to clipboard

even if i am handling this case by using format attribute of dspy.OutputFeild() but still getting error.

Define the validation logic for the hate speech classification task

Set up the teleprompter

Compile the program

dspy
dspy copied to clipboard