Kaijie Zhu
Kaijie Zhu
Hi, our current approach involves directing the model to specifically output the desired labels ('positive' or 'negative'), which are then enclosed within a unique pattern (e.g., ''). This format allows...
Hi, I have tested qqp dataset with Flan-T5 model using StressTest attack, it works for me. Could you please test it and paste the detailed error messages here?
I have the same issue. The safety settings does not work for me, instead, changing temperature from 0 to 0.7 works. The generated contents may be blocked since I found...
Thank you very much for the contribution! We will look into this.
Hi, could you please indicate which model you are using for the attack? The difference may arise from the use of a different model.
Could you please check and compare the results [here](https://huggingface.co/spaces/March07/PromptBench)? In this website, the results for T5 in SST-2 dataset is around 95%. 
Hi, can you share the code you are running with me? I tested the following code, it worked well. ``` import promptbench as pb dataset = pb.DatasetLoader.load_dataset("sst2") print(dataset) ``` BTW,...
You can ask the generative models to output in some certain formats, like , then you can use regex to parse the answer.
Can you show your reproduced results? Also, have you checked the test set? I saw you revised the code for loading the test set. Also please pay attention to the...
> @Ki-Zhang As of January 2024, the entire list of Harm Categories can be found [here](https://ai.google.dev/api/rest/v1beta/HarmCategory). The implementation for `gemini-pro` or `gemini-pro-vision` can be carried out as follows in Python:...