inseq icon indicating copy to clipboard operation
inseq copied to clipboard

Error when output contains emoji

Open Betswish opened this issue 1 month ago • 0 comments

🐛 Bug Report

When the model's (Llama2-7b-chat-hf) output contains emoji, PECORE will show an error of:

Traceback (most recent call last): File "/gpfs/home5/jqi2/research/ALCE/compare_pecora.py", line 212, in main() File "/gpfs/home5/jqi2/research/ALCE/compare_pecora.py", line 153, in main gen = attribute_context_with_model(lm_rag_prompting_example, model_pecora) File "/home/jqi2/anaconda3/envs/ALCE/lib/python3.9/site-packages/inseq/commands/attribute_context/attribute_context.py", line 195, in attribute_context_with_model cci_attrib_out = model.attribute( File "/home/jqi2/anaconda3/envs/ALCE/lib/python3.9/site-packages/inseq/models/attribution_model.py", line 457, in attribute attribution_outputs = attribution_method.prepare_and_attribute( File "/home/jqi2/anaconda3/envs/ALCE/lib/python3.9/site-packages/inseq/attr/attribution_decorators.py", line 72, in batched_wrapper out = f(self, *args, **kwargs) File "/home/jqi2/anaconda3/envs/ALCE/lib/python3.9/site-packages/inseq/attr/feat/feature_attribution.py", line 247, in prepare_and_attribute attribution_output = self.attribute( File "/home/jqi2/anaconda3/envs/ALCE/lib/python3.9/site-packages/inseq/attr/feat/feature_attribution.py", line 411, in attribute attr_pos_start, attr_pos_end = check_attribute_positions( File "/home/jqi2/anaconda3/envs/ALCE/lib/python3.9/site-packages/inseq/attr/feat/attribution_utils.py", line 84, in check_attribute_positions raise ValueError("Start and end attribution positions cannot be the same.") ValueError: Start and end attribution positions cannot be the same.

🔬 How To Reproduce

input_context_text: "Document [1](Title: Horse Rescue Warning Signs \u2013 Mackenzie Kincaid): to fight way too much and at a rescue, where there is no one to love them and nurse them back to health, it would be kind to let a sick horse go, rahter than trying to fight for it. Putting old horses through operations is never fair. Still, some horses want to live more than others. when do you know when it\u2019s time to go? I have had quite a few horses put down over the years- my own as well as my friends- somehow I always end up holding the horse in the end, and it is always\nDocument [2](Title: College grad's equine degree leads her back home to 'Tecumseh!'): "They know what they're supposed to do," said Wheaton. "And they're willing to do anything as long as the rider shows confidence." The relationship that develops between the equine and the human is the exact reason why Wheaton went into training rather than rehabilitation. She says that when working with animals the size of the horse, it's important to understand what they're physically and mentally capable of. The bond that develops between the horse and their trainer is beneficial to the animal further down the line when they have other riders. And all of the hard work to build trust\nDocument [3](Title: Melbourne Cup runner Cliffsofmoher dies, history of race horse deaths): sparked outrage. \u201c6 in 6 years. That\u2019s absolutely disgusting. Too many horses in one race. It\u2019s not safe for them when they get all bunched up. One step on another horses leg and they are put down,\u201d wrote one person on Twitter. The only type of horse sport I want to watch today: #NupToTheCup pic.twitter.com/LwnnKVIalp \u2014 Larissa Waters (@larissawaters) November 6, 2018 When \ud83d\udc0e regularly and predictably succumb to the mistreatment they endure during racing, why do some act surprised and say they \u2018broke down\u2019 rather than saying they were severely injured and in many cases had to be killed\nDocument [4](Title: The Decision - Baltimore Sun): they feared that Barbaro was about to be euthanized. Some shouted angrily that the horse should not be put down. "Take him home!" one wailed. In fact, the idea of putting down Barbaro at that point was never discussed by those on the track. The severity of the injury wasn't yet clear, although it looked bad. In any case, because valuable horses are always insured, they're routinely taken back to the barn after being injured so X-rays can be taken. "When I got the screen out, it was never with the thought that he would be put down there on\nDocument [5](Title: Horse Sense | The New Yorker): a horse was down on the track. Jones had just seen Eight Belles gallop past, looking fine, and he assumed that the injured horse was one of the late finishers. Then he saw his jockey, Saez, riding toward him on a lead pony. \u201cWhat\u2019s the matter?\u201d Jones asked. \u201cThey\u2019re putting her down,\u201d Saez told him. Jones\u2019s first reaction was anger. It is customary to consult with an injured horse\u2019s trainer before administering treatment, and if Eight Belles had already been euthanized Jones wanted to know why. He ran to the two equine ambulances at the back of the track. When\n\n"

input_current_text: Why are horses put down when they're injured rather than nursed back to health?

input_template: Instruction: Write an accurate, engaging, and concise answer for the given question using only the provided search results (some of which might be irrelevant) and cite them properly. Use an unbiased and journalistic tone. Always cite for any factual claim. When citing several search results, use [1][2][3]. Cite at least one document and at most three documents in each sentence. If multiple documents support the sentence, only cite a minimum sufficient subset of the documents.

Question: {current}

{context} Answer:

contextless_input_current_text: Instruction: Write an accurate, engaging, and concise answer for the given question using only the provided search results (some of which might be irrelevant) and cite them properly. Use an unbiased and journalistic tone. Always cite for any factual claim. When citing several search results, use [1][2][3]. Cite at least one document and at most three documents in each sentence. If multiple documents support the sentence, only cite a minimum sufficient subset of the documents.

Question: {current}

Answer:

output_current_text: Horses are typically put down when they are injured rather than nursed back to health because it is often more humane and less costly to end their suffering than to attempt to treat their injuries. According to [1], "They know what they're supposed to do," said Wheaton. "And they're willing to do anything as long as the rider shows confidence." The bond that develops between the equine and the human is the exact reason why Wheaton went into training rather than rehabilitation. However, as [3] notes, "When \ud83d\udc0e regularly and predictably succumb to the mistreatment they endure during racing, why do some act surprised and say they \u2018broke down\u2019 rather than saying they were severely injured and in many cases had to be killed." In cases where a horse is severely injured, euthanasia may be the most humane option to prevent further suffering. As [4] explains, "the severity of the injury wasn't yet clear, although it looked bad. In any case, because valuable horses are always insured, they're routinely taken back to the barn after being injured so X-rays can be taken."\n\nIn addition, as [2] notes, "The relationship that develops between the equine and the human is the exact reason why Wheaton went into training rather than rehabilitation."

decoder_input_output_separator: ' '

Code sample

        def get_max_memory():
            """Get the maximum memory available for the current GPU for loading models."""
            free_in_GB = int(torch.cuda.mem_get_info()[0]/1024**3)
            max_memory = f'{free_in_GB-6}GB'
            n_gpus = torch.cuda.device_count()
            max_memory = {i: max_memory for i in range(n_gpus)}
            return max_memory

        lm_rag_prompting_example = AttributeContextArgs(
                model_name_or_path='meta-llama/Llama-2-7b-chat-hf',
                input_context_text=input_context_text,
                input_current_text=input_current_text,
                output_template="{current}",
                input_template=input_template,
                contextless_input_current_text=contextless_input_current_text,
                show_intermediate_outputs=False,
                attributed_fn="contrast_prob_diff",
                context_sensitivity_std_threshold=0,
                output_current_text=output_current_text,
                attribution_method="saliency",
                attribution_kwargs={"logprob": True},
                save_path='./out.json',
                tokenizer_kwargs={"use_fast": False},
                model_kwargs={
                    "device_map": 'auto',
                    "torch_dtype": torch.float16,
                    "max_memory": get_max_memory(),
                    "load_in_8bit": False,
                    },
                decoder_input_output_separator=decoder_input_output_separator,
                special_tokens_to_keep=[],
                show_viz=False,
                )

        gen = attribute_context_with_model(lm_rag_prompting_example, model_pecora)

Betswish avatar May 13 '24 12:05 Betswish