SEV

Results 11 issues of SEV

Editing tasks have three categories: binary classification, QA and generation. In Batched edits, Why choose QA? And What is the result under finetune(FT)? Thanks!

Hi, In your paper, does the model edited in MEND methods must be fine-tuned? How to apply the MEND to the model without fine-tuning ? When I try to do...

Hi, In your paper, does the model edited in ROME or MEND methods must be fine-tuned? How to apply the ROME to the model without fine-tuning ? For example, the...

https://github.com/kssteven418/LTP/blob/f1d5ec88aba913de5e2b4aa502af9cf0ab7bb13f/src/transformers/models/ltp/modeling_ltp.py#L247 if self.training and not self.hard_masking: if pruner_outputs is not None: threshold, pruning_scores = pruner_outputs['threshold'], pruner_outputs['scores'] self.mask = torch.sigmoid((pruning_scores - threshold) / self.temperature) layer_output = layer_output * self.mask.unsqueeze(-1)

And the GPU I needed, at least can support train two 7B models ?

Hi, I find the code in rlattention.py : 'from thumt.layers.gumbel import gumbel_softmax' But in the layers fload, there isn't the Class gumbel ?

The Vicuna model will generate some unrelated output, so how to control the max_lrngth in : model.generate(inputs.input_ids.cuda(), max_length=??)

simple-knn and diff-gaussian-rasterization could install with cuda 12?

``` ./RWKU/KnowledgeCircuits-main/KnowledgeCircuits-main/transformer_lens/components.py:625, in AbstractAttention.forward(self, query_input, key_input, value_input, past_kv_cache_entry, additive_attention_mask, attention_mask) 616 result = self.hook_result( 617 bnb.matmul_4bit( 618 z.reshape(z.shape[0], z.shape[1], self.cfg.d_model), (...) 622 ) 623 ) 624 else: --> 625 result...

enhancement