Zhengfu He

Mechanistic Interpretability Research @OpenMOSS Team @FudanNLP.

Results 39 comments of


                                            Zhengfu He

How to evaluate BLEU score on LM1B?

@yujianll Hi, 1. Yes, we sum up NLL for all tokens in the sequence as NLL for the sequence. 2. The validation ELBO is around 110. And the average number...

How to evaluate BLEU score on LM1B?

@yujianll Hi, We trained DiffusionBERT with 512 steps and used DDIM sampling to uniformly sample 128 steps on test set, both for NLL calculation and generation. Hope this helps!

GPT mentioend in Figure3

Hi, Thank you for your question! We include both models in Figure 3. The red curve, which is rather close to our DiffusionBERT stands for an AR model trained from...

预训练模型和推理脚本

您好，我们最近需要一些时间重新整理。一周内会分享我们的网盘链接

About task_id

Hi, our model implements an `inform_model` function to get task_id from the batch in [this part of our code](https://github.com/Hzfinfdu/MPMP/blob/master/Deep/model.py#L35C14-L35C26). The current task_id is included in the batch [here](https://github.com/Hzfinfdu/MPMP/blob/master/Deep/trainer.py#L111). In the...

Keyerror: hook_point_in

Hi! Our codebase is constantly updating and we now realized that we are not capable to maintain a stable version for external use XD. Apologies for this! However you can...

[Proposal] Add Automatic (Unit) Testing and CI Workflows

You did not forget where you started. Great engineers do great jobs

Error loading SAE with SAElens library

Hi, we have been receiving such issue reports several times recently. We will have a look at this asap in 24 hrs.

Error loading SAE with SAElens library

I have replicated your code script in two different environments I have at hand and fail to reproduce this error. Could you please provide more details? running `pip list` gives...

‹
1
2
3
4