kongds

Results 44 comments of kongds

Another problem is that the magnitude of `delta_grad * self.step_size` is too small to influence the `noise` during noise updating. For example, `delta_grad * self.step_size` is around ~ 1e-13, but...

> > > > > One annoyance I have is the following scenario: > > > > > > > > > > 1. Enter insert mode, type some text...

@alexmurray I found the macro **with-minibuffer-input** can't work because some wrong in **minibuffer-input-provider**. Maybe can change it like this. ```emacs-lisp (defun minibuffer-input-provider (inputs) (fset 'hook (lambda (hook_inputs) (eval `(remove-hook 'post-command-hook...

Thank you, For observation 1, we find representing sentences by [CLS] token or averaging is not efficient. By reformulating the sentence embedding task as the mask language task, we can...

For large model, i only trained unsupervised `bert-large-uncased`. But i can't find the checkpoint of it. The result is as follows: | STS12 | STS13 | STS14 | STS15 |...

I use 4 * 16gb V100 and torch==1.6.1+cu101 with apex. By the way, the result of `bert-large` is from the old codebase, which may be slightly different. But I don't...

I find the `trainer_state.json` of `bert-large`, this might help. ``` json { "best_metric": 0.8650828143261685, "best_model_checkpoint": "result/unsup-bert-large", "epoch": 1.0, "global_step": 3907, "is_hyper_param_search": false, "is_local_process_zero": true, "is_world_process_zero": true, "log_history": [ { "epoch":...

Thanks for your answer. Another concern is the F1(87.7) seems not match accuracy(91.6) in CAP-m, which means the FN (false negative) and TN (true negative) is huge unbalance compared to...

Hello, I run the cap-m 0.10 on QQP based on `run_glue_topk_kd.sh`, but get the following results (90.5/87.2). ``` 07/14/2022 23:41:19 - INFO - __main__ - ***** Eval results ***** 07/14/2022...

I use the checkpoint provided by [dyanbert](https://github.com/huawei-noah/Pretrained-Language-Model/tree/master/DynaBERT). The performance is 90.9 acc on QQP.