electra icon indicating copy to clipboard operation
electra copied to clipboard

ELECTRA: Pre-training Text Encoders as Discriminators Rather Than Generators

Results 62 electra issues
Sort by recently updated
recently updated
newest added

# Overview I am attempting to train the small version of Electra on a custom vocabulary. Looking at the performances I see that `max_predictions_per_seq` is set to an heuristic formula:...

Hello, may someone please explain what does each evaluation metric indicates. ``` disc_accuracy = 0.86676794 disc_auc = 0.6815034 disc_loss = 0.35936752 disc_precision = 0.7586109 disc_recall = 0.04827826 global_step = 5000...

When I run "python3 run_pretraining.py --data-dir $DATA_DIR --model-name electra_small_owt", I face the following error: ERROR:tensorflow:Error recorded from training_loop: 2 root error(s) found. (0) Data loss: truncated record at 10035180 [[node...

Running ``` python3 build_openwebtext_pretraining_dataset.py --data-dir data --num-processes 8 ``` gives the error: ``` Traceback (most recent call last): File "build_openwebtext_pretraining_dataset.py", line 103, in main() File "build_openwebtext_pretraining_dataset.py", line 89, in main...

I request you to add a package configuration so that the ELECTRA repository can be easily installed and used. The purpose is to remove the need of cloning the repository...

- add a variable `init_checkpoint` to `configure_pretraining.py` - add code for continuing pre-training from an ELECTRA checkpoint to `run_pretraining.py` - update README (instructions for continuing pre-training from an ELECTRA checkpoint...

I noticed there're only GLUE test set results for ELECTRA-small and ELECTRA-small++ in Table 8 and GLUE dev set overall result for BERT-small and ELECTRA-small in Table 1. Could you...

Hello, I was wondering whether it is possible to add some loss metrics to the training cycle? The only thing I see during training Electra model is `1275000/3000000 = 42.5%,...

The function signature for tf.nn.dropout in TF1 is: ```python tf.nn.dropout( x, keep_prob=None, noise_shape=None, seed=None, name=None, rate=None ) ``` while TF2 has: ```python tf.nn.dropout( x, rate, noise_shape=None, seed=None, name=None ) ```...

avoid masking `["PAD"]` during dynamic masking. Issue #59