Sifal

Results 3 issues of Sifal

Hi! I'm currenty using the map function to tokenize a somewhat large dataset, so I need to use the cache to save ~25 mins. Changing num_proc incduces the recomputation of...

Hi, First off thanks for this great contribution! There seems to be an issue with the handling of then encoder_outputs in the pooler level when passing output_all_encoded_layers = True. https://github.com/mosaicml/examples/blob/daddaeff7a535273ff984b0134da4839c70a45b3/examples/benchmarks/bert/src/bert_layers.py#L676-L689...