milesial
milesial
Spice up your profiles with optional domains, categories and colors  ``` from torch.cuda.nvtx import range_push, range_pop, range, range_start, range_end import time range_push('default domain') time.sleep(.1) range_pop() range_push('custom domain', domain="mydomain") time.sleep(.1)...
Faster gradient clipping using the foreach functions ``` [------------------------ (tensors, scalar) -------------------------] | without foreach | with foreach | apex 1 threads: ---------------------------------------------------------------------- 10 tensors of size 4 | 120.5...
## What does this PR do? When training when native AMP and a LR scheduler, we get this warning that indicates that a LR step has been taken when an...
Hi, using version 0.10.3 and the llama3 tokenizer, with vLLM, I can't seem to constrain to generate emojis. ``` curl --request POST \ --url http://localhost:8000/v1/chat/completions \ --header 'Content-Type: application/json' \...