Stive

Results 6 comments of Stive

Confirming the problem of low GPU utilization: It seems that some sort of computing on a single CPU core is a bottle neck:

I tried the following options one by one: 1) Without accelerator and with accelerator 2) Increase the number of num_processes from 1 to 2 3) Decrease max_len from 600 to...

I did a little research and launched the profiler. Pay attention to the % of time MAIN LOOP ```text Line # Hits Time Per Hit % Time Line Contents ==============================================================...

looks like folding weights from weight norm to Conv1d might help, but it will take time, I will check

Pay attention to the very sound of the very first letter. [playground.zip](https://github.com/user-attachments/files/17643867/playground.zip)

https://github.com/user-attachments/assets/49b7bf64-28e0-4cee-82c1-373188138647