Teven

Results 11 comments of Teven

Update: this version is ~4 times slower with the change from column format to row format, compared to a hacky version that just gets the column for a certain feature....

Hey Maruf, sorry, not yet, I'm a bit swamped at the moment and the priority switched to cleaning OSCAR-ml additionally ourselves before launching anything on it, maybe @ibeltagy can review...

Hey ! The number without embeddings is actually just that: the number of non-embedding parameters, not the number of unique parameters. This is the relevant number to estimate the loss...

Ah yes, saw it on the other issue then forgot about it - I can take a look at the end of this week.

We did use mc4 for early multilingual experiments before switching to OSCAR - let's keep the code for future reference. Thanks for catching this!

Is there perhaps some way to compile without `doxygen` ?

`swig -version` returns 4.0.2, but maybe there's a conflicting installation issue. How can one remove the -doxygen flag ? Is it something to edit in the code, or a flag...

Passing `broadcast_buffers=False` to `DistributedDataParallel` fixed this for me. I've opened a PR at #24326 to surface that argument to the Trainer user.

Hey @tianyil1 , this looks like another issue to me, and I'm not seeing in my case. If you send your file here, it could be easier to run it...