Oleh Shliazhko

Results 7 issues of Oleh Shliazhko

This bug prevents to run Megatron-LM 10B offload training example

Так работает ``` >>AddressExtractor()('проспекте ленина 39').matches [Match([MorphTagToken('проспекте', [0, 9), 'RU', 'O', [Form('проспект', Grams(NOUN,inan,loct,masc,sing))]), MorphTagToken('ленина', [10, 16), 'RU', 'I', [Form('ленин', Grams(NOUN,Sgtm,Surn,anim,gent,masc,sing)), Form('ленин', Grams(NOUN,Sgtm,Surn,accs,anim,masc,sing)), Form('ленин', Grams(NOUN,Sgtm,Surn,anim,femn,nomn,sing)), Form('ленин', Grams(ADJF,Infr,Poss,gent,masc,sing)), Form('ленин', Grams(ADJF,Infr,Poss,accs,anim,masc,sing)), Form('ленин', Grams(ADJF,Poss,femn,nomn,sing)),...

1. Fix setting of model.dropoute to the value from cli arguments. 2. Remove overwriting lr to initial value in resume because it leads to model divergence when training has been...

cla:signed

It was made using the existing [FLAN eval script](https://github.com/hendrycks/test/blob/master/evaluate_flan.py) as a reference. Minor changes: - load models as Float16; - put the samples on the same device as a model;...

#### What does this implement? This PR introduces the ability to cache the environment created for each SWE-bench task as the docker image. It saves the filesystem and environment variables...

✨ enhancement

1. Separation of the miniwob task and underlying browser backend 2. MCP-Playwright as browser ## Description by Korbit AI ### What change is being made? Introduce a new Browser/MCP-based backend...

- Use tool calling - Eval Apriel model - Added readme ## Description by Korbit AI ### What change is being made? Update Gaia evaluation configs and flow to use...