Junjie Mao
Junjie Mao
Hi all, I'm recently building lablgtk (a GTK2 wrapper for OCaml) using mingw64 toolchains provided by msys2. The package uses ocamlmklib (and thus flexlink) to create a dll library called...
Bindgen is configurable through environment variables on how it searches for clang and what extra arguments it passes to the compiler. The former is the only way to change which...
**Describe the bug** When training Llama2 with DeepCompile enabled, the backward engine seems to pass two losses to the backward graph, with one of them being None and not actually...
# Problem Statement Today we have multiple sets of ProcessGroup management in the codebase for different parallel scenarios, namely: * `PipelineModule` uses `PipelineParallelGrid` for pipeline parallelism. * AutoTP uses one...
**Describe the bug** Trying to deep-compile a model calling tensor.expand() triggers the following guard error: ``` [rank0]: Traceback (most recent call last): [rank0]: File "/Playground/gist/deepcompile/extend.py", line 46, in [rank0]: o...
# Description Real-life training data may not be of the same size for every rank and at every iteration. When DeepCompile is active, training with variable-length data can hang because...
**Describe the bug** With a training script like the following: ``` import deepspeed import deepspeed.comm as dist def main(args): deepspeed.init_distributed() model = Model() ...... model.destroy() dist.destroy_process_group() ``` The following exception...