Sherlock Huang
Sherlock Huang
Stack from [ghstack](https://github.com/ezyang/ghstack) (oldest at bottom): * __->__ #83134
Stack from [ghstack](https://github.com/ezyang/ghstack) (oldest at bottom): * __->__ #83050 AOTAutograd retraces graph module produced by torch dynamo, this PR preserves the stack trace in the original fx.Node.
Stack from [ghstack](https://github.com/ezyang/ghstack) (oldest at bottom): * __->__ #84896 We can now get cpp stack trace by calling torch.utils.get_cpp_backtrace() Sample output when calling from a torch_dispatch stack: ``` frame #23:...
Draft for PT2 Export Schema. This is the logical representation of schema. Actual schema will be rewritten with FlatBuffer or other serialization lib. An example of readable format of ResNet18...
Stack from [ghstack](https://github.com/ezyang/ghstack) (oldest at bottom): * __->__ #87662
Stack from [ghstack](https://github.com/ezyang/ghstack) (oldest at bottom): * __->__ #91919
Need to run with fix in https://github.com/pytorch/pytorch/pull/166702 ``` NGPU=8 CONFIG_FILE=./torchtitan/models/llama3/train_configs/debug_model.toml ./run_train.sh --model.name compiler_toolkit.llama3 --parallelism.data_parallel_shard_degree=2 --parallelism.tensor_parallel_degree=4 ``` Current output: P2016557983 Observations - I see each TransformerBlock becomes one subgraph, look for...
As title, compile.enable should be False in the compiler toolkit style workflow