Wei-Sheng Chin

Results 11 issues of Wei-Sheng Chin

According to our recent study, Newton method can also be applied to solve FM as well. I am not sure if this result is helpful to fastFM, but if you...

question

Fix #12040. This PR removes most of the `opset` uses in C# test code. The only use of `opset` now is to workaround a special filename only used in a...

When converting models from Core ML, we should calculate the padding amounts and create a ONNX Pad operator to replace the use of deprecated attributes (e.g., SAME_LOWER). For example of...

enhancement

The types of parameters are not clear. For example, for [FFM](https://github.com/microsoft/NimbusML/blob/210b220f74d13ccb6586e034e83a5939ef395cef/src/python/nimbusml/decomposition/factorizationmachinebinaryclassifier.py#L53), we have doc ``` :param feature: see `Columns `_. :param label: see `Columns `_. :param weight: see `Columns `_....

### Describe the issue A test, `TensorrtExecutionProviderTest.MultiThreadsTestWithOneSessionSingleThreadInference`, frequently fails in `Windows GPU TensorRT CI`. From its title, it looks like a multi-threading test, so I feel there might be a...

ep:TensorRT

I count the number of sub-graphs (for tiny-GPT2 in huggingface) by ``` class GraphCaptureCompiler: def __init__(self): self.captured_graphs = [] def compile(self, gm, example_inputs): self.captured_graphs.append(gm) return gm compiler = GraphCaptureCompiler() torch._dynamo.optimize(compiler,...

triaged
open source
Merged
Reverted
ciflow/trunk
bug
module: dynamo
ciflow/inductor

repro: execute `bash run.sh` from `DeepSpeedExamples/training/HelloDeepSpeed`. **error** root@9824d79a444b:/home/DeepSpeedExamples/training/HelloDeepSpeed# sh run_ds.sh [2024-04-15 21:59:04,124] [INFO] [real_accelerator.py:203:get_accelerator] Setting ds_accelerator to cuda (auto detect) Traceback (most recent call last): File "/usr/local/bin/deepspeed", line 3, in...

bug

As title. We have pre-built executables for Windows and want to move them out of this repo.

The same code as #19769. I can't update that PR anymore since I force-push a change.

It's often hard to apply torch.compile directly to a model because many limitations. However, there are only a few key nn.Modules we really need to compile to get max speed....

needs-rebase
stale