Ma Mingfei comments

Results 93 comments of


                                            Ma Mingfei

V2 Performance Signal Detected by TorchBench CI on '1.13.0.dev20220811+cu113'

> @mingfeima the profiler will affect the e2e runs. In general, did you repro the regression on your side? I can't reproduce the regression on my side. `hf_BigBird` is slightly...

V2 Performance Signal Detected by TorchBench CI on '1.13.0.dev20220811+cu113'

@xuzhao9 Hi any update for this issue? One more thing to clarify is that performance improvement from https://github.com/pytorch/pytorch/pull/84404 can not be properly reflected by the listed modes in this issue...

[RFC] torchvision performance optimization on CPU

@NicolasHug First of all, yes our priority is inference. And the most requested model from our customers are `MaskedRCNN` and its variants. So from this point of view, the key...

[RFC] torchvision performance optimization on CPU

@NicolasHug @vfdev-5 Oh sorry for the late response, super busy recently, just got time to take a look at this last weekend ... I opened https://github.com/pytorch/pytorch/pull/87053 to address `mode=bilinear (3,...

[RFC] torchvision performance optimization on CPU

> Hopefully there will be support for uint8 type input and an accelerated version of it for `interpolate()` as mentioned in [pytorch/pytorch#86361 (comment)](https://github.com/pytorch/pytorch/pull/86361#issuecomment-1269822386) and [pytorch/pytorch#5580](https://github.com/pytorch/pytorch/issues/5580) . sum up the status...

[Roadmap] CPU Performance Optimization for PyG

Ok, I see. Then it is better to optimize scatter_reduce in torch. Just checked the code, scatter_add and scatter_reduce share the same kernel in torch so they have the same...

[Roadmap] CPU Performance Optimization for PyG

> > I'm a complete newbie to this, so my question is to learn not suggest something. Can you explain what you're intending to change here? It sounds like you...

[Roadmap] CPU Performance Optimization for PyG

current benchmark profiling result uses the default setting. Some scripts, for example [to_hetero_mag](https://github.com/pyg-team/pytorch_geometric/blob/master/examples/hetero/to_hetero_mag.py#L29) would explicitly set the `num_workers`, if not the pytorch default setting will be 4. `DataLoader` time in...

[Roadmap] CPU Performance Optimization for PyG

Updates on `scatter_add` optimizations, PR submitted at https://github.com/pytorch/pytorch/pull/82703 ### Initiative Depending type of the `edge_index`, message passing will choose different paths: a) `scatter_add` for dense tensor; b) `spmm` for `SparseTensor`....

[Roadmap] CPU Performance Optimization for PyG

> Just to understand: Does this mean that we first sort `index` and then do a segment reduction? In that case it might be good to preserve the information that...