Liao Peiyuan issues

Results 10 issues of


                                            Liao Peiyuan

Is model parallelism supported for PyTorch?

If I write my own multi-GPU model or use `torch.distributed.pipeline.sync.Pipe`, would multi-node training still work with byteps?

[misc] python tests error that doesn't affect program execution

**Describe the bug** When running ti test, the main program throws an error saying that it is unable to recognize the argument ``-n1`` **Log/Screenshots** Running python tests... ERROR: usage: ti...

python

potential bug

good first issue

[内容补充与拓展]集合通讯

- [ ] 机内网络(intra-node)和机间网络(inter-node)细节：RDMA, RoCE, Infiniband, NVLink, AWS EFA 之间的联系 - [ ] Allreduce 算法细节: tree, ring, CollNet - [ ] 其他常见的 Collective operations: AllGather, Reduce-Scatter, Broadcast - [ ]...

great suggestion

confirmed

Update brisque.py

Version of Protobuf

I saw in docker file that the command installed protobuf=3.9 to the container, but locally only 3.6.1 worked for me (on anaconda). `INSTALL.md` also suggests 3.6.1 (or later, but only...

Access Matrix

Access matrix on ComputeDAG for all tensors, following a [Tiramisu](https://arxiv.org/pdf/2104.04955.pdf)-style construction and flattened, then applied to each TIR-buffer during prediction. Results on a 1000-subset in e5: Model: xgb RMSE: 0.0793...

Graph Model

Graph Neural Networks on lowered TIR Stmt.

Assembly-level Feature Extraction

Known records that results in OOM/Segfault on `e5-2673`: ``` '([6e24397e6a0f017cdf546cb2ec2116a5,4,8,8,512,1,1,512,2048,1,1,1,2048,4,8,8,2048,4,8,8,2048],llvm).json' '([96a9c78ecf376dfe9479c94de3b5e43a,2,8,8,2048,2,8,8,2048],llvm).json' '([8c674f26f66543069d1e1c56cda249f9,8,15,15,1024,1,1,1024,2048,1,1,1,2048,8,8,8,2048],llvm).json' '([96a9c78ecf376dfe9479c94de3b5e43a,4,8,8,2048,4,8,8,2048],llvm).json' '([96a9c78ecf376dfe9479c94de3b5e43a,1,8,8,2048,1,8,8,2048],llvm).json' '([c985575127b3ad80868c19dff585d60b,1,100,36864,100,1,36864],llvm).json' '([6e24397e6a0f017cdf546cb2ec2116a5,8,7,7,512,1,1,512,2048,1,1,1,2048,8,7,7,2048,8,7,7,2048],llvm).json' ``` Time taken roughly on a 64-core machine: 100 hours Full-scale results on e5:...

md5 checksum for pre-trained model?

What is the meaning of an empty `pos_candidate`?

There are 761 rows in the [HuggingFace dataset osunlp/Multimodal-Mind2Web](https://huggingface.co/datasets/osunlp/Multimodal-Mind2Web) that have an empty `pos_candidate`. The rows span across 497 tasks: ``` {'test_domain': 164, 'test_task': 47, 'test_website': 34, 'train': 252} ```...