Results 10 issues of Liao Peiyuan

If I write my own multi-GPU model or use `torch.distributed.pipeline.sync.Pipe`, would multi-node training still work with byteps?

**Describe the bug** When running ti test, the main program throws an error saying that it is unable to recognize the argument ``-n1`` **Log/Screenshots** Running python tests... ERROR: usage: ti...

python
potential bug
good first issue

- [ ] 机内网络(intra-node)和机间网络(inter-node)细节:RDMA, RoCE, Infiniband, NVLink, AWS EFA 之间的联系 - [ ] Allreduce 算法细节: tree, ring, CollNet - [ ] 其他常见的 Collective operations: AllGather, Reduce-Scatter, Broadcast - [ ]...

great suggestion
confirmed

I saw in docker file that the command installed protobuf=3.9 to the container, but locally only 3.6.1 worked for me (on anaconda). `INSTALL.md` also suggests 3.6.1 (or later, but only...

Access matrix on ComputeDAG for all tensors, following a [Tiramisu](https://arxiv.org/pdf/2104.04955.pdf)-style construction and flattened, then applied to each TIR-buffer during prediction. Results on a 1000-subset in e5: Model: xgb RMSE: 0.0793...

Graph Neural Networks on lowered TIR Stmt.

Known records that results in OOM/Segfault on `e5-2673`: ``` '([6e24397e6a0f017cdf546cb2ec2116a5,4,8,8,512,1,1,512,2048,1,1,1,2048,4,8,8,2048,4,8,8,2048],llvm).json' '([96a9c78ecf376dfe9479c94de3b5e43a,2,8,8,2048,2,8,8,2048],llvm).json' '([8c674f26f66543069d1e1c56cda249f9,8,15,15,1024,1,1,1024,2048,1,1,1,2048,8,8,8,2048],llvm).json' '([96a9c78ecf376dfe9479c94de3b5e43a,4,8,8,2048,4,8,8,2048],llvm).json' '([96a9c78ecf376dfe9479c94de3b5e43a,1,8,8,2048,1,8,8,2048],llvm).json' '([c985575127b3ad80868c19dff585d60b,1,100,36864,100,1,36864],llvm).json' '([6e24397e6a0f017cdf546cb2ec2116a5,8,7,7,512,1,1,512,2048,1,1,1,2048,8,7,7,2048,8,7,7,2048],llvm).json' ``` Time taken roughly on a 64-core machine: 100 hours Full-scale results on e5:...

There are 761 rows in the [HuggingFace dataset osunlp/Multimodal-Mind2Web](https://huggingface.co/datasets/osunlp/Multimodal-Mind2Web) that have an empty `pos_candidate`. The rows span across 497 tasks: ``` {'test_domain': 164, 'test_task': 47, 'test_website': 34, 'train': 252} ```...