caixiiaoyang comments

Results 18 comments of


                                            caixiiaoyang

How to build debug-version Alpa-modified jaxlib

> """要使用调试信息进行构建，请添加标志 --bazel_options='--copt=/Z7'。""" FROM https://jax.readthedocs.io/en/latest/developer.html#id1 不过这个是基于最新的openxla的回答了，这个不确定0.3.22的能不能用 btw，可以加个vx交流一下吗老哥可以，我的微信号是13140163867

Check failed: operand_dim < ins->operand(0)->shape().rank() (2 vs. 2)Does not support this kind of Gather.

> I also met this issue when trying to use `alpa.ShardParallel()` or `alpa.PipeshardParallel()` to auto parallelize my llama model. > > ![image](https://private-user-images.githubusercontent.com/8370601/286101881-14749d05-9b5c-4f52-ab3b-7566908f3b7a.png?jwt=eyJhbGciOiJIUzI1NiIsInR5cCI6IkpXVCJ9.eyJpc3MiOiJnaXRodWIuY29tIiwiYXVkIjoicmF3LmdpdGh1YnVzZXJjb250ZW50LmNvbSIsImtleSI6ImtleTEiLCJleHAiOjE3MDE5MTQwNzUsIm5iZiI6MTcwMTkxMzc3NSwicGF0aCI6Ii84MzcwNjAxLzI4NjEwMTg4MS0xNDc0OWQwNS05YjVjLTRmNTItYWIzYi03NTY2OTA4ZjNiN2EucG5nP1gtQW16LUFsZ29yaXRobT1BV1M0LUhNQUMtU0hBMjU2JlgtQW16LUNyZWRlbnRpYWw9QUtJQUlXTkpZQVg0Q1NWRUg1M0ElMkYyMDIzMTIwNyUyRnVzLWVhc3QtMSUyRnMzJTJGYXdzNF9yZXF1ZXN0JlgtQW16LURhdGU9MjAyMzEyMDdUMDE0OTM1WiZYLUFtei1FeHBpcmVzPTMwMCZYLUFtei1TaWduYXR1cmU9MTk1NDU1OTllMjAyYmI3N2RlZmMzYzNjMDM3MWUyZWJmNjFhZmJjZDZjZTc4NzgwOTQ5YjY1NTJmMDMyZWJjOCZYLUFtei1TaWduZWRIZWFkZXJzPWhvc3QmYWN0b3JfaWQ9MCZrZXlfaWQ9MCZyZXBvX2lkPTAifQ.VucGPq6WLTOZO7S9U_N866Pfr4690sPXRNFd6IABzzQ) I also encountered this problem in the process...

Does not support this kind of Gather

I encountered the same problem, has your problem been solved?

Failed With Installation Check

请问您的问题解决了吗我的也出现了这个问题

when i check installation by running python3 -m alpa.test_install,AssertionError happend

你有几张卡，num_stages默认是2，你的num_devices不能整除num_satges就会出现这个问题，最好保证num_devices为偶数

when i check installation by running python3 -m alpa.test_install,AssertionError happend

> > 你有几张卡，num_stages默认是2，你的num_devices不能整除num_satges就会出现这个问题，最好保证num_devices为偶数 > > 我的服务器上有5张卡，如下图 ![image](https://user-images.githubusercontent.com/48151874/276118962-d4777b10-e090-4c5a-961b-1f9db404343e.png) > > 我应该怎么修改num_devices的数量，当我尝试输入CUDA_VISIBLE_DEVICES=0,1 python3 -m alpa.test_install，还是会有相同的问题我不知道怎么修改卡的数目，你有五张卡，并且卡的型号也不一样，你在a100上编译的，在3090上运行可能会出问题

IndexError: `InlinedVector::at(size_type) const` failed bounds check

请问您解决了吗，我也出现了这个错误

can you train yolov5？

> i can not train yolov5 with your code ,report error ' ValueError: Target size (torch.Size([16, 1200, 1])) must be the same as input size (torch.Size([16, 17000, 1]))'，it seems that...