CodeGeeX
CodeGeeX copied to clipboard
Hardware requirements?
You mention "NVIDIA V100 or A100", but can a newer consumer RTX 3080 card work? How much VRAM for the models?
Hi! CodeGeeX has 13B parameters, even under FP16 format, the model itself takes around 27GB RAM. Thus it requires GPUs with at least 32GB RAM.
If you want to run it on RTX 3080, there are two possibilities. First, if you want to run it on a single card, you need to frequently switch data between CPU RAM and GPU RAM. It is feasible but will be extremely slow (depending on the speed of PCI-E on your hardware). Second, you may also run it on multiple cards (4 * 3080) using pipeline techniques, i.e., divide the model into multiple parts and put them on different cards, then do the forward calculation of each part sequentially. It is faster than the first solution.
Hello, was optimizing the number of model parameters part of your research priorities ? If so, what methods or strategies have you put in place? If not, do you plan to look into this subject soon ? Thank you.
Hello, was optimizing the number of model parameters part of your research priorities ? If so, what methods or strategies have you put in place? If not, do you plan to look into this subject soon ? Thank you.
The optimization of the model inference is definitely an important future work. It is not necessarily reducing the number of parameters, since the model should have enough model capacity to learn different languages (especially in the case of multilingual models). Instead, we plan to use methods like quantization (e.g., compress FP16 to INT8) or faster implementations (C++ implementation, kernel fusion, etc). We are currently working on this, and the solution will also be publicly available.
has anyone tried GTX 4090? Would it fit on its 24GB VRAM?