一叶飘零

Results 13 issues of 一叶飘零

==>result[key] = (values[0] + min(left[d], right[d])) * values[1] 这一步理解不了是在干什么,我的理解是只要取 左右熵中的最小值作为 这一步需要赋值的值就可以了 def find_word(self, N): # 通过搜索得到互信息 # 例如: dict{ "a_b": (PMI, 出现概率), .. } bi = self.search_bi() # 通过搜索得到左右熵 left...

when to support JAVA api

基于公司数据做持续预训练, 可否提供预训练脚本,包括 lr,max_seq_len, 及其他需要注意的细节等

有个分布式的问题: 这套分布式code 单机多卡执行没有问题,但是 多机多卡 在 保存checkpoint时候老是报错,一直定位不了问题, 这个你们有经验么,问题出在哪 2024-04-15 09:40 File "/opt/conda/lib/python3.10/shutil.py", line 679, in _rmtree_safe_fd 2024-04-15 09:40 os.unlink(entry.name, dir_fd=topfd) 2024-04-15 09:40 FileNotFoundError: [Errno 2] No such file or directory: 'rng_state_28.pth'...

### Is your feature request related to a problem? Please describe. https://zhuanlan.zhihu.com/p/639362627 ### Solutions 引入 flash attention transformer 进行加速 ### Additional context _No response_

这个问题怎么解决, cuda版本的问题? ****************************************************************************************************************************** CUDA_SETUP: WARNING! libcudart.so not found in any environmental path. Searching /usr/local/cuda/lib64... CUDA SETUP: CUDA version lower than 11 are currently not supported for LLM.int8(). You will be...

as the title, wheter other language as Java could use the base llm?

![image](https://github.com/deepseek-ai/DeepSeek-Coder/assets/20754435/53540b6d-7ca6-437f-be72-5c8318182843) ![tokenizer乱码](https://github.com/deepseek-ai/DeepSeek-Coder/assets/20754435/59db7e6c-029a-4bd8-b8f5-09407affb0cd)

### System Info ```Shell - `Accelerate` version: 0.29.3 - Platform: Linux-3.10.0-1160.el7.x86_64-x86_64-with-glibc2.35 - `accelerate` bash location: /opt/conda/bin/accelerate - Python version: 3.10.13 - Numpy version: 1.26.4 - PyTorch version (GPU?): 2.1.2+cu121 (True)...

https://github.com/triton-lang/triton/blob/95623038c75463286aa5d4a44782ba7492cc1afa/python/triton/language/semantic.py#L761C1-L763C1 how to resolve this