glowwormX issues

Results 7 issues of


                                            glowwormX

sdk1.0.3 1、TxService.getTransactionsCountByContractAddr返回的json不应该用TxResponse序列化 ``` @Override public Request getTransactionsCountByContractAddr(String from, String to, String contractAddress, boolean txExtra, int... nodeIds) { TxRequest txRequest = new TxRequest(TX_PREFIX + "getTransactionsCountByContractAddr", providerManager, TxResponse.class, nodeIds); HashMap params =...

调用合约lambda表达式写法报错

demo每次调用合约需要新写一个类，尝试lambda表达式写法，`org.apache.bcel.util`报找不到类错误 ``` //调用注册 //Transaction transaction1 = new Transaction.HVMBuilder(account.getAddress()).invoke(contractAddress, new InvokeStudentReg()).build(); //使用lambda表达式，避免新写一个类 BaseInvoke register = iStudent -> iStudent.registerStudent(Arrays.asList(new Student("id1", "name1", 20), new Student("id2", "name2", 20))); Transaction transaction1 = new Transaction.HVMBuilder(account.getAddress()).invoke(contractAddress,...

请问有deepseek v3或r1微调的例子吗

### Reminder - [x] I have read the above rules and searched the existing issues. ### System Info main ### Reproduction 我看了Changelog中说支持了deepseek v3和r1，但我看commit中只提交了template，没有其他实现，请问有跑通的example吗。我的理解是跑通需要支持专家并行，DeepSeek-V3 bf16 1.3T，用zero3要把每个专家所有参数都allgather，通信量太大。另外[huggingface的modeling_deepseek](https://huggingface.co/deepseek-ai/DeepSeek-V3/blob/main/modeling_deepseek.py) 不支持训练 439:assert not self.training，也需要另外的实现...

bug

pending

FP16 training has an anomaly on the NPU.

I trained on the NPU using FP16, and found many NaN values in step 1 of the training results. ``` (TaskRunner pid=1218449) [2025-11-19 17:41:06,020] [INFO] [aggregate_logger.py:54:log]: step:1 actor/entropy:0.8346855640411377 training/rollout_probs_diff_valid:1 training/rollout_probs_diff_max:nan...

mindspeed train error: group_type currently only support -1 and 0, current value is 2

我使用verl 1030的main代码，按 [Dockerfile.ascend_8.2.rc1_a2](https://github.com/volcengine/verl/blob/main/docker/Dockerfile.ascend_8.2.rc1_a2)安装环境跑recipe/dapo/run_dapo_qwen3_moe_30b_megatron_npu.sh，初始化、推理均已完成，但训练时报错： ``` ray.exceptions.RayTaskError(RuntimeError): [36mray::WorkerDict.actor_rollout_update_actor()[39m (pid=490071, ip=172.16.2.11, actor_id=dee0d43a6f32372ec4ff655e04000000, repr=) File "/cache/ray_temp/session_2025-10-31_17-44-09_992100_1145212/runtime_resources/working_dir_files/_ray_pkg_d85728c4d7bda8f2/verl/single_controller/ray/base.py", line 700, in func return getattr(self.worker_dict[key], name)(*args, **kwargs) File "/cache/ray_temp/session_2025-10-31_17-44-09_992100_1145212/runtime_resources/working_dir_files/_ray_pkg_d85728c4d7bda8f2/verl/single_controller/base/decorator.py", line 442, in inner return func(*args, **kwargs)...

vllm开启dp和ep在npu上报错

运行30b 开启： ``` actor_rollout_ref.rollout.tensor_model_parallel_size=2 \ actor_rollout_ref.rollout.data_parallel_size=2 \ actor_rollout_ref.rollout.expert_parallel_size=4 \ ``` a2上我希望运行235b tp=8 dp=2，或者其他性能更高的切分策略，先在30b上测试结果报错了; 而去除dp ep正常运行，使用fsdp、megatron均报错 verl11.18代码 51d2104ecb61563c41123a8f0bce2f06b18387dc vllm 0.11.0.rc2 cann8.3.rc2 日志： ``` [36m(WorkerDict pid=3381096)[0m INFO 12-04 17:33:17 [layer.py:332] FlashInfer CUTLASS...

glowwormX

fix: file io bug

bugs

调用合约lambda表达式写法报错

请问有deepseek v3或r1微调的例子吗

FP16 training has an anomaly on the NPU.

mindspeed train error: group_type currently only support -1 and 0, current value is 2

vllm开启dp和ep在npu上报错