Doiiars

Results 250 comments of Doiiars

https://chromewebstore.google.com/detail/crawlemon/omhiddaacemadihfgbdmpioljgfhddkp?hl=zh-CN 已经完成。Crawlemon一键导出页面数据。

> 你是在用 fp16 跑推理么? qwen2-7B-instruct 这个模型在 910B 设备上,用 fp16 推理会溢出,导致 prefill 过后所有的 logits 全部为 nan,于是第一个 token 采样就永远是 0 号 token(即感叹号)。 解决方式是改用 bf16。 我印象中 AscendvLLM 的 FX 后端是支持 bf16 的。 如果你的...

> 如果是vllm的话,vllm自身就带json mode这个功能的。https://docs.vllm.ai/en/latest/serving/openai_compatible_server.html 这个文档里搜索下json。 > > 如果是阿里云的模型服务的话,在开发中。 ollama都支持了,阿里云模型打算多久支持呢?

> 明确您想要使用的模型的比例。您可以使用 24GB VRAM 始终运行 7B,无需量化。 The qwen2-7b model is weaker in code capabilities compared to the llama3-70b-IQ2XS(22GB) quantized model and also weaker than the deepseek-coder-v2:16b-lite-instruct-q8_0(16GB). However, I need more...

> Would you mind sharing some cases that you found Qwen2-7B underperformed? I have a strong requirement for mermaid diagrams and UML diagrams, and Qwen is not performing well.

> Could you check Qwen2.5 or Qwen2.5-Coder? cc: @huybery Thank you for your work! You are a true hero. I will go to check this! Long live!

I asked qwen2.5 to plan my evening tasks for me, but qwen2.5 mixed up the order of the syntax used to define aliases in Mermaid syntax. This is the output...

> sequenceDiagram > participant 用户 as User > participant 系统 as System > participant 邮件服务 as MailService it need to be sequenceDiagram participant User as 用户 participant System as 系统...

> 其实最低4090单卡就能微调,用lora方式。不缺资源的话就8卡a100 那我3090应该也没问题吧。24GB可以调70b模型吗?