DiegoD94

Results 14 comments of DiegoD94

@Spandan-Madan Hi This is DhDiego, I change my account to this. And above account is no longer active. What's more I already decrypt chat history database of WeChat and the...

@Spandan-Madan I already commit a PullRequest for WeChat Extension (English only) working of constructing Chinese version with fasttext or glove

@Spandan-Madan That's great, I'm working on my final projects and another final exam on 12.20 I'll work on Chinese version after that. I'll try to introduce this fun project to...

@Spandan-Madan That sounds great, you can reach me via [email protected] and we can do it together. Sorry for late responde

> This is cool, good find! Do you have any idea how much cloning + load-balancing is helping vs. not cloning the expert and instead just running an unbalanced fused-moe?...

> > > This is cool, good find! Do you have any idea how much cloning + load-balancing is helping vs. not cloning the expert and instead just running an...

@LucasWilkinson @tlrmchlsmth @robertgshaw2-redhat Hi all I have pushed a second revision make the replica configurable, one can now setting the VLLM_ENABLE_SHARED_FUSION to 0 to disable the feature and 1,2,3,4,5,6,7,8, or...

Ported in a new commit to fix current comments, let me know if there is any potential blocker for merging this in, thanks! @LucasWilkinson @tlrmchlsmth @maobaolong @robertgshaw2-redhat

> curious what's the performance like without kernel config tuning? This is an internal experiment that I benchmarked based on vllm 0.7.2, with and without tuned config, I think we...

rebased to catch up mainline