MagicSource

Results 1251 comments of MagicSource
trafficstars

@matthoffner ``` async def stream_response(tokens, llm): try: iterator: Generator = llm.generate(tokens) for chat_chunk in iterator: print(llm.detokenize(chat_chunk), end='', flush=True) response = { 'choices': [ { 'message': { 'role': 'system', 'content': llm.detokenize(chat_chunk)...

BTW, I using ctransformers cli without any issue: ``` outputs = '' for text in llm(prompt, stream=True): print(text, end="", flush=True) outputs += text print() ```

Does VILA randomly sample from frames and send to vit? Does they using directly 631 frames to training?

@XueFuzhao is it evenly resampling for 8 out of 631 in above examples? How does the multiple images send into s2-siglip? thanks for the indications.

Hi, looks like it compare on segmentation task, does LLava task compared? Also, whats the most effect way to reduce the final tokens if using s2?

I saw the code just have a mlp_downsample, the vit outputs doesn't changed, does the avg pooling you mentioned is mlp_downrsampler? Is it the specificaly `flat_square` mentioned for avg_pool?

@bfshi Hi, does it means, in S2, if input slices are [1x, 2x, 3x], then just the 2x and 3x will interpolate to 1x to get a normal output size?...

@DuGuYifei Mediapipe + Metahuman is exactly I want. Looking forward to your work!

不错的主意,不过也可以考虑将ort 或 eigen 作为不同的推理后端,共用同一个数据输入,如自定义的Tensor (tensor本身无需特别复杂),这样可以做到解耦合。