Dmitry Chichkov
Dmitry Chichkov
Thank you! For Seem/X-Decoder, I see the checkpoint, but I don't seem to be able to find the config in the [repository](https://github.com/UX-Decoder/Segment-Everything-Everywhere-All-At-Once/tree/main/demo_code). Should this be something like xdecoder_focall_lang.yaml? Sure, if...
I've tried converting to ONNX or TorchScript [to land it on Triton/TensorRT], quite a few issues. Seems that other people had issues with the ONNX export too - https://github.com/microsoft/onnxruntime/issues/12594
It had been some time since 2016, it'd be great for this to get higher priority. To give some context, I'm on 2TB drive and have to keep fighting with...
Hi @nataliaElv & @dvsrepo . An example could be https://huggingface.co/datasets/liuhaotian/LLaVA-Instruct-150K The format there (this is a single multi-turn conversation) is: `[ { "id": "000000033471", "image": "000000033471.jpg", "conversations": [ { "from":...
In terms of the number of conversations, turns, images, participants, participant names - it varies for every conversation. To put some rough numbers: - Conversations: 100k - Conversation Turns: 1...
Hi @Harsha-Nori . Please, can you tell if this is effectively abandoned, and if yes, what was it abandoned in favor of? Is there another library that can do it...
I've implemented a workaround that enables image support when using vLLM/OpenAI inference - https://github.com/guidance-ai/guidance/issues/1077
If this helps, I've implemented a hook, so the URLs get expanded, as a workaround for vLLM / OpenAI hosted VLMs: ```python def hook(request): j = json.loads(request.content) def process_content(input_str): #...
I expect str(lm) to result in a valid ChatML / Markdown string, like it was in the text-only case. It seems that ChatML was designed to work well in Markdown...
Yes, this seems similar to what I've observed on 15B / 24.07. I've noticed that locally, a 15B/TP4 checkpointing (checkpoint size is 205GB) reserves 70GB of process memory and 290GB...