Dmitry Chichkov comments

Results 33 comments of


                                            Dmitry Chichkov

Differences between SEEM Focal-L and the Huggingface Demo model?

Thank you! For Seem/X-Decoder, I see the checkpoint, but I don't seem to be able to find the config in the [repository](https://github.com/UX-Decoder/Segment-Everything-Everywhere-All-At-Once/tree/main/demo_code). Should this be something like xdecoder_focall_lang.yaml? Sure, if...

can I trans model to onnx for online serving?

I've tried converting to ONNX or TorchScript [to land it on Triton/TensorRT], quite a few issues. Seems that other people had issues with the ONNX export too - https://github.com/microsoft/onnxruntime/issues/12594

Cache management facilities

It had been some time since 2016, it'd be great for this to get higher priority. To give some context, I'm on 2TB drive and have to keep fighting with...

[FEATURE] Add conversation support to fields in Argilla dataset (ChatField)

Hi @nataliaElv & @dvsrepo . An example could be https://huggingface.co/datasets/liuhaotian/LLaVA-Instruct-150K The format there (this is a single multi-turn conversation) is: `[ { "id": "000000033471", "image": "000000033471.jpg", "conversations": [ { "from":...

[FEATURE] Add conversation support to fields in Argilla dataset (ChatField)

In terms of the number of conversations, turns, images, participants, participant names - it varies for every conversation. To put some rough numbers: - Conversations: 100k - Conversation Turns: 1...

OpenAI models does not support multi modal?

Hi @Harsha-Nori . Please, can you tell if this is effectively abandoned, and if yes, what was it abandoned in favor of? Is there another library that can do it...

OpenAI models does not support multi modal?

I've implemented a workaround that enables image support when using vLLM/OpenAI inference - https://github.com/guidance-ai/guidance/issues/1077

Images are not visible in the console and are invalid in the ChatML / string representation.

If this helps, I've implemented a hook, so the URLs get expanded, as a workaround for vLLM / OpenAI hosted VLMs: ```python def hook(request): j = json.loads(request.content) def process_content(input_str): #...

Images are not visible in the console and are invalid in the ChatML / string representation.

I expect str(lm) to result in a valid ChatML / Markdown string, like it was in the text-only case. It seems that ChatML was designed to work well in Markdown...

Out of RAM using 24.07 container

Yes, this seems similar to what I've observed on 15B / 24.07. I've noticed that locally, a 15B/TP4 checkpointing (checkpoint size is 205GB) reserves 70GB of process memory and 290GB...