grok-1 icon indicating copy to clipboard operation
grok-1 copied to clipboard

Grok-1.5V code release

Open fabiopoiesi opened this issue 10 months ago • 4 comments

Hi, when are you planning to release the source code of Grok-1.5V?

Thanks

fabiopoiesi avatar Apr 14 '24 06:04 fabiopoiesi

Does it matter? The amount of GPU power required to test the code for Grok is enormous. Even if the code is released, (just like Grok-1), there is no way you can test it on your local PC. I believe you need a subscription on X to test it.

pattang56892 avatar May 17 '24 02:05 pattang56892

it does matter, so i can learn how vision and language are put into communication

fabiopoiesi avatar May 17 '24 03:05 fabiopoiesi

I’ve encountered some challenges while working with Grok. After downloading the weights (approximately 300GB) and setting it up in my IDE, my PC froze as soon as I ran the run.py script. Upon investigating the code, it appears that this LLM requires a platform with at least 8 GPUs (Linux/Unix). Given these requirements, it seems impractical for my current setup.

However, I can see how this can be achieved with Ollama. Ollama's capability to run on local drives allows the possibility of building a GUI with various Llama models. This can provide a user-friendly frontend interface, enabling users to interact with different models and serving as an effective learning platform. This, in my opinion is fantastic.

Given these constraints, however, how would you learn about the communication/connection between the GUI and backend in Grok when it cannot be implemented on local drives due to its high GPU requirements? Can you provide more details?

pattang56892 avatar May 17 '24 13:05 pattang56892

I'm asking about 1.5V because I'm interested in the multimodal model: vision + language

On Fri, 17 May 2024, 15:45 Patrick T., @.***> wrote:

I’ve encountered some challenges while working with Grok. After downloading the weights (approximately 300GB) and setting it up in my IDE, my PC froze as soon as I ran the run.py script. Upon investigating the code, it appears that this LLM requires a platform with at least 8 GPUs (Linux/Unix). Given these requirements, it seems impractical for my current setup.

However, I can see how this can be achieved with Ollama. Ollama's capability to run on local drives allows the possibility of building a GUI with various Llama models. This can provide a user-friendly frontend interface, enabling users to interact with different models and serving as an effective learning platform. This, in my opinion is fantastic.

Given these constraints, however, how would you learn about the communication/connection between the GUI and backend in Grok when it cannot be implemented on local drives due to its high GPU requirements? Can you provide more details?

— Reply to this email directly, view it on GitHub https://github.com/xai-org/grok-1/issues/323#issuecomment-2117645412, or unsubscribe https://github.com/notifications/unsubscribe-auth/AEBXWKAFGHIC4GMKAYUZZITZCYCY5AVCNFSM6AAAAABGGBBZUOVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDCMJXGY2DKNBRGI . You are receiving this because you authored the thread.Message ID: @.***>

fabiopoiesi avatar May 22 '24 19:05 fabiopoiesi