Daniel Han

Results 781 comments of Daniel Han

From Linkedin chat (Daniel Han-Chen :) ) ``` import cpuinfo def get_gflops(): """ Uses https://github.com/workhorsy/py-cpuinfo to get CPU info. We make some assumptions on the FLOPs for each CPU. We...

PS I found `psutil.cpu_freq()` to not function on Google Colab, but `cpuinfo` works - very weird - I'll probably report this issue to `psutil`

My bad, Google Colab was using an old psutil version

PS I calculated by hand the Intel(R) Xeon(R) CPU E5-2673 v3 16 core GFLOPS Haswell AVX2: 16 cores * 2.4 GHz * 2 * 256 / 8 / 8 =...

Heyy! Ohhh oops! So we're still in the process of fixing things up - I think I might have accidentally not pushed a newer update. So currently the package is...

@gracehubai Oh I'm working on some chat completion notebooks as we speak! For now, a community member made one for Mistral: https://colab.research.google.com/drive/1bMOKOBzxQWUIGZBs_B0zm8pimuEnZdfM?usp=sharing - you can most likely copy the data...

@gracehubai No problems! Thanks to the community for the notebook :) I'll add a few in the following days to address other models (like TinyLlama) :)

@gracehubai Oh wait sadly GGUF models don't work. Ohhh you're referring to the chat template. https://www.reddit.com/r/LocalLLaMA/comments/19c75cp/what_magic_does_ollama_do_to_models_tinyllama/ Ie you need to use TinyLlama Chat's `apply_chat_template` to make it not have gibberish...

@gracehubai I'm pretty sure https://huggingface.co/TheBloke/TinyLlama-1.1B-Chat-v1.0-GGUF/tree/main is a GGUF file, and hence why the error message `RuntimeError: Unsloth: TheBloke/TinyLlama-1.1B-Chat-v1.0-GGUF is not a full model or a PEFT model.` exists. TinyLlama is...

@gracehubai Oh so I tried TinyLlama Chat - hope this helps :) ```python from unsloth import FastLanguageModel model, tokenizer = FastLanguageModel.from_pretrained( "TinyLlama/TinyLlama-1.1B-Chat-v1.0", load_in_4bit = True, max_seq_length = 2048, ) messages...