Model only outputs G repeatedly in interactive mode with ggml-model-i2_s.gguf
When I run run_inference.py in interactive mode using the provided ggml-model-i2_s.gguf from Hugging Face, the model only outputs the character G in a loop, no matter what prompt I use.
Command used
python run_inference.py -m models/BitNet-b1.58-2B-4T/ggml-model-i2_s.gguf -p "You are a helpful assistant" -cnv
System info Ubuntu 22.04 Python 3.9 (Conda env) CPU only (no AVX support)
Is this expected with the i2_s quantized model? Could this be a tokenizer issue or metadata mismatch?
Thanks for your help!
I face exact issue, running same system specification
I believe that VirtualBox does not expose AVX or AVX2 instructions to the guest machine, even though the host CPU supports them (in my case: Ryzen 7 7730U).
I checked with the command: lscpu | grep avx → No output was returned in the VM, which confirms the absence of AVX support.
However, I tested under WSL2 (Windows Subsystem for Linux), and the AVX/AVX2 instructions are properly detected. I was able to successfully run BitNet.
i`m running the model directly in an ubuntu server. The output of the command lscpu | grep avx returned avx, but the model only outputs G repeatedly. What else could i check?
PS: The run command logs this: llm_load_vocab: missing pre-tokenizer type, using: 'default' llm_load_vocab: llm_load_vocab: ************************************ llm_load_vocab: GENERATION QUALITY WILL BE DEGRADED! llm_load_vocab: CONSIDER REGENERATING THE MODEL llm_load_vocab: ************************************
On https://github.com/microsoft/BitNet: Did you properly install the required dependencies — Python (>=3.9), CMake (>=3.22), Clang (>=18), and Conda — and then follow the installation steps as described?
Running Bitnet on Ubuntu 24, with Python 3.9. I'm facing the same problem, here's the inference log:
$ python3.9 run_inference.py -m models/Llama3-8B-1.58-100B-tokens/ggml-model-i2_s.gguf -p "Implement a python function to get the factorial of a number." -t 12 -n 900
warning: not compiled with GPU offload support, --gpu-layers option will be ignored
warning: see main README.md for information on enabling GPU BLAS support
build: 3957 (5eb47b72) with cc (Ubuntu 13.3.0-6ubuntu2~24.04) 13.3.0 for x86_64-linux-gnu
main: llama backend init
main: load the model and apply lora adapter, if any
llama_model_loader: loaded meta data with 22 key-value pairs and 291 tensors from models/Llama3-8B-1.58-100B-tokens/ggml-model-i2_s.gguf (version GGUF V3 (latest))
llama_model_loader: Dumping metadata keys/values. Note: KV overrides do not apply in this output.
llama_model_loader: - kv 0: general.architecture str = llama
llama_model_loader: - kv 1: general.name str = Llama3-8B-1.58-100B-tokens
llama_model_loader: - kv 2: llama.block_count u32 = 32
llama_model_loader: - kv 3: llama.context_length u32 = 8192
llama_model_loader: - kv 4: llama.embedding_length u32 = 4096
llama_model_loader: - kv 5: llama.feed_forward_length u32 = 14336
llama_model_loader: - kv 6: llama.attention.head_count u32 = 32
llama_model_loader: - kv 7: llama.attention.head_count_kv u32 = 8
llama_model_loader: - kv 8: llama.rope.freq_base f32 = 500000.000000
llama_model_loader: - kv 9: llama.attention.layer_norm_rms_epsilon f32 = 0.000010
llama_model_loader: - kv 10: general.file_type u32 = 40
llama_model_loader: - kv 11: llama.vocab_size u32 = 128256
llama_model_loader: - kv 12: llama.rope.dimension_count u32 = 128
llama_model_loader: - kv 13: tokenizer.ggml.model str = gpt2
llama_model_loader: - kv 14: tokenizer.ggml.pre str = llama-bpe
llama_model_loader: - kv 15: tokenizer.ggml.tokens arr[str,128256] = ["!", "\"", "#", "$", "%", "&", "'", ...
llama_model_loader: - kv 16: tokenizer.ggml.token_type arr[i32,128256] = [1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, ...
llama_model_loader: - kv 17: tokenizer.ggml.merges arr[str,280147] = ["Ġ Ġ", "Ġ ĠĠĠ", "ĠĠ ĠĠ", "...
llama_model_loader: - kv 18: tokenizer.ggml.bos_token_id u32 = 128000
llama_model_loader: - kv 19: tokenizer.ggml.eos_token_id u32 = 128009
llama_model_loader: - kv 20: tokenizer.chat_template str = {% set loop_messages = messages %}{% ...
llama_model_loader: - kv 21: general.quantization_version u32 = 2
llama_model_loader: - type f32: 65 tensors
llama_model_loader: - type f16: 2 tensors
llama_model_loader: - type i2_s: 224 tensors
llm_load_vocab: control token: 128253 '<|begin_of_text|><|reserved_special_token_248|>' is not marked as EOG
llm_load_vocab: control token: 128252 '<|begin_of_text|><|reserved_special_token_247|>' is not marked as EOG
llm_load_vocab: control token: 128251 '<|begin_of_text|><|reserved_special_token_246|>' is not marked as EOG
llm_load_vocab: control token: 128246 '<|begin_of_text|><|reserved_special_token_241|>' is not marked as EOG
llm_load_vocab: control token: 128245 '<|begin_of_text|><|reserved_special_token_240|>' is not marked as EOG
llm_load_vocab: control token: 128244 '<|begin_of_text|><|reserved_special_token_239|>' is not marked as EOG
llm_load_vocab: control token: 128243 '<|begin_of_text|><|reserved_special_token_238|>' is not marked as EOG
llm_load_vocab: control token: 128242 '<|begin_of_text|><|reserved_special_token_237|>' is not marked as EOG
llm_load_vocab: control token: 128241 '<|begin_of_text|><|reserved_special_token_236|>' is not marked as EOG
llm_load_vocab: control token: 128240 '<|begin_of_text|><|reserved_special_token_235|>' is not marked as EOG
llm_load_vocab: control token: 128237 '<|begin_of_text|><|reserved_special_token_232|>' is not marked as EOG
llm_load_vocab: control token: 128236 '<|begin_of_text|><|reserved_special_token_231|>' is not marked as EOG
llm_load_vocab: control token: 128235 '<|begin_of_text|><|reserved_special_token_230|>' is not marked as EOG
llm_load_vocab: control token: 128234 '<|begin_of_text|><|reserved_special_token_229|>' is not marked as EOG
llm_load_vocab: control token: 128231 '<|begin_of_text|><|reserved_special_token_226|>' is not marked as EOG
llm_load_vocab: control token: 128230 '<|begin_of_text|><|reserved_special_token_225|>' is not marked as EOG
llm_load_vocab: control token: 128228 '<|begin_of_text|><|reserved_special_token_223|>' is not marked as EOG
llm_load_vocab: control token: 128226 '<|begin_of_text|><|reserved_special_token_221|>' is not marked as EOG
llm_load_vocab: control token: 128225 '<|begin_of_text|><|reserved_special_token_220|>' is not marked as EOG
llm_load_vocab: control token: 128221 '<|begin_of_text|><|reserved_special_token_216|>' is not marked as EOG
llm_load_vocab: control token: 128219 '<|begin_of_text|><|reserved_special_token_214|>' is not marked as EOG
llm_load_vocab: control token: 128218 '<|begin_of_text|><|reserved_special_token_213|>' is not marked as EOG
llm_load_vocab: control token: 128215 '<|begin_of_text|><|reserved_special_token_210|>' is not marked as EOG
llm_load_vocab: control token: 128214 '<|begin_of_text|><|reserved_special_token_209|>' is not marked as EOG
llm_load_vocab: control token: 128211 '<|begin_of_text|><|reserved_special_token_206|>' is not marked as EOG
llm_load_vocab: control token: 128209 '<|begin_of_text|><|reserved_special_token_204|>' is not marked as EOG
llm_load_vocab: control token: 128205 '<|begin_of_text|><|reserved_special_token_200|>' is not marked as EOG
llm_load_vocab: control token: 128204 '<|begin_of_text|><|reserved_special_token_199|>' is not marked as EOG
llm_load_vocab: control token: 128202 '<|begin_of_text|><|reserved_special_token_197|>' is not marked as EOG
llm_load_vocab: control token: 128200 '<|begin_of_text|><|reserved_special_token_195|>' is not marked as EOG
llm_load_vocab: control token: 128196 '<|begin_of_text|><|reserved_special_token_191|>' is not marked as EOG
llm_load_vocab: control token: 128195 '<|begin_of_text|><|reserved_special_token_190|>' is not marked as EOG
llm_load_vocab: control token: 128194 '<|begin_of_text|><|reserved_special_token_189|>' is not marked as EOG
llm_load_vocab: control token: 128193 '<|begin_of_text|><|reserved_special_token_188|>' is not marked as EOG
llm_load_vocab: control token: 128191 '<|begin_of_text|><|reserved_special_token_186|>' is not marked as EOG
llm_load_vocab: control token: 128189 '<|begin_of_text|><|reserved_special_token_184|>' is not marked as EOG
llm_load_vocab: control token: 128188 '<|begin_of_text|><|reserved_special_token_183|>' is not marked as EOG
llm_load_vocab: control token: 128187 '<|begin_of_text|><|reserved_special_token_182|>' is not marked as EOG
llm_load_vocab: control token: 128185 '<|begin_of_text|><|reserved_special_token_180|>' is not marked as EOG
llm_load_vocab: control token: 128184 '<|begin_of_text|><|reserved_special_token_179|>' is not marked as EOG
llm_load_vocab: control token: 128183 '<|begin_of_text|><|reserved_special_token_178|>' is not marked as EOG
llm_load_vocab: control token: 128181 '<|begin_of_text|><|reserved_special_token_176|>' is not marked as EOG
llm_load_vocab: control token: 128175 '<|begin_of_text|><|reserved_special_token_170|>' is not marked as EOG
llm_load_vocab: control token: 128172 '<|begin_of_text|><|reserved_special_token_167|>' is not marked as EOG
llm_load_vocab: control token: 128171 '<|begin_of_text|><|reserved_special_token_166|>' is not marked as EOG
llm_load_vocab: control token: 128170 '<|begin_of_text|><|reserved_special_token_165|>' is not marked as EOG
llm_load_vocab: control token: 128162 '<|begin_of_text|><|reserved_special_token_157|>' is not marked as EOG
llm_load_vocab: control token: 128160 '<|begin_of_text|><|reserved_special_token_155|>' is not marked as EOG
llm_load_vocab: control token: 128156 '<|begin_of_text|><|reserved_special_token_151|>' is not marked as EOG
llm_load_vocab: control token: 128155 '<|begin_of_text|><|reserved_special_token_150|>' is not marked as EOG
llm_load_vocab: control token: 128153 '<|begin_of_text|><|reserved_special_token_148|>' is not marked as EOG
llm_load_vocab: control token: 128152 '<|begin_of_text|><|reserved_special_token_147|>' is not marked as EOG
llm_load_vocab: control token: 128150 '<|begin_of_text|><|reserved_special_token_145|>' is not marked as EOG
llm_load_vocab: control token: 128148 '<|begin_of_text|><|reserved_special_token_143|>' is not marked as EOG
llm_load_vocab: control token: 128146 '<|begin_of_text|><|reserved_special_token_141|>' is not marked as EOG
llm_load_vocab: control token: 128145 '<|begin_of_text|><|reserved_special_token_140|>' is not marked as EOG
llm_load_vocab: control token: 128143 '<|begin_of_text|><|reserved_special_token_138|>' is not marked as EOG
llm_load_vocab: control token: 128141 '<|begin_of_text|><|reserved_special_token_136|>' is not marked as EOG
llm_load_vocab: control token: 128140 '<|begin_of_text|><|reserved_special_token_135|>' is not marked as EOG
llm_load_vocab: control token: 128139 '<|begin_of_text|><|reserved_special_token_134|>' is not marked as EOG
llm_load_vocab: control token: 128135 '<|begin_of_text|><|reserved_special_token_130|>' is not marked as EOG
llm_load_vocab: control token: 128133 '<|begin_of_text|><|reserved_special_token_128|>' is not marked as EOG
llm_load_vocab: control token: 128132 '<|begin_of_text|><|reserved_special_token_127|>' is not marked as EOG
llm_load_vocab: control token: 128129 '<|begin_of_text|><|reserved_special_token_124|>' is not marked as EOG
llm_load_vocab: control token: 128127 '<|begin_of_text|><|reserved_special_token_122|>' is not marked as EOG
llm_load_vocab: control token: 128126 '<|begin_of_text|><|reserved_special_token_121|>' is not marked as EOG
llm_load_vocab: control token: 128120 '<|begin_of_text|><|reserved_special_token_115|>' is not marked as EOG
llm_load_vocab: control token: 128118 '<|begin_of_text|><|reserved_special_token_113|>' is not marked as EOG
llm_load_vocab: control token: 128115 '<|begin_of_text|><|reserved_special_token_110|>' is not marked as EOG
llm_load_vocab: control token: 128113 '<|begin_of_text|><|reserved_special_token_108|>' is not marked as EOG
llm_load_vocab: control token: 128111 '<|begin_of_text|><|reserved_special_token_106|>' is not marked as EOG
llm_load_vocab: control token: 128110 '<|begin_of_text|><|reserved_special_token_105|>' is not marked as EOG
llm_load_vocab: control token: 128108 '<|begin_of_text|><|reserved_special_token_103|>' is not marked as EOG
llm_load_vocab: control token: 128107 '<|begin_of_text|><|reserved_special_token_102|>' is not marked as EOG
llm_load_vocab: control token: 128104 '<|begin_of_text|><|reserved_special_token_99|>' is not marked as EOG
llm_load_vocab: control token: 128103 '<|begin_of_text|><|reserved_special_token_98|>' is not marked as EOG
llm_load_vocab: control token: 128102 '<|begin_of_text|><|reserved_special_token_97|>' is not marked as EOG
llm_load_vocab: control token: 128101 '<|begin_of_text|><|reserved_special_token_96|>' is not marked as EOG
llm_load_vocab: control token: 128100 '<|begin_of_text|><|reserved_special_token_95|>' is not marked as EOG
llm_load_vocab: control token: 128099 '<|begin_of_text|><|reserved_special_token_94|>' is not marked as EOG
llm_load_vocab: control token: 128097 '<|begin_of_text|><|reserved_special_token_92|>' is not marked as EOG
llm_load_vocab: control token: 128096 '<|begin_of_text|><|reserved_special_token_91|>' is not marked as EOG
llm_load_vocab: control token: 128094 '<|begin_of_text|><|reserved_special_token_89|>' is not marked as EOG
llm_load_vocab: control token: 128093 '<|begin_of_text|><|reserved_special_token_88|>' is not marked as EOG
llm_load_vocab: control token: 128090 '<|begin_of_text|><|reserved_special_token_85|>' is not marked as EOG
llm_load_vocab: control token: 128086 '<|begin_of_text|><|reserved_special_token_81|>' is not marked as EOG
llm_load_vocab: control token: 128085 '<|begin_of_text|><|reserved_special_token_80|>' is not marked as EOG
llm_load_vocab: control token: 128082 '<|begin_of_text|><|reserved_special_token_77|>' is not marked as EOG
llm_load_vocab: control token: 128080 '<|begin_of_text|><|reserved_special_token_75|>' is not marked as EOG
llm_load_vocab: control token: 128077 '<|begin_of_text|><|reserved_special_token_72|>' is not marked as EOG
llm_load_vocab: control token: 128076 '<|begin_of_text|><|reserved_special_token_71|>' is not marked as EOG
llm_load_vocab: control token: 128072 '<|begin_of_text|><|reserved_special_token_67|>' is not marked as EOG
llm_load_vocab: control token: 128069 '<|begin_of_text|><|reserved_special_token_64|>' is not marked as EOG
llm_load_vocab: control token: 128068 '<|begin_of_text|><|reserved_special_token_63|>' is not marked as EOG
llm_load_vocab: control token: 128067 '<|begin_of_text|><|reserved_special_token_62|>' is not marked as EOG
llm_load_vocab: control token: 128063 '<|begin_of_text|><|reserved_special_token_58|>' is not marked as EOG
llm_load_vocab: control token: 128059 '<|begin_of_text|><|reserved_special_token_54|>' is not marked as EOG
llm_load_vocab: control token: 128058 '<|begin_of_text|><|reserved_special_token_53|>' is not marked as EOG
llm_load_vocab: control token: 128056 '<|begin_of_text|><|reserved_special_token_51|>' is not marked as EOG
llm_load_vocab: control token: 128054 '<|begin_of_text|><|reserved_special_token_49|>' is not marked as EOG
llm_load_vocab: control token: 128053 '<|begin_of_text|><|reserved_special_token_48|>' is not marked as EOG
llm_load_vocab: control token: 128051 '<|begin_of_text|><|reserved_special_token_46|>' is not marked as EOG
llm_load_vocab: control token: 128050 '<|begin_of_text|><|reserved_special_token_45|>' is not marked as EOG
llm_load_vocab: control token: 128046 '<|begin_of_text|><|reserved_special_token_41|>' is not marked as EOG
llm_load_vocab: control token: 128044 '<|begin_of_text|><|reserved_special_token_39|>' is not marked as EOG
llm_load_vocab: control token: 128043 '<|begin_of_text|><|reserved_special_token_38|>' is not marked as EOG
llm_load_vocab: control token: 128042 '<|begin_of_text|><|reserved_special_token_37|>' is not marked as EOG
llm_load_vocab: control token: 128041 '<|begin_of_text|><|reserved_special_token_36|>' is not marked as EOG
llm_load_vocab: control token: 128040 '<|begin_of_text|><|reserved_special_token_35|>' is not marked as EOG
llm_load_vocab: control token: 128036 '<|begin_of_text|><|reserved_special_token_31|>' is not marked as EOG
llm_load_vocab: control token: 128035 '<|begin_of_text|><|reserved_special_token_30|>' is not marked as EOG
llm_load_vocab: control token: 128033 '<|begin_of_text|><|reserved_special_token_28|>' is not marked as EOG
llm_load_vocab: control token: 128030 '<|begin_of_text|><|reserved_special_token_25|>' is not marked as EOG
llm_load_vocab: control token: 128029 '<|begin_of_text|><|reserved_special_token_24|>' is not marked as EOG
llm_load_vocab: control token: 128027 '<|begin_of_text|><|reserved_special_token_22|>' is not marked as EOG
llm_load_vocab: control token: 128023 '<|begin_of_text|><|reserved_special_token_18|>' is not marked as EOG
llm_load_vocab: control token: 128020 '<|begin_of_text|><|reserved_special_token_15|>' is not marked as EOG
llm_load_vocab: control token: 128019 '<|begin_of_text|><|reserved_special_token_14|>' is not marked as EOG
llm_load_vocab: control token: 128018 '<|begin_of_text|><|reserved_special_token_13|>' is not marked as EOG
llm_load_vocab: control token: 128014 '<|begin_of_text|><|reserved_special_token_9|>' is not marked as EOG
llm_load_vocab: control token: 128012 '<|begin_of_text|><|reserved_special_token_7|>' is not marked as EOG
llm_load_vocab: control token: 128008 '<|begin_of_text|><|reserved_special_token_4|>' is not marked as EOG
llm_load_vocab: control token: 128002 '<|begin_of_text|><|reserved_special_token_0|>' is not marked as EOG
llm_load_vocab: control token: 128087 '<|begin_of_text|><|reserved_special_token_82|>' is not marked as EOG
llm_load_vocab: control token: 128250 '<|begin_of_text|><|reserved_special_token_245|>' is not marked as EOG
llm_load_vocab: control token: 128116 '<|begin_of_text|><|reserved_special_token_111|>' is not marked as EOG
llm_load_vocab: control token: 128159 '<|begin_of_text|><|reserved_special_token_154|>' is not marked as EOG
llm_load_vocab: control token: 128091 '<|begin_of_text|><|reserved_special_token_86|>' is not marked as EOG
llm_load_vocab: control token: 128136 '<|begin_of_text|><|reserved_special_token_131|>' is not marked as EOG
llm_load_vocab: control token: 128028 '<|begin_of_text|><|reserved_special_token_23|>' is not marked as EOG
llm_load_vocab: control token: 128017 '<|begin_of_text|><|reserved_special_token_12|>' is not marked as EOG
llm_load_vocab: control token: 128011 '<|begin_of_text|><|reserved_special_token_6|>' is not marked as EOG
llm_load_vocab: control token: 128223 '<|begin_of_text|><|reserved_special_token_218|>' is not marked as EOG
llm_load_vocab: control token: 128147 '<|begin_of_text|><|reserved_special_token_142|>' is not marked as EOG
llm_load_vocab: control token: 128066 '<|begin_of_text|><|reserved_special_token_61|>' is not marked as EOG
llm_load_vocab: control token: 128247 '<|begin_of_text|><|reserved_special_token_242|>' is not marked as EOG
llm_load_vocab: control token: 128052 '<|begin_of_text|><|reserved_special_token_47|>' is not marked as EOG
llm_load_vocab: control token: 128169 '<|begin_of_text|><|reserved_special_token_164|>' is not marked as EOG
llm_load_vocab: control token: 128117 '<|begin_of_text|><|reserved_special_token_112|>' is not marked as EOG
llm_load_vocab: control token: 128203 '<|begin_of_text|><|reserved_special_token_198|>' is not marked as EOG
llm_load_vocab: control token: 128092 '<|begin_of_text|><|reserved_special_token_87|>' is not marked as EOG
llm_load_vocab: control token: 128071 '<|begin_of_text|><|reserved_special_token_66|>' is not marked as EOG
llm_load_vocab: control token: 128047 '<|begin_of_text|><|reserved_special_token_42|>' is not marked as EOG
llm_load_vocab: control token: 128217 '<|begin_of_text|><|reserved_special_token_212|>' is not marked as EOG
llm_load_vocab: control token: 128213 '<|begin_of_text|><|reserved_special_token_208|>' is not marked as EOG
llm_load_vocab: control token: 128249 '<|begin_of_text|><|reserved_special_token_244|>' is not marked as EOG
llm_load_vocab: control token: 128212 '<|begin_of_text|><|reserved_special_token_207|>' is not marked as EOG
llm_load_vocab: control token: 128000 '<|begin_of_text|><|begin_of_text|>' is not marked as EOG
llm_load_vocab: control token: 128075 '<|begin_of_text|><|reserved_special_token_70|>' is not marked as EOG
llm_load_vocab: control token: 128124 '<|begin_of_text|><|reserved_special_token_119|>' is not marked as EOG
llm_load_vocab: control token: 128180 '<|begin_of_text|><|reserved_special_token_175|>' is not marked as EOG
llm_load_vocab: control token: 128151 '<|begin_of_text|><|reserved_special_token_146|>' is not marked as EOG
llm_load_vocab: control token: 128079 '<|begin_of_text|><|reserved_special_token_74|>' is not marked as EOG
llm_load_vocab: control token: 128013 '<|begin_of_text|><|reserved_special_token_8|>' is not marked as EOG
llm_load_vocab: control token: 128248 '<|begin_of_text|><|reserved_special_token_243|>' is not marked as EOG
llm_load_vocab: control token: 128048 '<|begin_of_text|><|reserved_special_token_43|>' is not marked as EOG
llm_load_vocab: control token: 128034 '<|begin_of_text|><|reserved_special_token_29|>' is not marked as EOG
llm_load_vocab: control token: 128074 '<|begin_of_text|><|reserved_special_token_69|>' is not marked as EOG
llm_load_vocab: control token: 128015 '<|begin_of_text|><|reserved_special_token_10|>' is not marked as EOG
llm_load_vocab: control token: 128022 '<|begin_of_text|><|reserved_special_token_17|>' is not marked as EOG
llm_load_vocab: control token: 128081 '<|begin_of_text|><|reserved_special_token_76|>' is not marked as EOG
llm_load_vocab: control token: 128144 '<|begin_of_text|><|reserved_special_token_139|>' is not marked as EOG
llm_load_vocab: control token: 128173 '<|begin_of_text|><|reserved_special_token_168|>' is not marked as EOG
llm_load_vocab: control token: 128038 '<|begin_of_text|><|reserved_special_token_33|>' is not marked as EOG
llm_load_vocab: control token: 128138 '<|begin_of_text|><|reserved_special_token_133|>' is not marked as EOG
llm_load_vocab: control token: 128003 '<|begin_of_text|><|reserved_special_token_1|>' is not marked as EOG
llm_load_vocab: control token: 128166 '<|begin_of_text|><|reserved_special_token_161|>' is not marked as EOG
llm_load_vocab: control token: 128025 '<|begin_of_text|><|reserved_special_token_20|>' is not marked as EOG
llm_load_vocab: control token: 128178 '<|begin_of_text|><|reserved_special_token_173|>' is not marked as EOG
llm_load_vocab: control token: 128131 '<|begin_of_text|><|reserved_special_token_126|>' is not marked as EOG
llm_load_vocab: control token: 128130 '<|begin_of_text|><|reserved_special_token_125|>' is not marked as EOG
llm_load_vocab: control token: 128165 '<|begin_of_text|><|reserved_special_token_160|>' is not marked as EOG
llm_load_vocab: control token: 128149 '<|begin_of_text|><|reserved_special_token_144|>' is not marked as EOG
llm_load_vocab: control token: 128232 '<|begin_of_text|><|reserved_special_token_227|>' is not marked as EOG
llm_load_vocab: control token: 128216 '<|begin_of_text|><|reserved_special_token_211|>' is not marked as EOG
llm_load_vocab: control token: 128137 '<|begin_of_text|><|reserved_special_token_132|>' is not marked as EOG
llm_load_vocab: control token: 128055 '<|begin_of_text|><|reserved_special_token_50|>' is not marked as EOG
llm_load_vocab: control token: 128109 '<|begin_of_text|><|reserved_special_token_104|>' is not marked as EOG
llm_load_vocab: control token: 128039 '<|begin_of_text|><|reserved_special_token_34|>' is not marked as EOG
llm_load_vocab: control token: 128177 '<|begin_of_text|><|reserved_special_token_172|>' is not marked as EOG
llm_load_vocab: control token: 128176 '<|begin_of_text|><|reserved_special_token_171|>' is not marked as EOG
llm_load_vocab: control token: 128208 '<|begin_of_text|><|reserved_special_token_203|>' is not marked as EOG
llm_load_vocab: control token: 128233 '<|begin_of_text|><|reserved_special_token_228|>' is not marked as EOG
llm_load_vocab: control token: 128197 '<|begin_of_text|><|reserved_special_token_192|>' is not marked as EOG
llm_load_vocab: control token: 128061 '<|begin_of_text|><|reserved_special_token_56|>' is not marked as EOG
llm_load_vocab: control token: 128084 '<|begin_of_text|><|reserved_special_token_79|>' is not marked as EOG
llm_load_vocab: control token: 128163 '<|begin_of_text|><|reserved_special_token_158|>' is not marked as EOG
llm_load_vocab: control token: 128134 '<|begin_of_text|><|reserved_special_token_129|>' is not marked as EOG
llm_load_vocab: control token: 128224 '<|begin_of_text|><|reserved_special_token_219|>' is not marked as EOG
llm_load_vocab: control token: 128192 '<|begin_of_text|><|reserved_special_token_187|>' is not marked as EOG
llm_load_vocab: control token: 128016 '<|begin_of_text|><|reserved_special_token_11|>' is not marked as EOG
llm_load_vocab: control token: 128227 '<|begin_of_text|><|reserved_special_token_222|>' is not marked as EOG
llm_load_vocab: control token: 128199 '<|begin_of_text|><|reserved_special_token_194|>' is not marked as EOG
llm_load_vocab: control token: 128201 '<|begin_of_text|><|reserved_special_token_196|>' is not marked as EOG
llm_load_vocab: control token: 128089 '<|begin_of_text|><|reserved_special_token_84|>' is not marked as EOG
llm_load_vocab: control token: 128174 '<|begin_of_text|><|reserved_special_token_169|>' is not marked as EOG
llm_load_vocab: control token: 128049 '<|begin_of_text|><|reserved_special_token_44|>' is not marked as EOG
llm_load_vocab: control token: 128255 '<|begin_of_text|><|reserved_special_token_250|>' is not marked as EOG
llm_load_vocab: control token: 128157 '<|begin_of_text|><|reserved_special_token_152|>' is not marked as EOG
llm_load_vocab: control token: 128128 '<|begin_of_text|><|reserved_special_token_123|>' is not marked as EOG
llm_load_vocab: control token: 128083 '<|begin_of_text|><|reserved_special_token_78|>' is not marked as EOG
llm_load_vocab: control token: 128167 '<|begin_of_text|><|reserved_special_token_162|>' is not marked as EOG
llm_load_vocab: control token: 128095 '<|begin_of_text|><|reserved_special_token_90|>' is not marked as EOG
llm_load_vocab: control token: 128064 '<|begin_of_text|><|reserved_special_token_59|>' is not marked as EOG
llm_load_vocab: control token: 128164 '<|begin_of_text|><|reserved_special_token_159|>' is not marked as EOG
llm_load_vocab: control token: 128114 '<|begin_of_text|><|reserved_special_token_109|>' is not marked as EOG
llm_load_vocab: control token: 128105 '<|begin_of_text|><|reserved_special_token_100|>' is not marked as EOG
llm_load_vocab: control token: 128026 '<|begin_of_text|><|reserved_special_token_21|>' is not marked as EOG
llm_load_vocab: control token: 128112 '<|begin_of_text|><|reserved_special_token_107|>' is not marked as EOG
llm_load_vocab: control token: 128179 '<|begin_of_text|><|reserved_special_token_174|>' is not marked as EOG
llm_load_vocab: control token: 128060 '<|begin_of_text|><|reserved_special_token_55|>' is not marked as EOG
llm_load_vocab: control token: 128210 '<|begin_of_text|><|reserved_special_token_205|>' is not marked as EOG
llm_load_vocab: control token: 128186 '<|begin_of_text|><|reserved_special_token_181|>' is not marked as EOG
llm_load_vocab: control token: 128024 '<|begin_of_text|><|reserved_special_token_19|>' is not marked as EOG
llm_load_vocab: control token: 128031 '<|begin_of_text|><|reserved_special_token_26|>' is not marked as EOG
llm_load_vocab: control token: 128125 '<|begin_of_text|><|reserved_special_token_120|>' is not marked as EOG
llm_load_vocab: control token: 128161 '<|begin_of_text|><|reserved_special_token_156|>' is not marked as EOG
llm_load_vocab: control token: 128065 '<|begin_of_text|><|reserved_special_token_60|>' is not marked as EOG
llm_load_vocab: control token: 128123 '<|begin_of_text|><|reserved_special_token_118|>' is not marked as EOG
llm_load_vocab: control token: 128122 '<|begin_of_text|><|reserved_special_token_117|>' is not marked as EOG
llm_load_vocab: control token: 128007 '<|begin_of_text|><|end_header_id|>' is not marked as EOG
llm_load_vocab: control token: 128206 '<|begin_of_text|><|reserved_special_token_201|>' is not marked as EOG
llm_load_vocab: control token: 128142 '<|begin_of_text|><|reserved_special_token_137|>' is not marked as EOG
llm_load_vocab: control token: 128098 '<|begin_of_text|><|reserved_special_token_93|>' is not marked as EOG
llm_load_vocab: control token: 128021 '<|begin_of_text|><|reserved_special_token_16|>' is not marked as EOG
llm_load_vocab: control token: 128254 '<|begin_of_text|><|reserved_special_token_249|>' is not marked as EOG
llm_load_vocab: control token: 128037 '<|begin_of_text|><|reserved_special_token_32|>' is not marked as EOG
llm_load_vocab: control token: 128045 '<|begin_of_text|><|reserved_special_token_40|>' is not marked as EOG
llm_load_vocab: control token: 128190 '<|begin_of_text|><|reserved_special_token_185|>' is not marked as EOG
llm_load_vocab: control token: 128057 '<|begin_of_text|><|reserved_special_token_52|>' is not marked as EOG
llm_load_vocab: control token: 128239 '<|begin_of_text|><|reserved_special_token_234|>' is not marked as EOG
llm_load_vocab: control token: 128121 '<|begin_of_text|><|reserved_special_token_116|>' is not marked as EOG
llm_load_vocab: control token: 128222 '<|begin_of_text|><|reserved_special_token_217|>' is not marked as EOG
llm_load_vocab: control token: 128182 '<|begin_of_text|><|reserved_special_token_177|>' is not marked as EOG
llm_load_vocab: control token: 128207 '<|begin_of_text|><|reserved_special_token_202|>' is not marked as EOG
llm_load_vocab: control token: 128001 '<|begin_of_text|><|end_of_text|>' is not marked as EOG
llm_load_vocab: control token: 128088 '<|begin_of_text|><|reserved_special_token_83|>' is not marked as EOG
llm_load_vocab: control token: 128154 '<|begin_of_text|><|reserved_special_token_149|>' is not marked as EOG
llm_load_vocab: control token: 128032 '<|begin_of_text|><|reserved_special_token_27|>' is not marked as EOG
llm_load_vocab: control token: 128229 '<|begin_of_text|><|reserved_special_token_224|>' is not marked as EOG
llm_load_vocab: control token: 128238 '<|begin_of_text|><|reserved_special_token_233|>' is not marked as EOG
llm_load_vocab: control token: 128005 '<|begin_of_text|><|reserved_special_token_3|>' is not marked as EOG
llm_load_vocab: control token: 128106 '<|begin_of_text|><|reserved_special_token_101|>' is not marked as EOG
llm_load_vocab: control token: 128158 '<|begin_of_text|><|reserved_special_token_153|>' is not marked as EOG
llm_load_vocab: control token: 128062 '<|begin_of_text|><|reserved_special_token_57|>' is not marked as EOG
llm_load_vocab: control token: 128070 '<|begin_of_text|><|reserved_special_token_65|>' is not marked as EOG
llm_load_vocab: control token: 128119 '<|begin_of_text|><|reserved_special_token_114|>' is not marked as EOG
llm_load_vocab: control token: 128078 '<|begin_of_text|><|reserved_special_token_73|>' is not marked as EOG
llm_load_vocab: control token: 128004 '<|begin_of_text|><|reserved_special_token_2|>' is not marked as EOG
llm_load_vocab: control token: 128006 '<|begin_of_text|><|start_header_id|>' is not marked as EOG
llm_load_vocab: control token: 128010 '<|begin_of_text|><|reserved_special_token_5|>' is not marked as EOG
llm_load_vocab: control token: 128220 '<|begin_of_text|><|reserved_special_token_215|>' is not marked as EOG
llm_load_vocab: control token: 128073 '<|begin_of_text|><|reserved_special_token_68|>' is not marked as EOG
llm_load_vocab: control token: 128168 '<|begin_of_text|><|reserved_special_token_163|>' is not marked as EOG
llm_load_vocab: control token: 128198 '<|begin_of_text|><|reserved_special_token_193|>' is not marked as EOG
llm_load_vocab: control token: 128009 '<|begin_of_text|><|eot_id|>' is not marked as EOG
llm_load_vocab: special_eos_id is not in special_eog_ids - the tokenizer config may be incorrect
llm_load_vocab: special tokens cache size = 256
llm_load_vocab: token to piece cache size = 0.8041 MB
llm_load_print_meta: format = GGUF V3 (latest)
llm_load_print_meta: arch = llama
llm_load_print_meta: vocab type = BPE
llm_load_print_meta: n_vocab = 128256
llm_load_print_meta: n_merges = 280147
llm_load_print_meta: vocab_only = 0
llm_load_print_meta: n_ctx_train = 8192
llm_load_print_meta: n_embd = 4096
llm_load_print_meta: n_layer = 32
llm_load_print_meta: n_head = 32
llm_load_print_meta: n_head_kv = 8
llm_load_print_meta: n_rot = 128
llm_load_print_meta: n_swa = 0
llm_load_print_meta: n_embd_head_k = 128
llm_load_print_meta: n_embd_head_v = 128
llm_load_print_meta: n_gqa = 4
llm_load_print_meta: n_embd_k_gqa = 1024
llm_load_print_meta: n_embd_v_gqa = 1024
llm_load_print_meta: f_norm_eps = 0.0e+00
llm_load_print_meta: f_norm_rms_eps = 1.0e-05
llm_load_print_meta: f_clamp_kqv = 0.0e+00
llm_load_print_meta: f_max_alibi_bias = 0.0e+00
llm_load_print_meta: f_logit_scale = 0.0e+00
llm_load_print_meta: n_ff = 14336
llm_load_print_meta: n_expert = 0
llm_load_print_meta: n_expert_used = 0
llm_load_print_meta: causal attn = 1
llm_load_print_meta: pooling type = 0
llm_load_print_meta: rope type = 0
llm_load_print_meta: rope scaling = linear
llm_load_print_meta: freq_base_train = 500000.0
llm_load_print_meta: freq_scale_train = 1
llm_load_print_meta: n_ctx_orig_yarn = 8192
llm_load_print_meta: rope_finetuned = unknown
llm_load_print_meta: ssm_d_conv = 0
llm_load_print_meta: ssm_d_inner = 0
llm_load_print_meta: ssm_d_state = 0
llm_load_print_meta: ssm_dt_rank = 0
llm_load_print_meta: ssm_dt_b_c_rms = 0
llm_load_print_meta: model type = 8B
llm_load_print_meta: model ftype = I2_S - 2 bpw ternary
llm_load_print_meta: model params = 8.03 B
llm_load_print_meta: model size = 3.58 GiB (3.83 BPW)
llm_load_print_meta: general.name = Llama3-8B-1.58-100B-tokens
llm_load_print_meta: BOS token = 128000 '<|begin_of_text|><|begin_of_text|>'
llm_load_print_meta: EOS token = 128009 '<|begin_of_text|><|eot_id|>'
llm_load_print_meta: LF token = 128 'Ä'
llm_load_print_meta: EOG token = 128009 '<|begin_of_text|><|eot_id|>'
llm_load_print_meta: max token length = 256
llm_load_tensors: ggml ctx size = 0.14 MiB
llm_load_tensors: CPU buffer size = 3669.02 MiB
................................................
llama_new_context_with_model: n_batch is less than GGML_KQ_MASK_PAD - increasing to 32
llama_new_context_with_model: n_ctx = 2048
llama_new_context_with_model: n_batch = 32
llama_new_context_with_model: n_ubatch = 32
llama_new_context_with_model: flash_attn = 0
llama_new_context_with_model: freq_base = 500000.0
llama_new_context_with_model: freq_scale = 1
llama_kv_cache_init: CPU KV buffer size = 256.00 MiB
llama_new_context_with_model: KV self size = 256.00 MiB, K (f16): 128.00 MiB, V (f16): 128.00 MiB
llama_new_context_with_model: CPU output buffer size = 0.49 MiB
llama_new_context_with_model: CPU compute buffer size = 16.16 MiB
llama_new_context_with_model: graph nodes = 1030
llama_new_context_with_model: graph splits = 1
common_init_from_params: warming up the model with an empty run - please wait ... (--no-warmup to disable)
main: llama threadpool init, n_threads = 12
system_info: n_threads = 12 (n_threads_batch = 12) / 40 | AVX = 1 | AVX_VNNI = 0 | AVX2 = 0 | AVX512 = 0 | AVX512_VBMI = 0 | AVX512_VNNI = 0 | AVX512_BF16 = 0 | FMA = 0 | NEON = 0 | SVE = 0 | ARM_FMA = 0 | F16C = 1 | FP16_VA = 0 | RISCV_VECT = 0 | WASM_SIMD = 0 | BLAS = 0 | SSE3 = 1 | SSSE3 = 1 | VSX = 0 | MATMUL_INT8 = 0 | LLAMAFILE = 1 |
sampler seed: 1710881535
sampler params:
repeat_last_n = 64, repeat_penalty = 1.000, frequency_penalty = 0.000, presence_penalty = 0.000
top_k = 40, tfs_z = 1.000, top_p = 0.950, min_p = 0.050, typical_p = 1.000, temp = 0.800
mirostat = 0, mirostat_lr = 0.100, mirostat_ent = 5.000
sampler chain: logits -> logit-bias -> penalties -> top-k -> tail-free -> typical -> top-p -> min-p -> temp-ext -> softmax -> dist
generate: n_ctx = 2048, n_batch = 1, n_predict = 900, n_keep = 1
Implement a python function to get the factorial of a number.GGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGG
llama_perf_sampler_print: sampling time = 102.44 ms / 913 runs ( 0.11 ms per token, 8912.36 tokens per second)
llama_perf_context_print: load time = 1011.92 ms
llama_perf_context_print: prompt eval time = 671.10 ms / 13 tokens ( 51.62 ms per token, 19.37 tokens per second)
llama_perf_context_print: eval time = 48369.96 ms / 899 runs ( 53.80 ms per token, 18.59 tokens per second)
llama_perf_context_print: total time = 49436.53 ms / 912 tokens
Any clue on what to do?
Ran into the same, installation was flawless but the output is a long string of Gs.
Ran in to the same error I tried to put various arguments but only Gs an sometimes random chars
I have the same problem. i5-3450, using conda environment. It installed fine, but only get G's. It has AVX, but not AVX2.
I got the same issue on Fedora 42. Is this expected to work?