grok-1 icon indicating copy to clipboard operation
grok-1 copied to clipboard

The command line does not advance beyond that - Is everything okay? Is it supposed to be like this?

Open Yoshizito1 opened this issue 11 months ago • 2 comments

D:\grok-1>python run.py INFO:jax._src.xla_bridge:Unable to initialize backend 'cuda': INFO:jax._src.xla_bridge:Unable to initialize backend 'rocm': module 'jaxlib.xla_extension' has no attribute 'GpuAllocatorConfig' INFO:jax._src.xla_bridge:Unable to initialize backend 'tpu': UNIMPLEMENTED: LoadPjrtPlugin is not implemented on windows yet. INFO:rank:Initializing mesh for self.local_mesh_config=(1, 1) self.between_hosts_config=(1, 1)... INFO:rank:Detected 1 devices in mesh INFO:rank:partition rules: <bound method LanguageModelConfig.partition_rules of LanguageModelConfig(model=TransformerConfig(emb_size=6144, key_size=128, num_q_heads=48, num_kv_heads=8, num_layers=64, vocab_size=131072, widening_factor=8, attn_output_multiplier=0.08838834764831845, name=None, num_experts=8, capacity_factor=1.0, num_selected_experts=2, init_scale=1.0, shard_activations=True, data_axis='data', model_axis='model'), vocab_size=131072, pad_token=0, eos_token=2, sequence_len=8192, model_size=6144, embedding_init_scale=1.0, embedding_multiplier_scale=78.38367176906169, output_multiplier_scale=0.5773502691896257, name=None, fprop_dtype=<class 'jax.numpy.bfloat16'>, model_type=None, init_scale_override=None, shard_embeddings=True)> INFO:rank:(1, 256, 6144) INFO:rank:(1, 256, 131072) INFO:rank:State sharding type: <class 'model.TrainingState'> INFO:rank:(1, 256, 6144) INFO:rank:(1, 256, 131072) INFO:rank:Loading checkpoint at ./checkpoints/ckpt-0

Yoshizito1 avatar Mar 22 '24 10:03 Yoshizito1

are you running on a100? It took nearly 1.5hrs for me to load the checkpoints on an a100 compute

novaturient95 avatar Mar 22 '24 12:03 novaturient95

Not even close. USD$100K or ¥2400万円 is needed. See this link For software, Linux is preferred, and CUDA should be installed.

superguo avatar Mar 22 '24 12:03 superguo