HGB
HGB
During evaluation phase, when GPUs have created the merge, are done with lm_eval on the task, they sit idle doing nothing. Running this on 8xMI300X ``` mergekit-evolve --task-search-path task_dir --max-fevals...
For this part: ``` # Crop the padded images to the desired resolution and number of frames (pad_left, pad_right, pad_top, pad_bottom) = padding pad_bottom = -pad_bottom pad_right = -pad_right if...
Performance of 40 steps vs 100 steps. ``` Step 1/40 2%|██▏ | 1/40 [00:01
Flash Attention 3 now works with these platforms, is it easily possible for LMDeploy team to implement this? @lvhan028 https://github.com/Dao-AILab/flash-attention/issues/1049#issuecomment-2695283567
``` File "/home/kojoe/miniconda3/envs/vllm/lib/python3.12/site-packages/gemma/gm/text/_sampler.py", line 311, in sample init_state = _prefill.prefill( ^^^^^^^^^^^^^^^^^ File "/home/kojoe/miniconda3/envs/vllm/lib/python3.12/site-packages/gemma/gm/text/_prefill.py", line 110, in prefill out = model.apply( ^^^^^^^^^^^^ File "/home/kojoe/miniconda3/envs/vllm/lib/python3.12/site-packages/kauldron/utils/train_property.py", line 141, in decorated return fn(*args, **kwargs)...
``` # Common imports import os import jax.numpy as jnp import tensorflow_datasets as tfds # Gemma imports from gemma import gm os.environ["XLA_PYTHON_CLIENT_MEM_FRACTION"] = "1.00" ds = tfds.data_source("oxford_flowers102", split="train") image1 =...