MotionGPT Demo error: invalid literal for int() with base 10:

Hi all,

Thanks for your work. I'm encountering an issue when I attempt to run the demo:

ValueError: At line 114: tokens = torch.tensor([int(token) for token in output.split(',')]).cuda(), I encounter the error message

ValueError: invalid literal for int() with base 10: 'однакоMediaalling proceeduestopower dig impl courFincija heraus са XIIoshiicznênternoon Jimmy soap Weit rueadel KomSERT моло official comingyy SloLayoutInflaterstate domains waronds
анти alcune(\'. a

To filter out non-integer values, I tried modifying the line to tokens = torch.tensor([int(token) for token in output.split(',') if token.isdigit()], dtype=torch.long).cuda() but then encountered the Runtime error:

Calculated padded input size per channel: (2). Kernel size: (3).
Kernel size can't be greater than actual input size

I followed installation instructions as listed. Any suggestions? Thanks!

Aug 08 '23 16:08 felipe-parodi

@qiqiApink

Aug 15 '23 14:08 felipe-parodi

I'm getting the same error: "ValueError: invalid literal for int() with base 10" when executing the demo for text-to-motion. There seems to be no valid integers in my output. I'm using pretrained-7B.pth.

@felipe-parodi could you solve it?

Aug 27 '23 21:08 santiagomc

Still haven't solved it. I tried fine-tuning LLaMA on the kit dataset, and that didn't solve it either.

Aug 27 '23 21:08 felipe-parodi

Thanks for the quick response

Aug 27 '23 22:08 santiagomc

try: output = re.findall(r'\d+', output) for j, num in enumerate(output): if int(num) > 511: output = output[:j] break if len(output) == 0: tokens = torch.ones(1, max_new_tokens).cuda().long() else: tokens = torch.tensor([[int(num) for num in output]]).cuda().long() except: tokens = torch.ones(1, max_new_tokens).cuda().long() put this in generate_motion.py.

Aug 29 '23 02:08 SHUWEI-HO

that doesn't solve the issue though, that simply creates a tensor of ones if it doesn't get the desired output.

Aug 29 '23 13:08 felipe-parodi

This might be your model training is broken. You can try the pretrained 7B model that the author updated. It can generate the desire output .

Aug 29 '23 14:08 SHUWEI-HO

Hi @SHUWEI-HO, thanks for your quick response. The same happens for me with the pretrained 7B model as well as the finetuned model. Could you share your generate_motion script? it's unclear why it isn't working on my end.

Aug 29 '23 14:08 felipe-parodi

import os
import sys
import time
import warnings
from pathlib import Path
from typing import Optional

import lightning as L
import torch
import numpy as np

import models.vqvae as vqvae
from generate import generate
from lit_llama import Tokenizer, LLaMA, LLaMAConfig
from lit_llama.lora import lora
from lit_llama.utils import EmptyInitOnDevice, lazy_load
from scripts.prepare_motion import generate_prompt
from options import option
import imageio
from utils.evaluate import plot
from visualization.render import render
warnings.filterwarnings('ignore')

args = option.get_args_parser()


def main(
    quantize: Optional[str] = None,
    dtype: str = "float32",
    max_new_tokens: int = 200,
    top_k: int = 200,
    temperature: float = 0.8,
    accelerator: str = "auto",
) -> None:
    lora_path = Path(args.lora_path)
    pretrained_path = Path(f"./checkpoints/lit-llama/{args.pretrained_llama}/lit-llama.pth")
    tokenizer_path = Path("./checkpoints/lit-llama/tokenizer.model")
    
    assert lora_path.is_file()
    assert pretrained_path.is_file()
    assert tokenizer_path.is_file()

    if quantize is not None:
        raise NotImplementedError("Quantization in LoRA is not supported yet")

    fabric = L.Fabric(accelerator=accelerator, devices=1)

    dt = getattr(torch, dtype, None)
    if not isinstance(dt, torch.dtype):
        raise ValueError(f"{dtype} is not a valid dtype.")
    dtype = dt

    net = vqvae.HumanVQVAE(args, ## use args to define different parameters in different quantizers
                       args.nb_code,
                       args.code_dim,
                       args.output_emb_width,
                       args.down_t,
                       args.stride_t,
                       args.width,
                       args.depth,
                       args.dilation_growth_rate)
    print ('loading checkpoint from {}'.format(args.vqvae_pth))
    ckpt = torch.load(args.vqvae_pth, map_location='cpu')
    net.load_state_dict(ckpt['net'], strict=True)
    net.eval()
    net.cuda()

    print("Loading model ...", file=sys.stderr)
    t0 = time.time()
   # with EmptyInitOnDevice(
    #     device=fabric.device, dtype=dtype, quantization_mode=quantize
    # ), lora(r=args.lora_r, alpha=args.lora_alpha, dropout=args.lora_dropout, enabled=True):
    with fabric.device, lora(r=args.lora_r, alpha=args.lora_alpha, dropout=args.lora_dropout, enabled=True):
        config = LLaMAConfig.from_name(args.pretrained_llama)
        torch.set_default_tensor_type(torch.HalfTensor)
        model = LLaMA(config).bfloat16()
        torch.set_default_tensor_type(torch.FloatTensor)
        # model = LLaMA(LLaMAConfig())  # TODO: Support different model sizes

    print(f"Time to load model: {time.time() - t0:.02f} seconds.", file=sys.stderr)

    model.eval()
    model = fabric.setup_module(model)

    tokenizer = Tokenizer(tokenizer_path)
    sample = {"instruction": args.prompt, "input": args.input}
    prompt = generate_prompt(sample)
    encoded = tokenizer.encode(prompt, bos=True, eos=False, device=model.device)

    t0 = time.perf_counter()
    output = generate(
        model,
        idx=encoded,
        max_seq_length=max_new_tokens,
        max_new_tokens=max_new_tokens,
        temperature=temperature,
        top_k=top_k,
        eos_id=tokenizer.eos_id
    )
output = tokenizer.decode(output)
output = output.split("### Response:")[1].strip()   
t = time.perf_counter() - t0     
print(f"\n\nTime for inference: {t:.02f} sec total, {max_new_tokens / t:.02f} tokens/sec", file=sys.stderr)     
print(f"Memory used: {torch.cuda.max_memory_reserved() / 1e9:.02f} GB", file=sys.stderr)    
 try:
        output = re.findall(r'\d+', output)
        for j, num in enumerate(output):
            if int(num) > 511:
                output = output[:j]
                break
        if len(output) == 0:
            tokens = torch.ones(1, max_new_tokens).cuda().long()
        else:
            tokens = torch.tensor([[int(num) for num in output]]).cuda().long()
    except:
        tokens = torch.ones(1, max_new_tokens).cuda().long()
    generated_pose, img = plot(tokens, net, args.dataname) 
        
        
    n =  str(input("Enter the name :"))
    gif_name = f"round{n}.gif"
    os.makedirs(args.out_dir, exist_ok=True)
    if gif_name in os.listdir():
        print("Exist! Enter Again!")
        n =  str(input("Enter the name :"))
        gif_name = f"{n}.gif"
        np.save(os.path.join(args.out_dir, f'{n}.npy'), generated_pose)
    else : 
        gif_name = f"{n}.gif"
        np.save(os.path.join(args.out_dir, f'{n}.npy'), generated_pose)
    imageio.mimsave(os.path.join(args.out_dir, gif_name), np.array(img), fps=20)
    
    if args.render:
        print("Rendering...")
        render(generated_pose,n, outdir=args.out_dir)

Aug 29 '23 14:08 SHUWEI-HO

Thanks. Why did you remove the original decoding line 106 output = tokenizer.decode(output) just before output = output.split("### Response:")[1].strip()?

Aug 29 '23 15:08 felipe-parodi

Sorry, I missed the code "output = tokenizer.decode(output)" up . I edited my previous reply .

Aug 29 '23 15:08 SHUWEI-HO

I always get tensor of ones with the pretrained 7B model.

@SHUWEI-HO in your code you have args = option.get_argsdi_parser(), while in the current generate_motion.py is args = option.get_args_parser()

Maybe you have different arguments configured.

Aug 30 '23 01:08 santiagomc

Hi @felipe-parodi I fine-tuned 7B llama with KIT and encountered same error. Tried with different regex parsing, but still unable to solve the issue. Did you find a workaround by any chance? thanks

-- update: @felipe-parodi @qiqiApink I found this issue will be solved with training the model for longer, i.e 35000+ iterations. Then it manages to output the required non-numerical tokens. And the line: tokens = torch.tensor([int(token) for token in output.split(',')]).cuda() will receive the format needed to convert tokens into int

tokens can also be loaded like this : tokens = torch.tensor([int(token) for token in output.strip(',').split(',')]).cuda() to remove any unnecessay blank spaces or split issues in the output

Hope this helps

Dec 18 '23 17:12 zybermonk

The weights of llama downloaded from pyllama maybe change. So the fine-tuned weights provided will not match with llama you downloaded. You can fine-tune the model by yourselves.

Dec 26 '23 01:12 qiqiApink

MotionGPT MotionGPT copied to clipboard

Demo error: invalid literal for int() with base 10:

MotionGPT
MotionGPT copied to clipboard