impersonator icon indicating copy to clipboard operation
impersonator copied to clipboard

Train for custom dataset

Open sandhyalaxmiK opened this issue 6 years ago • 5 comments

Hi piaozhx, Thanks for sharing the code. I cloned and able to execute the links.The way of explanation is in such away even for installations is also very nice which makes the process easy and able to understood. I want to train on my own custom videos. Can you tell the process of how to go further?? What is the process for high quality images??

Thanks in Advance. Regards, SandhyaLaxmi

sandhyalaxmiK avatar Oct 15 '19 06:10 sandhyalaxmiK

First, your video frame should be preprocessed to square, That's to say, the height and width should be the same. Second, extract smpl infomation of your frames. You can use the pre-trained hmr model provided in the repo. My code is below

#!/usr/bin/env python3
import os
import torch
import cv2
import numpy as np
import glob
import pickle
from networks.hmr import HumanModelRecovery
import utils.cv_utils as cv_utils

def creat_hmr():
    # initialize the hmr network
    smpl_model = "assets/pretrains/smpl_model.pkl"
    hmr_model = "assets/pretrains/hmr_tf2pt.pth"

    hmr = HumanModelRecovery(smpl_pkl_path=smpl_model)
    hmr.load_state_dict(torch.load(hmr_model))
    hmr.eval()
    return hmr.cuda()

if __name__ == "__main__":
    imgs_path = "/mnt/4T/sunyangtian/train_data/images/"
    smpls_path = "/mnt/4T/sunyangtian/train_data/smpls"

    hmr = creat_hmr()

    if not os.path.exists(smpls_path):
        os.mkdir(smpls_path)


    size = 256

    for actor in sorted(os.listdir(imgs_path)):
        actor_path = os.path.join(imgs_path, actor)
        for action in sorted(os.listdir(actor_path)):
            action_path = os.path.join(actor_path, action)

            imgs_fn = sorted([os.path.join(action_path, fn) for fn in os.listdir(action_path)])

            pose = []
            shape = []
            cams = []
            vertices = []
            kps = []

            for idx, img_path in enumerate(imgs_fn):
                # if idx > 20:
                #     break
                img = cv_utils.read_cv2_img(img_path)*2 - 1.0
                print(img_path)
                # size = img.shape[1] # fetch width
                img = cv_utils.transform_img(img,image_size=224,transpose=True)
                img = torch.tensor(img, dtype=torch.float32).cuda()[None, ...]
                # print(img.shape)
                with torch.no_grad():
                    smpl = hmr(img)
                smpl_info = hmr.get_details(smpl)

                pose.append(smpl_info['pose'].cpu().numpy())
                shape.append(smpl_info['shape'].cpu().numpy())
                cams.append(smpl_info['cam'].cpu().numpy())
                vertices.append(smpl_info['verts'].cpu().numpy())
                kps.append(smpl_info['j2d'].cpu().numpy())

            pose = np.concatenate(pose, axis=0)
            shape = np.concatenate(shape, axis=0)
            cams = np.concatenate(cams, axis=0)
            vertices = np.concatenate(vertices, axis=0)
            kps = np.concatenate(kps, axis=0)

            pose_shape = {"pose":pose, "shape":shape, "cams":cams, "vertices":vertices}        
            kps_dict = {"kps":kps}

            pose_shape_dir = os.path.join(smpls_path, actor, action)
            if not os.path.exists(pose_shape_dir):
                os.makedirs(pose_shape_dir)

            pickle.dump(pose_shape, open(os.path.join(pose_shape_dir,'pose_shape.pkl'),'wb'))
            pickle.dump(kps_dict, open(os.path.join(pose_shape_dir, "kps.pkl"), 'wb'))

Third, set the format of your data according the train.md in this repo. Then change the data_dir in train_iPER.sh and you can train it.

SunYangtian avatar Nov 07 '19 12:11 SunYangtian

Thanks a lot @SunYangtian. This code is helpful for me to go one step forward. If i want to change the size instead from 256 to 1024, what things I have to modify in hmr so that we can get resized values of pose and shape (theta)???

sandhyalaxmiK avatar Nov 07 '19 13:11 sandhyalaxmiK

The size of image sent to hmr cannot be changed because the model is pre-trained. In fact,  it must be 224,  as I do in that code. And pose and shape parameter has nothing to do with the image size. So what you need to do is resize to 224*224 and send it to hmr.

------------------ Original ------------------ From: "SANDHYALAXMI KANNA"<[email protected]>; Date: 2019年11月7日(星期四) 晚上9:02 To: "svip-lab/impersonator"<[email protected]>; Cc: "孙阳天"<[email protected]>; "Mention"<[email protected]>; Subject: Re: [svip-lab/impersonator] Train for custom dataset (#28)

Thanks a lot @SunYangtian. This code is helpful for me to go one step forward. If i want to change the size instead of 256, 1024 what things i have to modify in hmr so that we can get resized values of pose and shape (theta)???

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub, or unsubscribe.

SunYangtian avatar Nov 08 '19 02:11 SunYangtian

First, your video frame should be preprocessed to square, That's to say, the height and width should be the same. Second, extract smpl infomation of your frames. You can use the pre-trained hmr model provided in the repo. My code is below

#!/usr/bin/env python3
import os
import torch
import cv2
import numpy as np
import glob
import pickle
from networks.hmr import HumanModelRecovery
import utils.cv_utils as cv_utils

def creat_hmr():
    # initialize the hmr network
    smpl_model = "assets/pretrains/smpl_model.pkl"
    hmr_model = "assets/pretrains/hmr_tf2pt.pth"

    hmr = HumanModelRecovery(smpl_pkl_path=smpl_model)
    hmr.load_state_dict(torch.load(hmr_model))
    hmr.eval()
    return hmr.cuda()

if __name__ == "__main__":
    imgs_path = "/mnt/4T/sunyangtian/train_data/images/"
    smpls_path = "/mnt/4T/sunyangtian/train_data/smpls"

    hmr = creat_hmr()

    if not os.path.exists(smpls_path):
        os.mkdir(smpls_path)


    size = 256

    for actor in sorted(os.listdir(imgs_path)):
        actor_path = os.path.join(imgs_path, actor)
        for action in sorted(os.listdir(actor_path)):
            action_path = os.path.join(actor_path, action)

            imgs_fn = sorted([os.path.join(action_path, fn) for fn in os.listdir(action_path)])

            pose = []
            shape = []
            cams = []
            vertices = []
            kps = []

            for idx, img_path in enumerate(imgs_fn):
                # if idx > 20:
                #     break
                img = cv_utils.read_cv2_img(img_path)*2 - 1.0
                print(img_path)
                # size = img.shape[1] # fetch width
                img = cv_utils.transform_img(img,image_size=224,transpose=True)
                img = torch.tensor(img, dtype=torch.float32).cuda()[None, ...]
                # print(img.shape)
                with torch.no_grad():
                    smpl = hmr(img)
                smpl_info = hmr.get_details(smpl)

                pose.append(smpl_info['pose'].cpu().numpy())
                shape.append(smpl_info['shape'].cpu().numpy())
                cams.append(smpl_info['cam'].cpu().numpy())
                vertices.append(smpl_info['verts'].cpu().numpy())
                kps.append(smpl_info['j2d'].cpu().numpy())

            pose = np.concatenate(pose, axis=0)
            shape = np.concatenate(shape, axis=0)
            cams = np.concatenate(cams, axis=0)
            vertices = np.concatenate(vertices, axis=0)
            kps = np.concatenate(kps, axis=0)

            pose_shape = {"pose":pose, "shape":shape, "cams":cams, "vertices":vertices}        
            kps_dict = {"kps":kps}

            pose_shape_dir = os.path.join(smpls_path, actor, action)
            if not os.path.exists(pose_shape_dir):
                os.makedirs(pose_shape_dir)

            pickle.dump(pose_shape, open(os.path.join(pose_shape_dir,'pose_shape.pkl'),'wb'))
            pickle.dump(kps_dict, open(os.path.join(pose_shape_dir, "kps.pkl"), 'wb'))

Third, set the format of your data according the train.md in this repo. Then change the data_dir in train_iPER.sh and you can train it.

@SunYangtian There seems to be a bug in your code while doing preprocessing to the img,

img = cv_utils.read_cv2_img(img_path)*2 - 1.0
print(img_path)
# size = img.shape[1] # fetch width
img = cv_utils.transform_img(img,image_size=224,transpose=True)
img = torch.tensor(img, dtype=torch.float32).cuda()[None, ...]

I think they should be

img = cv_utils.read_cv2_img(img_path)
print(img_path)
# size = img.shape[1] # fetch width
img = cv_utils.transform_img(img,image_size=224,transpose=True)*2 - 1.0
img = torch.tensor(img, dtype=torch.float32).cuda()[None, ...]

Because in cv_utils.transform_img, it divide the img by 255 to scale it to [0,1] first.

HrsPythonix avatar Dec 10 '19 20:12 HrsPythonix

@HrsPythonix You're right. Thanks!

SunYangtian avatar Dec 12 '19 01:12 SunYangtian