DanceRevolution icon indicating copy to clipboard operation
DanceRevolution copied to clipboard

Training Help

Open jkurian49 opened this issue 4 years ago • 8 comments

Hi! I am running into a segmentation fault when trying to run train.sh, so I wanted to ask what the preprocessing steps are in case I am missing something. Besides installing the dependences and running extractor.py and prepro.py, are there any steps required before running train.sh?

jkurian49 avatar Nov 08 '20 03:11 jkurian49

Same Questions @stonyhu

wtnan2003 avatar Nov 09 '20 06:11 wtnan2003

Same Questions @stonyhu

It seems that the longformer is giving me a seg fault. Are you experiencing the same thing?

CrystalWang1225 avatar Nov 15 '20 18:11 CrystalWang1225

Same Questions @stonyhu

I have solved the issue by reinstalling the transformer. I looked into the longformer package and it looks like the transformer importing isnt successful

CrystalWang1225 avatar Nov 17 '20 15:11 CrystalWang1225

@CrystalWang1225 Actually, you can implement your own local self-attention network by naive pytorch. But you can also implement it based on longformer that provides the custom CUDA kernel to accelerate computation and reduce gpu memory cost.

stonyhu avatar Dec 09 '20 04:12 stonyhu

Hello @CrystalWang1225 , could you please share your method for reinstalling transformer? I reinstall transformer by pip uninstall transformers;pip install transformers==2.5.1 but still run into segmentation fault. I also try to install transformer from source by the following code, but segmentation fault still occurs.

git clone https://github.com/huggingface/transformers.git
cd transformers
pip install -e .

Thanks in advance!

klong121 avatar Dec 13 '20 14:12 klong121

@klong121 So I downgraded transformers to pip install transformers==3.5.0 The train.pyis working for us for now. These are our requirements we have on our CUDA:

absl-py==0.11.0
appdirs==1.4.4
audioread==2.1.9
boto3==1.16.35
botocore==1.19.35
cachetools==4.2.0
certifi==2020.12.5
cffi==1.14.4
chardet==3.0.4
click==7.1.2
decorator==4.4.2
distlib==0.3.1
essentia==2.1b6.dev234
filelock==3.0.12
future==0.18.2
google-auth==1.24.0
google-auth-oauthlib==0.4.2
grpcio==1.34.0
idna==2.10
imageio==2.9.0
importlib-metadata==2.0.0
jmespath==0.10.0
joblib==0.17.0
librosa==0.8.0
llvmlite==0.35.0
Markdown==3.3.3
numba==0.52.0
numpy==1.19.4
oauthlib==3.1.0
packaging==20.8
pandas==1.1.5
Pillow==8.0.1
pooch==1.3.0
protobuf==3.14.0
pyasn1==0.4.8
pyasn1-modules==0.2.8
pycparser==2.20
pyparsing==2.4.7
python-dateutil==2.8.1
pytorch-lightning==0.6.0
pytz==2020.4
PyYAML==5.3.1
regex==2020.11.13
requests==2.25.0
requests-oauthlib==1.3.0
resampy==0.2.2
rsa==4.6
s3transfer==0.3.3
sacremoses==0.0.43
scikit-learn==0.23.2
scipy==1.5.4
sentencepiece==0.1.91
six==1.15.0
SoundFile==0.10.3.post1
tensorboard==2.4.0
tensorboard-plugin-wit==1.7.0
tensorboardX==2.1
test-tube==0.7.5
threadpoolctl==2.1.0
tokenizers==0.9.3
torch==1.3.1
torchvision==0.4.2
tqdm==4.54.1
transformers==3.5.1
urllib3==1.26.2
virtualenv==20.0.35
Werkzeug==1.0.1
zipp==3.3.1

CrystalWang1225 avatar Dec 14 '20 03:12 CrystalWang1225

Thanks for your help! I downgraded the transformers and the code is working for me now. @CrystalWang1225

klong121 avatar Dec 14 '20 09:12 klong121

Hi, I am confused about the meaning of BOS_POSE in pose.py which used while training, is the initialization of the skeleton? How does it get? BOS_POSE = np.array([ -0.00647944, -0.540766, -0.00655532, -0.3829, -0.0862469, -0.388304, -0.107649, -0.137507, -0.0648178, 0.0423361, 0.0791047, -0.382726, 0.0882645, -0.148286, 0.0699688, 0.0641147, 0.0056175, 0.05319, -0.0464647, 0.0476973, -0.0465082, 0.467254, -0.031224, 0.854397, 0.0576743, 0.0585055, 0.0394117, 0.478282, 0.0208931, 0.854322, -0.0158758, -0.568058, 0.0147899, -0.568106, -0.0465137, -0.568036, 0.0391742, -0.562693, 0.0209782, 0.936187, 0.0424097, 0.930442, 0.0148087, 0.886806, -0.0249588, 0.935932, -0.0434218, 0.935805, -0.0280986, 0.886854 ])

THANKS VERY MUCH!

aryanna384 avatar Dec 25 '20 15:12 aryanna384