Sign-Language-Datasets
Sign-Language-Datasets copied to clipboard
Intro of some sign language datasets suitable for research
Sign Language Datasets
Here, we introduce several publicly available sign language datasets. They are suitable for multiple sign language (SL) processing tasks, including SL recognition, translation and generation.
We also provide the creation method of LMDB, which is space-saving and loading-friendly. All frames are converted to JPG format and saved as binary file in LMDB database.
Usage
usage: lmdb_dataset_modality.py [-h] [-nw NUM_WORKERS] [-tt TARGET_TMP_PATH] source_path target_path
-
source_path
: the path where original data stored -
target_path
: the path where lmdb will be stored -
target_tmp_path
: the path where transformed images stored. If-tt ...
not set, temporary.jpg
file will be deleted after stored in LMDB.
RWTH-PHOENIX-Weather 2014 (German SL)
Keywords: continuous SL, sign gloss
Links: Homepage, Paper (CVIU'2015)
phoenix-2014-multisigner
LMDB Database
fullFrame-210x260px
python scripts/lmdb_ph14_full_rgb.py .../fullFrame-210x260px lmdb/ph14/full_rgb_224 -nw 4
trackedRightHand-92x132px
python scripts/lmdb_ph14_hand_rgb.py .../trackedRightHand-92x132px lmdb/ph14/hand_rgb_112 -nw 4
RWTH-PHOENIX-Weather 2014 T (German SL)
Keywords: continuous SL, sign gloss, German translation
Links: Homepage, Paper (CVPR'2018)
LMDB Database
fullFrame-210x260px
python scripts/lmdb_ph14-t_full_rgb.py .../fullFrame-210x260px lmdb/ph14T/full_rgb_224 -nw 4
Pose Annotation
In STMC (AAAI'20), authors use HRNet (CVPR'19) to conduct automatic pose annotation.
The estimated upper-body keypoint array (T, 7, 2)
are saved in a Dict
indexed with video name.
Each keypoint is recorded as (w, h)
and normalized between [0, 1]
.
Download Links
Dataset | HRNet |
---|---|
PHOENIX-2014 | GoogleDrive |
PHOENIX-2014-T | GoogleDrive |
How to read
import pickle as pkl
with open('pose_phoenix2014_up_hrnet_TxN_wh.pkl', 'rb') as f:
dict_pose = pkl.load(f)
print(dict_pose['01April_2010_Thursday_heute_default-0'].shape)