Sign-Language-Datasets
Sign-Language-Datasets copied to clipboard
Intro of some sign language datasets suitable for research
Sign Language Datasets
Here, we introduce several publicly available sign language datasets. They are suitable for multiple sign language (SL) processing tasks, including SL recognition, translation and generation.
We also provide the creation method of LMDB, which is space-saving and loading-friendly. All frames are converted to JPG format and saved as binary file in LMDB database.
Usage
usage: lmdb_dataset_modality.py [-h] [-nw NUM_WORKERS] [-tt TARGET_TMP_PATH] source_path target_path
source_path: the path where original data storedtarget_path: the path where lmdb will be storedtarget_tmp_path: the path where transformed images stored. If-tt ...not set, temporary.jpgfile will be deleted after stored in LMDB.
RWTH-PHOENIX-Weather 2014 (German SL)
Keywords: continuous SL, sign gloss
Links: Homepage, Paper (CVIU'2015)
phoenix-2014-multisigner
LMDB Database
fullFrame-210x260px
python scripts/lmdb_ph14_full_rgb.py .../fullFrame-210x260px lmdb/ph14/full_rgb_224 -nw 4
trackedRightHand-92x132px
python scripts/lmdb_ph14_hand_rgb.py .../trackedRightHand-92x132px lmdb/ph14/hand_rgb_112 -nw 4
RWTH-PHOENIX-Weather 2014 T (German SL)
Keywords: continuous SL, sign gloss, German translation
Links: Homepage, Paper (CVPR'2018)
LMDB Database
fullFrame-210x260px
python scripts/lmdb_ph14-t_full_rgb.py .../fullFrame-210x260px lmdb/ph14T/full_rgb_224 -nw 4
Pose Annotation
In STMC (AAAI'20), authors use HRNet (CVPR'19) to conduct automatic pose annotation.
The estimated upper-body keypoint array (T, 7, 2) are saved in a Dict indexed with video name.
Each keypoint is recorded as (w, h) and normalized between [0, 1].
Download Links
| Dataset | HRNet |
|---|---|
| PHOENIX-2014 | GoogleDrive |
| PHOENIX-2014-T | GoogleDrive |
How to read
import pickle as pkl
with open('pose_phoenix2014_up_hrnet_TxN_wh.pkl', 'rb') as f:
dict_pose = pkl.load(f)
print(dict_pose['01April_2010_Thursday_heute_default-0'].shape)