datasets icon indicating copy to clipboard operation
datasets copied to clipboard

Feature Request: datum-to-Pose

Open cleong110 opened this issue 1 year ago • 2 comments

For a number of applications, it would be nice to be able to work with poses using the Pose-format library. So, given a datum, it would be ideal to be able to reconstruct the Pose format, or at least retrieve the Header to know which point is which.

Applications include

  • Visualization of the pose sequences.
  • Being able to apply pose normalization or select specific points, as in SignCLIP.
  • Ability to save off the data as .pose files

I tried a few methods for reconstructing the pose from the datum and wasn't able to figure it out. Currently, what we get when you do a tfds.load with pose="holistic", is a Tensor, without the accompanying header that explains things like fps, which point is the NOSE point, and so forth. Eventually I just edited the data loader in question to also save off .pose files.

cleong110 avatar Dec 12 '24 15:12 cleong110

I absolutely agree. it is even possible we could do it as part of the decoding of https://github.com/sign-language-processing/datasets/blob/master/sign_language_datasets/utils/features/pose_feature.py but i am not sure.

What I do, which is terrible that it is not a part of this library, is:

dataset_name = "dgs_corpus" # for example

# Dynamically import the dataset module
dataset_module = importlib.import_module(f"sign_language_datasets.datasets.{dataset_name}.{dataset_name}")

# Read the pose header from the dataset's predefined file
with open(dataset_module._POSE_HEADERS["holistic"], "rb") as buffer:
    pose_header = PoseHeader.read(BufferReader(buffer.read()))

pose_body = NumPyPoseBody(fps=float(datum["pose"]["fps"].numpy()), 
                                                      data=datum["pose"]["data"].numpy(), 
                                                      confidence=datum["pose"]["conf"].numpy())

# Construct and return the Pose object
pose = Pose(pose_header, pose_body)

AmitMY avatar Dec 13 '24 19:12 AmitMY

The thought occurs to me that we could detect some common formats, e.g. holistic, from the shape of the body data. We could possibly throw up a visualization as well for a manual check.

cleong110 avatar Mar 12 '25 14:03 cleong110