On-boarding Multimodal transforms to DPK
Search before asking
- [x] I searched the issues and found no similar issues.
Component
Transforms/Other
Feature
There was work done in IBM on Multi-modal and multi-lingual transforms. Need to coordinate with the owners (Dhiraj, Pengyuan, Rogerio, Juergen) to bring these transforms to DPK.
Are you willing to submit a PR?
- [ ] Yes I am willing to submit a PR!
Some initial thoughts on priority transforms:
- People Detect + Face Blur
- NSFW (Not Safe for Working)
- p2j (allows conversion from parquet to json with custom fields including blurred image write-out)
Question: Should current j2p become llava2parquet and p2j become parquet2llava, assuming we want to stick with llava? cc: @daw3rd
Summary of discussion with Dhiraj and Pengyuan on 4/22
The key to bringing the multimodal transforms from inner to outer is to rewrite the abstract layer code here that the individual transforms use. This code was developed by @daw3rd and we would need David's help or someone in the DPK team to rewrite this code. Once that is done, it is relatively simple to get the in the nfsw (done by Michele) or p2j (done by Pengyuan) to come to the open DPK.
cc: @touma-I
PR #1278