deepsparse
deepsparse copied to clipboard
[CLIP] Validation Script
Question - should this script include some sort of metric generation for zeroshot classification on some dataset? Right now, the script just takes in local samples
Summary
- Validation script to run the CLIP Zeroshot and CLIP Caption generation pipelines
- Both tasks are supported and the default task is Zeroshot classifiation
- The script adds an extra layer where the user can provide text files with paths to images and sample classes. Examples of text files are provided in this PR:
src/deepsparse/clip/sample_classes.txtandsrc/deepsparse/clip/sample_images.txt. Both are used as defaults
Testing
- Tested locally using the following commands. The default
sample_classes.txtwas used and thesample_images.txtwas updated to include thethailand.jpg(an image of elephants) from the yolact samples images.
Caption Generation:
python validation.py --task caption --decoder-model /home/dsikka/deepsparse/tests/deepsparse/pipelines/final_models/clip_text_decoder.onnx --text-model /home/dsikka/deepsparse/tests/deepsparse/pipelines/final_models/clip_text.onnx --visual-model /home/dsikka/deepsparse/tests/deepsparse/pipelines/final_models/clip_visual.onnx
Output:
Class prediction for /home/dsikka/deepsparse/src/deepsparse/yolact/sample_images/thailand.jpg, an adult elephant and a baby elephant
Zeroshot Classification:
python validation.py --visual-model ~/deepsparse/tests/deepsparse/pipelines/zeroshot_research/visual/model.onnx --text-model ~/deepsparse/tests/deepsparse/pipelines/zeroshot_research/text/model.onnx
Output:
Class prediction for /home/dsikka/deepsparse/src/deepsparse/yolact/sample_images/thailand.jpg, an elephant