[CLIP] Validation Script

Open dsikka opened this issue 2 years ago • 0 comments

Question - should this script include some sort of metric generation for zeroshot classification on some dataset? Right now, the script just takes in local samples

Summary

Validation script to run the CLIP Zeroshot and CLIP Caption generation pipelines
Both tasks are supported and the default task is Zeroshot classifiation
The script adds an extra layer where the user can provide text files with paths to images and sample classes. Examples of text files are provided in this PR: src/deepsparse/clip/sample_classes.txt and src/deepsparse/clip/sample_images.txt. Both are used as defaults

Testing

Tested locally using the following commands. The default sample_classes.txt was used and the sample_images.txt was updated to include the thailand.jpg (an image of elephants) from the yolact samples images.

Caption Generation:

python validation.py --task caption --decoder-model /home/dsikka/deepsparse/tests/deepsparse/pipelines/final_models/clip_text_decoder.onnx --text-model /home/dsikka/deepsparse/tests/deepsparse/pipelines/final_models/clip_text.onnx --visual-model /home/dsikka/deepsparse/tests/deepsparse/pipelines/final_models/clip_visual.onnx

Output:

Class prediction for /home/dsikka/deepsparse/src/deepsparse/yolact/sample_images/thailand.jpg, an adult elephant and a baby elephant

Zeroshot Classification:

python validation.py --visual-model ~/deepsparse/tests/deepsparse/pipelines/zeroshot_research/visual/model.onnx --text-model ~/deepsparse/tests/deepsparse/pipelines/zeroshot_research/text/model.onnx

Output:

Class prediction for /home/dsikka/deepsparse/src/deepsparse/yolact/sample_images/thailand.jpg, an elephant

Aug 14 '23 02:08 dsikka