zero-shot-prediction-plugin icon indicating copy to clipboard operation
zero-shot-prediction-plugin copied to clipboard

Loading 8 bit quantization owl-vit model support in fifty one

Open solomonmanuelraj opened this issue 3 months ago • 1 comments

Hi team,

like to load the 8 bits quantized owl-vit model in fifty one.

############################################################################################ import fiftyone as fo import fiftyone.zoo as foz from fiftyone import ViewField as F from transformers import BitsAndBytesConfig

quant_dataset = foz.load_zoo_dataset( "coco-2017", split="validation", label_types=["detections"], max_samples=200, classes = ['person', 'bicycle', 'car', 'motorcycle', 'airplane', 'bus', 'train', 'truck', 'boat', 'traffic light', 'fire hydrant', 'stop sign', 'parking meter', 'bench', 'bird', 'cat', 'dog', 'horse', 'sheep', 'cow', 'elephant', 'bear', 'zebra', 'giraffe', 'backpack', 'umbrella', 'handbag', 'tie', 'suitcase', 'frisbee', 'skis', 'snowboard', 'sports ball', 'kite', 'baseball bat', 'baseball glove', 'skateboard', 'surfboard', 'tennis racket', 'bottle', 'wine glass', 'cup', 'fork', 'knife', 'spoon', 'bowl', 'banana', 'apple', 'sandwich', 'orange', 'broccoli', 'carrot', 'hot dog', 'pizza', 'donut', 'cake', 'chair', 'couch', 'potted plant', 'bed', 'dining table', 'toilet', 'tv', 'laptop', 'mouse', 'remote', 'keyboard', 'cell phone', 'microwave', 'oven', 'toaster', 'sink', 'refrigerator', 'book', 'clock', 'vase', 'scissors', 'teddy bear', 'hair drier', 'toothbrush'], only_matching=True )

Loading the 8 bits quantized model

bnb_config = BitsAndBytesConfig( load_in_8bit=True)

model_type = "zero-shot-detection-transformer-torch" name_or_path = "google/owlvit-base-patch32" ## <- Owl-ViT

load model

quant_model = foz.load_zoo_model(model_type, name_or_path=name_or_path,quantization_config=bnb_config,device_map="auto")

can set classes at any time

quant_model.classes = ['person', 'bicycle', 'car', 'motorcycle', 'airplane', 'bus', 'train', 'truck', 'boat', 'traffic light', 'fire hydrant', 'stop sign', 'parking meter', 'bench', 'bird', 'cat', 'dog', 'horse', 'sheep', 'cow', 'elephant', 'bear', 'zebra', 'giraffe', 'backpack', 'umbrella', 'handbag', 'tie', 'suitcase', 'frisbee', 'skis', 'snowboard', 'sports ball', 'kite', 'baseball bat', 'baseball glove', 'skateboard', 'surfboard', 'tennis racket', 'bottle', 'wine glass', 'cup', 'fork', 'knife', 'spoon', 'bowl', 'banana', 'apple', 'sandwich', 'orange', 'broccoli', 'carrot', 'hot dog', 'pizza', 'donut', 'cake', 'chair', 'couch', 'potted plant', 'bed', 'dining table', 'toilet', 'tv', 'laptop', 'mouse', 'remote', 'keyboard', 'cell phone', 'microwave', 'oven', 'toaster', 'sink', 'refrigerator', 'book', 'clock', 'vase', 'scissors', 'teddy bear', 'hair drier', 'toothbrush']

quant_dataset.apply_model(quant_model, label_field="owlvit_quant")

############################################################################################

whether it is supported in fifty one or not?

solomonmanuelraj avatar Mar 12 '24 06:03 solomonmanuelraj

Hey @solomonmanuelraj ,

Thanks for your interest, and great question!

The functionality of loading zero-shot models from the zoo is actually part of the core FiftyOne library, not this plugin. This plugin just provides a simple and streamlined interface specifically for zero-shot tasks.

As for your specific question, I think it should be possible to load the model directly from Hugging Face transformers (with the 8-bit quantization), and use the function convert_transformers_model from here. You can then set the classes and apply the model to your data. Want to give this a try?

jacobmarks avatar Mar 15 '24 20:03 jacobmarks