FastSAM
FastSAM copied to clipboard
Convert FastSAM to onnx and coreml format
Hi,
Has someone convert FastSAM to the onnx or coreml format? Since, FastSam is based on YOLOV8 model and takes img.path as input, how to get an image trace for this and convert to core ml ormat? Also, how to convert it to coreml format such that output can be of variable size ?
You can write a simple script to export the vision model to CoreML using the Ultralytics export API:
from fastsam import FastSAM
fsam = FastSAM("weights/FastSAM-x.pt")
fsam.export(format="coreml")
This already achieves a great performance but note that the performance and model size on disk can be improved by using the MLProgram format which use FP16 storage and inference by default. Here is an example for the S variant:
from fastsam import FastSAM
import coremltools as ct
import torch
def main():
fsam = FastSAM("weights/FastSAM-s.pt")
fsam.export(format="torchscript")
imgsz = fsam.overrides["imgsz"]
ts = torch.load("weights/FastSAM-s.torchscript")
model = ct.convert(
ts,
inputs=[
ct.ImageType(
name="image", scale=1 / 255, bias=(0, 0, 0), shape=(1, 3, imgsz, imgsz)
)
],
convert_to="mlprogram",
)
model.save("weights/FastSAM-s.mlpackage")
if __name__ == "__main__":
main()
You would still need to export the prompt model or keep it in torch/torchscript.
@laclouis5 Thanks a lot, is there a way to export FastSAM to coreml model such that it can accept variable size input instead of fixed shape(1024,1024)
@okdha1234 There is a way, though I don't know if this model will have optimal performance when using a different input shape than the default one. Here is an example of an export which accepts two input shapes (more can be added):
from fastsam import FastSAM
import coremltools as ct
import torch
def main():
fsam = FastSAM("weights/FastSAM-s.pt")
fsam.export(format="torchscript")
imgsz = fsam.overrides["imgsz"]
ts = torch.load("weights/FastSAM-s.torchscript")
model = ct.convert(
ts,
inputs=[
ct.ImageType(
name="image",
scale=1 / 255,
bias=(0, 0, 0),
shape=ct.EnumeratedShapes(
shapes=[
(1, 3, imgsz // 2, imgsz // 2),
(1, 3, imgsz, imgsz),
],
default=(1, 3, imgsz, imgsz),
),
)
],
convert_to="mlprogram",
)
model.save("weights/FastSAM-s.mlpackage")
if __name__ == "__main__":
main()
We would close this issue for now to keep the issue tracker organized. However, if the problem persists or if you have any further questions, please feel free to comment here or open a new issue. We value your input and are happy to assist further.
@okdha1234 I think fsam.export(dynamic=True)
can do this. Source: /Ultralytics/yolo/cfg/default.yaml:75
@laclouis5 Thanks much for your help, I had used your answer to generate a mlpackage and got the VNCoreMLFeatureValueObservation successfully , but when I try to visualize the multiArrayValue, I only get the nonsense. I want to know the exact struct of the output in CoreML and how to visualize the segments.
@laclouis5 Thanks much for your help, I had used your answer to generate a mlpackage and got the VNCoreMLFeatureValueObservation successfully , but when I try to visualize the multiArrayValue, I only get the nonsense. I want to know the exact struct of the output in CoreML and how to visualize the segments.
Sorry to trouble you, I figure out the reason: I simply use the solution for DeepLabV3 which output is just 513x513 not fastSAM's 1x32x256x256
@laclouis5 Thanks much for your help, I had used your answer to generate a mlpackage and got the VNCoreMLFeatureValueObservation successfully , but when I try to visualize the multiArrayValue, I only get the nonsense. I want to know the exact struct of the output in CoreML and how to visualize the segments.
Sorry to trouble you, I figure out the reason: I simply use the solution for DeepLabV3 which output is just 513x513 not fastSAM's 1x32x256x256
@foxstudiohua The CoreML model shows there are two outputs: 1×37×8400 and 1×32×160×60, do you know what's the first one? and How do you get 1x32x256x256 output?
@laclouis5 Thanks much for your help, I had used your answer to generate a mlpackage and got the VNCoreMLFeatureValueObservation successfully , but when I try to visualize the multiArrayValue, I only get the nonsense. I want to know the exact struct of the output in CoreML and how to visualize the segments.
Sorry to trouble you, I figure out the reason: I simply use the solution for DeepLabV3 which output is just 513x513 not fastSAM's 1x32x256x256
@foxstudiohua The CoreML model shows there are two outputs: 1×37×8400 and 1×32×160×60, do you know what's the first one? and How do you get 1x32x256x256 output?
I got the information from here https://github.com/orgs/ultralytics/discussions/3417, and I think my 256x256 is just different config when convert model to CoreML~
@laclouis5 Thanks much for your help, I had used your answer to generate a mlpackage and got the VNCoreMLFeatureValueObservation successfully , but when I try to visualize the multiArrayValue, I only get the nonsense. I want to know the exact struct of the output in CoreML and how to visualize the segments.
Sorry to trouble you, I figure out the reason: I simply use the solution for DeepLabV3 which output is just 513x513 not fastSAM's 1x32x256x256
@foxstudiohua The CoreML model shows there are two outputs: 1×37×8400 and 1×32×160×60, do you know what's the first one? and How do you get 1x32x256x256 output?
I got the information from here https://github.com/orgs/ultralytics/discussions/3417, and I think my 256x256 is just different config when convert model to CoreML~
Thanks :)