run on the drone in real-time
I want to use VGGT to provide pose message on the drone. I first tested the computation time on my laptop(with RTX2070), and the result showed that it takes nearly 1 second to compute a single image. Is it proper?
Are you sure you're using cuda in eval mode?
dtype = torch.bfloat16
# Check if cuda is available and set the device accordingly
device = "cuda" if torch.cuda.is_available() else "cpu"
# Initialize the model and load the pretrained weights.
# This will automatically download the model weights the first time it's run, which may take a while.
model = VGGT.from_pretrained("facebook/VGGT-1B").to(device)
model.eval()
import torch import time from vggt.models.vggt import VGGT from vggt.utils.load_fn import load_and_preprocess_images
device = "cuda" if torch.cuda.is_available() else "cpu"
dtype = torch.bfloat16 if torch.cuda.get_device_capability()[0] >= 8 else torch.float16
model = VGGT.from_pretrained("facebook/VGGT-1B").to(device) model.eval()
image_names = ["/home/lll/vggt/1.png"]
images = load_and_preprocess_images(image_names).to(device)
with torch.no_grad(): with torch.cuda.amp.autocast(dtype=dtype): start_time = time.time() # Predict attributes including cameras, depth maps, and point maps. predictions = model(images) agg_time = time.time() - start_time print(f"推理完成,耗时: {agg_time:.4f} 秒")
This is my code now in eval mode. It still costs 0.79 seconds.
Yeah sounds about right to me as is. I would imagine that torch compile would optimize it but on a RTX2070 this is not surprising.
Thanks for your answer. I now think that VGGT is unusable on real-time embedded platforms like drones. Because I don't think Jetson Orin NX will do much better than my RTX2070.
So i'm working a way to run it in the cloud in real time, as it's difficult embedded with dora-rs my project. But i don't know about drones.
This is my code now in eval mode. It still costs 0.79 seconds.
Why doesn't my RTX3060 work?
I would believe memory