vggt icon indicating copy to clipboard operation
vggt copied to clipboard

run on the drone in real-time

Open llllliu123 opened this issue 6 months ago • 2 comments

I want to use VGGT to provide pose message on the drone. I first tested the computation time on my laptop(with RTX2070), and the result showed that it takes nearly 1 second to compute a single image. Is it proper?

llllliu123 avatar Jun 25 '25 11:06 llllliu123

Image

llllliu123 avatar Jun 25 '25 11:06 llllliu123

Are you sure you're using cuda in eval mode?

dtype = torch.bfloat16

# Check if cuda is available and set the device accordingly
device = "cuda" if torch.cuda.is_available() else "cpu"

# Initialize the model and load the pretrained weights.
# This will automatically download the model weights the first time it's run, which may take a while.
model = VGGT.from_pretrained("facebook/VGGT-1B").to(device)
model.eval()

haixuanTao avatar Jun 25 '25 12:06 haixuanTao

import torch import time from vggt.models.vggt import VGGT from vggt.utils.load_fn import load_and_preprocess_images

device = "cuda" if torch.cuda.is_available() else "cpu"

dtype = torch.bfloat16 if torch.cuda.get_device_capability()[0] >= 8 else torch.float16

model = VGGT.from_pretrained("facebook/VGGT-1B").to(device) model.eval()

image_names = ["/home/lll/vggt/1.png"]
images = load_and_preprocess_images(image_names).to(device)

with torch.no_grad(): with torch.cuda.amp.autocast(dtype=dtype): start_time = time.time() # Predict attributes including cameras, depth maps, and point maps. predictions = model(images) agg_time = time.time() - start_time print(f"推理完成,耗时: {agg_time:.4f} 秒")

Image

llllliu123 avatar Jun 26 '25 09:06 llllliu123

This is my code now in eval mode. It still costs 0.79 seconds.

llllliu123 avatar Jun 26 '25 09:06 llllliu123

Yeah sounds about right to me as is. I would imagine that torch compile would optimize it but on a RTX2070 this is not surprising.

haixuanTao avatar Jun 26 '25 11:06 haixuanTao

Thanks for your answer. I now think that VGGT is unusable on real-time embedded platforms like drones. Because I don't think Jetson Orin NX will do much better than my RTX2070.

llllliu123 avatar Jun 30 '25 08:06 llllliu123

So i'm working a way to run it in the cloud in real time, as it's difficult embedded with dora-rs my project. But i don't know about drones.

haixuanTao avatar Jun 30 '25 12:06 haixuanTao

This is my code now in eval mode. It still costs 0.79 seconds.

Why doesn't my RTX3060 work?

TDyyds6 avatar Jul 17 '25 11:07 TDyyds6

I would believe memory

haixuanTao avatar Jul 17 '25 16:07 haixuanTao