Prediction performed on CPU

Open liningpan opened this issue 3 years ago • 0 comments

Hi,

I recently discovered that at least in some cases Py-FEAT is not actually using GPUs for inference. I'm not sure if this is expected and if it will be addressed by #133.

Here is the script I used for profiling.

from feat import Detector
from torch.profiler import profile, ProfilerActivity

detector = Detector(
    face_model="retinaface",
    landmark_model="mobilefacenet",
    au_model="svm",
    emotion_model="resmasknet",
    facepose_model="img2pose",
)

video_location = "./WolfgangLanger_Pexels.mp4"

with profile(activities=[ProfilerActivity.CPU, ProfilerActivity.CUDA], record_shapes=True, use_cuda=True) as prof:
    video_prediction = detector.detect_video(video_location, skip_frames=24)

print(prof.key_averages().table(sort_by="self_cpu_time_total", row_limit=20))
prof.export_chrome_trace("trace.json")

The trace.json file produced by the script can be visualized in chrome chrome://tracing.

We can see that very little work is actually done on the GPU. Full profile

If we zoom in we can see that mkldnn is called by conv2d, I was expecting CUDA to be used. zoom in cpu

A section that is actually running on the GPU. zoom in gpu/cudnn

Additional Information: AWS EC2 g4dn.xlarge (4 vCPU, 16 gb ram, 1 x Tesla T4) Ubuntu 20.04 LTS NVIDIA Driver 515.65.01 PyTorch 1.12.1 + CUDA 11.3 py-feat 0.4.0

Sep 23 '22 19:09 liningpan