DPTNet
DPTNet copied to clipboard
Can this audio encoder handle 3 to 4 minutes of audio?
Can this encoder directly process 3 to 4 minutes of audio tensor and get global features?