SenseVoice
SenseVoice copied to clipboard
CASIA 数据集 SER 性能不能复现,测试结果只有34,而不是report的70
Notice: In order to resolve issues more efficiently, please raise issue following the template. (注意:为了更加高效率解决您遇到的问题,请按照模板提问,补充细节)
❓ Questions and Help
Before asking:
- search the issues.
- search the docs.
What is your question?
CASIA 数据集 SER 性能不能复现,测试结果只有34,而不是report的70
Code
from funasr import AutoModel
import os
from funasr.utils.postprocess_utils import rich_transcription_postprocess
model_dir = "/local/path/to/SenseVoiceSmall"
model = AutoModel(
model=model_dir,
device="cuda:1",
disable_update=True
)
test_wav_dir = "/local/path/to/CASIA/6"
for dir in os.listdir(test_wav_dir):
# en
for file in os.listdir(os.path.join(test_wav_dir, dir)):
res = model.generate(
input=os.path.join(test_wav_dir, dir, file),
output_dir="./outputs/debug",
cache={},
language="auto", # "zh", "en", "yue", "ja", "ko", "nospeech"
use_itn=True,
batch_size_s=60,
merge_vad=True,
merge_length_s=15,
ban_emo_unk=True
)
def metric(file):
acc = 0
total = 0
gt_set, pred_set = set(), set()
with open(file, "r") as flr:
for line in flr.readlines():
gt, pred = line.split(" ")
gt = gt.split("-")[1]
pred = pred.split("|><|")[1].lower().replace("surprised", "surprise")
gt_set.add(gt)
pred_set.add(pred)
if gt == pred:
acc += 1
total += 1
print(gt_set, pred_set)
print(acc, total, acc / total * 100)
CASIA 数据集是从 https://aistudio.baidu.com/datasetdetail/209512 下载的,直接解压使用。
测试结果 ground_truth 的标签集合:{'surprise', 'angry', 'fear', 'sad', 'neutral', 'happy'} 模型预测的标签集合:{'surprise', 'angry', 'sad', 'neutral', 'happy'} (surprised已经转换成surprise以便和ground-truth对比) 预测正确的样本数:409 总样本数:1200 正确率:34.08%
What have you tried?
What's your environment?
- OS (e.g., Linux): ubuntu
- FunASR Version (e.g., 1.0.0): 1.2.6
- PyTorch Version (e.g., 2.0.0): 2.5.1
- How you installed funasr (
pip, source): pip - Python version: 3.10.12
- GPU (e.g., V100M32): A100-40G
- CUDA/cuDNN version (e.g., cuda11.7): 12.2
- Any other relevant information:
模型输出的结果文件 pred.txt
我也遇到同样的问题, CASIA上acc 34%, RAVDESS上表现也很差
我也遇到同样的问题, CASIA上acc 34%, RAVDESS上表现也很差
RAVDESS上表现大概有多少?