FunASR
FunASR copied to clipboard
连续运行情绪识别爆显存
Notice: In order to resolve issues more efficiently, please raise issue following the template. (注意:为了更加高效率解决您遇到的问题,请按照模板提问,补充细节)
❓ 我想批量识别视频句子的情绪, 每个句子都很短, 但是连续运行显存会越来越大,
显存变化如下: 2.2 2.2 2.6 2.2 2.3 2.4 2.7 2.4 2.5 2.4 2.6 7.7 爆显存
Before asking:
- search the issues.
- search the docs.
能稳定维持显存
Code
` def get_audio_slice( video_path: str | Path, start_time: float, end_time: float, sample_rate: int = 44100, # 可降低采样率以加速 stereo: bool = True, # 立体声 ) -> bytes: duration = end_time - start_time
cmd = [
"ffmpeg",
"-ss",
str(start_time), # 精准定位(关键优化)
"-i",
video_path,
"-t",
str(duration),
"-acodec",
"pcm_s16le", # 16-bit PCM(最快无损格式)
"-ar",
str(sample_rate), # 可降低采样率
"-ac",
"2" if stereo else "1", # 单声道更快
"-f",
"wav",
"-y", # 覆盖输出(无提示)
"pipe:1", # 内存输出
]
# 运行并捕获输出
result = subprocess.run(
cmd,
stdout=subprocess.PIPE,
stderr=subprocess.PIPE,
check=True,
)
return result.stdout
def emotion_video( video_dir="/home/c/Videos/video4", data_suffix=".data2.json", ): video_dir_path = Path(video_dir) if not video_dir_path.exists() or not video_dir_path.is_dir(): logger.error(f"❌ 视频目录不存在或不是目录: {video_dir}") return data_files = list(video_dir_path.rglob(f"*{data_suffix}"))
emotion_model = AutoModel(
model=r"media\model\iic\emotion2vec_plus_large",
device="cuda",
disable_update=True,
disable_pbar=True,
)
for data_file in data_files:
with open(data_file, "r", encoding="utf-8") as f:
data = json.load(f)
logger.info(f"🎬 正在处理: {data_file}")
for i in data["sentence"]:
if i.get("emotion", None):
continue
logger.info(f"🎬 正在处理文本: {i["text"]} {i['start']} {i['end']}")
video_path = str(data_file).replace(data_suffix, ".mp4")
if not os.path.exists(video_path):
logger.error(f"❌ 视频文件不存在: {video_path}")
continue
audio_slice = get_audio_slice(
video_path,
i["start"],
i["end"],
)
res = emotion_model.generate(
audio_slice,
granularity="utterance",
extract_embedding=False,
device="cuda",
)
# 找到最大分数的索引
item = res[0]
max_score_index = item["scores"].index(max(item["scores"]))
# 获取对应的标签
max_score_label = item["labels"][max_score_index]
i["emotion"] = max_score_label
time.sleep(2)
with open(data_file, "w", encoding="utf-8") as f:
json.dump(data, f, ensure_ascii=False, indent=2)
`
What have you tried?
`(vam) PS C:\project\vam\back> uv run init.py C:\project\vam\back\api\tv.py:445: SyntaxWarning: invalid escape sequence '{' "question": [{"name": "问题1", "range": [[1,5],[20, 30]]}], C:\project\vam\back\api\tv.py:1037: SyntaxWarning: invalid escape sequence '{' "question": [{"name": "问题1", "range": [[1,5],[20, 30]]}], C:\project\vam\back
jionlp - 微信公众号: JioNLP Github: https://github.com/dongrixinyu/JioNLP.
C:\project\vam.venv\Lib\site-packages\jieba_compat.py:18: UserWarning: pkg_resources is deprecated as an API. See https://setuptools.pypa.io/en/latest/pkg_resources.html. The pkg_resources package is slated for removal as early as 2025-11-30. Refrain from using this package or pin to Setuptools<81.
import pkg_resources
funasr version: 1.2.7.
WARNING:root:trust_remote_code: False
Warning, miss key in ckpt: modality_encoders.AUDIO.decoder.blocks.0.0.weight, media\model\iic\emotion2vec_plus_large\model.pt
Warning, miss key in ckpt: modality_encoders.AUDIO.decoder.blocks.0.0.bias, media\model\iic\emotion2vec_plus_large\model.pt
Warning, miss key in ckpt: modality_encoders.AUDIO.decoder.blocks.1.0.weight, media\model\iic\emotion2vec_plus_large\model.pt
Warning, miss key in ckpt: modality_encoders.AUDIO.decoder.blocks.1.0.bias, media\model\iic\emotion2vec_plus_large\model.pt
Warning, miss key in ckpt: modality_encoders.AUDIO.decoder.blocks.2.0.weight, media\model\iic\emotion2vec_plus_large\model.pt
Warning, miss key in ckpt: modality_encoders.AUDIO.decoder.blocks.2.0.bias, media\model\iic\emotion2vec_plus_large\model.pt
Warning, miss key in ckpt: modality_encoders.AUDIO.decoder.blocks.3.0.weight, media\model\iic\emotion2vec_plus_large\model.pt
Warning, miss key in ckpt: modality_encoders.AUDIO.decoder.blocks.3.0.bias, media\model\iic\emotion2vec_plus_large\model.pt
Warning, miss key in ckpt: modality_encoders.AUDIO.decoder.proj.weight, media\model\iic\emotion2vec_plus_large\model.pt
Warning, miss key in ckpt: modality_encoders.AUDIO.decoder.proj.bias, media\model\iic\emotion2vec_plus_large\model.pt
2025-10-15 22:00:51.270 | INFO | api.tv:emotion_video:1303 - 🎬 正在处理: C:\project\video\video10\尼山滑雪场.data2.json
2025-10-15 22:00:51.272 | INFO | api.tv:emotion_video:1303 - 🎬 正在处理: C:\project\video\video10\雪具大厅.data2.json
2025-10-15 22:00:51.276 | INFO | api.tv:emotion_video:1303 - 🎬 正在处理: C:\project\video\video4\2025雅思备考.data2.json
2025-10-15 22:00:51.289 | INFO | api.tv:emotion_video:1303 - 🎬 正在处理: C:\project\video\video4\《现在就出发》综艺片段.data2.json
2025-10-15 22:00:51.299 | INFO | api.tv:emotion_video:1303 - 🎬 正在处理: C:\project\video\video4\【掐丝珐琅画】百合摆件教程.data2.json
2025-10-15 22:00:51.303 | INFO | api.tv:emotion_video:1303 - 🎬 正在处理: C:\project\video\video4\京剧《武家坡》教学视频.data2.json
2025-10-15 22:00:51.313 | INFO | api.tv:emotion_video:1303 - 🎬 正在处理: C:\project\video\video4\初会_高效备考.data2.json
2025-10-15 22:00:51.321 | INFO | api.tv:emotion_video:1303 - 🎬 正在处理: C:\project\video\video4\南昌&景德镇&上饶旅游vlog.data2.json
2025-10-15 22:00:51.332 | INFO | api.tv:emotion_video:1303 - 🎬 正在处理: C:\project\video\video4\反刍动物饲料原料的选择与使用.data2.json
2025-10-15 22:00:51.338 | INFO | api.tv:emotion_video:1303 - 🎬 正在处理: C:\project\video\video4\可乐鸡翅教程.data2.json
2025-10-15 22:00:51.344 | INFO | api.tv:emotion_video:1303 - 🎬 正在处理: C:\project\video\video4\周城旅游日记.data2.json
2025-10-15 22:00:51.344 | INFO | api.tv:emotion_video:1307 - 🎬 正在处理文本: 回头一步, 13.04 15.47
inference-----------------
device cuda:0 cuda
2025-10-15 22:00:55.015 | INFO | api.tv:emotion_video:1307 - 🎬 正在处理文本: 清辰一扑, 15.47 18.73
inference-----------------
device cuda:0 cuda
2025-10-15 22:00:57.361 | INFO | api.tv:emotion_video:1307 - 🎬 正在处理文本: 彩虹, 18.81 19.615
inference-----------------
device cuda:0 cuda
2025-10-15 22:00:59.683 | INFO | api.tv:emotion_video:1307 - 🎬 正在处理文本: 错过相拥时间, 22.73 26.66
inference-----------------
device cuda:0 cuda
2025-10-15 22:01:02.189 | INFO | api.tv:emotion_video:1307 - 🎬 正在处理文本: 几处有微风去处, 28.01 31.61
inference-----------------
device cuda:0 cuda
2025-10-15 22:01:04.569 | INFO | api.tv:emotion_video:1307 - 🎬 正在处理文本: 藏着心动。 31.65 33.225
inference-----------------
device cuda:0 cuda
2025-10-15 22:01:06.987 | INFO | api.tv:emotion_video:1307 - 🎬 正在处理文本: 天水的梦躲着渔火, 36.01 42.14
inference-----------------
device cuda:0 cuda
2025-10-15 22:01:09.614 | INFO | api.tv:emotion_video:1307 - 🎬 正在处理文本: 阳光些渐行渐远的时光才高。 42.22 49.53
inference-----------------
device cuda:0 cuda
2025-10-15 22:01:12.131 | INFO | api.tv:emotion_video:1307 - 🎬 正在处理文本: 想躺在你 you first ball 路的谁的路上, 53.07 63.545
inference-----------------
device cuda:0 cuda
2025-10-15 22:01:14.747 | INFO | api.tv:emotion_video:1307 - 🎬 正在处理文本: 还是热爱伤开缘分还风雨浪相伴, 64.43 74.11
inference-----------------
device cuda:0 cuda
2025-10-15 22:01:17.329 | INFO | api.tv:emotion_video:1307 - 🎬 正在处理文本: 或许也能是于花开是丰盛的欢喜人物, 75.44 97.26
inference-----------------
device cuda:0 cuda
2025-10-15 22:01:20.217 | INFO | api.tv:emotion_video:1307 - 🎬 正在处理文本: 唤星繁星美好的光阴。 97.42 103.605
inference-----------------
device cuda:0 cuda
2025-10-15 22:01:22.671 | INFO | api.tv:emotion_video:1307 - 🎬 正在处理文本: 我着雨晃样方的街渐行渐远, 106.68 112.335
inference-----------------
device cuda:0 cuda
2025-10-15 22:01:25.185 | INFO | api.tv:emotion_video:1307 - 🎬 正在处理文本: 时光在 thank you。 113.23 121.08
inference-----------------
device cuda:0 cuda
2025-10-15 22:01:27.785 | INFO | api.tv:emotion_video:1307 - 🎬 正在处理文本: first, 121.26 121.505
inference-----------------
device cuda:0 cuda
2025-10-15 22:01:30.045 | INFO | api.tv:emotion_video:1307 - 🎬 正在处理文本: 特别好看, 122.82 123.765
inference-----------------
device cuda:0 cuda
2025-10-15 22:01:32.418 | INFO | api.tv:emotion_video:1307 - 🎬 正在处理文本: 我的喜待留吧。 124.6 126.85
inference-----------------
device cuda:0 cuda
2025-10-15 22:01:34.997 | INFO | api.tv:emotion_video:1307 - 🎬 正在处理文本: 多少电视如爱们在翻抱紧的 shpray, 128.93 140.055
inference-----------------
device cuda:0 cuda
2025-10-15 22:01:37.623 | INFO | api.tv:emotion_video:1307 - 🎬 正在处理文本: 或是跟谁细数美好的一次初见期待着台下段情节。 141.25 154.705
inference-----------------
device cuda:0 cuda
2025-10-15 22:01:40.252 | INFO | api.tv:emotion_video:1307 - 🎬 正在处理文本: 想在你好的温习的谁可白路上掩是热爱。 158.23 170.625
inference-----------------
device cuda:0 cuda
2025-10-15 22:01:42.844 | INFO | api.tv:emotion_video:1307 - 🎬 正在处理文本: 嗯嗯嗯嗯嗯, 197.38 224.91
inference-----------------
device cuda:0 cuda
2025-10-15 22:01:45.983 | INFO | api.tv:emotion_video:1307 - 🎬 正在处理文本: 那可以天嗯嗯你啦, 239.64 265.78
inference-----------------
device cuda:0 cuda
2025-10-15 22:01:48.986 | INFO | api.tv:emotion_video:1307 - 🎬 正在处理文本: 我嗯这个的嗯有嗯不果嗯嗯知的。 270.87 379.97
inference-----------------
Traceback (most recent call last):
File "C:\project\vam\back\init.py", line 263, in
What's your environment?
- OS (e.g., Linux):win11
- FunASR Version (e.g., 1.0.0):1.2.7
- ModelScope Version (e.g., 1.11.0):1.30.0
- PyTorch Version (e.g., 2.0.0):2.8.0
- How you installed funasr (
pip, source):uv - Python version:3.12.11
- GPU (e.g., V100M32)4060
- CUDA/cuDNN version (e.g., cuda11.7):无
- Docker version (e.g., funasr-runtime-sdk-cpu-0.4.1)
- Any other relevant information: