Haoxuan (Horace) Wang
Haoxuan (Horace) Wang
另外想问下是用GPT-4/ChatGPT翻译的吗?如果是的话,可否在prompt里加入不翻译代码的部分。 另外我看了下这个大概是原始数据1/2的量,其它部分还会继续翻译吗?
> Hi 非常感谢翻译数据集,我看了一下有个问题就是代码都被“翻译”了。所以我用下面的关键词搜索一下,花了一个下午手动把不太对的代码翻译都改回去了。当然也许会有遗漏。 > > * 代码 > * 函数 > * 程序 > * 脚本 > * Python > * 蟒蛇 > * Go > * C++ > *...
Hi @ddlBoJack Thanks a lot for replying! One thing I do observe - we have many no-speech audio, or very short audio segment is that in our testing dataset. A...
Hi @PigeonDan1 1. prompt + Musan dataset with such an label seem helpful in controlling the format of output for audio without speech, or audio with very few speech. We...
@fclearner Thanks for your question. It seems that it can happen that it has multiple decoded depends on the LLM's instruction following capability. But the thing is that we will...