Lixue Cheng (Sherry)
Lixue Cheng (Sherry)
> prevent translating citation commands like `\citep[xxx][]{xxx}` Thank you so much for this PR! We really appreciate your contributions and will merge it later!
Thanks a lot for your PR! @SUSYUSTC we will review these changes and merge your PR soon. Sorry again for following up so late. We will probably also change the...
Thank you for reporting issues to us. Since we are a general translation tool instead of a tool only working for CS or DL, we think it might be better...
@SUSYUSTC In terms of the known issues, we should list them as "Known issues" in our main page. After that we might close the series of issues here.
> 我目前对nougat具体能做什么暂时不是特别了解,保留图片我们这应该不太行 我感觉咱们应该可以弄哎!我看了他们的论文,其实他们代码开源了(Meta比我司在这方面良心好多啊啊啊!)我们其实写个接口就行。而且我感觉科研上也很有遐想的空间。https://www.arxiv.org/abs/2408.06292 我在想也许还能接上AI scientist。但目前我没看到他们的代码
> nougat复读蛮严重的,这个项目有做一些后验的处理去除掉nougat输出的源文件的复读部分和一些latex公式错误的部分吗 感谢亲的建议!抱歉才看到这个讨论。我自己不是做NLP的,所以之前没有留意到Meta这个工作。。MT这个项目目前代码其实没有AI flavor,也不是一个NLP的项目,只是给Latex然后去根据key word detect command,纯工程。亲提到的后验,精巧的可能不太好做到(可以detect如果连续重复4次以上可以keep一次的,这类的)。nougat的复读原因是transformer的问题,如果想更clever的解决复读,我想如果有solution那不如直接弄进transformer block里面更有价值,而非在我们这边 (我看原论文用了augmentation)。nougat本身跟我们还蛮适配的~而且可能科研角度也会有些有趣的点,比如训多语言的文本也许能帮助nougat降低repetition的频率等等。。但我们两个平时科研工作实在是太忙了:(