Yibo Zhao
Yibo Zhao
Wonderful job, I have a stupid question. When calculating clip score, is it right to calculate the clip scores of all coco2017val image text pairs and then average them, and...
Can I only refer to the hat in this picture, which means that the reference can specify a specific area as the reference content. 
### Checklist - [x] I'm asking a question and **not** reporting a bug or requesting a feature - [x] I've looked through the [README](https://github.com/yt-dlp/yt-dlp#readme) - [x] I've verified that I...