HelloGitHub 【开源自荐】Chinese-CLIP——中文图文预训练表征&检索模型（欢迎试用 & 欢迎star~🔥🔥）

【开源自荐】Chinese-CLIP——中文图文预训练表征&检索模型（欢迎试用 & 欢迎star~🔥🔥）

Open yangapku opened this issue 2 years ago • 0 comments

推荐项目

项目地址：https://github.com/OFA-Sys/Chinese-CLIP

类别：Python

项目标题：OpenAI CLIP模型中文预训练版本，几行代码实现中文图文特征&图文检索

项目描述：大家好我们是达摩院OFA-Sys团队，欢迎在github试用我们的Chinese-CLIP图文预训练模型项目（https://github.com/OFA-Sys/Chinese-CLIP ），该项目是OpenAI CLIP模型的中文版本。我们使用大量互联网图文信息进行预训练（~2亿中文原生图文数据），提供了多个规模的预训练模型和技术报告，使对机器学习感兴趣的初学者，能几行代码完成中文图文特征提取和图文检索。在近期多个图文检索评测比赛（“兴智杯”全国人工智能创新应用大赛、天池电商多模态图文检索挑战赛）上，基于Chinese-CLIP的模型都取得榜首成绩！希望大家多多试用 & 多多star！

亮点：我们实现的中文版本CLIP在多个公开数据集上取得杰出的效果，基本超出市面同类型baseline图文表征和检索模型。上手门槛非常低，几行代码就可以完成中文图文特征提取和图文检索，打比赛做项目都非常给力，持续保持更新和维护当中！
示例代码：

import torch 
from PIL import Image

import cn_clip.clip as clip
from cn_clip.clip import load_from_name, available_models
print("Available models:", available_models())  
# Available models: ['ViT-B-16', 'ViT-L-14', 'ViT-L-14-336', 'ViT-H-14', 'RN50']

device = "cuda" if torch.cuda.is_available() else "cpu"
model, preprocess = load_from_name("ViT-B-16", device=device, download_root='./')
model.eval()
image = preprocess(Image.open("examples/pokemon.jpeg")).unsqueeze(0).to(device)
text = clip.tokenize(["杰尼龟", "妙蛙种子", "小火龙", "皮卡丘"]).to(device)

with torch.no_grad():
    image_features = model.encode_image(image)
    text_features = model.encode_text(text)
    # 对特征进行归一化，请使用归一化后的图文特征用于下游任务
    image_features /= image_features.norm(dim=-1, keepdim=True) 
    text_features /= text_features.norm(dim=-1, keepdim=True)    

    logits_per_image, logits_per_text = model.get_similarity(image, text)
    probs = logits_per_image.softmax(dim=-1).cpu().numpy()

print("Label probs:", probs)  # 图文匹配概率 [[1.268734e-03 5.436878e-02 6.795761e-04 9.436829e-01]]

截图：
后续更新计划：基于中文CLIP和diffusion模型的文本生成图像模型准备中，专门领域的CLIP模型也在准备中

如您推荐的项目收录到《HelloGitHub》月刊，您的 GitHub 帐号将展示在贡献人列表，同时会在本 issues 中通知您。

最后，感谢您对 HelloGitHub 项目的支持！

Nov 28 '22 13:11 yangapku

HelloGitHub HelloGitHub copied to clipboard

【开源自荐】Chinese-CLIP——中文图文预训练表征&检索模型（欢迎试用 & 欢迎star~🔥🔥）

推荐项目

HelloGitHub
HelloGitHub copied to clipboard