OpenCompass

Results 7 repositories owned by


                                            OpenCompass

opencompass

6.3k

Stars

689

Forks

6.3k

Watchers

OpenCompass is an LLM evaluation platform, supporting a wide range of models (Llama3, Mistral, InternLM2,GPT-4,LLaMa2, Qwen,GLM, Claude, etc) over 100+ datasets.

open-compass

benchmark

chatgpt

evaluation

large-language-model

LawBench

235

Stars

Forks

Watchers

Benchmarking Legal Knowledge of Large Language Models

open-compass

benchmark

chatgpt

law

llm

VLMEvalKit

1.1k

Stars

157

Forks

Watchers

Open-source evaluation toolkit of large vision-language models (LVLMs), support ~100 VLMs, 40+ benchmarks

open-compass

benchmark

gemini

gpt-4v

large-language-models

MixtralKit

759

Stars

Forks

Watchers

A toolkit for inference and evaluation of 'mixtral-8x7b-32kseqlen' from Mistral AI

open-compass

llm

mistral

moe

Ada-LEval

Stars

Forks

Watchers

The official implementation of "Ada-LEval: Evaluating long-context LLMs with length-adaptable benchmarks"

open-compass

gpt4

llm

long-context

ANAH

Stars

Forks

Watchers

[ACL 2024] ANAH & [NeurIPS 2024] ANAH-v2 & [ICLR 2025] Mask-DPO

open-compass

acl

gpt

hallucination-detection

llms

Official repo of "MMBench-GUI: Hierarchical Multi-Platform Evaluation Framework for GUI Agents". It can be used to evaluate a GUI agent with a hierarchical manner across multiple platforms, including...

open-compass

benchmark-framework

computer-use

gui-agent

vision-language-model