Memtensor Research Group

Results 11 repositories owned by


                                            Memtensor Research Group

UHGEval

179

Stars

Forks

179

Watchers

[ACL 2024] User-friendly evaluation framework: Eval Suite & Benchmarks: UHGEval, HaluEval, HalluQA, etc.

IAAR-Shanghai

aquila

baichuan

benchmark

chatglm

Grimoire

116

Stars

Forks

Watchers

Grimoire is All You Need for Enhancing Large Language Models

IAAR-Shanghai

baichuan

chatgpt

datasets

gpt-4

DATG

Stars

Forks

Watchers

[ACL 2024]Controlled Text Generation for Large Language Model with Dynamic Attribute Graphs

IAAR-Shanghai

controllable-text-generation

controlled-text-generation

fudge

graph

CRUD_RAG

221

Stars

Forks

Watchers

CRUD-RAG: A Comprehensive Chinese Benchmark for Retrieval-Augmented Generation of Large Language Models

IAAR-Shanghai

benchmark

large-language-models

retrieval-augmented-generation

xFinder

181

Stars

Forks

181

Watchers

[ICLR 2025] xFinder: Large Language Models as Automated Evaluators for Reliable Evaluation

IAAR-Shanghai

evaluation

gpt

llm

xfinder

ICSFSurvey

171

Stars

Forks

171

Watchers

Explore concepts like Self-Correct, Self-Refine, Self-Improve, Self-Contradict, Self-Play, and Self-Knowledge, alongside o1-like reasoning elevation🍓 and hallucination alleviation🍄.

IAAR-Shanghai

attention-head

chain-of-thought

data-augmentation

decoding

CTGSurvey

100

Stars

Forks

Watchers

Controllable Text Generation for Large Language Models: A Survey

IAAR-Shanghai

controllable-text-generation

controlled-text-generation

ctg

decoding

NewsBench

Stars

Forks

Watchers

[ACL 2024 Main] NewsBench: A Systematic Evaluation Framework for Assessing Editorial Capabilities of Large Language Models in Chinese Journalism

IAAR-Shanghai

acl2024

aquila2

baichaun2

benchmark

Awesome-Attention-Heads

387

Stars

Forks

387

Watchers

An awesome repository & A comprehensive survey on interpretability of LLM attention heads.

IAAR-Shanghai

attention-head-mining

attention-mechanism

awesome

chain-of-thought

xVerify

140

Stars

Forks

140

Watchers

xVerify: Efficient Answer Verifier for Reasoning Model Evaluations

IAAR-Shanghai

benchmark

cc-by-nc-nd-4

chatgpt

deepseek-math