acl2024 topics

🧙🏻 Code and benchmark for our Findings of ACL 2024 paper - "TimeChara: Evaluating Point-in-Time Character Hallucination of Role-Playing Large Language Models"

ahnjaewoo

acl2024

benchmark

dataset

dialogue

Cotempqa

32

Stars

1

Forks

32

Watchers

Code and data for "Living in the Moment: Can Large Language Models Grasp Co-Temporal Reasoning?" (ACL 2024)

zhaochen0110

acl2024

benchmark

large-language-models

nature-language-process

NewsBench

33

Stars

1

Forks

33

Watchers

[ACL 2024 Main] NewsBench: A Systematic Evaluation Framework for Assessing Editorial Capabilities of Large Language Models in Chinese Journalism

IAAR-Shanghai

acl2024

aquila2

baichaun2

benchmark

KIEval

38

Stars

2

Forks

38

Watchers

[ACL'24] A Knowledge-grounded Interactive Evaluation Framework for Large Language Models

zhuohaoyu

acl2024

explainable-ai

llm

llm-evaluation

camera

26

Stars

2

Forks

26

Watchers

Multimodal dataset for ad text generation in Japanese [Mita+, ACL2024]

CyberAgentAILab

acl2024

advertising

dataset

multimodal