EscapeBench: Towards advancing creative intelligence of language model agents

Cheng Qian

Computer science - computation and language, computer science - artificial intelligence, computer science - machine learning

Abstract

Language model agents excel in long-session planning and reasoning, but existing benchmarks primarily focus on goal-oriented tasks with explicit objectives, neglecting creative adaptation in unfamiliar environments. To address this, we introduce EscapeBench, a benchmark suite of room escape game environments designed to challenge agents with creative reasoning, unconventional tool use, and iterative problem-solving to uncover implicit goals. Our results show that current LM models, despite emplo

Relevance Assessment

Research Gap

Notes

Notes are automatically saved as you type

Search Queries

("LLMs" OR large language models) AND creative BY TITLE - ("LLMs" OR large language models) AND creative

Paper ID: 9c01f1cc-47e3-4f53-92f9-2f14eb00b58aAdded: 10/26/2025

EscapeBench: Towards advancing creative intelligence of language model agents

Abstract

Relevance Assessment

Research Gap

Notes

Tags

Search Queries