EscapeBench: Towards advancing creative intelligence of language model agents
Cheng Qian
Computer science - computation and language, computer science - artificial intelligence, computer science - machine learning
Abstract
Language model agents excel in long-session planning and reasoning, but existing benchmarks primarily focus on goal-oriented tasks with explicit objectives, neglecting creative adaptation in unfamiliar environments. To address this, we introduce EscapeBench, a benchmark suite of room escape game environments designed to challenge agents with creative reasoning, unconventional tool use, and iterative problem-solving to uncover implicit goals. Our results show that current LM models, despite emplo
Relevance Assessment
Research Gap
Notes
Notes are automatically saved as you type
Tags
Search Queries
Paper ID: 9c01f1cc-47e3-4f53-92f9-2f14eb00b58aAdded: 10/26/2025