Back to Papers

EscapeBench: Towards advancing creative intelligence of language model agents

Cheng Qian

2024Computer science - computation and language, computer science - artificial intelligence, computer science - machine learning

Abstract

Language model agents excel in long-session planning and reasoning, but existing benchmarks primarily focus on goal-oriented tasks with explicit objectives, neglecting creative adaptation in unfamiliar environments. To address this, we introduce EscapeBench, a benchmark suite of room escape game environments designed to challenge agents with creative reasoning, unconventional tool use, and iterative problem-solving to uncover implicit goals. Our results show that current LM models, despite emplo

Relevance Assessment

Research Gap

Notes

Notes are automatically saved as you type

Tags

Research Focus › BenchmarkResearch Focus › Controllable Generation

Search Queries

Paper ID: 9c01f1cc-47e3-4f53-92f9-2f14eb00b58aAdded: 10/26/2025