Back to Papers

LLM Factoscope: Uncovering LLMs’ Factual Discernment through Measuring Inner States

Jinwen He

2024FINDINGS

Abstract

Large Language Models (LLMs) have revolutionized various domains with extensive knowledge and creative capabilities. However, a critical issue with LLMs is their tendency to produce outputs that diverge from factual reality. This phenomenon is particularly concerning in sensitive applications such as medical consultation and legal advice, where accuracy is paramount. Inspired by human lie detectors using physiological responses, we introduce the LLM Factoscope, a novel Siamese network-based model that leverages the inner states of LLMs for factual detection. Our investigation reveals distinguishable patterns in LLMs’ inner states when generating factual versus non-factual content. We demonstrate its effectiveness across various architectures, achieving over 96% accuracy on our custom-collected factual detection dataset. Our work opens a new avenue for utilizing LLMs’ inner states for factual detection and encourages further exploration into LLMs’ inner workings for enhanced reliability and transparency.

Relevance Assessment

Research Gap

Notes

Notes are automatically saved as you type

Tags

Search Queries

Paper ID: ab6e4246-b7a1-41fc-937b-fae57b9556bbAdded: 9/21/2025