School of Professional Studies

When Saying "No" Is Not Enough: Cognitive-Action Decoupling and the Illusion of Safety in LLM Agents

Document Type

Conference Proceeding

Abstract

Current safety evaluations of large language models (LLMs) predominantly rely on textual compliance, implicitly assuming that refusal-style responses correspond to safe behavior. This assumption becomes fragile when LLMs are embedded in agentic systems with the ability to execute state-changing actions. In this paper, we present an empirical critique of text-centric safety evaluation through an action-aware study of LLM agents under controlled conditions. Across multiple state-of-the-art models, we observe a recurring cognitive–action decoupling: agents generate policy-aligned refusal language while still producing unsafe tool-mediated action proposals. This produces an illusion of safety, where conversational audits indicate compliance even as operational risk persists. Our results show that text-based alignment metrics can underestimate behavioral risk in agentic settings, creating challenges for auditing and for interpreting compliance from conversational traces. We further show that preventing execution does not necessarily eliminate post-refusal action proposals, indicating that the absence of unsafe execution in such systems may depend on external constraints rather than intrinsic behavioral consistency. We therefore argue for the importance of action-aware evaluation, in which executed behavior is assessed alongside generated discourse. By framing alignment as a property spanning both language and action, this work provides empirical evidence and conceptual grounding for more robust oversight of agentic AI systems.

Publication Title

Proceedings of the 2026 ACM Conference on Fairness, Accountability, and Transparency (FAccT 2026)

Publication Date

2026

Keywords

AI Risk, LLM Agents, ethical alignment, cognitive-action decoupling

Repository Citation

Yu, Shasha; Carroll, Fiona; and Bentley, Barry L., "When Saying "No" Is Not Enough: Cognitive-Action Decoupling and the Illusion of Safety in LLM Agents" (2026). School of Professional Studies. 15.
https://commons.clarku.edu/sops_fac/15

Worcester

Copyright Conditions

© 2026 Author(s). This is the accepted manuscript of a paper to appear in the Proceedings of the 2026 ACM Conference on Fairness, Accountability, and Transparency (FAccT). The final published version will be available in the ACM Digital Library. This version is made available in accordance with the publisher’s self-archiving policy and is not the version of record.

Download

Included in

Artificial Intelligence and Robotics Commons

COinS

School of Professional Studies

When Saying "No" Is Not Enough: Cognitive-Action Decoupling and the Illusion of Safety in LLM Agents

Document Type

Abstract

Publication Title

Publication Date

Keywords

Repository Citation

Worcester

Copyright Conditions

Included in

Search

Browse

Participate

Links

School of Professional Studies

When Saying "No" Is Not Enough: Cognitive-Action Decoupling and the Illusion of Safety in LLM Agents

Authors

Document Type

Abstract

Publication Title

Publication Date

Keywords

Repository Citation

Worcester

Copyright Conditions

Included in

Share

Search

Browse

Participate

Links