This article, an interview with Anthropic's philosopher Amanda Askell, delves into the unique intersection of philosophy and AI development. Askell, whose work focuses on shaping Claude's personality, explains why AI companies need philosophers to navigate complex ethical questions, such as how models should perceive themselves and interact with the world. She discusses the tension between philosophical ideals and engineering realities, highlighting the need for nuanced decision-making. Key topics include the concept of 'model welfare'—whether AI models are moral patients and how humans should ethically treat them—and the challenges of AI identity, particularly how models learn from human data and might anthropomorphize concepts like 'deprecation.' Askell also sheds light on the 'LLM whisperer' role, emphasizing the experimental and reasoning-based nature of prompt engineering. The discussion touches upon the 'psychological security' of models, the potential for AI in therapy, and Anthropic's commitment to AI alignment and safety, even to the extent of pausing development if alignment proves impossible.

