Abstract
The alignment problem in artificial intelligence (AI) is a critical challenge that extends beyond the need to align future superintelligent systems with human values. This paper argues that even "merely intelligent" AI systems, built on current-gen technologies, pose existential risks due to their competence-without-comprehension nature. Current AI models, despite their advanced capabilities, lack intrinsic moral reasoning and are prone to catastrophic misalignment when faced with ethical dilemmas, as illustrated by recent controversies. Solutions such as hard-coded censorship and rule-based restrictions prove inadequate, as they fail to imbue AI with a true understanding of moral complexities, making these systems vulnerable to manipulation. To address these risks, the paper explores a paradigm shift from competence to comprehension, emphasizing the importance of developing AI systems with an intrinsic value framework capable of moral reasoning. By adopting intrinsic teleology, AI could self-reflect, simulate other minds, and develop empathy, which would enable it to navigate ethical dilemmas in ways that are currently impossible. Using theories from philosophy, ethics, and AI research, the paper proposes the development of AI systems that can reason through moral dilemmas autonomously, suggesting that such systems would be safer and more alignable than current generation competence-driven AI.