Levels of Self-Improvement in AI and their Implications for AI Safety


Abstract: This article presents a model of self-improving AI in which improvement could happen on several levels: hardware, learning, code and goals system, each of which has several sublevels. We demonstrate that despite diminishing returns at each level and some intrinsic difficulties of recursive self-improvement—like the intelligence-measuring problem, testing problem, parent-child problem and halting risks—even non-recursive self-improvement could produce a mild form of superintelligence by combining small optimizations on different levels and the power of learning. Based on this, we analyze how self-improvement could happen on different stages of the development of AI, including the stages at which AI is boxed or hiding in the internet.

Author's Profile


Added to PP

476 (#26,481)

6 months
81 (#30,187)

Historical graph of downloads since first upload
This graph includes both downloads from PhilArchive and clicks on external links on PhilPapers.
How can I increase my downloads?