Standard models of visual perception hold that vision is an inferential or interpretative process. Such models are said to be superior to competing, non-inferential views in explanatory power. In particular, they are said to be capable of explaining a number of otherwise mysterious, visual phenomena such as multi-stable perception. Multi-stable perception paradigmatically occurs in the presence of ambiguous figures, single images that can give rise to two or more distinct percepts. Different interpretations are said to produce the different percepts. In this paper, I argue that a non-inferential account of visual perception is just as capable of explaining multi-stable perception. I propose an embedded understanding of vision, and show how the embedded account can, after properly qualifying them, use the explanatory resources of the inferential view to explain just what such a view explains.