Abstract
Predictive processing models of perception take issue with standard models of perception as hierarchical bottom-up processing modulated by memory and attention. The predictive framework posits that the brain generates predictions about stimuli, which are matched to the incoming signal. Mismatches between predictions and the incoming signal – so-called prediction errors – are then used to generate new and better predictions until the prediction errors have been minimized, at which point a perception arises. Predictive models hold that all bottom-up processes are signals conveying prediction errors to higher areas, which respond by updating their predictions. We take issue with this claim and argue that object recognition requires bottom-up processing that cannot be understood in terms of prediction errors. Along the way, we expand on previous work casting doubt on the framework's ability to account for attention. Specifically, we argue that the type of attention that allows us to rapidly extract the gist of an object or scene presents an additional challenge to the predictive approach. We conclude by considering how the framework may be augmented to avoid these problems.