Suppose that you are at a live jazz show. The drummer begins a solo. You see the
cymbal jolt and you hear the clang. But in addition seeing the cymbal jolt and hearing the clang,
you are also aware that the jolt and the clang are part of the same event. Casey O’Callaghan
(forthcoming) calls this awareness “intermodal feature binding awareness.” Psychologists have
long assumed that multimodal perceptions such as this one are the result of a subpersonal feature
binding mechanism (see Vatakis and Spence, 2007, Kubovy and Schutz, 2010, Pourtois et al.,
2000, and Navarra et al., 2012). I present new evidence against this. I argue that there is no
automatic feature binding mechanism that couples features like the jolt and the clang together.
Instead, when you experience the jolt and the clang as part of the same event, this is the result of
an associative learning process. The cymbal’s jolt and the clang are best understood as a single
learned perceptual unit, rather than as automatically bound. I outline the specific learning process
in perception called “unitization,” whereby we come to “chunk” the world into multimodal units.
Unitization has never before been applied to multimodal cases. Yet I argue that this learning
process can do the same work that intermodal binding would do, and that this issue has important
philosophical implications. Specifically, whether we take multimodal cases to involve a binding
mechanism or an associative process will have impact on philosophical issues from Molyneux’s
question to the question of how active or passive we consider perception to be.