Visual Perception in the Human Brain: How the Brain Perceives and Understands Real-World Scenes


How humans perceive and understand real-world scenes is a long-standing question in neuroscience, cognitive psychology, and artificial intelligence. Initially, it was thought that scenes are constructed and represented by their component objects. An alternative view proposed that scene perception starts by extracting global features (e.g., spatial layout) first and individual objects in later stages. A third framework focuses on how the brain not only represents objects and layout but how this information combines to allow determining possibilities for (inter)action that the environment offers us. The discovery of scene-selective regions in the human visual system sparked interest in how scenes are represented in the brain. Experiments using functional magnetic resonance imaging show that multiple types of information are encoded in the scene-selective regions, while electroencephalography and magnetoencephalography measurements demonstrate links between the rapid extraction of different scene features and scene perception behavior. Computational models such as deep neural networks offer further insight by how training networks on different scene recognition tasks results in the computation of diagnostic features that can then be tested for their ability to predict activity in human brains when perceiving a scene. Collectively, these findings suggest that the brain flexibly and rapidly extracts a variety of information from scenes using a distributed network of brain regions.

Oxford Research Encyclopedia of Neuroscience 2023