Visual short term memory

From Scholarpedia
Steven J. Luck (2007), Scholarpedia, 2(6):3328. doi:10.4249/scholarpedia.3328 revision #47721 [link to/cite this article]
Jump to: navigation, search
Post-publication activity

Curator: Steven J. Luck

Visual short term memory (VSTM) is a memory system that stores visual information for a few seconds so that it can be used in the service of ongoing cognitive tasks. Compared with iconic memory representations, VSTM representations are longer lasting, more abstract, and more durable. VSTM representations can survive eye movements, eye blinks, and other visual interruptions, and they may play an important role in maintaining continuity across these interruptions. VSTM also differs markedly from long-term memory (LTM). Specifically, whereas LTM has a virtually infinite storage capacity and creates richly detailed representations over a relatively long time period, VSTM has a highly limited storage capacity and creates largely schematic representations very rapidly. VSTM is usually considered to be the visual storage component of the broader working memory system.


Measuring visual short term memory

Figure 1: Example of the one-shot change-detection task.
Figure 2: Animated example of a one-shot color change detection task with varying set sizes.
Figure 3: Typical results from a one-shot change detection task (from Vogel et al., 2001).

Four general classes of tasks have most often been used to study VSTM. In one class of tasks, subjects are are asked to create a mental image. In the Brook Matrix Task (Brooks, 1967), for example, subjects are told a set of numbers and their relative spatial locations within a matrix (e.g., “place a 4 in the upper left corner; the place a 3 below this position”). It is assumed that the mental image is stored in VSTM. These tasks are usually studied in the context of dual-task interference experiments, in which the goal is to determine whether the VSTM task can be performed concurrently with another task.

A second class of VSTM tasks uses a recall procedure. For example, the subject may be presented with a colored square for 500 ms and then, after a 1000-ms delay, be asked to point to the remembered color of this item on a color wheel (see, e.g., Wilken & Ma, 2004).

A third class of VSTM tasks uses a sequential comparison procedure. For example, the subject may be presented with a colored square for 500 ms and then, after a 1000-ms delay, be shown another colored square and asked whether it is the same color as the remembered color. This procedure is akin to the partial report technique that typically is used to study iconic memory, but the long delay between the display phase and the recognition phase exceeds the limits of iconic memory, meaning the task depends on longer-lasting VSTM.

A common version of the sequential comparison procedure is the change-detection task. In the one-shot version of the change-detection task (first developed by Phillips, 1974), observers view a brief sample array, which consists of one or more objects that the observers try to remember (see Figure 1). After a brief retention interval, a test array is presented, and the observers compare the test array with the sample array to determine if there are any differences. The number of objects in the array (the set size) is often varied, and detection accuracy typically declines as the number of objects increases. An animated demonstration of this task is shown in Figure 2, and typical results are shown in Figure 3.

A fourth class of VSTM tasks, used most often in monkeys, requires the observer to withhold a response after seeing a target. For example, while the observer is looking at a central fixation point, a small target will flash at some peripheral location; the observer must continue looking at the fixation point until it disappears, at which time, the remembered location of the target is fixated (see, e.g., Funahashi, Bruce, & Goldman-Rakic, 1993).

The last three classes of VSTM tasks are highly similar insofar as they involve the brief presentation of a set of stimuli followed by a short delay period and then some kind of simple memory test. It is not clear whether the first class of VSTM tasks—which involves mental imagery—taps the same memory system as the last three classes of VSTM tasks.

Neural substrates of visual short term memory

Figure 4: Example of single-unit delay-period activity following two classes of stimuli. Each stimulus is presented for 100 ms, but the subject must remember the stimulus until the end of the trial. In this example, stimulus A elicits a much larger sensory response than stimulus B, and the activity is maintained long after the stimulus disappears.

Whereas long term memory representations are stored by means of long lasting changes in synaptic connections, VSTM representations are stored by means of sustained firing of action potentials. This can be observed directly in monkeys by recording the activity of individual neurons in VSTM tasks. When a monkey has been shown a to-be-remembered stimulus, neurons in specific areas will begin to fire and will continue to fire during the delay interval. In many cases, neurons in high-level areas of visual cortex that produce a large sensory response to the initial presentation of the stimulus are the same neurons that will exhibit sustained activity during the delay period. An example of this is shown in Figure 4. Neural activity during the delay period of a VSTM task can also be observed in neuroimaging studies (Cohen et al., 1997) and event-related potential studies (Vogel & Machizawa, 2004). Activity in the intraparietal sulcus is closely tied with VSTM performance (Todd & Marois, 2004; Xu & Chun, 2006).

It is thought that delay activity involves recurrent neural networks. That is, the neurons that respond to a stimulus are part of a circuit in which the activity in these sensory neurons ultimately flows back to them, allowing them to continue firing when the stimulus has been removed (Raffone & Wolters, 2001).

Subdividing visual short term memory

VSTM can be readily distinguished from verbal short term memory. Brain damage can lead to a disruption of verbal short term memory without a disruption of VSTM and vice versa (De Renzi & Nichelli, 1975). In addition, it is possible to fill up verbal short term memory with one task without impacting VSTM for another task and vice versa (Scarborough, 1972; Vogel, Woodman, & Luck, 2001).

VSTM can also be subdivided into spatial and object subsystems, although there is some controversy about this issue. Support for separate spatial and object subsystems comes from several sources:

  • Dual-task studies have shown that spatial VSTM but not object VSTM is impaired by the performance of certain concurrent tasks, whereas object but not spatial VSTM is impaired by other concurrent tasks (Hyun & Luck, in press; Logie & Marchetti, 1991; Tresch, Sinnamon, & Seamon, 1993; Woodman & Luck, 2004; Woodman, Vogel, & Luck, 2001).
  • Brain damage may disrupt object memory without disrupting spatial memory, or vice versa (De Renzi & Nichelli, 1975; Farah, Hammond, Levine, & Calvanio, 1988; Hanley, Young, & Person, 1991).
  • Sustained delay-period activity is observed in the parietal lobe for spatial VSTM tasks but in the occipital and temporal lobes for object VSTM tasks (Cohen et al., 1997; S.M. Courtney, Ungerleider, Keil, & Haxby, 1996; Courtney, Ungerleider, Keil, & Haxby, 1997; Fuster & Jervey, 1981; Gnadt & Andersen, 1988; Miller, Li, & Desimone, 1993; Smith & Jonides, 1997).

However, there is also evidence that spatial and object information is integrated in VSTM:

  • Prefrontal cortex is active during both spatial and object VSTM tasks (Postle & D'Esposito, 1999; Rainer, Asaad, & Miller, 1998).
  • Disruptions in the spatial organization of objects can influence object VSTM (Jiang, Olson, & Chun, 2000).

A possible resolution to these conflicting findings is that spatial and object information are stored in separate posterior brain systems but are functionally linked by their mutual connections with prefrontal control systems.

Capacity limits in visual short term memory

Figure 5: Stimuli and results from the study of Luck and Vogel (1997). In one condition, observers were instructed to remember only the colors of the items because only color could change. In a second condition, observers were instructed to remember only the orientations of the items because only orientation could change. In a third condition, observers were instructed to remember both the colors and the orientations because either could change.

Early studies of VSTM using alphanumeric characters suggested a capacity limit of 4-5 items (e.g., Sperling, 1960), but it was not clear whether the items were being stored visually or verbally. Experiments using the change-detection task have estimated a capacity of 3-4 objects using basic visual features combined with an interference task to limit contributions from verbal short term memory (Luck & Vogel, 1997). As shown in Figure 3, for example, observers are highly accurate for arrays containing 1-3 simple objects, and performance declines systematically as the number of items increases. Quantitative estimates of capacity using the Pashler/Cowan K equation (Cowan et al., 2005; Pashler, 1988) typically lead to estimates of 3-4 items, which might reflect a broad limit on active memory maintenance (Cowan, 2001).

However, it is not yet clear whether VSTM is limited to a set of 3-4 high-resolution representations or "slots" or whether the limits are due to the amount of information rather than the number of objects. The first view proposes that VSTM consists of a small number of fixed-resolution slots, and Luck and Vogel (1997) proposed that the capacity of VSTM is limited by the number of objects rather than the number features that must be remembered. That is, objects are the fundamental storage unit for VSTM. As shown in Figure 5, they demonstrated that observers could remember the colors and orientations of four objects just as well as they could remember only the colors or only the orientations. They further showed that objects defined by four features could be remembered as well as objects defined by a single feature. An alternative possibility is that each feature dimension is represented in a separate memory store (Magnussen, Greenlee, & Thomas, 1996). Subsequent research has shown that features can be stored more efficiently when they form an object than when they do not (Xu 2002a, 2002b) and that observers can detect differences between arrays that contain the same features but in different combinations (Johnson, Hollingworth, & Luck, in press; Wheeler & Treisman, 2002). Yet, such research also shows that performance is worse when multiple features are drawn from the same feature dimension (e.g, objects are composed of two colors that could change independently rather than one color and one orientation).

The second view proposes that capacity limits on VSTM result not from the number of objects or a fixed number of slots, but from the amount of information in the display. This view, often called the “resource” hypothesis, proposes that a fixed pool of resources is divided among the available items, with the resolution of the representation reduced as the number of items is increased. In this view, all of the objects may be stored, but with decreased resolution as the amount of information increases (Alvarez & Cavanagh, 2004; Vogel et al., 2001; Wilken & Ma, 2004). Alvarez and Cavanagh (2004) found that estimated capacity decreased as a function of the difficulty of discriminating different items increased. By extrapolating to a case in which the items were maximally discriminable with minimal effort, they also estimated a maximum capacity of approximately 4.5 items for the simplest items.

The difference between these views rests on whether the fundamental units of visual memory are discrete objects that are stored in fixed-resolution slots or whether the determining factor in the capacity of VSTM is the amount of information to be stored, independent of the number of objects.

Creation, maintenance, and use of visual short-term memory representations

Perceptual representations are fragile and are easily overwritten by new stimuli (the phenomenon of visual masking). VSTM representations, in contrast, must survive incoming stimuli. The process of transforming transient perceptual representations into durable VSTM representations is called consolidation (by analogy to the memory consolidation process used to stabilize long-term memory representations) or vulcanization (by analogy to the vulcanization process used to make rubber durable). Initial research on this process indicates that it involves a limited-capacity central process (Jolicoeur & Dell' Acqua, 1998; Vogel, Woodman, & Luck, 2006) and that it requires 20-50 ms to consolidate each item (Gegenfurtner & Sperling, 1993; Shibuya & Bundesen, 1988; Vogel, Woodman, & Luck, in press). However, it is not yet known whether the consolidation process occurs simultaneously for all items in memory (i.e., in parallel) or sequentially for each item (i.e., serially).

The consolidation process appears to play an important role in the attentional blink phenomenon. Specifically, the attentional blink appears to occur when the second of two targets has been perceived but is not consolidated in VSTM.

VSTM representations may decay, terminate, or drift over time. For example, spatial VSTM representations may be attracted toward or repelled away from spatial reference points (Simmering, Spencer, & Schöner, in press; Spencer & Hund, 2002).

The process of maintaining representations in VSTM is not very effortful, and it is possible to perform highly attention-demanding tasks during the delay period of a VSTM with little or no interference as long as these tasks do not require the use of VSTM. For example, people can perform a difficult visual search while they are concurrently maintaining several colors or shapes in VSTM (Woodman et al., 2001). However, a visual search task interferes with a spatial VSTM task (Woodman & Luck, 2004), presumably because visual search requires spatial memory to avoid revisiting already-searched locations (Peterson, Kramer, Wang, Irwin, & McCarley, 2001).

If a VSTM representation survives the delay period, it can be used in further cognitive processing. This often involves comparing the VSTM representations with new sensory inputs, as in the change-detection paradigm. Although this comparison process has not yet received much study in the context of VSTM, visual comparison processes were extensively studied in the context of visual perception from the 1960s through the early 1980s (Farell, 1985). The comparison of two simultaneous perceptual inputs appears to be largely identical to the comparison of a perceptual input with a VSTM representation (Hyun, 2006; Scott-Brown, Baker, & Orbach, 2000), so the results of this older literature are probably relevant for VSTM. These older studies indicated that comparison involves two parallel processes, one that can rapidly determine that two patterns are the same and one that more slowly finds differences. As a result, responses are typically faster when the patterns being compared are the same than when they are different.

The function of visual short-term memory representations

VSTM is thought to be the visual component of the working memory system, and as such it is used as a buffer for temporary information storage during the process of naturally occurring tasks. But what naturally occurring tasks actually require VSTM? Most work on this issue has focused on the role of VSTM in bridging the sensory gaps caused by saccadic eye movements. These sudden shift of gaze typically occur 2-4 times per second, and vision is briefly suppressed while the eyes are moving. Thus, the visual input consists of a series of spatially shifted snapshots of the overall scene, separated by brief gaps. Over time, a rich and detailed long-term memory representation is constructed from these brief glimpses of the input (Hollingworth, 2004), and VSTM is thought to bridge the gaps between these glimpses (Irwin, 1991) and to allow the relevant portions of one glimpse to be aligned with the relevant portions of the next glimpse (Currie, McConkie, Carlson-Radvansky, & Irwin, 2000; Henderson & Hollingworth, 1999). Both spatial and object VSTM systems may play important roles in the integration of information across eye movements.

Spatial VSTM might also play an important role in keeping track of locations that have already been attended when subjects search for targets in complex scenes. Inhibition-of-return experiments have shown that after attention has visited a location, it tends not to revisit the same location again immediately afterward (Klein, 2000; Peterson et al., 2001; Posner & Cohen, 1984). It appears that the visual system can exhibit inhibition at several previously attended locations over a period of a few seconds (Snyder & Kingstone, 2001), and the inhibition is reduced when spatial VSTM is occupied by a concurrent task (Castel, Pratt, & Craik, 2003).


  • Alvarez, G. A., & Cavanagh, P. (2004). The capacity of visual short-term memory is set both by information load and by number of objects. Psychological Science, 15, 106-111.
  • Brooks, L. R. (1967). The suppression of visualization by reading. Quarterly Journal of Experimental Psychology, 19, 289-299.
  • Castel, A. D., Pratt, J., & Craik, F. I. (2003). The role of spatial working memory in inhibition of return: evidence from divided attention tasks. Perception & Psychophysics, 65(6), 970-981.
  • Cohen, J. D., Perlstein, W. M., Braver, T. S., Nystrom, L. E., Noll, D. C., Jonides, J., et al. (1997). Temporal dynamics of brain activation during a working memory task. Nature, 386(6625), 604-608.
  • Courtney, S. M., Ungerleider, L. G., Keil, K., & Haxby, J. V. (1996). Object and spatial visual working memory activate separate neural systems in human cortex. Cerebral Cortex, 6, 39-49.
  • Courtney, S. M., Ungerleider, L. G., Keil, K., & Haxby, J. V. (1997). Transient and sustained activity in a distributed neural system for human working memory. Nature, 386, 608-611.
  • Cowan, N. (2001). The magical number 4 in short-term memory: A reconsideration of mental storage capacity. Behavioral and Brain Sciences, 24, 87-185.
  • Cowan, N., Elliott, E. M., Saults, J. S., Morey, C. C., Mattox, S., Ismajatulina, A., et al. (2005). On the capacity of attention: Its estimation and its role in working memory and cognitive aptitudes. Cognitive Psychology, 51, 42-100.
  • Currie, C., McConkie, G., Carlson-Radvansky, L. A., & Irwin, D. E. (2000). The role of the saccade target object in the perception of a visual stable world. Perception & Psychophysics, 62, 673-683.
  • De Renzi, E., & Nichelli, P. (1975). Verbal and nonverbal short-term memory impairment following hemispheric damage. Cortex, 11, 341-354.
  • Farah, M. J., Hammond, K. M., Levine, D. N., & Calvanio, R. (1988). Visual and spatial mental imagery: Dissociable systems of representation. Cognitive Psychology, 20(4), 439-462.
  • Farell, B. (1985). "Same"–"different" judgments: A review of current controversies in perceptual comparisons. Psychological Bulletin, 98, 419-456.
  • Funahashi, S., Bruce, C. J., & Goldman-Rakic, P. S. (1993). Dorsolateral prefrontal lesions and oculomotor delayed-response performance: Evidence for mnemonic "scotomas". Journal of Neuroscience, 13, 1479-1497.
  • Fuster, J. M., & Jervey, J. P. (1981). Inferotemporal neurons distinguish and retain behaviorally relevant features of visual stimuli. Science, 212, 952-954.
  • Gegenfurtner, K. R., & Sperling, G. (1993). Information transfer in iconic memory experiments. Journal of Experimental Psychology: Human Perception & Performance, 19(4), 845-866.
  • Gnadt, J. W., & Andersen, R. A. (1988). Memory related motor planning activity in posterior parietal cortex of macaque. Experimental Brain Research, 70, 216-220.
  • Hanley, J. F., Young, A. W., & Person, N. A. (1991). Impairment of the visuo-spatial sketch pad. Quarterly Journal of Experimental Psychology, 43A, 101-125.
  • Henderson, J. M., & Hollingworth, A. (1999). The role of fixation position in detecting scene changes across saccades. Psychological Science, 10, 438-443.
  • Hollingworth, A. (2004). Constructing visual representations of natural scenes: The roles of short- and long-term visual memory. Journal of Experimental Psychology: Human Perception & Performance, 30, 519-537.
  • Hyun, J.-S. (2006). How are visual working memory representations compared with perceptual inputs? , University of Iowa, Iowa City, Iowa.
  • Hyun, J.-S., & Luck, S. J. (in press). Visual working memory as the substrate for mental rotation. Psychonomic Bulletin & Review.
  • Irwin, D. E. (1991). Information integration across saccadic eye movements. Cognitive Psychology, 23(3), 420-456.
  • Jiang, Y., Olson, I. R., & Chun, M. M. (2000). Organization of visual short-term memory. Journal of Experimental Psychology: Learning, Memory & Cognition, 2, 683-702.
  • Johnson, J. S., Hollingworth, A., & Luck, S. J. (in press). The role of attention in the maintenance of feature bindings in visual short-term memory. Journal of Experimental Psychology: Human Perception and Performance.
  • Jolicoeur, P., & Dell' Acqua, R. (1998). The demonstration of short-term consolidation. Cognitive Psychology, 36(2), 138-202.
  • Klein, R. (2000). Inhibition of Return. Trends in Cognitive Science.
  • Logie, R. H., & Marchetti, C. (1991). Visuo-spatial working memory: Visual, spatial or central executive? Advances in Psychology, 80, 105-115.
  • Luck, S. J., & Vogel, E. K. (1997). The capacity of visual working memory for features and conjunctions. Nature, 390, 279-281.
  • Magnussen, S., Greenlee, M. W., & Thomas, J. P. (1996). Parallel processing in visual short-term memory. Journal of Experimental Psychology: Human Perception and Performance, 22, 202-212.
  • Miller, E. K., Li, L., & Desimone, R. (1993). Activity of neurons in anterior inferior temporal cortex during a short-term memory task. Journal of Neuroscience, 13, 1460-1478.
  • Pashler, H. (1988). Familiarity and visual change detection. Perception & Psychophysics, 44, 369-378.
  • Peterson, M. S., Kramer, A. F., Wang, R. F., Irwin, D. E., & McCarley, J. S. (2001). Visual search has memory. Psychological Science, 12, 287-292.
  • Phillips, W. A. (1974). On the distinction between sensory storage and short-term visual memory. Perception & Psychophysics, 16, 283-290.
  • Posner, M. I., & Cohen, Y. (1984). Components of visual orienting. In H. Bouma & D. G. Bouwhuis (Eds.), Attention and Performance X (pp. 531-556). Hillsdale, New Jersey: Erlbaum.
  • Postle, B. R., & D'Esposito, M. (1999). "What" then "where" in visual working memory: An event-related fMRI study. Journal of Cognitive Neuroscience, 11, 585-597.
  • Raffone, A., & Wolters, G. (2001). A cortical mechanism for binding in visual working memory. Journal of Cognitive Neuroscience, 13, 766-785.
  • Rainer, G., Asaad, W. F., & Miller, E. K. (1998). Selective representation of relevant information by neurons in the primate prefrontal cortex. Nature, 393, 577-579.
  • Scarborough, D. L. (1972). Memory for brief visual displays of symbols. Cognitive Psychology, 3, 408-429.
  • Scott-Brown, K. C., Baker, M. R., & Orbach, H. S. (2000). Comparison blindness. Visual Cognition, 7(1-3), 253-267.
  • Shibuya, H., & Bundesen, C. (1988). Visual selection from multielement displays: Measure and modeling effects of exposure duration. Journal of Experimental Psychology: Human Perception and Performance, 14, 591-600.
  • Simmering, V. R., Spencer, J. P., & Schöner, G. (in press). Reference-related inhibition produces enhanced position discrimination and fast repulsion near axes of symmetry. Perception & Psychophysics.
  • Smith, E. E., & Jonides, J. (1997). Working memory: A view from neuroimaging. Cognitive Psychology, 33, 5-42.
  • Snyder, J. J., & Kingstone, A. (2001). Multiple location inhibition of return: When you see it and when you don't. Quarterly Journal of Experimental Psychology, 54A, 1221-1237.
  • Spencer, J. P., & Hund, A. M. (2002). Prototypes and particulars: Spatial categories are formed using geometric and experience-dependent information. Journal of Experimental Psychology: General, 131, 16-37.
  • Sperling, G. (1960). The information available in brief visual presentations. Psychological Monographs, 74, (Whole No. 498).
  • Todd, J. J., & Marois, R. (2004). Capacity limit of visual short-term memory in human posterior parietal cortex. Nature, 428, 751-754
  • Tresch, M. C., Sinnamon, H. M., & Seamon, J. G. (1993). Double dissociation of spatial and object visual memory: Evidence from selective interference in intact human subjects. Neuropsychologia, 31(3), 211-219.
  • Vogel, E. K., & Machizawa, M. G. (2004). Neural activity predicts individual differences in visual working memory capacity. Nature, 428, 748-751.
  • Vogel, E. K., Woodman, G. F., & Luck, S. J. (2001). Storage of features, conjunctions, and objects in visual working memory. Journal of Experimental Psychology: Human Perception and Performance, 27, 92-114.
  • Vogel, E. K., Woodman, G. F., & Luck, S. J. (2006). The time course of consolidation in visual working memory. Journal of Experimental Psychology: Human Perception and Performance, 32, 1436-1451.
  • Vogel, E. K., Woodman, G. F., & Luck, S. J. (in press). The time course of consolidation in visual working memory. Journal of Experimental Psychology: Human Perception and Performance.
  • Wheeler, M., & Treisman, A. M. (2002). Binding in short-term visual memory. Journal of Experimental Psychology: General, 131, 48-64.
  • Wilken, P., & Ma, W. J. (2004). A detection theory account of change detection. Journal of Vision, 4, 1120-1135.
  • Woodman, G. F., & Luck, S. J. (2004). Visual search is slowed when visuospatial working memory is occupied. Psychonomic Bulletin & Review, 11, 269-274.
  • Woodman, G. F., Vogel, E. K., & Luck, S. J. (2001). Visual search remains efficient when visual working memory is full. Psychological Science, 12, 219-224.
  • Xu, Y. (2002a). Encoding color and shape from different parts of an object in visual short-term memory. Perception & Psychophysics, 64, 1260-1280.
  • Xu, Y. (2002b). Limitations of object-based feature encoding in visual short-term memory. Journal of Experimental Psychology: Human Perception & Performance, 28, 458-468.
  • Xu, Y., & Chun, M. M. (2006). Dissociable neural mechanisms supporting visual short-term memory for objects. Nature, 440, 91-95.

Internal references

  • Valentino Braitenberg (2007) Brain. Scholarpedia, 2(11):2918.
  • Keith Rayner and Monica Castelhano (2007) Eye movements. Scholarpedia, 2(10):3649.
  • William D. Penny and Karl J. Friston (2007) Functional imaging. Scholarpedia, 2(5):1478.
  • Peter Jonas and Gyorgy Buzsaki (2007) Neural inhibition. Scholarpedia, 2(9):3286.
  • Bruno G. Breitmeyer and Haluk Ogmen (2007) Visual masking. Scholarpedia, 2(7):3330.

External links - "Visual short term memory"

Author's Web Site

See also

Memory, Vision, Visual search, Working memory

Personal tools

Focal areas