hebartlab

Filter publications

Use CMD/CTRL+Click to select multiple filters

Author:

Year:

Preprints

Decoding peripheral saccade targets from foveal retinotopic cortex

Kämmer, Luca; Kroell, Lisa M; Knapen, Tomas; Rolfs, Martin; Hebart, Martin N

2025 , bioRxiv

URL PDF ABSTRACT

Human vision is characterized by frequent eye movements and constant shifts in visual input, yet our perception of the world remains remarkably stable. Here we directly demonstrate the existence of an image-specific foveal prediction signal that reaches all the way back to primary visual cortex. To this end, we used a gaze-contingent fMRI paradigm, in which peripheral saccade targets disappeared before they could be fixated. Despite no direct foveal stimulation, we were able to decode peripheral saccade targets from foveal retinotopic areas, demonstrating that image-specific feedback during saccade preparation must underlie this effect. Decoding was sensitive to shape but not semantic category of natural images, indicating feedback of only low-to-mid-level information. Cross-decoding to a control condition with foveal stimulus presentation indicated a shared representational format between foveal feedback and direct stimulation. Moreover, eccentricity-dependent analyses showed a u-shaped decoding curve, confirming that these results are not explained by spillover of peripheral activity or large receptive fields. Finally, fluctuations in foveal decodability covaried with activity in the intraparietal sulcus, thus providing a candidate region for driving foveal feedback. These findings reveal that foveal cortex predicts the features of incoming stimuli through feedback from higher cortical areas, which offers a candidate mechanism underlying stable perception and may facilitate object recognition across saccades.

Dynamic representation of multidimensional object properties in the human brain

Teichmann, Lina; Hebart, Martin N; Baker, Chris I

2025 , bioRxiv , pages: 2023--09

URL PDF PREPRINT ABSTRACT

Our visual world consists of an immense number of unique objects and yet, we are easily able to identify, distinguish, interact, and reason about the things we see within a few hundred milliseconds. This requires that we integrate and focus on a wide array of object properties to support diverse behavioral goals. In the current study, we used a large-scale and comprehensively sampled stimulus set and developed an analysis approach to determine if we could capture how rich, multidimensional object representations unfold over time in the human brain. We modelled time-resolved MEG signals evoked by viewing single presentations of tens of thousands of object images based on millions of behavioral judgments. Extracting behavior-derived object dimensions from similarity judgments, we developed a data-driven approach to guide our understanding of the neural representation of the object space and found that every dimension is reflected in the neural signal. Studying the temporal profiles for different object dimensions we found that the time courses fell into two broad types, with either a distinct and early peak (~125 ms) or a slow rise to a late peak (~300 ms). Further, early effects were stable across participants, in contrast to later effects which showed more variability, suggesting that early peaks may carry stimulus-specific and later peaks more participant-specific information. Dimensions with early peaks appeared to be primarily visual dimensions and those with later peaks more conceptual, suggesting that conceptual representations are more variable across people. Together, these data provide a comprehensive account of how behavior-derived object properties unfold in the human brain and form the basis for the rich nature of object vision.

Dynamic presentation in 3D modulates face similarity judgments--A human-aligned encoding model approach

Hofmann, Simon M; Ciston, Anthony; Koushik, Abhay; Klotzsche, Felix; Hebart, Martin N; Müller, Klaus-Robert; Villringer, Arno; Scherf, Nico; Hilsmann, Anna; Nikulin, Vadim V; others

2025 , bioRxiv , pages: 2023--09

PDF PREPRINT ABSTRACT

Face perception dynamically unfolds in three-dimensional space, yet, experimental paradigms predominantly rely on static 2D images, limiting insights into real-world face processing. We conducted a pre-registered study comparing face similarity judgments in static 2D and dynamic 3D conditions using a triplet odd-one-out task in 2,605 participants (yielding data from 323,400 unique trials). Behavioral similarity matrices revealed a strong cross-condition correlation (R2D~3D = 0.93, p < 0.001), suggesting perceptual invariance, that is, consistency across modalities. However, human-aligned sparse (VICE) and deep (VGG-Face) encoding models trained to map face stimuli to behavioral judgments uncovered condition-specific weighting of facial geometry: while chin-cheek distance, eye size, and nose shape dominated similarity judgements in both conditions, face-width-height ratio and upper face length gained more perceptual relevance in 3D. Importantly, the richer information in 3D stimuli significantly reduced choice variance, indicating lower perceptual demand than in 2D during similarity judgements. Employing a representational alignment framework, our approach reveals both shared cognitive processing and representational differences between static 2D and dynamic 3D faces, motivating more naturalistic experimental paradigms which reflect real-world perception. Our open large-scale dataset and encoding models enable further advances in face perception research across biological and computational systems.

Saccade onset, not fixation onset, best explains early responses across the human visual cortex during naturalistic vision

Amme, Carmen; Sulewski, Philip; Spaak, Eelke; Hebart, Martin N; König, Peter; Kietzmann, Tim C

2024 , bioRxiv , pages: 2024--10

URL PDF ABSTRACT

Visual processing has traditionally been investigated using static viewing paradigms, where participants are presented with streams of randomized stimuli. Observations from such experiments have been generalized to naturalistic vision, which is based on active sampling via eye movements. In studies of naturalistic vision, visual processing stages are thought to be initiated at the onset of fixations, equivalent to a stimulus onset. Here we test whether findings from static visual paradigms translate to active, naturalistic vision. Utilizing head-stabilized magnetoencephalography (MEG) and eye tracking data of 5 participants who freely explored thousands of natural images, we show that saccade onset, not fixation onset, explains most variance in latency and amplitude of the early sensory component M100. Source-projected MEG topographies of image and saccade onset were anticorrelated, demonstrating neural dynamics that share similar topographies but produce oppositely oriented fields. Our findings challenge the prevailing approach for studying natural vision and highlight the role of internally generated signals in the dynamics of sensory processing.

Human-aligned deep and sparse encoding models of dynamic 3D face similarity perception

Simon Hofmann; Anthony Ciston; Abhay Koushik; Felix Klotzsche; Martin N Hebart; Klaus-Robert Müller; Arno Villringer; Nico Scherf; Anna Hilsmann; Vadim V Nikulin; Michael Gaebler

2024 , PsyArXiv

URL PDF PREPRINT ABSTRACT

Face perception happens dynamically over time and primarily in three-dimensional space. Perceived similarity, including identity, should ideally remain invariant to changes along these dimensions. Surprisingly, much of our knowledge about face representations stems from static presentations of 2D images, which might not sufficiently capture real-world dynamic face processing. To test the effect of space and time on face similarity judgements, we conducted a pre-registered (https://osf. io/678uh) experiment using a triplet odd-one-out task in a static 2D and a dynamic 3D condition. We then trained sparse and deep computational encoding models of human face similarity judgements to investigate the latent representations underlying their predictions. Aggregated over all faces, we found a strong correlation between viewing conditions, indicating a consistent processing of face similarity between 2D and 3D. Despite these similarities, our encoding models revealed subtle differences between viewing conditions, where a small set of face features, such as distance between chin and cheeks, eye size, nose shape, and particularly in the 3D condition, face-width-height ratio, explained much of the variance in human judgements. Our openly available data and encoding models lay the groundwork for understanding face similarity judgements, which are crucial for our ability to recognise and identify faces in a dynamically changing environment.

Getting aligned on representational alignment

Sucholutsky, Ilia; Muttenthaler, Lukas; Weller, Adrian; Peng, Andi; Bobu, Andreea; Kim, Been; Love, Bradley C; Grant, Erin; Groen, Iris; Achterberg, Jascha and others

2023 , arXiv

URL PDF ABSTRACT

Biological and artificial information processing systems form representations of the world that they can use to categorize, reason, plan, navigate, and make decisions. How can we measure the similarity between the representations formed by these diverse systems? Do similarities in representations then translate into similar behavior? If so, then how can a system's representations be modified to better match those of another system? These questions pertaining to the study of representational alignment are at the heart of some of the most promising research areas in contemporary cognitive science, neuroscience, and machine learning. In this Perspective, we survey the exciting recent developments in representational alignment research in the fields of cognitive science, neuroscience, and machine learning. Despite their overlapping interests, there is limited knowledge transfer between these fields, so work in one field ends up duplicated in another, and useful innovations are not shared effectively. To improve communication, we propose a unifying framework that can serve as a common language for research on representational alignment, and map several streams of existing work across fields within our framework. We also lay out open problems in representational alignment where progress can benefit all three of these fields. We hope that this paper will catalyze cross-disciplinary collaboration and accelerate progress for all communities studying and developing information processing systems.

Parallel cognitive maps for short-term statistical and long-term semantic relationships in the hippocampal formation

Zheng, X. Y.; Hebart, M. N.; Dolan, R. J.; Doeller, C. F.; Cools, R.; Garvert, M. M.

2022 , bioRxiv

URL ABSTRACT

The hippocampal-entorhinal system uses cognitive maps to represent spatial knowledge and other types of relational information, such as the transition probabilities between objects. However, objects can often be characterized in terms of different types of relations simultaneously, e.g. semantic similarities learned over the course of a lifetime as well as transitions experienced over a brief timeframe in an experimental setting. Here we ask how the hippocampal formation handles the embedding of stimuli in multiple relational structures that differ vastly in terms of their mode and timescale of acquisition: Does it integrate the different stimulus dimensions into one conjunctive map, or is each dimension represented in a parallel map? To this end, we reanalyzed functional magnetic resonance imaging (fMRI) data from Garvert et al. (2017) that had previously revealed an entorhinal map which coded for newly learnt statistical regularities. We used a triplet odd-one-out task to construct a semantic distance matrix for presented items and applied fMRI adaptation analysis to show that the degree of similarity of representations in bilateral hippocampus decreases as a function of semantic distance between presented objects. Importantly, while both maps localize to the hippocampal formation, this semantic map is anatomically distinct from the originally described entorhinal map. This finding supports the idea that the hippocampal-entorhinal system forms parallel cognitive maps reflecting the embedding of objects in diverse relational structures.

A data-driven investigation of human action representations

Dima, D. C.; Hebart, M. N.; Isik, L.

2022 , bioRxiv

URL ABSTRACT

Understanding actions performed by others requires us to integrate different types of information about people, scenes, objects, and their interactions. What organizing dimensions does the mind use to make sense of this complex action space? To address this question, we collected intuitive similarity judgments across two large-scale sets of naturalistic videos depicting everyday actions. We used cross-validated sparse non-negative matrix factorization (NMF) to identify the structure underlying action similarity judgments. A low-dimensional representation, consisting of nine to ten dimensions, was sufficient to accurately reconstruct human similarity judgments. The dimensions were robust to stimulus set perturbations and reproducible in a separate odd-one-out experiment. Human labels mapped these dimensions onto semantic axes relating to food, work, and home life; social axes relating to people and emotions; and one visual axis related to scene setting. While highly interpretable, these dimensions did not share a clear one-to-one correspondence with prior hypotheses of action-relevant dimensions. Together, our results reveal a low-dimensional set of robust and interpretable dimensions that organize intuitive action similarity judgments and highlight the importance of data-driven investigations of behavioral representations.

2025

Core dimensions of human material perception

Schmidt, F.*; Hebart, M.N.*; Schmid, A.; Fleming, R.

2025 , Proceedings of the National Academy of Sciences

URL PDF PREPRINT CODE ABSTRACT

Visually categorizing and comparing materials is crucial for our everyday behaviour. Given the dramatic variability in their visual appearance and functional significance, what organizational principles underly the internal representation of materials? To address this question, here we use a large-scale data-driven approach to uncover the core latent dimensions in our mental representation of materials. In a first step, we assembled a new image dataset (STUFF dataset) consisting of 600 photographs of 200 systematically sampled material classes. Next, we used these images to crowdsource 1.87 million triplet similarity judgments. Based on the responses, we then modelled the assumed cognitive process underlying these choices by quantifying each image as a sparse, non-negative vector in a multidimensional embedding space. The resulting embedding predicted material similarity judgments in an independent test set close to the human noise ceiling and accurately reconstructed the similarity matrix of all 600 images in the STUFF dataset. We found that representations of individual material images were captured by a combination of 36 material dimensions that were highly reproducible and interpretable, comprising perceptual (e.g., “grainy”, “blue”) as well as conceptual (e.g., “mineral”, “viscous”) dimensions. These results have broad implications for understanding material perception, its natural dimensions, and our ability to organize materials into classes.

Ten principles for reliable, efficient, and adaptable coding in psychology and cognitive neuroscience

Roth, Johannes; Duan, Yunyan; Mahner, Florian P; Kaniuth, Philipp; Wallis, Thomas SA; Hebart, Martin N

2025 , Communications Psychology , Volume: 3 , pages: 62

URL PDF ABSTRACT

Writing code is becoming essential for psychology and neuroscience research, supporting increasingly advanced experimental designs, processing of ever-larger datasets and easy reproduction of scientific results. Despite its critical role, coding remains challenging for many researchers, as it is typically not part of formal academic training. We present a range of practices tailored to different levels of programming experience, from beginners to advanced users. Our ten principles help researchers streamline and automate their projects, reduce human error, and improve the quality and reusability of their code. For principal investigators, we highlight the benefits of fostering a collaborative environment that values code sharing. Maintaining basic standards for code quality, reusability, and shareability is critical for increasing the trustworthiness and reliability of research in experimental psychology and cognitive neuroscience.

The scope and limits of fine-grained image and category information in the ventral visual pathway

Badwal, Markus W; Bergmann, Johanna; Roth, Johannes HR; Doeller, Christian F; Hebart, Martin N

2025 , Journal of Neuroscience , Volume: 45

URL PDF ABSTRACT

Humans can easily abstract incoming visual information into discrete semantic categories. Previous research employing functional MRI (fMRI) in humans has identified cortical organizing principles that allow not only for coarse-scale distinctions such as animate versus inanimate objects but also more fine-grained distinctions at the level of individual objects. This suggests that fMRI carries rather fine-grained information about individual objects. However, most previous work investigating fine-grained category representations either additionally included coarse-scale category comparisons of objects, which confounds fine-grained and coarse-scale distinctions, or only used a single exemplar of each object, which confounds visual and semantic information. To address these challenges, here we used multisession human fMRI (female and male) paired with a broad yet homogenous stimulus class of 48 terrestrial mammals, with two exemplars per mammal. Multivariate decoding and representational similarity analysis revealed high image-specific reliability in low- and high-level visual regions, indicating stable representational patterns at the image level. In contrast, analyses across exemplars of the same animal yielded only small effects in the lateral occipital complex (LOC), indicating rather subtle category effects in this region. Variance partitioning with a deep neural network and shape model showed that across-exemplar effects in the early visual cortex were largely explained by low-level visual appearance, while representations in LOC appeared to also contain higher category-specific information. These results suggest that representations typically measured with fMRI are dominated by image-specific visual or coarse-grained category information but indicate that commonly employed fMRI protocols may reveal subtle yet reliable distinctions between individual objects.

How modular are modules in visual cortex?

Hebart, Martin N

2025 , Brain , Volume: 148 , pages: 1050--1051

URL PDF ABSTRACT

This scientific commentary refers to ‘Visual feature processing in a large stroke cohort: evidence against modular organization’ by Lugtmeijer, Sobolewska et al.(https://doi.org/10.1093/brain/awaf009).

Study 3: Identifying and characterizing scene rep-resentations relevant for categorization behavior

Singer, Johannes JD; Karapetian, Agnessa; Hebart, Martin N; Cichy, Radoslaw M; JJDS, AK

2025 , Imaging Neuroscience , Volume: 3 , pages: 71

URL PDF ABSTRACT

Scene recognition is a core sensory capacity that enables humans to adaptively interact with their environment. Despite substantial progress in the understanding of the neural representations underlying scene recognition, the relevance of these representations for behavior given varying task demands remains unknown. To address this, we aimed to identify behaviorally relevant scene representations, to characterize them in terms of their underlying visual features, and to reveal how they vary across different tasks. We recorded fMRI data while human participants viewed scenes and linked brain responses to behavior in three tasks acquired in separate sessions: man-made/natural cate- gorization, basic-level categorization, and fixation color discrimination. We found correlations between categorization response times and scene-specific brain responses, quantified as the distance to a hyperplane derived from a multi- variate classifier. Across tasks, these effects were found in largely distinct parts of the ventral visual stream. This suggests that different scene representations are relevant for behavior depending on the task. Next, using deep neu- ral networks as a proxy for visual feature representations, we found that intermediate layers mediated the relationship between scene representations and behavior for both categorization tasks, indicating a contribution of mid-level visual features to these representations. Finally, we observed opposite patterns of brain-behavior correlations in the man-made/natural and the fixation task, indicating interference of representations with behavior for task demands that do not align with the content of representations. Together, these results reveal the spatial extent, content, and task-dependence of the visual representations that mediate behavior in complex scenes.

Drawings of THINGS: A large-scale drawing dataset of 1,854 object concepts

Mukherjee, Kushin; Huey, Holly; Stoinski, Laura Mai; Hebart, Martin N; Fan, Judith; Bainbridge, Wilma

2025

PDF ABSTRACT

The development of large datasets of natural images has galvanized progress in psychology, neuroscience, and computer science. Notably, the THINGS database constitutes a collective effort towards understanding of human visual knowledge by accumulating rich data on a shared set of visual object concepts across several studies. In this paper, we introduce Drawing of THINGS (DoT), a novel dataset of 28,627 human drawings of 1,854 diverse object concepts, sampled systematically from concrete picturable and nameable nouns in the American English language, mirroring the structure of the THINGS image database. In addition to data on drawings’ stroke history, we further collected fine-grained recognition data for each drawing, along with metadata on participant demographics, drawing ability, and mental imagery. We characterize people’s ability to communicate and recognize semantic information encoded in drawings and compare this ability to their ability to recognize real-world images of the same visual objects. We also explore the relationship between drawing understanding and the memorability and typicality of the objects contained in THINGS. In sum, we envision DoT as a powerful tool that builds on the THINGS database to advance understanding of how humans express knowledge about visual concepts.

Feedback of peripheral saccade targets to early foveal cortex

Kämmer, Luca; Kroell, Lisa M; Knapen, Tomas; Rolfs, Martin; Hebart, Martin N

2025

URL ABSTRACT

Human vision is characterized by frequent eye movements1. This causes continuous shifts in visual input, yet visual perception appears highly stable2–4. A potential mechanism behind this stability is foveal prediction, involving feedback from higher cortical areas during saccade preparation.5,6 However, it remains unknown (1) whether information is fed back to early visual areas, (2) whether feedback is specific to stimulus features, and (3) which brain regions mediate this effect. To dissociate neural processes associated with stimulus presentation from those related to foveal feedback, we designed a gaze-contingent fMRI paradigm, where saccade targets are removed before they can be foveated. To determine the content of the neural representation, we used natural images as saccade targets and independently manipulated object shape and category. Multivariate analyzes revealed reliable decoding of stimuli from foveal retinotopic areas as early as V1, even though the stimulus never appeared in the fovea. Decoding was sensitive to shape but not semantic category, indicating that only low-to-mid-level information is fed back. Cross-decoding to a control condition with foveal stimulus presentation yielded reliable decoding, indicating a similar neural representation between foveal feedback and direct stimulation. Eccentricity-dependent analyzes showed a u-shaped decoding curve, confirming that these results are not explained by spillover of peripheral activity or large receptive fields. Moreover, fluctuations in foveal decodability correlated with activity in the intraparietal sulcus, a candidate region for driving this foveal feedback. These findings go beyond trans-saccadic remapping7,8 by suggesting that peripheral saccade targets are encoded in the foveal cortex in a feature-specific representation5.

Identifying and characterizing scene representations relevant for categorization behavior

Singer, Johannes JD; Karapetian, Agnessa; Hebart, Martin N; Cichy, Radoslaw M

2025 , Imaging Neuroscience , Volume: 3 , pages: imag\_a\_00449

URL PDF ABSTRACT

Scene recognition is a core sensory capacity that enables humans to adaptively interact with their environment. Despite substantial progress in the understanding of the neural representations underlying scene recognition, the relevance of these representations for behavior given varying task demands remains unknown. To address this, we aimed to identify behaviorally relevant scene representations, to characterize them in terms of their underlying visual features, and to reveal how they vary across different tasks. We recorded fMRI data while human participants viewed scenes and linked brain responses to behavior in three tasks acquired in separate sessions: man-made/natural categorization, basic-level categorization, and fixation color discrimination. We found correlations between categorization response times and scene-specific brain responses, quantified as the distance to a hyperplane derived from a multivariate classifier. Across tasks, these effects were found in largely distinct parts of the ventral visual stream. This suggests that different scene representations are relevant for behavior depending on the task. Next, using deep neural networks as a proxy for visual feature representations, we found that intermediate layers mediated the relationship between scene representations and behavior for both categorization tasks, indicating a contribution of mid-level visual features to these representations. Finally, we observed opposite patterns of brain-behavior correlations in the man-made/natural and the fixation task, indicating interference of representations with behavior for task demands that do not align with the content of representations. Together, these results reveal the spatial extent, content, and task-dependence of the visual representations that mediate behavior in complex scenes.

Dimensions underlying the representational alignment of deep neural networks with humans

Mahner, Florian P; Muttenthaler, Lukas; Güçlü, Umut; Hebart, Martin N

2025 , Nature Machine Intelligence , pages: 1--12

URL PDF PREPRINT ABSTRACT

Determining the similarities and differences between humans and artificial intelligence (AI) is an important goal in both computational cognitive neuroscience and machine learning, promising a deeper understanding of human cognition and safer, more reliable AI systems. Much previous work comparing representations in humans and AI has relied on global, scalar measures to quantify their alignment. However, without explicit hypotheses, these measures only inform us about the degree of alignment, not the factors that determine it. To address this challenge, we propose a generic framework to compare human and AI representations, based on identifying latent representational dimensions underlying the same behaviour in both domains. Applying this framework to humans and a deep neural network (DNN) model of natural images revealed a low-dimensional DNN embedding of both visual and semantic dimensions. In contrast to humans, DNNs exhibited a clear dominance of visual over semantic properties, indicating divergent strategies for representing images. Although in silico experiments showed seemingly consistent interpretability of DNN dimensions, a direct comparison between human and DNN representations revealed substantial differences in how they process images. By making representations directly comparable, our results reveal important challenges for representational alignment and offer a means for improving their comparability.

A high-throughput approach for the efficient prediction of perceived similarity of natural objects

Kaniuth, Philipp; Mahner, Florian P; Perkuhn, Jonas; Hebart, Martin N

2025 , eLife Sciences Publications, Ltd

URL PDF ABSTRACT

Perceived similarity offers a window into the mental representations underlying our ability to make sense of our visual world, yet, the collection of similarity judgments quickly becomes infeasible for larger datasets, limiting their generality. To address this challenge, here we introduce a computational approach that predicts perceived similarity from neural network activations through a set of 49 interpretable dimensions learned on 1.46 million triplet odd-one-out judgments. The approach allowed us to predict separate, independently-sampled similarity scores with an accuracy of up to 0.898. Combining this approach with human ratings of the same dimensions led only to small improvements, indicating that the neural network used similar information as humans in this task. Predicting the similarity of highly homogeneous image classes revealed that performance critically depends on the granularity of the training data. Our approach allowed us to improve the brain-behavior correspondence in a large-scale neuroimaging dataset and visualize candidate image features humans use for making similarity judgments, thus highlighting which image parts may carry behaviorally-relevant information. Together, our results demonstrate that current neural networks carry information sufficient for capturing broadly-sampled similarity scores, offering a pathway towards the automated collection of similarity scores for natural images.

2024

THINGSplus: New norms and metadata for the THINGS database of 1,854 object concepts and 26,107 natural object images

Stoinski, L.A.; Perkuhn, J.; Hebart, M.N.

2024 , Behavior Research Methods , Volume: 56 , pages: 1583-1603

URL PDF ABSTRACT

To study visual object processing, the need for well-curated object concepts and images has grown significantly over the past years. To address this we have previously developed THINGS (Hebart et al., 2019), a large-scale database of 1,854 systematically sampled object concepts with 26,107 high-quality naturalistic images of these concepts. With THINGS+ we aim to extend THINGS by adding concept-specific and image-specific norms and metadata. Concept-specific norms were collected for all 1,854 object concepts for the object properties real-world size, manmadeness, preciousness, liveliness, heaviness, naturalness, ability to move, graspability, holdability, ability to be moved, pleasantness, and arousal. Further, we extended high-level categorization to 53 superordinate categories and collected typicality ratings for members of all 53 categories. Image-specific metadata includes measures of nameability and recognizability for objects in all 26,107 images. To this end, we asked participants to provide labels for prominent objects depicted in each of the 26,107 images and measured the alignment with the original object concept. Finally, to present example images in publications without copyright restrictions, we identified one new public domain image per object concept. In this study we demonstrate a high consistency of property (r = 0.92-0.99, M = 0.98, SD = 0.34) and typicality ratings (r = 0.88-0.98; M = 0.96, SD = 0.19), with arousal ratings as the only exception (r = 0.69). Correlations of our data with external norms were moderate to high for object properties (r = 0.44-0.95; M = 0.85, SD = 0.32) and typicality scores (r = 0.72-0.88; M = 0.79, SD = 0.18), again with the lowest validity for arousal (r = 0.30 - 0.52). To summarize, THINGS+ provides a broad, externally-validated extension to existing object norms and an important extension to THINGS as a general resource of object concepts, images, and category memberships. Our norms, metadata, and images provide a detailed selection of stimuli and control variables for a wide range of research interested in object processing and semantic memory.

Distributed representations of behaviour-derived object dimensions in the human visual system

Contier, O.; Baker, C. I.; Hebart, M. N.

2024 , Nature Human Behaviour , Volume: 8(11) , pages: 2179-2193

URL PDF PREPRINT CODE ABSTRACT

Object vision is commonly thought to involve a hierarchy of brain regions processing increasingly complex image features, with high-level visual cortex supporting object recognition and categorization. However, object vision supports diverse behavioural goals, suggesting basic limitations of this category-centric framework. To address these limitations, we mapped a series of dimensions derived from a large-scale analysis of human similarity judgements directly onto the brain. Our results reveal broadly distributed representations of behaviourally relevant information, demonstrating selectivity to a wide variety of novel dimensions while capturing known selectivities for visual features and categories. Behaviour-derived dimensions were superior to categories at predicting brain responses, yielding mixed selectivity in much of visual cortex and sparse selectivity in category-selective clusters. This framework reconciles seemingly disparate findings regarding regional specialization, explaining category selectivity as a special case of sparse response profiles among representational dimensions, suggesting a more expansive view on visual processing in the human brain.

What comparing deep neural networks can teach us about human vision

Seeliger, Katja; Hebart, Martin N

2024 , Nature Machine Intelligence , Volume: 6 , pages: 122--123

URL PDF ABSTRACT

Recent work has demonstrated important parallels between human visual representations and those found in deep neural networks. A new study comparing functional MRI data to deep neural network models highlights factors that may determine this similarity.

Unrealized promise of joint modeling of choice and reaction time in improving representation learning

Richie, Russell; Ajmal, Nehal; Hebart, Martin N

2024 , Proceedings of the Annual Meeting of the Cognitive Science Society

URL PDF ABSTRACT

As mental representations are standardly thought to underlie all cognitive processes, a major goal of cognitive science has been to uncover representations. Methods for representation learning from behavioral data often model choice or reaction time data alone, but not jointly, leaving out potentially useful information. Here we develop two models of choice and RT in the odd-one-out task, including one based on the Linear Ballistic Accumulator. Parameter recovery simulations show joint modeling of choice and RT with LBA recovers representations more accurately than modeling choice alone with softmax. However, on two empirical datasets of images and words, joint models performed no better than choice-only models, despite a significant correlation of reaction time with two measures of similarity and choice difficulty in both datasets. We speculate on reasons for the unrealized promise of joint modeling of RT and choice in representation learning.

2023

The spatiotemporal neural dynamics of object recognition for natural images and line drawings

Singer, J. J.; D.; Cichy, R. M.; Hebart, M. N.

2023 , Journal of Neuroscience , Volume: 43 , pages: 484-500

URL ABSTRACT

Drawings offer a simple and efficient way to communicate meaning. While line drawings capture only coarsely how objects look in reality, we still perceive them as resembling real-world objects. Previous work has shown that this perceived similarity is mirrored by shared neural representations for drawings and natural images, which suggests that similar mechanisms underlie the recognition of both. However, other work has proposed that representations of drawings and natural images become similar only after substantial processing has taken place, suggesting distinct mechanisms. To arbitrate between those alternatives, we measured brain responses resolved in space and time using fMRI and MEG, respectively, while human participants (female and male) viewed images of objects depicted as photographs, line drawings, or sketch-like drawings. Using multivariate decoding, we demonstrate that object category information emerged similarly fast and across overlapping regions in occipital, ventral-temporal, and posterior parietal cortex for all types of depiction, yet with smaller effects at higher levels of visual abstraction. In addition, cross-decoding between depiction types revealed strong generalization of object category information from early processing stages on. Finally, by combining fMRI and MEG data using representational similarity analysis, we found that visual information traversed similar processing stages for all types of depiction, yet with an overall stronger representation for photographs. Together, our results demonstrate broad commonalities in the neural dynamics of object recognition across types of depiction, thus providing clear evidence for shared neural mechanisms underlying recognition of natural object images and abstract drawings.

Emergent dimensions underlying human understanding of the reachable world

Josephs, E.; Hebart, M. N.; Konkle, T.

2023 , Cognition , Volume: 234 , pages: 105368

URL ABSTRACT

Near-scale environments, like work desks, restaurant place settings or lab benches, are the interface of our hand-based interactions with the world. How are our conceptual representations of these environments organized? What properties distinguish among reachspaces, and why? We obtained 1.25 million similarity judgments on 990 reachspace images, and generated a 30-dimensional embedding which accurately predicts these judgments. Examination of the embedding dimensions revealed key properties underlying these judgments, such as reachspace layout, affordance, and visual appearance. Clustering performed over the embedding revealed four distinct interpretable classes of reachspaces, distinguishing among spaces related to food, electronics, analog activities, and storage or display. Finally, we found that reachspace similarity ratings were better predicted by the function of the spaces than their locations, suggesting that reachspaces are largely conceptualized in terms of the actions they support. Altogether, these results reveal the behaviorally-relevant principles that structure our internal representations of reach-relevant environments.

The features underlying the memorability of objects

Kramer, M.A.; Hebart, M.N.; Baker, C.I.; Bainbridge, W.A.

2023 , Science Advances

URL PREPRINT ABSTRACT

What makes certain images more memorable than others? While much of memory research has focused on participant effects, recent studies employing a stimulus-centric perspective have sparked debate on the determinants of memory, including the roles of semantic and visual features and whether the most prototypical or atypical items are best remembered. Prior studies have typically relied on constrained stimulus sets, limiting a generalized view of the features underlying what we remember. Here, we collected 1+ million memory ratings for a naturalistic dataset of 26,107 object images designed to comprehensively sample concrete objects. We establish a model of object features that is predictive of image memorability and examined whether memorability could be accounted for by the typicality of the objects. We find that semantic features exert a stronger influence than perceptual features on what we remember and that the relationship between memorability and typicality is more complex than a simple positive or negative association alone.

THINGS-data, a multimodal collection of large-scale datasets for investigating object representations in brain and behavior

Hebart, M.N.*; Contier, O.*; Teichmann, L.*; Rockter, A.; Zheng, C.Y.; Kidder, A.; Corriveau, A.; Vaziri-Pashkam, M.; Baker, C.I.

2023 , eLife

URL ABSTRACT

Understanding object representations requires a broad, comprehensive sampling of the objects in our visual world with dense measurements of brain activity and behavior. Here, we present THINGS-data, a multimodal collection of large-scale neuroimaging and behavioral datasets in humans, comprising densely sampled functional MRI and magnetoencephalographic recordings, as well as 4.70 million similarity judgments in response to thousands of photographic images for up to 1,854 object concepts. THINGS-data is unique in its breadth of richly annotated objects, allowing for testing countless hypotheses at scale while assessing the reproducibility of previous findings. Beyond the unique insights promised by each individual dataset, the multimodality of THINGS-data allows combining datasets for a much broader view into object processing than previously possible. Our analyses demonstrate the high quality of the datasets and provide five examples of hypothesis-driven and data-driven applications. THINGS-data constitutes the core public release of the THINGS initiative (https://things-initiative.org) for bridging the gap between disciplines and the advancement of cognitive neuroscience.

The Three Terms Task-an open benchmark to compare human and artificial semantic representations

V Borghesani; J Armoza; Martin N Hebart; P Bellec; SM Brambati

2023 , Scientific Data

URL PDF ABSTRACT

Word processing entails retrieval of a unitary yet multidimensional semantic representation (e.g., a lemon’s colour, flavour, possible use) and has been investigated in both cognitive neuroscience and artificial intelligence. To enable the direct comparison of human and artificial semantic representations, and to support the use of natural language processing (NLP) for computational modelling of human understanding, a critical challenge is the development of benchmarks of appropriate size and complexity. Here we present a dataset probing semantic knowledge with a three-terms semantic associative task: which of two target words is more closely associated with a given anchor (e.g., is lemon closer to squeezer or sour?). The dataset includes both abstract and concrete nouns for a total of 10,107 triplets. For the 2,255 triplets with varying levels of agreement among NLP word embeddings, we additionally collected …

2022

THINGS-EEG: Human electroencephalography recordings for 1,854 concepts presented in rapid serial visual presentation streams

Grootswagers, T.; Zhou, I.; Robinson, A. K.; Hebart, M. N.; Carlson, T. A.

2022 , Scientific Data 9 , Volume: 9 , pages: 3

URL ABSTRACT

The neural basis of object recognition and semantic knowledge has been extensively studied but the high dimensionality of object space makes it challenging to develop overarching theories on how the brain organises object knowledge. To help understand how the brain allows us to recognise, categorise, and represent objects and object categories, there is a growing interest in using large-scale image databases for neuroimaging experiments. In the current paper, we present THINGS-EEG, a dataset containing human electroencephalography responses from 50 subjects to 1,854 object concepts and 22,248 images in the THINGS stimulus set, a manually curated and high-quality image database that was specifically designed for studying human vision. The THINGS-EEG dataset provides neuroimaging recordings to a systematic collection of objects and concepts and can therefore support a wide array of research to understand visual object processing in the human brain.

From photos to sketches - how humans and deep neural networks process objects across different levels of visual abstraction

Singer, J.; Seeliger, K.; Kietzmann, T. C.; Hebart, M. N.

2022 , journal of vision , Volume: 22 , pages: 1-19

URL ABSTRACT

Line drawings convey meaning with just a few strokes. Despite strong simplifications, humans can recognize objects depicted in such abstracted images without effort. To what degree do deep convolutional neural networks (CNNs) mirror this human ability to generalize to abstracted object images? While CNNs trained on natural images have been shown to exhibit poor classification performance on drawings, other work has demonstrated highly similar latent representations in the networks for abstracted and natural images. Here, we address these seemingly conflicting findings by analyzing the activation patterns of a CNN trained on natural images across a set of photographs, drawings, and sketches of the same objects and comparing them to human behavior. We find a highly similar representational structure across levels of visual abstraction in early and intermediate layers of the network. This similarity, however, does not translate to later stages in the network, resulting in low classification performance for drawings and sketches. We identified that texture bias in CNNs contributes to the dissimilar representational structure in late layers and the poor performance on drawings. Finally, by fine-tuning late network layers with object drawings, we show that performance can be largely restored, demonstrating the general utility of features learned on natural images in early and intermediate layers for the recognition of drawings. In conclusion, generalization to abstracted images, such as drawings, seems to be an emergent property of CNNs trained on natural images, which is, however, suppressed by domain-related biases that arise during later processing stages in the network.

Feature-reweighted representational similarity analysis: A method for improving the fit between computational models, brains, and behavior

Kaniuth, P.; Hebart, M.N.

2022 , Neuroimage , Volume: 257 , pages: 119294

URL ABSTRACT

Representational Similarity Analysis (RSA) has emerged as a popular method for relating representational spaces from human brain activity, behavioral data, and computational models. RSA is based on the comparison of representational (dis-)similarity matrices (RDMs or RSMs), which characterize the pairwise (dis-)similarities of all conditions across all features (e.g. fMRI voxels or units of a model). However, classical RSA treats each feature as equally important. This ‘equal weights’ assumption contrasts with the flexibility of multivariate decoding, which reweights individual features for predicting a target variable. As a consequence, classical RSA may lead researchers to underestimate the correspondence between a model and a brain region and, in case of model comparison, may lead them to select an inferior model. The aim of this work is twofold: First, we sought to broadly test feature-reweighted RSA (FR-RSA) applied to computational models and reveal the extent to which reweighting model features improves RSM correspondence and affects model selection. Previous work suggested that reweighting can improve model selection in RSA but it has remained unclear to what extent these results generalize across datasets and data modalities. To draw more general conclusions, we utilized a range of publicly available datasets and three popular deep neural networks (DNNs). Second, we propose voxel-reweighted RSA, a novel use case of FR-RSA that reweights fMRI voxels, mirroring the rationale of multivariate decoding of optimally combining voxel activity patterns. We found that reweighting individual model units markedly improved the fit between model RSMs and target RSMs derived from several fMRI and behavioral datasets and affected model selection, highlighting the importance of considering FR-RSA. For voxel-reweighted RSA, improvements in RSM correspondence were even more pronounced, demonstrating the utility of this novel approach. We additionally show that classical noise ceilings can be exceeded when FR-RSA is applied and propose an updated approach for their computation. Taken together, our results broadly validate the use of FR-RSA for improving the fit between computational models, brain, and behavioral data, allowing us to better adjudicate between competing computational models. Further, our results suggest that FR-RSA applied to brain measurement channels could become an important new method to assess the correspondence between representational spaces.

Semantic features of object concepts generated with GPT-3

Hansen, H.; Hebart, M.N.

2022 , Proceedings of the Annual Meeting of the Cognitive Science Society , Volume: 44

URL ABSTRACT

Semantic features have been playing a central role in investigating the nature of our conceptual representations. Yet the time and effort required to sample features from human raters has restricted their use to a limited set of manually curated concepts. Given recent success of transformer-based language models, we asked whether it was possible to use such models to automatically generate meaningful lists of properties for arbitrary object concepts and whether these models would produce features similar to those found in humans. We probed a GPT-3 model to generate semantic features for 1,854 objects and compared them to existing human feature norms. GPT-3 showed a similar distribution in the types of features and similar performance in predicting similarity, relatedness, and category membership. Together these results highlight the potential of large language models to capture important facets of human knowledge and yield a new approach for automatically generating interpretable feature sets.

VICE: Variational Interpretable Concept Embeddings.

Muttenthaler, L.; Zheng, C. Y.; McClure, P.; Vandermeulen, R. A.; Hebart, M. N.; Pereira, F.

2022 , Advances in Neural Information Processing Systems (NeurIPS)

URL ABSTRACT

A central goal in the cognitive sciences is the development of numerical models for mental representations of object concepts. This paper introduces Variational Interpretable Concept Embeddings (VICE), an approximate Bayesian method for embedding object concepts in a vector space using data collected from humans in a triplet odd-one-out task. VICE uses variational inference to obtain sparse, non-negative representations of object concepts with uncertainty estimates for the embedding values. These estimates are used to automatically select the dimensions that best explain the data. We derive a PAC learning bound for VICE that can be used to estimate generalization performance or determine a sufficient sample size for experimental design. VICE rivals or outperforms its predecessor, SPoSE, at predicting human behavior in the triplet odd-one-out task. Furthermore, VICE's object representations are more reproducible and consistent across random initializations, highlighting the unique advantage of using VICE for deriving interpretable embeddings from human behavior.

2021

THINGSvision: A Python toolbox for streamlining the extraction of activations from deep neural networks

Muttenthaler, L.; Hebart, M. N.

2021 , Frontiers in Neuroinformatics , Volume: 15 , pages: 45

URL PDF ABSTRACT

Over the past decade, deep neural network (DNN) models have received a lot of attention due to their near-human object classification performance and their excellent prediction of signals recorded from biological visual systems. To better understand the function of these networks and relate them to hypotheses about brain activity and behavior, researchers need to extract the activations to images across different DNN layers. The abundance of different DNN variants, however, can often be unwieldy, and the task of extracting DNN activations from different layers may be non-trivial and error-prone for someone without a strong computational background. Thus, researchers in the fields of cognitive science and computational neuroscience would benefit from a library or package that supports a user in the extraction task. THINGSvision is a new Python module that aims at closing this gap by providing a simple and unified tool for extracting layer activations for a wide range of pretrained and randomly-initialized neural network architectures, even for users with little to no programming experience. We demonstrate the general utility of THINGsvision by relating extracted DNN activations to a number of functional MRI and behavioral datasets using representational similarity analysis, which can be performed as an integral part of the toolbox. Together, THINGSvision enables researchers across diverse fields to extract features in a streamlined manner for their custom image dataset, thereby improving the ease of relating DNNs, brain activity, and behavior, and improving the reproducibility of findings in these research fields.

The organizational principles of de-differentiated topographic maps in somatosensory cortex

Liu, P.; Chrysidou, A.; Doehler, J.; Hebart, M. N.; Wolbers, T.; Kuehn, E.

2021 , eLife , Volume: 10 , pages: e60090

URL CODE ABSTRACT

Topographic maps are a fundamental feature of cortex architecture in the mammalian brain. One common theory is that the de-differentiation of topographic maps links to impairments in everyday behavior due to less precise functional map readouts. Here, we tested this theory by characterizing de-differentiated topographic maps in primary somatosensory cortex (SI) of younger and older adults by means of ultra-high resolution functional magnetic resonance imaging together with perceptual finger individuation and hand motor performance. Older adults’ SI maps showed similar amplitude and size to younger adults’ maps, but presented with less representational similarity between distant fingers. Larger population receptive field sizes in older adults’ maps did not correlate with behavior, whereas reduced cortical distances between D2 and D3 related to worse finger individuation but better motor performance. Our data uncover the drawbacks of a simple de-differentiation model of topographic map function, and motivate the introduction of feature-based models of cortical reorganization.

2020

The representation of object drawings and sketches in deep convolutional neural networks

Singer, J.; Seeliger, K.; Hebart, M.N.

2020 , NeurIPS Workshop SVRHM

URL PDF ABSTRACT

Drawings are universal in human culture and serve as tools to efficiently convey meaning with little visual information. Humans are adept at recognizing even highly abstracted drawings of objects, and their visual system has been shown to respond similarly to different object depictions. Yet, the processing of object drawings in deep convolutional neural networks (CNNs) has yielded conflicting results. While CNNs have been shown to perform poorly on drawings, there is evidence that representations in CNNs are similar for object photographs and drawings. Here, we resolve these disparate findings by probing the generalization ability of a CNN trained on natural object images for a set of photos, drawings and sketches of the same objects, with each depiction representing a different level of abstraction. We demonstrate that despite poor classification performance on drawings and sketches, the network exhibits a similar representational structure across levels of abstraction in intermediate layers which, however, disappears in later layers. Further, we show that a texture bias found in CNNs contributes both to the poor classification performance for drawings and the dissimilar representational structure, specifically in the later layers of the network. By finetuning only those layers on a database of object drawings, we show that features in early and intermediate layers learned on natural object photographs are indeed sufficient for downstream recognition of drawings. Our findings reconcile previous investigations on the generalization ability of CNNs for drawings and reveal both opportunities and limitations of CNNs as models for the representation and recognition of drawings and sketches.

Revealing the multidimensional mental representations of natural objects underlying human similarity judgements

Hebart, M.N.; Zheng, C.Y.; Pereira, F.; Baker, C.I.

2020 , Nature Human Behaviour , pages: 1173-1185

URL PDF ABSTRACT

Objects can be characterized according to a vast number of possible criteria (such as animacy, shape, colour and function), but some dimensions are more useful than others for making sense of the objects around us. To identify these core dimensions of object representations, we developed a data-driven computational model of similarity judgements for real-world images of 1,854 objects. The model captured most explainable variance in similarity judgements and produced 49 highly reproducible and meaningful object dimensions that reflect various conceptual and perceptual properties of those objects. These dimensions predicted external categorization behaviour and reflected typicality judgements of those categories. Furthermore, humans can accurately rate objects along these dimensions, highlighting their interpretability and opening up a way to generate similarity estimates from object dimensions alone. Collectively, these results demonstrate that human similarity judgements can be captured by a fairly low-dimensional, interpretable embedding that generalizes to external behaviour.

Current topics in Computational Cognitive Neuroscience (Editorial)

Hebart, M.N.; Schuck, N.W.

2020 , Neuropsychologia

URL ABSTRACT

Computational Cognitive Neuroscience is a discipline at the intersection of psychology, neuroscience and artificial intelligence. At its core is the development and comparison of computational models that allow the prediction of behavior, cognition and brain activity, with the long-term goal of providing a neurophysiologically plausible characterization of the underlying brain structure or function (Ashby and Helie, 2011; Kriegeskorte and Douglas, 2018; Love, 2015; O’Reilly and Munakata, 2000). Fueled by recent developments with machine learning techniques that solve cognitive tasks such as object recognition, decision making, or language processing (Krizhevsky et al., 2012; Mikolov et al., 2013; Mnih et al., 2015), computational cognitive neuroscientists have started to link these artificial intelligence approaches to neural processes (Huth et al., 2016; Stachenfeld et al., 2017; Yamins et al., 2014). This, in turn, has led to applications of computational modeling in neuroscience that have become increasingly sophisticated. Today, the field is moving fast, and hardly a year goes by without discoveries that seem like a true expansion of our horizon. These exciting developments motivated us to bring to life this Special Issue on Computational Cognitive Neuroscience.

Mechanisms of offline motor learning at a microscale of seconds in large-scale crowdsourced data

Bönstrup, M.; Iturrate, I.; Hebart, M.N.; Censor, N.; Cohen, L.G.

2020 , Science of Learning , pages: 1--10

URL ABSTRACT

Performance improvements during early human motor skill learning are suggested to be driven by short periods of rest during practice, at the scale of seconds. To reveal the unknown mechanisms behind these “micro-offline” gains, we leveraged the sampling power offered by online crowdsourcing (cumulative N over all experiments = 951). First, we replicated the original in-lab findings, demonstrating generalizability to subjects learning the task in their daily living environment (N = 389). Second, we show that offline improvements during rest are equivalent when significantly shortening practice period duration, thus confirming that they are not a result of recovery from performance fatigue (N = 118). Third, retroactive interference immediately after each practice period reduced the learning rate relative to interference after passage of time (N = 373), indicating stabilization of the motor memory at a microscale of several seconds. Finally, we show that random termination of practice periods did not impact offline gains, ruling out a contribution of predictive motor slowing (N = 71). Altogether, these results demonstrate that micro-offline gains indicate rapid, within-seconds consolidation accounting for early skill learning.

2019

THINGS: A database of 1,854 object concepts and more than 26,000 naturalistic object images

Hebart, M. N.; Dickter, A. H.; Kidder, A.; Kwok, W. Y.; Corriveau, A.; Van Wicklin, C.; Baker, C. I.

2019 , PloS one , pages: e0223792

URL PDF ABSTRACT

In recent years, the use of a large number of object concepts and naturalistic object images has been growing strongly in cognitive neuroscience research. Classical databases of object concepts are based mostly on a manually curated set of concepts. Further, databases of naturalistic object images typically consist of single images of objects cropped from their background, or a large number of naturalistic images of varying quality, requiring elaborate manual image curation. Here we provide a set of 1,854 diverse object concepts sampled systematically from concrete picturable and nameable nouns in the American English language. Using these object concepts, we conducted a large-scale web image search to compile a database of 26,107 high-quality naturalistic images of those objects, with 12 or more object images per concept and all images cropped to square size. Using crowdsourcing, we provide higher-level category membership for the 27 most common categories and validate them by relating them to representations in a semantic embedding derived from large text corpora. Finally, by feeding images through a deep convolutional neural network, we demonstrate that they exhibit high selectivity for different object concepts, while at the same time preserving variability of different object images within each concept. Together, the THINGS database provides a rich resource of object concepts and object images and offers a tool for both systematic and large-scale naturalistic research in the fields of psychology, neuroscience, and computer science.

2018

The same analysis approach: Practical protection against the pitfalls of novel neuroimaging analysis methods

Görgen, K.; Hebart, M. N.; Allefeld, C.; Haynes, J.-D.

2018 , Neuroimage , pages: 19--30

URL PDF ABSTRACT

Standard neuroimaging data analysis based on traditional principles of experimental design, modelling, and statistical inference is increasingly complemented by novel analysis methods, driven e.g. by machine learning methods. While these novel approaches provide new insights into neuroimaging data, they often have unexpected properties, generating a growing literature on possible pitfalls. We propose to meet this challenge by adopting a habit of systematic testing of experimental design, analysis procedures, and statistical inference. Specifically, we suggest to apply the analysis method used for experimental data also to aspects of the experimental design, simulated confounds, simulated null data, and control data. We stress the importance of keeping the analysis method the same in main and test analyses, because only this way possible confounds and unexpected properties can be reliably detected and avoided. We describe and discuss this Same Analysis Approach in detail, and demonstrate it in two worked examples using multivariate decoding. With these examples, we reveal two sources of error: A mismatch between counterbalancing (crossover designs) and cross-validation which leads to systematic below-chance accuracies, and linear decoding of a nonlinear effect, a difference in variance.

Deconstructing multivariate decoding for the study of brain function

Hebart, M.N; Baker, C.I.

2018 , Neuroimage , pages: 4--18

URL PDF ABSTRACT

Multivariate decoding methods were developed originally as tools to enable accurate predictions in real-world applications. The realization that these methods can also be employed to study brain function has led to their widespread adoption in the neurosciences. However, prior to the rise of multivariate decoding, the study of brain function was firmly embedded in a statistical philosophy grounded on univariate methods of data analysis. In this way, multivariate decoding for brain interpretation grew out of two established frameworks: multivariate decoding for predictions in real-world applications, and classical univariate analysis based on the study and interpretation of brain activation. We argue that this led to two confusions, one reflecting a mixture of multivariate decoding for prediction or interpretation, and the other a mixture of the conceptual and statistical philosophies underlying multivariate decoding and classical univariate analysis. Here we attempt to systematically disambiguate multivariate decoding for the study of brain function from the frameworks it grew out of. After elaborating these confusions and their consequences, we describe six, often unappreciated, differences between classical univariate analysis and multivariate decoding. We then focus on how the common interpretation of what is signal and noise changes in multivariate decoding. Finally, we use four examples to illustrate where these confusions may impact the interpretation of neuroimaging data. We conclude with a discussion of potential strategies to help resolve these confusions in interpreting multivariate decoding results, including the potential departure from multivariate decoding methods for the study of brain function.

The representational dynamics of task and object processing in humans

Hebart, M. N.; Bankson, B. B.; Harel, A.; Baker, C. I.; Cichy, R. M.

2018 , Elife , pages: e32816

URL ABSTRACT

Despite the importance of an observer’s goals in determining how a visual object is categorized, surprisingly little is known about how humans process the task context in which objects occur and how it may interact with the processing of objects. Using magnetoencephalography (MEG), functional magnetic resonance imaging (fMRI) and multivariate techniques, we studied the spatial and temporal dynamics of task and object processing. Our results reveal a sequence of separate but overlapping task-related processes spread across frontoparietal and occipitotemporal cortex. Task exhibited late effects on object processing by selectively enhancing task-relevant object features, with limited impact on the overall pattern of object representations. Combining MEG and fMRI data, we reveal a parallel rise in task-related signals throughout the cerebral cortex, with an increasing dominance of task over object representations from early to higher visual areas. Collectively, our results reveal the complex dynamics underlying task and object representations throughout human cortex.

The temporal evolution of conceptual object representations revealed through models of behavior, semantics and deep neural networks

Bankson, B. B.; Hebart, M. N.; Groen, I. I. A.; Baker, C. I.

2018 , NeuroImage , pages: 172--182

URL PDF CODE ABSTRACT

Visual object representations are commonly thought to emerge rapidly, yet it has remained unclear to what extent early brain responses reflect purely low-level visual features of these objects and how strongly those features contribute to later categorical or conceptual representations. Here, we aimed to estimate a lower temporal bound for the emergence of conceptual representations by defining two criteria that characterize such representations: 1) conceptual object representations should generalize across different exemplars of the same object, and 2) these representations should reflect high-level behavioral judgments. To test these criteria, we compared magnetoencephalography (MEG) recordings between two groups of participants (n = 16 per group) exposed to different exemplar images of the same object concepts. Further, we disentangled low-level from high-level MEG responses by estimating the unique and shared contribution of models of behavioral judgments, semantics, and different layers of deep neural networks of visual object processing. We find that 1) both generalization across exemplars as well as generalization of object-related signals across time increase after 150 ms, peaking around 230 ms; 2) representations specific to behavioral judgments emerged rapidly, peaking around 160 ms. Collectively, these results suggest a lower bound for the emergence of conceptual object representations around 150 ms following stimulus onset.

2016

The relationship between perceptual decision variables and confidence in the human brain

Hebart, M. N.; Schriever, Y.; Donner, T. H.; Haynes, J.-D.

2016 , Cerebral Cortex , pages: 118--130

URL PDF ABSTRACT

Perceptual confidence refers to the degree to which we believe in the accuracy of our percepts. Signal detection theory suggests that perceptual confidence is computed from an internal "decision variable," which reflects the amount of available information in favor of one or another perceptual interpretation of the sensory input. The neural processes underlying these computations have, however, remained elusive. Here, we used fMRI and multivariate decoding techniques to identify regions of the human brain that encode this decision variable and confidence during a visual motion discrimination task. We used observers' binary perceptual choices and confidence ratings to reconstruct the internal decision variable that governed the subjects' behavior. A number of areas in prefrontal and posterior parietal association cortex encoded this decision variable, and activity in the ventral striatum reflected the degree of perceptual confidence. Using a multivariate connectivity analysis, we demonstrate that patterns of brain activity in the right ventrolateral prefrontal cortex reflecting the decision variable were linked to brain signals in the ventral striatum reflecting confidence. Our results suggest that the representation of perceptual confidence in the ventral striatum is derived from a transformation of the continuous decision variable encoded in the cerebral cortex.

Analyzing neuroimaging data with subclasses: A shrinkage approach

Höhne, J.; Bartz, D.; Hebart, M. N.; Müller, K.-R.; Blankertz, B.

2016 , NeuroImage , pages: 740--751

URL PDF ABSTRACT

Among the numerous methods used to analyze neuroimaging data, Linear Discriminant Analysis (LDA) is commonly applied for binary classification problems. LDAs popularity derives from its simplicity and its competitive classification performance, which has been reported for various types of neuroimaging data.

Yet the standard LDA approach proves less than optimal for binary classification problems when additional label information (i.e. subclass labels) is present. Subclass labels allow to model structure in the data, which can be used to facilitate the classification task. In this paper, we illustrate how neuroimaging data exhibit subclass labels that may contain valuable information. We also show that the standard LDA classifier is unable to exploit subclass labels.

We introduce a novel method that allows subclass labels to be incorporated efficiently into the classifier. The novel method, which we call Relevance Subclass LDA (RSLDA), computes an individual classification hyperplane for each subclass. It is based on regularized estimators of the subclass mean and uses other subclasses as regularization targets. We demonstrate the applicability and performance of our method on data drawn from two different neuroimaging modalities: (I) EEG data from brain–computer interfacing with event-related potentials, and (II) fMRI data in response to different levels of visual motion. We show that RSLDA outperforms the standard LDA approach for both types of datasets. These findings illustrate the benefits of exploiting subclass structure in neuroimaging data. Finally, we show that our classifier also outputs regularization profiles, enabling researchers to interpret the subclass structure in a meaningful way.

RSLDA therefore yields increased classification accuracy as well as a better interpretation of neuroimaging data. Since both results are highly favorable, we suggest to apply RSLDA for various classification problems within neuroimaging and beyond.

Mesolimbic confidence signals guide perceptual learning in the absence of external feedback

Guggenmos, M.; Wilbertz, G.; Hebart, M. N.; Sterzer, P.

2016 , eLife , pages: e13388

URL ABSTRACT

It is well established that learning can occur without external feedback, yet normative reinforcement learning theories have difficulties explaining such instances of learning. Here, we propose that human observers are capable of generating their own feedback signals by monitoring internal decision variables. We investigated this hypothesis in a visual perceptual learning task using fMRI and confidence reports as a measure for this monitoring process. Employing a novel computational model in which learning is guided by confidence-based reinforcement signals, we found that mesolimbic brain areas encoded both anticipation and prediction error of confidence—in remarkable similarity to previous findings for external reward-based feedback. We demonstrate that the model accounts for choice and confidence reports and show that the mesolimbic confidence prediction error modulation derived through the model predicts individual learning success. These results provide a mechanistic neurobiological explanation for learning without external feedback by augmenting reinforcement models with confidence-based feedback.

An efficient data partitioning to improve classification performance while keeping parameters interpretable

Korjus, K.; Hebart, M.N.; Vicente, R.

2016 , PloS one , pages: e0161788

URL PDF ABSTRACT

Supervised machine learning methods typically require splitting data into multiple chunks for training, validating, and finally testing classifiers. For finding the best parameters of a classifier, training and validation are usually carried out with cross-validation. This is followed by application of the classifier with optimized parameters to a separate test set for estimating the classifier’s generalization performance. With limited data, this separation of test data creates a difficult trade-off between having more statistical power in estimating generalization performance versus choosing better parameters and fitting a better model. We propose a novel approach that we term “Cross-validation and cross-testing” improving this trade-off by re-using test data without biasing classifier performance. The novel approach is validated using simulated data and electrophysiological recordings in humans and rodents. The results demonstrate that the approach has a higher probability of discovering significant results than the standard approach of cross-validation and testing, while maintaining the nominal alpha level. In contrast to nested cross-validation, which is maximally efficient in re-using data, the proposed approach additionally maintains the interpretability of individual parameters. Taken together, we suggest an addition to currently used machine learning approaches which may be particularly useful in cases where model weights do not require interpretation, but parameters do.

Interaction of instrumental and goal-directed learning modulates prediction error representations in the ventral striatum

Guo, R.; Böhmer, W.; Hebart, M.; Chien, S.; Sommer, T.; Obermayer, K.; Gläscher, J.

2016 , Journal of Neuroscience , pages: 12650--12660

URL PDF ABSTRACT

Goal-directed and instrumental learning are both important controllers of human behavior. Learning about which stimulus event occurs in the environment and the reward associated with them allows humans to seek out the most valuable stimulus and move through the environment in a goal-directed manner. Stimulus–response associations are characteristic of instrumental learning, whereas response–outcome associations are the hallmark of goal-directed learning. Here we provide behavioral, computational, and neuroimaging results from a novel task in which stimulus–response and response–outcome associations are learned simultaneously but dominate behavior at different stages of the experiment. We found that prediction error representations in the ventral striatum depend on which type of learning dominates. Furthermore, the amygdala tracks the time-dependent weighting of stimulus–response versus response–outcome learning. Our findings suggest that the goal-directed and instrumental controllers dynamically engage the ventral striatum in representing prediction errors whenever one of them is dominating choice behavior.

2015

Serotonin and dopamine differentially affect appetitive and aversive general Pavlovian-to-instrumental transfer

Hebart, M.N.; Gläscher, J.

2015 , Psychopharmacology , pages: 437--451

URL PDF ABSTRACT

Rationale

Human motivation and decision-making is influenced by the interaction of Pavlovian and instrumental systems. The neurotransmitters dopamine and serotonin have been suggested to play a major role in motivation and decision-making, but how they affect this interaction in humans is largely unknown.

Objective

We investigated the effect of these neurotransmitters in a general Pavlovian-to-instrumental transfer (PIT) task which measured the nonspecific effect of appetitive and aversive Pavlovian cues on instrumental responses.

Methods

For that purpose, we used selective dietary depletion of the amino acid precursors of serotonin and dopamine: tryptophan (n = 34) and tyrosine/phenylalanine (n = 35), respectively, and compared the performance of these groups to a control group (n = 34) receiving a nondepleted (balanced) amino acid drink.

Results

We found that PIT differed between groups: Relative to the control group that exhibited only appetitive PIT, we found reduced appetitive PIT in the tyrosine/phenylalanine-depleted group and enhanced aversive PIT in the tryptophan-depleted group.

Conclusions

These results demonstrate a differential involvement of serotonin and dopamine in motivated behavior. They suggest that reductions in serotonin enhance the motivational influence of aversive stimuli on instrumental behavior and do not affect the influence of appetitive stimuli, while reductions in dopamine diminish the influence of appetitive stimuli. No conclusions could be drawn about how dopamine affects the influence of aversive stimuli. The interplay of both neurotransmitter systems allows for flexible and adaptive responses depending on the behavioral context.

Parietal and early visual cortices encode working memory content across mental transformations

Christophel, T. B.; Cichy, R. M.; Hebart, M. N.; Haynes, J.-D.

2015 , Neuroimage , pages: 198--206

URL PDF ABSTRACT

Active and flexible manipulations of memory contents “in the mind's eye” are believed to occur in a dedicated neural workspace, frequently referred to as visual working memory. Such a neural workspace should have two important properties: The ability to store sensory information across delay periods and the ability to flexibly transform sensory information. Here we used a combination of functional MRI and multivariate decoding to indentify such neural representations. Subjects were required to memorize a complex artificial pattern for an extended delay, then rotate the mental image as instructed by a cue and memorize this transformed pattern. We found that patterns of brain activity already in early visual areas and posterior parietal cortex encode not only the initially remembered image, but also the transformed contents after mental rotation. Our results thus suggest that the flexible and general neural workspace supporting visual working memory can be realized within posterior brain regions.

Memory detection using fMRI—Does the encoding context matter?

Peth, J.; Sommer, T.; Hebart, M. N.; Vossel, G.; Büchel, C.; Gamer, M.

2015 , NeuroImage , pages: 164--174

URL PDF ABSTRACT

Recent research revealed that the presentation of crime related details during the Concealed Information Test (CIT) reliably activates a network of bilateral inferior frontal, right medial frontal and right temporal–parietal brain regions. However, the ecological validity of these findings as well as the influence of the encoding context are still unclear. To tackle these questions, three different groups of subjects participated in the current study. Two groups of guilty subjects encoded critical details either only by planning (guilty intention group) or by really enacting (guilty action group) a complex, realistic mock crime. In addition, a group of informed innocent subjects encoded half of the relevant details in a neutral context. Univariate analyses showed robust activation differences between known relevant compared to neutral details in the previously identified ventral frontal–parietal network with no differences between experimental groups. Moreover, validity estimates for average changes in neural activity were similar between groups when focusing on the known details and did not differ substantially from the validity of electrodermal recordings. Additional multivariate analyses provided evidence for differential patterns of activity in the ventral fronto-parietal network between the guilty action and the informed innocent group and yielded higher validity coefficients for the detection of crime related knowledge when relying on whole brain data. Together, these findings demonstrate that an fMRI-based CIT enables the accurate detection of concealed crime related memories, largely independent of encoding context. On the one hand, this indicates that even persons who planned a (mock) crime could be validly identified as having specific crime related knowledge. On the other hand, innocents with such knowledge have a high risk of failing the test, at least when considering univariate changes of neural activation.

2014

Rapid fear detection relies on high spatial frequencies

Stein, T.; Seymour, K.; Hebart, M.N.; Sterzer, P.

2014 , Psychological Science , pages: 566--574

URL PDF ABSTRACT

Signals of threat—such as fearful faces—are processed with priority and have privileged access to awareness. This fear advantage is commonly believed to engage a specialized subcortical pathway to the amygdala that bypasses visual cortex and processes predominantly low-spatial-frequency information but is largely insensitive to high spatial frequencies. We tested visual detection of low- and high-pass-filtered fearful and neutral faces under continuous flash suppression and sandwich masking, and we found consistently that the fear advantage was specific to high spatial frequencies. This demonstrates that rapid fear detection relies not on low- but on high-spatial-frequency information—indicative of an involvement of cortical visual areas. These findings challenge the traditional notion that a subcortical pathway to the amygdala is essential for the initial processing of fear signals and support the emerging view that the cerebral cortex is crucial for the processing of ecologically relevant signals.

Representation of spatial information in key areas of the descending pain modulatory system

Ritter, C.; Hebart, M.N.; Wolbers, T.; Bingel, U.

2014 , Journal of Neuroscience , pages: 4634--4639

URL PDF ABSTRACT

Behavioral studies have demonstrated that descending pain modulation can be spatially specific, as is evident in placebo analgesia, which can be limited to the location at which pain relief is expected. This suggests that higher-order cortical structures of the descending pain modulatory system carry spatial information about the site of stimulation. Here, we used functional magnetic resonance imaging and multivariate pattern analysis in 15 healthy human volunteers to test whether spatial information of painful stimuli is represented in areas of the descending pain modulatory system. We show that the site of nociceptive stimulation (arm or leg) can be successfully decoded from local patterns of brain activity during the anticipation and receipt of painful stimulation in the rostral anterior cingulate cortex, the dorsolateral prefrontal cortices, and the contralateral parietal operculum. These results demonstrate that information regarding the site of nociceptive stimulation is represented in these brain regions. Attempts to predict arm and leg stimulation from the periaqueductal gray, control regions (e.g., white matter) or the control time interval in the intertrial phase did not allow for classifications above chance level. This finding represents an important conceptual advance in the understanding of endogenous pain control mechanisms by bridging the gap between previous behavioral and neuroimaging studies, suggesting a spatial specificity of endogenous pain control.

The Decoding Toolbox (TDT): a versatile software package for multivariate analyses of functional imaging data

Hebart, M.N.; Görgen, K.; Haynes, J.-D.

2014 , Frontiers in Neuroinformatics

URL ABSTRACT

The multivariate analysis of brain signals has recently sparked a great amount of interest, yet accessible and versatile tools to carry out decoding analyses are scarce. Here we introduce The Decoding Toolbox (TDT) which represents a user-friendly, powerful and flexible package for multivariate analysis of functional brain imaging data. TDT is written in Matlab and equipped with an interface to the widely used brain data analysis package SPM. The toolbox allows running fast whole-brain analyses, region-of-interest analyses and searchlight analyses, using machine learning classifiers, pattern correlation analysis, or representational similarity analysis. It offers automatic creation and visualization of diverse cross-validation schemes, feature scaling, nested parameter selection, a variety of feature selection methods, multiclass capabilities, and pattern reconstruction from classifier weights. While basic users can implement a generic analysis in one line of code, advanced users can extend the toolbox to their needs or exploit the structure to combine it with external high-performance classification toolboxes. The toolbox comes with an example data set which can be used to try out the various analysis methods. Taken together, TDT offers a promising option for researchers who want to employ multivariate analyses of brain activity patterns.

2012

What visual information is processed in the human dorsal stream? (Journal Club)

Hebart, M.N.; Hesselmann, G.

2012 , The Journal of Neuroscience , pages: 8107--8109

URL PDF

Human visual and parietal cortex encode visual choices independent of motor plans

Hebart, M.N.; Donner, T.H.; Haynes, J.-D.

2012 , Neuroimage , pages: 1393--1403

URL PDF ABSTRACT

Perceptual decision-making entails the transformation of graded sensory signals into categorical judgments. Often, there is a direct mapping between these judgments and specific motor responses. However, when stimulus–response mappings are fixed, neural activity underlying decision-making cannot be separated from neural activity reflecting motor planning. Several human neuroimaging studies have reported changes in brain activity associated with perceptual decisions. Nevertheless, to date it has remained unknown where and how specific choices are encoded in the human brain when motor planning is decoupled from the decision process. We addressed this question by having subjects judge the direction of motion of dynamic random dot patterns at various levels of motion strength while measuring their brain activity with fMRI. We used multivariate decoding analyses to search the whole brain for patterns of brain activity encoding subjects' choices. To decouple the decision process from motor planning, subjects were informed about the required motor response only after stimulus presentation. Patterns of fMRI signals in early visual and inferior parietal cortex predicted subjects' perceptual choices irrespective of motor planning. This was true across several levels of motion strength and even in the absence of any coherent stimulus motion. We also found that the cortical distribution of choice-selective brain signals depended on stimulus strength: While visual cortex carried most choice-selective information for strong motion, information in parietal cortex decreased with increasing motion coherence. These results demonstrate that human visual and inferior parietal cortex carry information about the visual decision in a more abstract format than can be explained by simple motor intentions. Both brain regions may be differentially involved in perceptual decision-making in the face of strong and weak sensory evidence.

Decoding the contents of visual short-term memory from human visual and parietal cortex

Christophel, T.B.; Hebart, M.N.; Haynes, J.-D.

2012 , Journal of Neuroscience , pages: 12983--12989

URL PDF ABSTRACT

How content is stored in the human brain during visual short-term memory (VSTM) is still an open question. Different theories postulate storage of remembered stimuli in prefrontal, parietal, or visual areas. Aiming at a distinction between these theories, we investigated the content-specificity of BOLD signals from various brain regions during a VSTM task using multivariate pattern classification. To participate in memory maintenance, candidate regions would need to have information about the different contents held in memory. We identified two brain regions where local patterns of fMRI signals represented the remembered content. Apart from the previously established storage in visual areas, we also discovered an area in the posterior parietal cortex where activity patterns allowed us to decode the specific stimuli held in memory. Our results demonstrate that storage in VSTM extends beyond visual areas, but no frontal regions were found. Thus, while frontal and parietal areas typically coactivate during VSTM, maintenance of content in the frontoparietal network might be limited to parietal cortex.

2011

Differential BOLD activity associated with subjective and objective reports during “blindsight” in normal observers

Hesselmann, G.; Hebart, M.; Malach, R.

2011 , Journal of Neuroscience , pages: 12936--12944

URL PDF ABSTRACT

The study of conscious visual perception invariably necessitates some means of report. Report can be either subjective, i.e., an introspective evaluation of conscious experience, or objective, i.e., a forced-choice discrimination regarding different stimulus states. However, the link between report type and fMRI-BOLD signals has remained unknown. Here we used continuous flash suppression to render target images invisible, and observed a long-lasting dissociation between subjective report of visibility and human subjects' forced-choice localization of targets (“blindsight”). Our results show a robust dissociation between brain regions and type of report. We find subjective visibility effects in high-order visual areas even under equal objective performance. No significant BOLD difference was found between correct and incorrect trials in these areas when subjective report was constant. On the other hand, objective performance was linked to the accuracy of multivariate pattern classification mainly in early visual areas. Together, our data support the notion that subjective and objective reports tap cortical signals of different location and amplitude within the visual cortex.

Breaking continuous flash suppression: A new measure of unconscious processing during interocular suppression?

Stein, T.; Hebart, M.N.; Sterzer, P.

2011 , Frontiers in Human Neuroscience , pages: 167

URL ABSTRACT

Until recently, it has been thought that under interocular suppression high-level visual processing is strongly inhibited if not abolished. With the development of continuous flash suppression (CFS), a variant of binocular rivalry, this notion has now been challenged by a number of reports showing that even high-level aspects of visual stimuli, such as familiarity, affect the time stimuli need to overcome CFS and emerge into awareness. In this "breaking continuous flash suppression" (b-CFS) paradigm, differential unconscious processing during suppression is inferred when (a) speeded detection responses to initially invisible stimuli differ, and (b) no comparable differences are found in non-rivalrous control conditions supposed to measure non-specific threshold differences between stimuli. The aim of the present study was to critically evaluate these assumptions. In six experiments we compared the detection of upright and inverted faces. We found that not only under CFS, but also in control conditions upright faces were detected faster and more accurately than inverted faces, although the effect was larger during CFS. However, reaction time (RT) distributions indicated critical differences between the CFS and the control condition. When RT distributions were matched, similar effect sizes were obtained in both conditions. Moreover, subjective ratings revealed that CFS and control conditions are not perceptually comparable. These findings cast doubt on the usefulness of non-rivalrous control conditions to rule out non-specific threshold differences as a cause of shorter detection latencies during CFS. Thus, at least in its present form, the b-CFS paradigm cannot provide unequivocal evidence for unconscious processing under interocular suppression. Nevertheless, our findings also demonstrate that the b-CFS paradigm can be fruitfully applied as a highly sensitive device to probe differences between stimuli in their potency to gain access to awareness.