Zheng, X. Y., Hebart, M.N., Dolan, R. J., Doeller, C. F., Cools, R., & Garvert, M. M.
The hippocampal-entorhinal system uses cognitive maps to represent spatial knowledge and other types of relational information, such as the transition probabilities between objects. However, objects can often be characterized in terms of different types of relations simultaneously, e.g. semantic similarities learned over the course of a lifetime as well as transitions experienced over a brief timeframe in an experimental setting. Here we ask how the hippocampal formation handles the embedding of stimuli in multiple relational structures that differ vastly in terms of their mode and timescale of acquisition: Does it integrate the different stimulus dimensions into one conjunctive map, or is each dimension represented in a parallel map? To this end, we reanalyzed functional magnetic resonance imaging (fMRI) data from Garvert et al. (2017) that had previously revealed an entorhinal map which coded for newly learnt statistical regularities. We used a triplet odd-one-out task to construct a semantic distance matrix for presented items and applied fMRI adaptation analysis to show that the degree of similarity of representations in bilateral hippocampus decreases as a function of semantic distance between presented objects. Importantly, while both maps localize to the hippocampal formation, this semantic map is anatomically distinct from the originally described entorhinal map. This finding supports the idea that the hippocampal-entorhinal system forms parallel cognitive maps reflecting the embedding of objects in diverse relational structures.
Dima, D. C., Hebart, M.N., & Isik, L.
Understanding actions performed by others requires us to integrate different types of information about people, scenes, objects, and their interactions. What organizing dimensions does the mind use to make sense of this complex action space? To address this question, we collected intuitive similarity judgments across two large-scale sets of naturalistic videos depicting everyday actions. We used cross-validated sparse non-negative matrix factorization (NMF) to identify the structure underlying action similarity judgments. A low-dimensional representation, consisting of nine to ten dimensions, was sufficient to accurately reconstruct human similarity judgments. The dimensions were robust to stimulus set perturbations and reproducible in a separate odd-one-out experiment. Human labels mapped these dimensions onto semantic axes relating to food, work, and home life; social axes relating to people and emotions; and one visual axis related to scene setting. While highly interpretable, these dimensions did not share a clear one-to-one correspondence with prior hypotheses of action-relevant dimensions. Together, our results reveal a low-dimensional set of robust and interpretable dimensions that organize intuitive action similarity judgments and highlight the importance of data-driven investigations of behavioral representations.
Stoinski, L.A., Perkuhn, J., & Hebart, M.N.
To study visual object processing, the need for well-curated object concepts and images has grown significantly over the past years. To address this we have previously developed THINGS (Hebart et al., 2019), a large-scale database of 1,854 systematically sampled object concepts with 26,107 high-quality naturalistic images of these concepts. With THINGS+ we aim to extend THINGS by adding concept-specific and image-specific norms and metadata. Concept-specific norms were collected for all 1,854 object concepts for the object properties real-world size, manmadeness, preciousness, liveliness, heaviness, naturalness, ability to move, graspability, holdability, ability to be moved, pleasantness, and arousal. Further, we extended high-level categorization to 53 superordinate categories and collected typicality ratings for members of all 53 categories. Image-specific metadata includes measures of nameability and recognizability for objects in all 26,107 images. To this end, we asked participants to provide labels for prominent objects depicted in each of the 26,107 images and measured the alignment with the original object concept. Finally, to present example images in publications without copyright restrictions, we identified one new public domain image per object concept. In this study we demonstrate a high consistency of property (r = 0.92-0.99, M = 0.98, SD = 0.34) and typicality ratings (r = 0.88-0.98; M = 0.96, SD = 0.19), with arousal ratings as the only exception (r = 0.69). Correlations of our data with external norms were moderate to high for object properties (r = 0.44-0.95; M = 0.85, SD = 0.32) and typicality scores (r = 0.72-0.88; M = 0.79, SD = 0.18), again with the lowest validity for arousal (r = 0.30 - 0.52). To summarize, THINGS+ provides a broad, externally-validated extension to existing object norms and an important extension to THINGS as a general resource of object concepts, images, and category memberships. Our norms, metadata, and images provide a detailed selection of stimuli and control variables for a wide range of research interested in object processing and semantic memory.
Schmidt, F.*, Hebart, M.N.*, Schmid, A., & Fleming, R.
Visually categorizing and comparing materials is crucial for our everyday behaviour. Given the dramatic variability in their visual appearance and functional significance, what organizational principles underly the internal representation of materials? To address this question, here we use a large-scale data-driven approach to uncover the core latent dimensions in our mental representation of materials. In a first step, we assembled a new image dataset (STUFF dataset) consisting of 600 photographs of 200 systematically sampled material classes. Next, we used these images to crowdsource 1.87 million triplet similarity judgments. Based on the responses, we then modelled the assumed cognitive process underlying these choices by quantifying each image as a sparse, non-negative vector in a multidimensional embedding space. The resulting embedding predicted material similarity judgments in an independent test set close to the human noise ceiling and accurately reconstructed the similarity matrix of all 600 images in the STUFF dataset. We found that representations of individual material images were captured by a combination of 36 material dimensions that were highly reproducible and interpretable, comprising perceptual (e.g., “grainy”, “blue”) as well as conceptual (e.g., “mineral”, “viscous”) dimensions. These results have broad implications for understanding material perception, its natural dimensions, and our ability to organize materials into classes.
Singer, J.J., D., Cichy, R.M., Hebart, M.N.
2023 , Journal of Neuroscience , Volume: 43 , pages: 484-500Drawings offer a simple and efficient way to communicate meaning. While line drawings capture only coarsely how objects look in reality, we still perceive them as resembling real-world objects. Previous work has shown that this perceived similarity is mirrored by shared neural representations for drawings and natural images, which suggests that similar mechanisms underlie the recognition of both. However, other work has proposed that representations of drawings and natural images become similar only after substantial processing has taken place, suggesting distinct mechanisms. To arbitrate between those alternatives, we measured brain responses resolved in space and time using fMRI and MEG, respectively, while human participants (female and male) viewed images of objects depicted as photographs, line drawings, or sketch-like drawings. Using multivariate decoding, we demonstrate that object category information emerged similarly fast and across overlapping regions in occipital, ventral-temporal, and posterior parietal cortex for all types of depiction, yet with smaller effects at higher levels of visual abstraction. In addition, cross-decoding between depiction types revealed strong generalization of object category information from early processing stages on. Finally, by combining fMRI and MEG data using representational similarity analysis, we found that visual information traversed similar processing stages for all types of depiction, yet with an overall stronger representation for photographs. Together, our results demonstrate broad commonalities in the neural dynamics of object recognition across types of depiction, thus providing clear evidence for shared neural mechanisms underlying recognition of natural object images and abstract drawings.
Josephs, E., Hebart, M.N., & Konkle, T.
Near-scale environments, like work desks, restaurant place settings or lab benches, are the interface of our hand-based interactions with the world. How are our conceptual representations of these environments organized? What properties distinguish among reachspaces, and why? We obtained 1.25 million similarity judgments on 990 reachspace images, and generated a 30-dimensional embedding which accurately predicts these judgments. Examination of the embedding dimensions revealed key properties underlying these judgments, such as reachspace layout, affordance, and visual appearance. Clustering performed over the embedding revealed four distinct interpretable classes of reachspaces, distinguishing among spaces related to food, electronics, analog activities, and storage or display. Finally, we found that reachspace similarity ratings were better predicted by the function of the spaces than their locations, suggesting that reachspaces are largely conceptualized in terms of the actions they support. Altogether, these results reveal the behaviorally-relevant principles that structure our internal representations of reach-relevant environments.
Kramer, M.A., Hebart, M.N., Baker, C.I., & Bainbridge, W.A.
What makes certain images more memorable than others? While much of memory research has focused on participant effects, recent studies employing a stimulus-centric perspective have sparked debate on the determinants of memory, including the roles of semantic and visual features and whether the most prototypical or atypical items are best remembered. Prior studies have typically relied on constrained stimulus sets, limiting a generalized view of the features underlying what we remember. Here, we collected 1+ million memory ratings for a naturalistic dataset of 26,107 object images designed to comprehensively sample concrete objects. We establish a model of object features that is predictive of image memorability and examined whether memorability could be accounted for by the typicality of the objects. We find that semantic features exert a stronger influence than perceptual features on what we remember and that the relationship between memorability and typicality is more complex than a simple positive or negative association alone.
Hebart, M.N.*, Contier, O.*, Teichmann, L.*, Rockter, A., Zheng, C.Y., Kidder, A., Corriveau, A., Vaziri-Pashkam, M., & Baker, C.I.
Understanding object representations requires a broad, comprehensive sampling of the objects in our visual world with dense measurements of brain activity and behavior. Here, we present THINGS-data, a multimodal collection of large-scale neuroimaging and behavioral datasets in humans, comprising densely sampled functional MRI and magnetoencephalographic recordings, as well as 4.70 million similarity judgments in response to thousands of photographic images for up to 1,854 object concepts. THINGS-data is unique in its breadth of richly annotated objects, allowing for testing countless hypotheses at scale while assessing the reproducibility of previous findings. Beyond the unique insights promised by each individual dataset, the multimodality of THINGS-data allows combining datasets for a much broader view into object processing than previously possible. Our analyses demonstrate the high quality of the datasets and provide five examples of hypothesis-driven and data-driven applications. THINGS-data constitutes the core public release of the THINGS initiative (https://things-initiative.org) for bridging the gap between disciplines and the advancement of cognitive neuroscience.
Grootswagers, T., Zhou, I., Robinson, A.K., Hebart, M.N., Carlson, T.A.
2022 , Scientific Data 9 , Volume: 9 , pages: 3The neural basis of object recognition and semantic knowledge has been extensively studied but the high dimensionality of object space makes it challenging to develop overarching theories on how the brain organises object knowledge. To help understand how the brain allows us to recognise, categorise, and represent objects and object categories, there is a growing interest in using large-scale image databases for neuroimaging experiments. In the current paper, we present THINGS-EEG, a dataset containing human electroencephalography responses from 50 subjects to 1,854 object concepts and 22,248 images in the THINGS stimulus set, a manually curated and high-quality image database that was specifically designed for studying human vision. The THINGS-EEG dataset provides neuroimaging recordings to a systematic collection of objects and concepts and can therefore support a wide array of research to understand visual object processing in the human brain.
Singer, J., Seeliger, K., Kietzmann, T.C., Hebart, M.N.
2022 , journal of vision , Volume: 22 , pages: 1-19Line drawings convey meaning with just a few strokes. Despite strong simplifications, humans can recognize objects depicted in such abstracted images without effort. To what degree do deep convolutional neural networks (CNNs) mirror this human ability to generalize to abstracted object images? While CNNs trained on natural images have been shown to exhibit poor classification performance on drawings, other work has demonstrated highly similar latent representations in the networks for abstracted and natural images. Here, we address these seemingly conflicting findings by analyzing the activation patterns of a CNN trained on natural images across a set of photographs, drawings, and sketches of the same objects and comparing them to human behavior. We find a highly similar representational structure across levels of visual abstraction in early and intermediate layers of the network. This similarity, however, does not translate to later stages in the network, resulting in low classification performance for drawings and sketches. We identified that texture bias in CNNs contributes to the dissimilar representational structure in late layers and the poor performance on drawings. Finally, by fine-tuning late network layers with object drawings, we show that performance can be largely restored, demonstrating the general utility of features learned on natural images in early and intermediate layers for the recognition of drawings. In conclusion, generalization to abstracted images, such as drawings, seems to be an emergent property of CNNs trained on natural images, which is, however, suppressed by domain-related biases that arise during later processing stages in the network.
Kaniuth, P., Hebart, M.N.
2022 , Neuroimage , Volume: 257 , pages: 119294Representational Similarity Analysis (RSA) has emerged as a popular method for relating representational spaces from human brain activity, behavioral data, and computational models. RSA is based on the comparison of representational (dis-)similarity matrices (RDMs or RSMs), which characterize the pairwise (dis-)similarities of all conditions across all features (e.g. fMRI voxels or units of a model). However, classical RSA treats each feature as equally important. This ‘equal weights’ assumption contrasts with the flexibility of multivariate decoding, which reweights individual features for predicting a target variable. As a consequence, classical RSA may lead researchers to underestimate the correspondence between a model and a brain region and, in case of model comparison, may lead them to select an inferior model. The aim of this work is twofold: First, we sought to broadly test feature-reweighted RSA (FR-RSA) applied to computational models and reveal the extent to which reweighting model features improves RSM correspondence and affects model selection. Previous work suggested that reweighting can improve model selection in RSA but it has remained unclear to what extent these results generalize across datasets and data modalities. To draw more general conclusions, we utilized a range of publicly available datasets and three popular deep neural networks (DNNs). Second, we propose voxel-reweighted RSA, a novel use case of FR-RSA that reweights fMRI voxels, mirroring the rationale of multivariate decoding of optimally combining voxel activity patterns. We found that reweighting individual model units markedly improved the fit between model RSMs and target RSMs derived from several fMRI and behavioral datasets and affected model selection, highlighting the importance of considering FR-RSA. For voxel-reweighted RSA, improvements in RSM correspondence were even more pronounced, demonstrating the utility of this novel approach. We additionally show that classical noise ceilings can be exceeded when FR-RSA is applied and propose an updated approach for their computation. Taken together, our results broadly validate the use of FR-RSA for improving the fit between computational models, brain, and behavioral data, allowing us to better adjudicate between competing computational models. Further, our results suggest that FR-RSA applied to brain measurement channels could become an important new method to assess the correspondence between representational spaces.
Hansen, H., Hebart, M.N.
2022 , Proceedings of the Annual Meeting of the Cognitive Science Society , Volume: 44Semantic features have been playing a central role in investigating the nature of our conceptual representations. Yet the time and effort required to sample features from human raters has restricted their use to a limited set of manually curated concepts. Given recent success of transformer-based language models, we asked whether it was possible to use such models to automatically generate meaningful lists of properties for arbitrary object concepts and whether these models would produce features similar to those found in humans. We probed a GPT-3 model to generate semantic features for 1,854 objects and compared them to existing human feature norms. GPT-3 showed a similar distribution in the types of features and similar performance in predicting similarity, relatedness, and category membership. Together these results highlight the potential of large language models to capture important facets of human knowledge and yield a new approach for automatically generating interpretable feature sets.
Muttenthaler, L., Zheng, C. Y., McClure, P., Vandermeulen, R. A., Hebart, M.N. & Pereira, F.
2022 , Advances in Neural Information Processing Systems (NeurIPS)A central goal in the cognitive sciences is the development of numerical models for mental representations of object concepts. This paper introduces Variational Interpretable Concept Embeddings (VICE), an approximate Bayesian method for embedding object concepts in a vector space using data collected from humans in a triplet odd-one-out task. VICE uses variational inference to obtain sparse, non-negative representations of object concepts with uncertainty estimates for the embedding values. These estimates are used to automatically select the dimensions that best explain the data. We derive a PAC learning bound for VICE that can be used to estimate generalization performance or determine a sufficient sample size for experimental design. VICE rivals or outperforms its predecessor, SPoSE, at predicting human behavior in the triplet odd-one-out task. Furthermore, VICE's object representations are more reproducible and consistent across random initializations, highlighting the unique advantage of using VICE for deriving interpretable embeddings from human behavior.
Muttenthaler, L. & Hebart, M.N.
2021 , Frontiers in Neuroinformatics , Volume: 15 , pages: 45Over the past decade, deep neural network (DNN) models have received a lot of attention due to their near-human object classification performance and their excellent prediction of signals recorded from biological visual systems. To better understand the function of these networks and relate them to hypotheses about brain activity and behavior, researchers need to extract the activations to images across different DNN layers. The abundance of different DNN variants, however, can often be unwieldy, and the task of extracting DNN activations from different layers may be non-trivial and error-prone for someone without a strong computational background. Thus, researchers in the fields of cognitive science and computational neuroscience would benefit from a library or package that supports a user in the extraction task. THINGSvision is a new Python module that aims at closing this gap by providing a simple and unified tool for extracting layer activations for a wide range of pretrained and randomly-initialized neural network architectures, even for users with little to no programming experience. We demonstrate the general utility of THINGsvision by relating extracted DNN activations to a number of functional MRI and behavioral datasets using representational similarity analysis, which can be performed as an integral part of the toolbox. Together, THINGSvision enables researchers across diverse fields to extract features in a streamlined manner for their custom image dataset, thereby improving the ease of relating DNNs, brain activity, and behavior, and improving the reproducibility of findings in these research fields.
Liu, P., Chrysidou, A, Doehler, J, Hebart, M.N., Wolbers, T, & Kuehn, E.
2021 , eLife , Volume: 10 , pages: e60090Topographic maps are a fundamental feature of cortex architecture in the mammalian brain. One common theory is that the de-differentiation of topographic maps links to impairments in everyday behavior due to less precise functional map readouts. Here, we tested this theory by characterizing de-differentiated topographic maps in primary somatosensory cortex (SI) of younger and older adults by means of ultra-high resolution functional magnetic resonance imaging together with perceptual finger individuation and hand motor performance. Older adults’ SI maps showed similar amplitude and size to younger adults’ maps, but presented with less representational similarity between distant fingers. Larger population receptive field sizes in older adults’ maps did not correlate with behavior, whereas reduced cortical distances between D2 and D3 related to worse finger individuation but better motor performance. Our data uncover the drawbacks of a simple de-differentiation model of topographic map function, and motivate the introduction of feature-based models of cortical reorganization.
Singer, J., Seeliger, K., & Hebart, M.N.
2020 , NeurIPS Workshop SVRHMDrawings are universal in human culture and serve as tools to efficiently convey meaning with little visual information. Humans are adept at recognizing even highly abstracted drawings of objects, and their visual system has been shown to respond similarly to different object depictions. Yet, the processing of object drawings in deep convolutional neural networks (CNNs) has yielded conflicting results. While CNNs have been shown to perform poorly on drawings, there is evidence that representations in CNNs are similar for object photographs and drawings. Here, we resolve these disparate findings by probing the generalization ability of a CNN trained on natural object images for a set of photos, drawings and sketches of the same objects, with each depiction representing a different level of abstraction. We demonstrate that despite poor classification performance on drawings and sketches, the network exhibits a similar representational structure across levels of abstraction in intermediate layers which, however, disappears in later layers. Further, we show that a texture bias found in CNNs contributes both to the poor classification performance for drawings and the dissimilar representational structure, specifically in the later layers of the network. By finetuning only those layers on a database of object drawings, we show that features in early and intermediate layers learned on natural object photographs are indeed sufficient for downstream recognition of drawings. Our findings reconcile previous investigations on the generalization ability of CNNs for drawings and reveal both opportunities and limitations of CNNs as models for the representation and recognition of drawings and sketches.
Hebart, M.N., Zheng, C.Y., Pereira, F., & Baker, C.I.
2020 , Nature Human Behaviour , pages: 1173-1185Objects can be characterized according to a vast number of possible criteria (such as animacy, shape, colour and function), but some dimensions are more useful than others for making sense of the objects around us. To identify these core dimensions of object representations, we developed a data-driven computational model of similarity judgements for real-world images of 1,854 objects. The model captured most explainable variance in similarity judgements and produced 49 highly reproducible and meaningful object dimensions that reflect various conceptual and perceptual properties of those objects. These dimensions predicted external categorization behaviour and reflected typicality judgements of those categories. Furthermore, humans can accurately rate objects along these dimensions, highlighting their interpretability and opening up a way to generate similarity estimates from object dimensions alone. Collectively, these results demonstrate that human similarity judgements can be captured by a fairly low-dimensional, interpretable embedding that generalizes to external behaviour.
Hebart, M.N. & Schuck, N.W.
2020 , NeuropsychologiaComputational Cognitive Neuroscience is a discipline at the intersection of psychology, neuroscience and artificial intelligence. At its core is the development and comparison of computational models that allow the prediction of behavior, cognition and brain activity, with the long-term goal of providing a neurophysiologically plausible characterization of the underlying brain structure or function (Ashby and Helie, 2011; Kriegeskorte and Douglas, 2018; Love, 2015; O’Reilly and Munakata, 2000). Fueled by recent developments with machine learning techniques that solve cognitive tasks such as object recognition, decision making, or language processing (Krizhevsky et al., 2012; Mikolov et al., 2013; Mnih et al., 2015), computational cognitive neuroscientists have started to link these artificial intelligence approaches to neural processes (Huth et al., 2016; Stachenfeld et al., 2017; Yamins et al., 2014). This, in turn, has led to applications of computational modeling in neuroscience that have become increasingly sophisticated. Today, the field is moving fast, and hardly a year goes by without discoveries that seem like a true expansion of our horizon. These exciting developments motivated us to bring to life this Special Issue on Computational Cognitive Neuroscience.
Bönstrup, M., Iturrate, I., Hebart, M.N., Censor, N., & Cohen, L.G.
2020 , Science of Learning , pages: 1--10Performance improvements during early human motor skill learning are suggested to be driven by short periods of rest during practice, at the scale of seconds. To reveal the unknown mechanisms behind these “micro-offline” gains, we leveraged the sampling power offered by online crowdsourcing (cumulative N over all experiments = 951). First, we replicated the original in-lab findings, demonstrating generalizability to subjects learning the task in their daily living environment (N = 389). Second, we show that offline improvements during rest are equivalent when significantly shortening practice period duration, thus confirming that they are not a result of recovery from performance fatigue (N = 118). Third, retroactive interference immediately after each practice period reduced the learning rate relative to interference after passage of time (N = 373), indicating stabilization of the motor memory at a microscale of several seconds. Finally, we show that random termination of practice periods did not impact offline gains, ruling out a contribution of predictive motor slowing (N = 71). Altogether, these results demonstrate that micro-offline gains indicate rapid, within-seconds consolidation accounting for early skill learning.
Hebart, M.N., Dickter, A.H., Kidder, A., Kwok, W.Y., Corriveau, A., Van Wicklin, C., & Baker, C.I.
2019 , PloS one , pages: e0223792In recent years, the use of a large number of object concepts and naturalistic object images has been growing strongly in cognitive neuroscience research. Classical databases of object concepts are based mostly on a manually curated set of concepts. Further, databases of naturalistic object images typically consist of single images of objects cropped from their background, or a large number of naturalistic images of varying quality, requiring elaborate manual image curation. Here we provide a set of 1,854 diverse object concepts sampled systematically from concrete picturable and nameable nouns in the American English language. Using these object concepts, we conducted a large-scale web image search to compile a database of 26,107 high-quality naturalistic images of those objects, with 12 or more object images per concept and all images cropped to square size. Using crowdsourcing, we provide higher-level category membership for the 27 most common categories and validate them by relating them to representations in a semantic embedding derived from large text corpora. Finally, by feeding images through a deep convolutional neural network, we demonstrate that they exhibit high selectivity for different object concepts, while at the same time preserving variability of different object images within each concept. Together, the THINGS database provides a rich resource of object concepts and object images and offers a tool for both systematic and large-scale naturalistic research in the fields of psychology, neuroscience, and computer science.
Görgen, K., Hebart, M.N., Allefeld, C., & Haynes, J.-D.
2018 , Neuroimage , pages: 19--30Standard neuroimaging data analysis based on traditional principles of experimental design, modelling, and statistical inference is increasingly complemented by novel analysis methods, driven e.g. by machine learning methods. While these novel approaches provide new insights into neuroimaging data, they often have unexpected properties, generating a growing literature on possible pitfalls. We propose to meet this challenge by adopting a habit of systematic testing of experimental design, analysis procedures, and statistical inference. Specifically, we suggest to apply the analysis method used for experimental data also to aspects of the experimental design, simulated confounds, simulated null data, and control data. We stress the importance of keeping the analysis method the same in main and test analyses, because only this way possible confounds and unexpected properties can be reliably detected and avoided. We describe and discuss this Same Analysis Approach in detail, and demonstrate it in two worked examples using multivariate decoding. With these examples, we reveal two sources of error: A mismatch between counterbalancing (crossover designs) and cross-validation which leads to systematic below-chance accuracies, and linear decoding of a nonlinear effect, a difference in variance.
Hebart, M.N. & Baker, C.I.
2018 , Neuroimage , pages: 4--18Multivariate decoding methods were developed originally as tools to enable accurate predictions in real-world applications. The realization that these methods can also be employed to study brain function has led to their widespread adoption in the neurosciences. However, prior to the rise of multivariate decoding, the study of brain function was firmly embedded in a statistical philosophy grounded on univariate methods of data analysis. In this way, multivariate decoding for brain interpretation grew out of two established frameworks: multivariate decoding for predictions in real-world applications, and classical univariate analysis based on the study and interpretation of brain activation. We argue that this led to two confusions, one reflecting a mixture of multivariate decoding for prediction or interpretation, and the other a mixture of the conceptual and statistical philosophies underlying multivariate decoding and classical univariate analysis. Here we attempt to systematically disambiguate multivariate decoding for the study of brain function from the frameworks it grew out of. After elaborating these confusions and their consequences, we describe six, often unappreciated, differences between classical univariate analysis and multivariate decoding. We then focus on how the common interpretation of what is signal and noise changes in multivariate decoding. Finally, we use four examples to illustrate where these confusions may impact the interpretation of neuroimaging data. We conclude with a discussion of potential strategies to help resolve these confusions in interpreting multivariate decoding results, including the potential departure from multivariate decoding methods for the study of brain function.
Hebart, M.N., Bankson, B.B., Harel, A., Baker, C.I.*, & Cichy, R.M.*
2018 , Elife , pages: e32816Despite the importance of an observer’s goals in determining how a visual object is categorized, surprisingly little is known about how humans process the task context in which objects occur and how it may interact with the processing of objects. Using magnetoencephalography (MEG), functional magnetic resonance imaging (fMRI) and multivariate techniques, we studied the spatial and temporal dynamics of task and object processing. Our results reveal a sequence of separate but overlapping task-related processes spread across frontoparietal and occipitotemporal cortex. Task exhibited late effects on object processing by selectively enhancing task-relevant object features, with limited impact on the overall pattern of object representations. Combining MEG and fMRI data, we reveal a parallel rise in task-related signals throughout the cerebral cortex, with an increasing dominance of task over object representations from early to higher visual areas. Collectively, our results reveal the complex dynamics underlying task and object representations throughout human cortex.
Bankson, B.B.*, Hebart, M.N.*, Groen, I.I.A., & Baker, C.I.
2018 , NeuroImage , pages: 172--182Visual object representations are commonly thought to emerge rapidly, yet it has remained unclear to what extent early brain responses reflect purely low-level visual features of these objects and how strongly those features contribute to later categorical or conceptual representations. Here, we aimed to estimate a lower temporal bound for the emergence of conceptual representations by defining two criteria that characterize such representations: 1) conceptual object representations should generalize across different exemplars of the same object, and 2) these representations should reflect high-level behavioral judgments. To test these criteria, we compared magnetoencephalography (MEG) recordings between two groups of participants (n = 16 per group) exposed to different exemplar images of the same object concepts. Further, we disentangled low-level from high-level MEG responses by estimating the unique and shared contribution of models of behavioral judgments, semantics, and different layers of deep neural networks of visual object processing. We find that 1) both generalization across exemplars as well as generalization of object-related signals across time increase after 150 ms, peaking around 230 ms; 2) representations specific to behavioral judgments emerged rapidly, peaking around 160 ms. Collectively, these results suggest a lower bound for the emergence of conceptual object representations around 150 ms following stimulus onset.
Hebart, M.N., Schriever, Y., Donner, T.H.*, & Haynes, J.-D.*
2016 , Cerebral Cortex , pages: 118--130Perceptual confidence refers to the degree to which we believe in the accuracy of our percepts. Signal detection theory suggests that perceptual confidence is computed from an internal "decision variable," which reflects the amount of available information in favor of one or another perceptual interpretation of the sensory input. The neural processes underlying these computations have, however, remained elusive. Here, we used fMRI and multivariate decoding techniques to identify regions of the human brain that encode this decision variable and confidence during a visual motion discrimination task. We used observers' binary perceptual choices and confidence ratings to reconstruct the internal decision variable that governed the subjects' behavior. A number of areas in prefrontal and posterior parietal association cortex encoded this decision variable, and activity in the ventral striatum reflected the degree of perceptual confidence. Using a multivariate connectivity analysis, we demonstrate that patterns of brain activity in the right ventrolateral prefrontal cortex reflecting the decision variable were linked to brain signals in the ventral striatum reflecting confidence. Our results suggest that the representation of perceptual confidence in the ventral striatum is derived from a transformation of the continuous decision variable encoded in the cerebral cortex.
Höhne, J., Bartz, D., Hebart, M.N., Müller, K.-R., & Blankertz, B.
2016 , NeuroImage , pages: 740--751Among the numerous methods used to analyze neuroimaging data, Linear Discriminant Analysis (LDA) is commonly applied for binary classification problems. LDAs popularity derives from its simplicity and its competitive classification performance, which has been reported for various types of neuroimaging data.
Yet the standard LDA approach proves less than optimal for binary classification problems when additional label information (i.e. subclass labels) is present. Subclass labels allow to model structure in the data, which can be used to facilitate the classification task. In this paper, we illustrate how neuroimaging data exhibit subclass labels that may contain valuable information. We also show that the standard LDA classifier is unable to exploit subclass labels.
We introduce a novel method that allows subclass labels to be incorporated efficiently into the classifier. The novel method, which we call Relevance Subclass LDA (RSLDA), computes an individual classification hyperplane for each subclass. It is based on regularized estimators of the subclass mean and uses other subclasses as regularization targets. We demonstrate the applicability and performance of our method on data drawn from two different neuroimaging modalities: (I) EEG data from brain–computer interfacing with event-related potentials, and (II) fMRI data in response to different levels of visual motion. We show that RSLDA outperforms the standard LDA approach for both types of datasets. These findings illustrate the benefits of exploiting subclass structure in neuroimaging data. Finally, we show that our classifier also outputs regularization profiles, enabling researchers to interpret the subclass structure in a meaningful way.
RSLDA therefore yields increased classification accuracy as well as a better interpretation of neuroimaging data. Since both results are highly favorable, we suggest to apply RSLDA for various classification problems within neuroimaging and beyond.
Guggenmos, M., Wilbertz, G., Hebart, M.N.*, & Sterzer, P.*
2016 , eLife , pages: e13388It is well established that learning can occur without external feedback, yet normative reinforcement learning theories have difficulties explaining such instances of learning. Here, we propose that human observers are capable of generating their own feedback signals by monitoring internal decision variables. We investigated this hypothesis in a visual perceptual learning task using fMRI and confidence reports as a measure for this monitoring process. Employing a novel computational model in which learning is guided by confidence-based reinforcement signals, we found that mesolimbic brain areas encoded both anticipation and prediction error of confidence—in remarkable similarity to previous findings for external reward-based feedback. We demonstrate that the model accounts for choice and confidence reports and show that the mesolimbic confidence prediction error modulation derived through the model predicts individual learning success. These results provide a mechanistic neurobiological explanation for learning without external feedback by augmenting reinforcement models with confidence-based feedback.
Korjus, K., Hebart, M.N., & Vicente, R.
2016 , PloS one , pages: e0161788Supervised machine learning methods typically require splitting data into multiple chunks for training, validating, and finally testing classifiers. For finding the best parameters of a classifier, training and validation are usually carried out with cross-validation. This is followed by application of the classifier with optimized parameters to a separate test set for estimating the classifier’s generalization performance. With limited data, this separation of test data creates a difficult trade-off between having more statistical power in estimating generalization performance versus choosing better parameters and fitting a better model. We propose a novel approach that we term “Cross-validation and cross-testing” improving this trade-off by re-using test data without biasing classifier performance. The novel approach is validated using simulated data and electrophysiological recordings in humans and rodents. The results demonstrate that the approach has a higher probability of discovering significant results than the standard approach of cross-validation and testing, while maintaining the nominal alpha level. In contrast to nested cross-validation, which is maximally efficient in re-using data, the proposed approach additionally maintains the interpretability of individual parameters. Taken together, we suggest an addition to currently used machine learning approaches which may be particularly useful in cases where model weights do not require interpretation, but parameters do.
Guo, R., Böhmer, W., Hebart, M., Chien, S., Sommer, T., Obermayer, K., & Gläscher, J.
2016 , Journal of Neuroscience , pages: 12650--12660Goal-directed and instrumental learning are both important controllers of human behavior. Learning about which stimulus event occurs in the environment and the reward associated with them allows humans to seek out the most valuable stimulus and move through the environment in a goal-directed manner. Stimulus–response associations are characteristic of instrumental learning, whereas response–outcome associations are the hallmark of goal-directed learning. Here we provide behavioral, computational, and neuroimaging results from a novel task in which stimulus–response and response–outcome associations are learned simultaneously but dominate behavior at different stages of the experiment. We found that prediction error representations in the ventral striatum depend on which type of learning dominates. Furthermore, the amygdala tracks the time-dependent weighting of stimulus–response versus response–outcome learning. Our findings suggest that the goal-directed and instrumental controllers dynamically engage the ventral striatum in representing prediction errors whenever one of them is dominating choice behavior.
Hebart, M.N. & Gläscher, J.
2015 , Psychopharmacology , pages: 437--451Human motivation and decision-making is influenced by the interaction of Pavlovian and instrumental systems. The neurotransmitters dopamine and serotonin have been suggested to play a major role in motivation and decision-making, but how they affect this interaction in humans is largely unknown. We investigated the effect of these neurotransmitters in a general Pavlovian-to-instrumental transfer (PIT) task which measured the nonspecific effect of appetitive and aversive Pavlovian cues on instrumental responses. For that purpose, we used selective dietary depletion of the amino acid precursors of serotonin and dopamine: tryptophan (n = 34) and tyrosine/phenylalanine (n = 35), respectively, and compared the performance of these groups to a control group (n = 34) receiving a nondepleted (balanced) amino acid drink. We found that PIT differed between groups: Relative to the control group that exhibited only appetitive PIT, we found reduced appetitive PIT in the tyrosine/phenylalanine-depleted group and enhanced aversive PIT in the tryptophan-depleted group. These results demonstrate a differential involvement of serotonin and dopamine in motivated behavior. They suggest that reductions in serotonin enhance the motivational influence of aversive stimuli on instrumental behavior and do not affect the influence of appetitive stimuli, while reductions in dopamine diminish the influence of appetitive stimuli. No conclusions could be drawn about how dopamine affects the influence of aversive stimuli. The interplay of both neurotransmitter systems allows for flexible and adaptive responses depending on the behavioral context.Rationale
Objective
Methods
Results
Conclusions
Christophel, T.B., Cichy, R.M., Hebart, M.N., & Haynes, J.-D.
2015 , Neuroimage , pages: 198--206Active and flexible manipulations of memory contents “in the mind's eye” are believed to occur in a dedicated neural workspace, frequently referred to as visual working memory. Such a neural workspace should have two important properties: The ability to store sensory information across delay periods and the ability to flexibly transform sensory information. Here we used a combination of functional MRI and multivariate decoding to indentify such neural representations. Subjects were required to memorize a complex artificial pattern for an extended delay, then rotate the mental image as instructed by a cue and memorize this transformed pattern. We found that patterns of brain activity already in early visual areas and posterior parietal cortex encode not only the initially remembered image, but also the transformed contents after mental rotation. Our results thus suggest that the flexible and general neural workspace supporting visual working memory can be realized within posterior brain regions.
Peth, J., Sommer, T., Hebart, M.N., Vossel, G., Büchel, C., & Gamer, M.
2015 , NeuroImage , pages: 164--174Recent research revealed that the presentation of crime related details during the Concealed Information Test (CIT) reliably activates a network of bilateral inferior frontal, right medial frontal and right temporal–parietal brain regions. However, the ecological validity of these findings as well as the influence of the encoding context are still unclear. To tackle these questions, three different groups of subjects participated in the current study. Two groups of guilty subjects encoded critical details either only by planning (guilty intention group) or by really enacting (guilty action group) a complex, realistic mock crime. In addition, a group of informed innocent subjects encoded half of the relevant details in a neutral context. Univariate analyses showed robust activation differences between known relevant compared to neutral details in the previously identified ventral frontal–parietal network with no differences between experimental groups. Moreover, validity estimates for average changes in neural activity were similar between groups when focusing on the known details and did not differ substantially from the validity of electrodermal recordings. Additional multivariate analyses provided evidence for differential patterns of activity in the ventral fronto-parietal network between the guilty action and the informed innocent group and yielded higher validity coefficients for the detection of crime related knowledge when relying on whole brain data. Together, these findings demonstrate that an fMRI-based CIT enables the accurate detection of concealed crime related memories, largely independent of encoding context. On the one hand, this indicates that even persons who planned a (mock) crime could be validly identified as having specific crime related knowledge. On the other hand, innocents with such knowledge have a high risk of failing the test, at least when considering univariate changes of neural activation.
Stein, T., Seymour, K., Hebart, M.N., & Sterzer, P.
2014 , Psychological Science , pages: 566--574Signals of threat—such as fearful faces—are processed with priority and have privileged access to awareness. This fear advantage is commonly believed to engage a specialized subcortical pathway to the amygdala that bypasses visual cortex and processes predominantly low-spatial-frequency information but is largely insensitive to high spatial frequencies. We tested visual detection of low- and high-pass-filtered fearful and neutral faces under continuous flash suppression and sandwich masking, and we found consistently that the fear advantage was specific to high spatial frequencies. This demonstrates that rapid fear detection relies not on low- but on high-spatial-frequency information—indicative of an involvement of cortical visual areas. These findings challenge the traditional notion that a subcortical pathway to the amygdala is essential for the initial processing of fear signals and support the emerging view that the cerebral cortex is crucial for the processing of ecologically relevant signals.
Ritter, C., Hebart, M.N., Wolbers, T., & Bingel, U.
2014 , Journal of Neuroscience , pages: 4634--4639Behavioral studies have demonstrated that descending pain modulation can be spatially specific, as is evident in placebo analgesia, which can be limited to the location at which pain relief is expected. This suggests that higher-order cortical structures of the descending pain modulatory system carry spatial information about the site of stimulation. Here, we used functional magnetic resonance imaging and multivariate pattern analysis in 15 healthy human volunteers to test whether spatial information of painful stimuli is represented in areas of the descending pain modulatory system. We show that the site of nociceptive stimulation (arm or leg) can be successfully decoded from local patterns of brain activity during the anticipation and receipt of painful stimulation in the rostral anterior cingulate cortex, the dorsolateral prefrontal cortices, and the contralateral parietal operculum. These results demonstrate that information regarding the site of nociceptive stimulation is represented in these brain regions. Attempts to predict arm and leg stimulation from the periaqueductal gray, control regions (e.g., white matter) or the control time interval in the intertrial phase did not allow for classifications above chance level. This finding represents an important conceptual advance in the understanding of endogenous pain control mechanisms by bridging the gap between previous behavioral and neuroimaging studies, suggesting a spatial specificity of endogenous pain control.
Hebart, M.N., Görgen, K., & Haynes, J.-D.
2014 , Frontiers in NeuroinformaticsThe multivariate analysis of brain signals has recently sparked a great amount of interest, yet accessible and versatile tools to carry out decoding analyses are scarce. Here we introduce The Decoding Toolbox (TDT) which represents a user-friendly, powerful and flexible package for multivariate analysis of functional brain imaging data. TDT is written in Matlab and equipped with an interface to the widely used brain data analysis package SPM. The toolbox allows running fast whole-brain analyses, region-of-interest analyses and searchlight analyses, using machine learning classifiers, pattern correlation analysis, or representational similarity analysis. It offers automatic creation and visualization of diverse cross-validation schemes, feature scaling, nested parameter selection, a variety of feature selection methods, multiclass capabilities, and pattern reconstruction from classifier weights. While basic users can implement a generic analysis in one line of code, advanced users can extend the toolbox to their needs or exploit the structure to combine it with external high-performance classification toolboxes. The toolbox comes with an example data set which can be used to try out the various analysis methods. Taken together, TDT offers a promising option for researchers who want to employ multivariate analyses of brain activity patterns.
Hebart, M.N., & Hesselmann, G.
2012 , The Journal of Neuroscience , pages: 8107--8109Hebart, M.N., Donner, T.H.*, & Haynes, J.-D.*
2012 , Neuroimage , pages: 1393--1403Perceptual decision-making entails the transformation of graded sensory signals into categorical judgments. Often, there is a direct mapping between these judgments and specific motor responses. However, when stimulus–response mappings are fixed, neural activity underlying decision-making cannot be separated from neural activity reflecting motor planning. Several human neuroimaging studies have reported changes in brain activity associated with perceptual decisions. Nevertheless, to date it has remained unknown where and how specific choices are encoded in the human brain when motor planning is decoupled from the decision process. We addressed this question by having subjects judge the direction of motion of dynamic random dot patterns at various levels of motion strength while measuring their brain activity with fMRI. We used multivariate decoding analyses to search the whole brain for patterns of brain activity encoding subjects' choices. To decouple the decision process from motor planning, subjects were informed about the required motor response only after stimulus presentation. Patterns of fMRI signals in early visual and inferior parietal cortex predicted subjects' perceptual choices irrespective of motor planning. This was true across several levels of motion strength and even in the absence of any coherent stimulus motion. We also found that the cortical distribution of choice-selective brain signals depended on stimulus strength: While visual cortex carried most choice-selective information for strong motion, information in parietal cortex decreased with increasing motion coherence. These results demonstrate that human visual and inferior parietal cortex carry information about the visual decision in a more abstract format than can be explained by simple motor intentions. Both brain regions may be differentially involved in perceptual decision-making in the face of strong and weak sensory evidence.
Christophel, T.B., Hebart, M.N., & Haynes, J.-D.
2012 , Journal of Neuroscience , pages: 12983--12989How content is stored in the human brain during visual short-term memory (VSTM) is still an open question. Different theories postulate storage of remembered stimuli in prefrontal, parietal, or visual areas. Aiming at a distinction between these theories, we investigated the content-specificity of BOLD signals from various brain regions during a VSTM task using multivariate pattern classification. To participate in memory maintenance, candidate regions would need to have information about the different contents held in memory. We identified two brain regions where local patterns of fMRI signals represented the remembered content. Apart from the previously established storage in visual areas, we also discovered an area in the posterior parietal cortex where activity patterns allowed us to decode the specific stimuli held in memory. Our results demonstrate that storage in VSTM extends beyond visual areas, but no frontal regions were found. Thus, while frontal and parietal areas typically coactivate during VSTM, maintenance of content in the frontoparietal network might be limited to parietal cortex.
Hesselmann, G., Hebart, M., & Malach, R.
2011 , Journal of Neuroscience , pages: 12936--12944The study of conscious visual perception invariably necessitates some means of report. Report can be either subjective, i.e., an introspective evaluation of conscious experience, or objective, i.e., a forced-choice discrimination regarding different stimulus states. However, the link between report type and fMRI-BOLD signals has remained unknown. Here we used continuous flash suppression to render target images invisible, and observed a long-lasting dissociation between subjective report of visibility and human subjects' forced-choice localization of targets (“blindsight”). Our results show a robust dissociation between brain regions and type of report. We find subjective visibility effects in high-order visual areas even under equal objective performance. No significant BOLD difference was found between correct and incorrect trials in these areas when subjective report was constant. On the other hand, objective performance was linked to the accuracy of multivariate pattern classification mainly in early visual areas. Together, our data support the notion that subjective and objective reports tap cortical signals of different location and amplitude within the visual cortex.
Stein, T., Hebart, M.N., & Sterzer, P.
2011 , Frontiers in Human Neuroscience , pages: 167Until recently, it has been thought that under interocular suppression high-level visual processing is strongly inhibited if not abolished. With the development of continuous flash suppression (CFS), a variant of binocular rivalry, this notion has now been challenged by a number of reports showing that even high-level aspects of visual stimuli, such as familiarity, affect the time stimuli need to overcome CFS and emerge into awareness. In this "breaking continuous flash suppression" (b-CFS) paradigm, differential unconscious processing during suppression is inferred when (a) speeded detection responses to initially invisible stimuli differ, and (b) no comparable differences are found in non-rivalrous control conditions supposed to measure non-specific threshold differences between stimuli. The aim of the present study was to critically evaluate these assumptions. In six experiments we compared the detection of upright and inverted faces. We found that not only under CFS, but also in control conditions upright faces were detected faster and more accurately than inverted faces, although the effect was larger during CFS. However, reaction time (RT) distributions indicated critical differences between the CFS and the control condition. When RT distributions were matched, similar effect sizes were obtained in both conditions. Moreover, subjective ratings revealed that CFS and control conditions are not perceptually comparable. These findings cast doubt on the usefulness of non-rivalrous control conditions to rule out non-specific threshold differences as a cause of shorter detection latencies during CFS. Thus, at least in its present form, the b-CFS paradigm cannot provide unequivocal evidence for unconscious processing under interocular suppression. Nevertheless, our findings also demonstrate that the b-CFS paradigm can be fruitfully applied as a highly sensitive device to probe differences between stimuli in their potency to gain access to awareness.
Zheng, X. Y., Hebart, M.N., Dolan, R. J., Doeller, C. F., Cools, R., & Garvert, M. M.
The hippocampal-entorhinal system uses cognitive maps to represent spatial knowledge and other types of relational information, such as the transition probabilities between objects. However, objects can often be characterized in terms of different types of relations simultaneously, e.g. semantic similarities learned over the course of a lifetime as well as transitions experienced over a brief timeframe in an experimental setting. Here we ask how the hippocampal formation handles the embedding of stimuli in multiple relational structures that differ vastly in terms of their mode and timescale of acquisition: Does it integrate the different stimulus dimensions into one conjunctive map, or is each dimension represented in a parallel map? To this end, we reanalyzed functional magnetic resonance imaging (fMRI) data from Garvert et al. (2017) that had previously revealed an entorhinal map which coded for newly learnt statistical regularities. We used a triplet odd-one-out task to construct a semantic distance matrix for presented items and applied fMRI adaptation analysis to show that the degree of similarity of representations in bilateral hippocampus decreases as a function of semantic distance between presented objects. Importantly, while both maps localize to the hippocampal formation, this semantic map is anatomically distinct from the originally described entorhinal map. This finding supports the idea that the hippocampal-entorhinal system forms parallel cognitive maps reflecting the embedding of objects in diverse relational structures.