Comparing cortical feedback (fMRI) to Self-Supervised DNNs 

The promise of artificial intelligence in understanding biological vision relies on the comparison of computational models with brain data with the goal of capturing functional principles of visual information processing. Convolutional neural networks (CNN) have successfully matched the transformations in hierarchical processing occurring along the brain’s feedforward visual pathway extending into ventral temporal cortex. 

Comparing cortical feedback (fMRI) to Self-Supervised DNNs

However, we are still to learn if CNNs can successfully describe feedback processes in early visual cortex. Here, we investigated similarities between human early visual cortex and a CNN with encoder/decoder architecture, trained with self-supervised learning to fill occlusions and reconstruct an unseen image. Using Representational Similarity Analysis (RSA), we compared 3T fMRI data from a non-stimulated patch of early visual cortex in human participants viewing partially occluded images, with the different CNN layer activations from the same images. Results show that our self-supervised image-completion network outperforms a classical object-recognition supervised network (VGG16) in terms of similarity to fMRI data. This provides additional evidence that optimal models of the visual system might come from less feedforward architectures trained with less supervision. We also find that CNN decoder pathway activations are more similar to brain processing compared to encoder activations, suggesting an integration of mid- and low/middle-level features in early visual cortex. Challenging an AI model to learn natural image representations via self-supervised learning and comparing them with brain data can help us to constrain our understanding of information processing such as neuronal predictive coding. [biorxivjournal paper]


The analysis framework is composed of two parts: the encoder/decoder (artificial) neural network and brain imaging data collection. (A) The image passed through the network, we extracted activations for one layer, we selected a quadrant at the time, and we applied PCA transformation to reduce the dimension to 1024 components; we then obtained one 1024-d vector per layer (15), per quadrant (2), and per image (24). We used these vectors to compute representational dissimilarity matrices (RDMs). (B) fMRI data were collected from participants while viewing the same images that were fed into the network (in testing) and RDMs computed. We compared RDMs of the network and brain data (cross-validated; see Walther et al., 2016), for every CNN layer (15 layers analysed), human visual area (V1 and V2), and image space quadrant (occluded and non occluded quadrants).

The analysis framework is composed of two parts: the encoder/decoder (artificial) neural network and brain imaging data collection. (a) The image passed through the network, we extracted activations for one layer, we selected a quadrant at the time, and we applied PCA transformation to reduce the dimension to 1024 components; we then obtained one 1024-d vector per layer (15), per quadrant (2), and per image (24). We used these vectors to compute Representational Dissimilarity Matrices (RDMs). (b) fMRI data were collected from participants whilst viewing the same images that were fed into the network (in testing) and RDMs computed. We compared RDMs of the network and brain data (cross-validated – see [Walther et al., 2016]), for every CNN layer (15 layers analysed), human visual area (V1 and V2), and image space quadrant (occluded and non-occluded quadrants).

Visit the project website for more or directly the paper.

Deep learning methods for MRI data analysis

Understanding brain mechanisms associated with sensory perception is a long-standing goal of Cognitive Neuroscience. Using non-invasive techniques, such as fMRI, researchers have nowadays the possibility to study brain activity in different areas. An essential step in many functional and structural neuroimaging studies is segmentation, the operation of partitioning the MR images in anatomical structures. Current automatic (multi-) atlas-based segmentation strategies often lack accuracy on difficult-to-segment brain structures and, since these methods rely on atlas-to-scan alignment, they may take long processing times. Alternatively, recent methods deploying solutions based on Convolutional Neural Networks (CNNs) are enabling the direct analysis of out-of-the-scanner data. However, current CNN-based solutions partition the test volume into 2D or 3D patches, which are processed independently. This process entails a loss of global contextual information, thereby negatively impacting the segmentation accuracy. In these works, we introduce CEREBRUM, an optimised end-to-end Convolutional Neural Network (CNN), that allows the segmentation of a whole T1w MRI brain volume at once, without partitioning the volume, preprocessing, nor aligning it to an atlas. Different quantitative measures demonstrate an improved accuracy of this solution when compared to state-of-the-art techniques. Moreover, through a randomised survey involving expert neuroscientists, we show that subjective judgements favour our solution with respect to widely adopted atlas-based software. We delivered two tools:

Below, reconstructed meshes of GM, WM, basal ganglia, ventricles, brain stem, and cerebellum of a testing volume, obtained with CEREBRUM7T on sub-013_ses-001 (mesh created with BrainVoyager). A light smoothing operation is performed (50 iterations) – no manual corrections.

Work done in collaboration with the Department of Information Engineering, University of Brescia (Italy).