Independent Components Analysis, Part II: Using FSL Example Data

For our next step, we will be working with FSL example data - somewhat artificial data, true, and much better quality than anything you can ever expect from the likes of your data, which will only lead to toil, sweat, and the garret. Sufficient unto the day is the frustration thereof.

However, it is a necessary step to see what ICA analysis ought to look like, as well as learning how to examine and identify components related to tasks and networks that you are interested in. Also - and possibly more important - you will learn to recognize sources of noise, such as components related to head motion and physiological artifacts.

First, go to the link http://fsl.fmrib.ox.ac.uk/fslcourse/ and scroll to the bottom for instructions on how to download the datasets. You can use either wget or curl. For this demonstration, we will be using datasets 2 and 6:

curl -# -O -C - http://fsl.fmrib.ox.ac.uk/fslcourse/fsl_course_data2.tar.gz
curl -# -O -C - http://fsl.fmrib.ox.ac.uk/fslcourse/fsl_course_data6.tar.gz


Once you have downloaded them, unzip them using gunzip, and then "tar -xf" on the resulting tar files. This will create a folder called fsl_course_data, which you should rename so that they do not conflict with each other. Within fsl_course_data2, navigate to the /melodic/av directory, where you will find a small functional dataset that was acquired while the participant was exposed to auditory stimuli and visual stimuli - which sounds much more scientific than saying, "The participant saw stuff and heard stuff."

Open up the MELODIC gui either through the FSL gui, or through typing Melodic_gui from the command line. Most of the preprocessing steps can be kept as is. However, keep the following points in mind:

1. Double-check the TR. FSL will fill it in automatically from the header of the NIFTI file, but it isn't always reliable.

2. Spatial smoothing isn't required in ICA, but a small amount can help produce better-looking and more identifiable component maps. Somewhere on the order of the size of a voxel or two will usually suffice.


3. By default, MELODIC automatically estimates the number of components for you. However, if you have severe delusions and believe that you know how many components should be generated, you can turn off the "Automatic dimensionality estimation" option in the Stats tab, and enter the number of components you want.


4. The Threshold IC maps option is not the same thing as a p-value correction threshold. I'm not entirely clear on how it relates to the mixture modeling carried out by ICA, but my sense from reading the documentation and papers using ICA is that a higher threshold only keeps those voxels that have a higher probability of belonging to the true signal distribution, instead of the background noise distribution, and it comes down to a balance between false positives and false negatives. I don't have any clear guidelines about what threshold to use, but I've seen cutoffs used within the 0.8-0.9 range in papers.


5. I don't consider myself a snob, but I was using the bathroom at a friend's house recently, and I realized how uncomfortable that cheap, non-quilted toilet paper can be. It's like performing intimate hygiene with roofing materials.


6. Once you have your components, you can load them into FSLview and scroll through them with the "Volumes" button in the lower left corner. You can also load the Atlases from the Tools menu and double-click on it to get a semi-transparent highlight of where different cortical regions are. This can be useful when trying to determine whether certain components fall within network areas that you would expect them to.




More details in the videos below, separately for the visual-auditory and resting-state datasets.



Introduction to Independent Components Analysis

In the course of your FMRI studies, you may have at one point stumbled upon the nightmarish world of Independent Components Analysis (ICA), an intimidating-sounding technique often used by people who know far more than I do. Nevertheless, let us meditate upon it and see what manner of things we have here.

Components are simpler building blocks of a more complex signal. For example, in music, the soundwave profiles of chords can appear very complex; yet a Fourier analysis can filter it back into its constituent parts - simpler sine waves associated with each note that, combined together, create the chord's waveform.



A similar process happens when ICA is applied to neuroimaging data. Data which comes right off the scanner is horribly messy, a festering welter of voxels and timecourses that any right man would run away from as though his very life depended upon it. To make this more comprehensible, however, ICA can decompose it into more meaningful components, each one explaining some amount of the variance of the overall signal. 

Note also that we talk of spatial and temporal components, which will be extracted from each other; this may seem somewhat odd, as the two are inseparable in a typical FMRI dataset: each voxel (the spatial component) has a timecourse (the temporal component). However, ICA splits these apart and recombines them, in descending order, into components that explain the amount of variance of the original spatio-temporal maps. This is what is represented in the figures shown on the FSL website, reprinted below:



After these components have been extracted, something happens that some (like me) consider terrifying: You need to identify the components yourself, assigning each one to whatever condition seems most reasonable. Does that one look like a component related to visual processing? Then call it the visual component. Does that one look like a resting-state network? Then call it a resting-state network component. This may all seem rather cavalier, especially considering that you are the one making the judgments. Seriously; just think about that for a moment. Think about what you've done today.

In any case, that is a brief overview of ICA. To be fair, there are more rigorous ways of classifying what components actually represent - such as creating templates for connectivity networks or activation patterns and calculating the amount of fit between that and your component - but to be honest, you probably don't have the motivation to go through all of that. And by you, I mean me.


Next we will work through the example datasets on FSL's website, discussing such problems as over- and under-fitting, the Melodic GUI, and what options need to be changed; followed by using ICA to analyze resting-state data.