Resting State Analysis, Part VII: The End

For the last part of the analysis, we are going to simply take all of the z-maps we have just created, and enter them into a second-level analysis across all subjects. To do this, first I suggest creating separate directories for the Autism and Control subjects, based on the phenotypic data provided by the KKI on the ABIDE website; once you have those, run uber_ttest.py to begin loading Z-maps from each subject into the corresponding category of Autism or Control.

Most of this is easier to see than to describe, so I've made a final tutorial video to show how this is all done. Hope it helps!




P.S. I've collated all of the past videos over the past week and a half into a playlist, which should make sorting through everything slightly easier:

How to Fake Data and Get Tons of Money: Part 1

In what I hope will become a long-running serial, today we will discuss how you can prevaricate, dissemble, equivocate, and in general become as slippery as an eel slathered with axle grease, yet still maintain your mediocre, ill-deserved, but unblemished reputation, without feeling like a repulsive stain on the undergarments of the universe.

I am, of course, talking about making stuff up.

As the human imagination is one of the most wonderful and powerful aspects of our nature, there is no reason you should not exercise it to the best of your ability; and there is no greater opportunity to use this faculty than when the stakes are dire, the potential losses abysmally, wretchedly low, the potential gains dizzyingly, intoxicatingly high. (To get yourself in the right frame of mind, I recommend Dostoyevsky's novella The Gambler.) And I can think of no greater stakes than in reporting scientific data, when entire grants can turn on just one analysis, one result, one number. Every person, at one time or another, has been tempted to cheat and swindle their way to fortune; and as all are equally disposed to sin, all are equally guilty.

In order to earn your fortune, therefore, and to elicit the admiration, envy, and insensate jealousy of your colleagues, I humbly suggest using none other than the lowly correlation. Taught in every introductory statistics class, a correlation is simply a quantitative description of the association between two variables; it can range between -1 and +1; and the farther away from zero, the stronger the correlation, while the closer to zero, the weaker the correlation. However, the beauty of correlation is that one number - just one! - has the inordinate ability to make the correlation significant or not significant.Take, for example, the correlation between shoe size and IQ. Most would intuit that there is no relationship between the two, and that having a larger shoe size should neither be associated with a higher IQ or a lower IQ. However, if Bozo the Clown is included in your sample - a man with a gigantic shoe size, and who happens to also be a comedic genius - then your correlation could be spuriously driven upward by this one observation.

To illustrate just how easy this is, a recently created web applet provides you with fourteen randomly generated numbers, and allows the user to plot an additional point anywhere on the graph. As you will soon learn, it is simple to place the observation in a reasonable and semi-random location, and get the result that you want:

Non-significant correlation, leading to despair, despond, and death.

Significant correlation, leading to elation, ebullience, and aphrodisia.

The beauty of this approach lies in its simplicity: We are only altering one number, after all, and this hardly approaches the enormity of scientific fraud perpetrated on far grander scales. It is easy, efficient, and fiendishly inconspicuous, and should anyone's suspicions be aroused, that one number can simply be dismissed as a clerical error, fat-finger typing, or simply chalked up to plain carelessness. In any case, it requires a minimum of effort, promises a maximum of return, and allows you to cover your tracks like the vulpine, versatile genius that you are.

And should your conscience, in your most private moments, ever raise objection to your spineless behavior, merely repeat this mantra to smother it: Others have done worse.

FSL Tutorial 2: FEAT (Part 2): The Reckoning

(Note: Hit the fullscreen option and play at a higher resolution for better viewing)


Now things get serious. I am talking more serious than the projected peak Nutella in 2020, after which our Nutella resources will slow down to a trickle and then simply evaporate. This video goes into detail about the FEAT stats tab, specifically what you should and shouldn't do (which pretty much means just leaving most of the defaults as is), lest you screw everything up, which, let's face it, you probably will anyway. People will tell you that's okay, but it's not.

I've tried to compress these tutorials into a shorter amount of time, because I usually take one look at the duration of an online video and don't even bother with something greater than three or four minutes (unless it happens to be a Starcraft 2 replay). But there's no getting around the fact that this stuff takes a while to explain, so the walkthroughs will probably remain in the ten-minute range.

To supplement the tutorial, it may help to flesh out a few concepts, particularly if this is your first time doing stuff in FSL. The most important part of an fMRI experiment - besides the fact that it should be well-planned, make sense to the subject, and be designed to compare hypotheses against each other - is the timing. In other words, knowing what happened when. If you don't know that, you're up a creek without Nutella a paddle. There's no way to salvage your experiment if the timing is off or unreliable.

The documentation on FSL's website isn't very good when demonstrating how to make timing files, and I'm surprised that the default option is a square waveform to be convolved with a canonical Hemodynamic Response Function (HRF). What almost every researcher will want is the Custom 3 Column format, which specifies the onset of each condition, how long it lasted, and any auxiliary parametric information you have reason to believe may modulate the amplitude of the Blood Oxygenation Level Dependent (BOLD) response. This auxiliary parametric information could be anything about that particular trial of the condition; for example, if you are showing the subject one of those messed-up IAPS photos, and you have a rating about how messed-up it is, this can be entered into the third column of the timing file. If you have no reason to believe that one trial of a condition should be different from any other in that condition, you can set every instance to 1.

Here is a sample timing file to be read into FSL (which they should post an example of somewhere in their documentation; I haven't been able to find one yet, but they do provide a textual walkthrough of how to do it under the EV's part of the Stats section here):

10  1  1
18  1  1
25  1  1
30  1  1


To translate this text file, this would mean that this condition occurred at 10, 18, 25, and 30 seconds relative to the start of the run; each trial of this condition lasted one second; and there is no parametric modulation.

A couple of other things to keep in mind:

1) When setting up contrasts, also make a simple effect (i.e., just estimate a beta weight) for each condition. This is because if everything is set up as a contrast of one beta weight minus another, you can lose valuable information about what is going on in that contrast.

As an example of why this might be important, look at this graph. Just look at it!

Proof that fMRI data is (mostly) crap


These are timecourses extracted from two different conditions, helpfully labeled "pred2corr0" and "pred2corr1". When we took a contrast of pred2corr0 and compared it to pred2corr1, we got a positive value. However, here's what happened: The peaks of the HRF (represented here by the timepoints under the "3" in the x-axis, which translates into 6 seconds (3 scans of 2 seconds each = 6 seconds), representing the typical peak of the HRF after the onset of a stimulus) for both conditions were negative. It just happened that the peak for the pred2corr1 condition was more negative than that of pred2corr0, hence the positive contrast value.

2) If you have selected "Temporal Derivative" for all of your regressors, then every other column will represent an estimate of what the temporal derivative should look like. Adding a temporal derivative has the advantage of accounting for any potential lags in the onset of the HRF, but comes at the cost of a degree of freedom, since you have something extra to estimate.

3) After the model is set up and you click on the "Efficiency" tab, you will see two sections. The section on the left represents the correlation between regressors, and the section on the right represents the singular value decomposition eigenvalue for each condition.

What is an eigenvalue? Don't ask me; I'm just a mere cognitive neuroscientist.
For the correlations, brighter intensities represent higher correlations. So, it makes sense that the diagonal is all white, since each condition correlates with itself perfectly. However, it is the off-diagonal squares that you need to pay attention to, and if any of them are overly bright, you have a problem. A big one. Bigger than finding someone you loved and trusted just ate your Nu-...but let's not go there.

As for the eigenvalues on the right, I have yet to find out what range of values represent safety and which ones represent danger. I will keep looking, but for now, it is probably a better bet to do design efficiency estimation using a package like AFNI to get a good idea of how your design will hold up under analysis.


That's it. A few more videos will be uploaded, and then the beginning user should have everything he needs to get started.