Spike Sorting


Electrophysiologists always feel the need to reinvent the wheel so there is an average of 1 software suite per scientist. Here are some that we use.

  • Caton - backend KlustaKwik, one-shot run from original binary files
  • Mat's fancy model
  • OpenElectrophy - Chris contributed some code to this. The detection algorithm is especially classy :) This software also provides many types of data loaders and an SQL backend
  • KlustaKwik / Klusters - Gold standard of sorting. Good luck getting it installed


Here are the parameters we use for auditory data:

  • 8 features - because it's the default
  • (KlustaKwik) MinClusters, MaxClusters at 3 and 14 - because it's the default in Klusters
  • Common average referencing - This is a great way to do unsupervised artifact rejection. However certain datasets (those with very few channels, or very large neurons) will not work with this because the subtracted spikes show up. In any case you should use this only for detection, not for classification.
  • Threshold 4.5 sigma - our noise levels are low so even this many sigma is still a pretty small spike
  • Detection window of 800us - auditory data is especially sensitive to this because the MUA tends to be tightly time-locked in a small onset window. That means that spikes overlap, so you should use as narrow a window as possible.
  • 300Hz+ filtering - I've tried a bunch of random filtering algorithms including wavelets or whatever and it makes almost no difference. The general philosophy is to leave the spikes as raw as possible and let the classification algorithm take care of it.

Human Intervention: Merging and identifying good clusters

Sadly this is inevitable. Any time a human is involved your results are not replicable. At least do this blind to the experimental outcomes.

Here's the rough guide to doing this in Klusters.
  • First I like to know whether there's any chance of getting anything good. 
    • The autocorrelation and crosscorrelation displays are your new best friends. For deep philosophical reasons that I don't understand, bad things tends to come in bursts. So any cluster with a peak in the center of its autocorrelation is probably an artefact (scratching, connector noise), or multi-unit (because neurons tend to spike all at once).
    • Take a quick look - do any of the clusters have valleys in the center of their autocorrelation? If so, you might have something. Good neurons tend to have acorr valleys. Even bursty neurons should have some refractory period. Note that the lowest point of the valley is an estimate of the false positive rate -- that is, the fraction of spikes in the cluster that are surely not from the same neuron. So a good valley should reach zero.
    • At this point I also increase the window size to 100ms. Sometimes a valley can last 30ms or so, in which case the close-up view will look flat.
  • Next open a "grouping assistant" display which auto-calculates suspiciously similar cluster. I like to leave the time window on this display at 30ms (close-up) so that you can see both zoomed-in and zoomed-out.
  • Now remove obvious garbage: large diffuse clusters, waveforms that go off the charts, waveforms that are highly irregular, waveforms that are near the threshold for detection (anything this small is certain to have many undetected spikes).
    • Before removing anything, make sure it isn't related to anything potentially useful. Go through its row and column in the grouping display and click on the hot spots. If you notice that it has dependencies on possibly good units (that is, if it's xcorr looks similar to a good unit's acorr), leave it for now.
  • Now you should have just possible good units, and garbage that is related to the good units. It's time to merge clusters.
    • The grouping assistant marks with hot spots things that might need to be merged
    • Merge any clusters that have the same acorr as each other, and an xcorr that matches both acorrs, and similar waveforms and clusters. I tend to err on the side of NOT merging if I'm unsure, because you can't really undo this step.
  • Finally remove anything that doesn't have an acorr valley and doesn't seem to be related to any possible unit.
    • Sometimes people like to see the valley go below the mean firing rate. That's a good heuristic but note it gets completely thrown off the unit's mean firing rate is not stationary over time, in which case the notion of a mean firing rate isn't very useful.
  • Note down how well sorted the units are.
    • Good things:
      • Acorr valley smoothly reaching zero within a time window around zero
      • No other clusters are xcorrd with it in a way that looks similar to its acorr
      • Large waveforms
      • Consistent waveforms
      • Consistent firing rate over time, or at least not monotonically decreasing, which could indicate electrode drift
    • I rate any cluster with all of those features as a 5, the maximum. If it's missing a few features but seems close, I rate it a 3, which is my minimum for analyzing it. If it has only one or two of the features, I'll rate it a 1, and I'll only analyze it if I'm desperate. If I think it's a weird artefact of possible technical interest (because I want to find out what's causing the artefact, or it's some kind of weird cluster that's associated with one of the good neurons), I'll rate it as a 0 so I know that it's not real.