Skip to content

Conversation

@Julie-Fabre
Copy link

@Julie-Fabre Julie-Fabre commented Jan 8, 2026

This PR ports bombcell-style unit classification to SpikeInterface.

Template metrics
  • Rewrote peak/trough detection with a new get_trough_and_peak_idx() function that uses scipy.signal.find_peaks(). Since SpikeInterface stores templates based on raw data rather than the heavily smoothed templates used in template matching, the waveforms can be noisy—so you can optionally apply Savitzky-Golay smoothing before detection. The function returns dicts for troughs, peaks before, and peaks after, each containing indices, values, prominences, and widths.
from spikeinterface.postprocessing import get_trough_and_peak_idx

troughs, peaks_before, peaks_after = get_trough_and_peak_idx(
    templates,
    sampling_frequency,
    smooth=True,
    min_thresh_detect_peaks_troughs=0.4,
)
  • New metrics: peak_before_to_trough_ratio, peak_after_to_trough_ratio, waveform_baseline_flatness, peak_before_width, trough_width, main_peak_to_trough_ratio.

  • Renamed peak_to_valley to peak_to_trough_duration.

analyzer.compute("template_metrics", metric_names=[
    "peak_before_to_trough_ratio",
    "waveform_baseline_flatness",
    "trough_width",
])
Quality metrics
  • Added snr_bombcell—peak amplitude over baseline MAD.
analyzer.compute("quality_metrics", metric_names=["snr_bombcell"])
  • amplitude_cutoff now has parameters for controlling the histogram fitting:
analyzer.compute("quality_metrics", metric_names=["amplitude_cutoff"], qm_params={
    "amplitude_cutoff": {
        "num_histogram_bins": 100,
        "histogram_smoothing_value": 3,
    }
})
Unit classification
  • New in spikeinterface.curation:
import spikeinterface.comparison as sc

thresholds = sc.bombcell_get_default_thresholds()
unit_type, unit_type_string = sc.bombcell_classify_units(
    quality_metrics,
    thresholds=thresholds,
    classify_non_somatic=True,
)
summary = sc.get_classification_summary(unit_type, unit_type_string)

Units get classified as NOISE → MUA → GOOD based on successive threshold checks. Optional NON_SOMA category for non-somatic waveforms.

Plots
  • Added plots for classification summaries, metric histograms with threshold lines, waveform overlays by category, and UpSet plots.
from spikeinterface.widgets import (
    plot_unit_classification,
    plot_classification_histograms,
    plot_waveform_overlay,
    plot_upset,
)

plot_unit_classification(analyzer, unit_type, unit_type_string)
plot_classification_histograms(quality_metrics, thresholds=thresholds)
plot_waveform_overlay(analyzer, unit_type, unit_type_string)
plot_upset(quality_metrics, unit_type, unit_type_string)

or a wrapper for all plots:

plots = plot_unit_classification_all(
    sorting_analyzer,
    unit_type,
    unit_type_string,
    quality_metrics=quality_metrics,  # optional, will try to get from analyzer
    thresholds=thresholds,            # optional, uses defaults
    split_non_somatic=False,
    include_upset=True,
)

TODO:

  • Add documentation tutorial

Julie-Fabre and others added 20 commits January 7, 2026 01:15
…uration, add amplitude_median, bombcell_snr and fix non-somatic classification rules
@alejoe91 alejoe91 added the curation Related to curation module label Jan 8, 2026
@samuelgarcia samuelgarcia added this to the 0.104.0 milestone Feb 3, 2026
Minimum prominence threshold as a fraction of the template's absolute max value
smooth : bool, default: True
Whether to apply smoothing before peak detection
smooth_window_frac : float, default: 0.1
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think I wowuld prefer to have this params in ms like smooth_window_ms
to avoid different smoothing when we change ms_before/ms_after

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

from collections import namedtuple

from spikeinterface.core.analyzer_extension_core import BaseMetric

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would add a small and consice internal code doc string to explain the main rules for peak and trough detection concept.

Comment on lines +242 to +245
unit_type_string = np.array([labels.get(int(t), "unknown") for t in unit_type], dtype=object)
labels = pd.DataFrame(data={"label": unit_type_string, "label_type": unit_type}, index=combined_metrics.index)

return labels
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we should return on the unit_type_string and handle this in any downstream function.

return labels


def get_bombcell_labeling_summary(unit_type: np.ndarray, unit_type_string: np.ndarray) -> dict:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this should work only on unit_type_string

noise_mask |= values < thresh["min"]
if not is_threshold_disabled(thresh["max"]):
noise_mask |= values > thresh["max"]
unit_type[noise_mask] = 0
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think I would do directly

Suggested change
unit_type[noise_mask] = 0
unit_type[noise_mask] = "noise"

to avoid remembering internal number.
This also would make the code super easy to read.
No ?

Comment on lines +942 to +952
if metric_names is None:
metric_names = [m.metric_name for m in cls.metric_list]
else:
for metric_name in metric_names:
if metric_name not in [m.metric_name for m in cls.metric_list]:
raise ValueError(
f"Metric {metric_name} not in available metrics {cls.get_available_metric_names()}"
)
for metric_name in metric_names:
metric_class = cls.get_metric_by_name(metric_name)
default_metric_columns.extend(metric_class.metric_columns)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can I try a small rewrite of this ?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I rewrote a bit already, but sure!

[label in np.unique(unit_labels) for label in labels_order]
), "All labels in labels_order must be present in unit_labels"
else:
labels_order = np.unique(unit_labels)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This will change the behavior if you have only good units for instance because it will be only on plots.
I would hard code of the possible no ?

@@ -239,6 +294,9 @@ def _prepare_data(self, sorting_analyzer, unit_ids):

tmp_data["troughs"] = troughs
tmp_data["peaks"] = peaks
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should rename peak_after no ? to avoid the confusion

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done! also added indices

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

curation Related to curation module

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants