ConnInfPy

Contents:

  • Installation
  • Usage
  • How Inference Works
  • Recommended Workflows
    • Selecting a workflow
    • Input conventions
    • Two independent groups
    • Continuous predictor with confounds
    • Several predictors in one pass
    • Multi-site GLM with ComBat harmonization
      • Choosing a harmonization strategy
    • Paired / repeated conditions
    • Repeated-measures GLM (condition-varying confounds)
    • Custom contrasts, interactions, and omnibus F-tests
    • Inference stability: sweep (E, H) in one call
    • Interpreting and exporting results
      • Speed vs. publication runs
  • conninfpy
  • References
ConnInfPy
  • Recommended Workflows
  • View page source

Recommended Workflows

This page describes the recommended end-to-end recipes for the most common connectivity-inference designs. Where the Usage page is a function-by-function reference, this page is task-oriented: pick the row in the decision table that matches your study, then copy the recipe.

Tip

For the conceptual background to these recipes — the per-edge statistic, network enhancement, the permutation null, and family-wise error control — see How Inference Works. Each recipe below is a particular configuration of that common four-step pipeline.

The recommended entry point is conninfpy.analyze(). It wraps the standard pipeline —

Fisher r→z  →  optional ComBat  →  optional site-stratified permutation
           →  network enhancement  →  FWER-corrected p-value maps

— in a single call, and dispatches to the appropriate lower-level pipeline based on the arguments supplied. The lower-level entry points conninfpy.compute_p_val(), conninfpy.compute_p_val_glm(), and conninfpy.compute_p_val_paired_glm() are needed only when full control over the design matrix or contrast is required.

Selecting a workflow

Study design

analyze() arguments

Section

Two independent groups (patients vs controls)

group1=, group2=

Two independent groups

Continuous predictor + nuisance covariates

Y=, interest=, confounds=

Continuous predictor with confounds

Several predictors, tested separately, one pass

Y=, interest={name: vec}

Several predictors in one pass

Multi-site continuous / group predictor

... + sites=, harmonize=

Multi-site GLM with ComBat harmonization

Paired / repeated conditions (no condition confounds)

group1=, group2=, test_type='paired'

Paired / repeated conditions

Paired / repeated conditions + condition-varying confound

... + confounds_group1=, confounds_group2=

Repeated-measures GLM (condition-varying confounds)

Custom contrast / interaction / omnibus F-test

drop to conninfpy.compute_p_val_glm()

Custom contrasts, interactions, and omnibus F-tests

All paths share the same output contract: a dict-like InferenceResult with canonical keys 'positive' and 'negative' (or 'omnibus' for F-tests). Positive and negative effects are always tested separately. For directional designs the orientation is fixed:

  • two-sample / paired: positive = group2 > group1;

  • GLM: positive = predictor ↑ → connectivity ↑.

Input conventions

  • Connectivity tensors are shape (n_subjects, N, N), symmetric, zero diagonal.

  • analyze() applies Fisher r→z by default (fisher_z=True), so pass raw correlation matrices. If your matrices are already on the z-scale, pass fisher_z=False. Lower-level functions do not apply the transform automatically — call conninfpy.fisher_r_to_z() first.

  • Predictors, confounds, and sites must be row-aligned to the subject axis of the connectivity tensor.

Two independent groups

The simplest design — two groups of subjects, no covariates.

from conninfpy import analyze

out = analyze(
    group1=controls_corr,        # (n1, N, N) raw correlations
    group2=patients_corr,        # (n2, N, N)
    test_type='two-sample',
    method='tfnbs',
    e=0.4, h=3.0, n=10,          # FDR-calibrated regime (Hao 2024)
    n_permutations=1000,
    acceleration=None,           # exact empirical reference
    rng=42,
)

p_patients_higher = out.inference['positive']   # patients > controls
p_controls_higher = out.inference['negative']   # controls > patients
print(out.inference.n_significant(alpha=0.05))

Note

compute_p_val(test_type='two-sample') uses Welch’s (unequal variance) t-statistic unconditionally. Under unequal variances combined with unbalanced group sizes the exchangeability assumption behind the permutation null weakens and Type-I error can inflate (Anderson & Robinson 2001). Treat such results with caution; for publication-grade multi-site work prefer the GLM recipe below with a binary interest indicator.

Continuous predictor with confounds

Question: does connectivity vary with age after controlling for sex and head motion? This is the GLM path (Freedman–Lane permutation), triggered by passing Y= and interest=.

import numpy as np
from conninfpy import analyze

confounds = np.column_stack([sex, mean_fd])

out = analyze(
    Y,                            # (n, N, N) raw correlations
    interest=age,                 # continuous predictor
    confounds=confounds,          # nuisance regressors
    method='tfnbs',
    e=0.4, h=3.0, n=10,
    rng=42,
)

p_age_pos = out.inference['positive']   # older → higher connectivity
p_age_neg = out.inference['negative']   # older → lower connectivity

A binary 0/1 interest column turns this into a confound-adjusted group comparison — the preferred form of the two-sample test when covariates or multiple sites are involved.

A single array interest tests one effect and returns one result. To test several predictors at once, pass a dict — see Several predictors in one pass.

Several predictors in one pass

To test several predictors separately under a shared nuisance model — e.g. age, sex, and mean_fd each as an effect of interest while the others are controlled — pass interest as a dict {name: vector}. analyze() builds one design matrix, shares the Freedman–Lane reduced-model fit across predictors (conninfpy.compute_p_val_glm_multi()), and returns a dict mapping each name to its own InferenceResult, at approximately the cost of a single inference call rather than one per predictor.

from conninfpy import analyze

out = analyze(
    Y,
    interest={'age': age, 'sex': sex, 'mean_fd': mean_fd},
    confounds=motion,          # extra nuisance, shared by all predictors
    sites=site,                # ComBat + site-stratified permutation
    harmonize='nuisance_only',
    method='tfnbs', e=0.4, h=3.0, n=10,
    n_permutations=5000, acceleration='gpd', rng=42,
)

out['age'].inference['positive']        # edges where age ↑ → connectivity ↑
out['sex'].inference.n_significant(0.05)
out['mean_fd'].significant_edges(atlas)  # AnalyzeResult per predictor

Notes:

  • Each predictor is tested adjusting for the intercept, every confounds column, and the other interest predictors (they all sit in the shared design). The shared ComBat diagnostics and warning flags are attached to every entry of the returned dict.

  • Each dict value is a single 1-D regressor of shape (n_subjects,); the key names that predictor’s result. An empty dict, or a value that is not 1-D, raises.

  • Under Strategy D, ComBat preserves only confounds and excludes all tested predictors — the same label-leak avoidance as the single-predictor case.

  • The (E, H) grid sweep (Inference stability: sweep (E, H) in one call) composes: pass sequences for e / h and every predictor’s result carries the parameter axis.

  • This path is for separate per-predictor tests. For a joint (omnibus) test of several predictors, use a multi-row F-contrast instead (Custom contrasts, interactions, and omnibus F-tests).

Multi-site GLM with ComBat harmonization

The recommended pattern for most real fMRI analyses: a scientific predictor, subject-level confounds, and multi-site data. Adding sites= engages two coupled mechanisms — ComBat batch harmonization, and site-stratified permutation (PALM -eb semantics, auto-set via strata=sites).

import numpy as np
from conninfpy import analyze

confounds = np.column_stack([age, sex, mean_fd])

out_D = analyze(
    Y,
    interest=diagnosis,           # 0 = control, 1 = patient
    confounds=confounds,
    sites=site,                   # per-subject scanner / site label
    harmonize='nuisance_only',    # Strategy D — primary recipe
    method='tfnbs',
    e=0.4, h=3.0, n=10,
    n_permutations=200,
    acceleration='gpd',           # fast exploratory inference
    rng=42,
)

print(out_D.inference)
print(out_D.flags)                # plain-English provenance warnings
print(out_D.combat_diagnostics)   # includes 'strategy': 'D'

What the call does, step by step:

  • ComBat fits with preserve = confounds (age, sex, motion). The tested variable — diagnosis — is deliberately excluded so the harmonization fit does not incorporate the labels the permutation will reshuffle (Nygaard 2016 label-leak avoidance).

  • The downstream GLM tests diagnosis with age + sex + mean_fd + site_dummies as nuisance. Site dummies absorb any additive shifts ComBat did not fully remove.

  • sites=site auto-sets strata=site, so the permutation respects site exchangeability blocks.

Choosing a harmonization strategy

analyze() ships two strategies. Report both for paper-grade work: D as the headline, E as a sensitivity arm showing the harmonization was not load-bearing.

harmonize=

Strategy

What ComBat does

GLM nuisance design

When to use

'nuisance_only' / 'd'

D — primary

Fits with preserve = confounds; tested variable excluded

confounds + site dummies

Headline result. Removes the Nygaard 2016 label leak. Requires sites= and confounds=; GLM mode only.

None / 'e'

E — sensitivity

Skipped

confounds + site dummies

Calibrated-by-construction reference; pair with D.

'auto' (default)

dispatcher

D if sites + confounds; E if only sites; none otherwise

(whichever D / E uses)

When the call shape unambiguously implies the recipe. Prefer explicit 'nuisance_only' / None in paper scripts.

# Primary + sensitivity pair — run both, report both.
common = dict(Y=Y, interest=diagnosis, confounds=confounds, sites=site,
              method='tfnbs', e=0.4, h=3.0, n=10, rng=42)
out_D = analyze(**common, harmonize='nuisance_only')   # Strategy D
out_E = analyze(**common, harmonize=None)              # Strategy E

Note

Two-sample mode + ``sites=`` has no defensible ComBat recipe (there is no interest column to preserve). analyze() skips ComBat and emits a flag recommending promotion of the analysis to a GLM with a binary interest indicator, which is the appropriate form for any multi-site group comparison.

Paired / repeated conditions

Question: does connectivity change between condition A and condition B in the same subjects? Pass both conditions as group1 / group2 (row aligned: group1[s] and group2[s] are the same subject) with test_type='paired'. With no condition-varying confounds this uses the sign-flip permutation null — the exact non-asymptotic test.

from conninfpy import analyze

out = analyze(
    group1=rest_corr,             # condition A, (n, N, N)
    group2=task_corr,             # condition B, same subjects
    test_type='paired',
    method='tfnbs',
    e=0.4, h=3.0, n=10,
    rng=42,
)

p_task_higher = out.inference['positive']   # task > rest
p_rest_higher = out.inference['negative']   # rest > task

Repeated-measures GLM (condition-varying confounds)

When a confound differs between the two conditions for the same subject (e.g. condition-level head motion, arousal, or reaction time), pass it through confounds_group1 / confounds_group2. analyze() then routes to the paired-difference GLM: it forms \(\Delta_Y = \mathrm{group2} - \mathrm{group1}\) and the per-subject confound difference, and tests the difference intercept with Freedman–Lane permutation (via conninfpy.compute_p_val_paired_glm()).

from conninfpy import analyze

out = analyze(
    group1=rest_corr,             # condition A, (n, N, N)
    group2=task_corr,             # condition B, same subjects
    test_type='paired',
    confounds_group1=fd_rest,     # motion during condition A
    confounds_group2=fd_task,     # motion during condition B
    method='tfnbs',
    e=0.4, h=3.0, n=10,
    n_permutations=1000,
    acceleration=None,
    rng=42,
)

p_task_higher = out.inference['positive']   # task > rest, motion-adjusted
p_rest_higher = out.inference['negative']

Notes:

  • Orientation is identical to the no-confound paired path: positive = group2 > group1. (Internally the conditions are passed swapped so the tested intercept of Δ = group2 − group1 keeps that sign.)

  • Pass both confounds_group1 and confounds_group2 or neither. They are only valid with test_type='paired'. Use confounds= (no suffix) only for the between-subject GLM path; passing it alongside group1/group2 raises.

  • Subject-constant nuisances cancel in the within-subject difference and do not need to be supplied — including additive site effects. So sites= with a paired design skips ComBat (it is unnecessary) while still stratifying the permutation; analyze() notes this in out.flags.

  • Power caveat. The paired GLM tests the difference intercept. When a single edge carries a very strong effect it dominates the max-statistic FWER null, and the GLM path is then less powerful than the no-confound sign-flip path. This is inherent to the intercept permutation test, not to the analyze() implementation. When no condition-varying confounds are present, the plain paired path (Paired / repeated conditions) is preferable.

Custom contrasts, interactions, and omnibus F-tests

For interactions, custom categorical coding, or joint (omnibus) tests, build the design matrix and contrast explicitly and call conninfpy.compute_p_val_glm() directly — see the Usage page for the advanced API, including stat_type='fstat' for multi-row contrasts (≥3-condition designs or jointly testing several predictors), which returns a single 'omnibus' p-map.

Inference stability: sweep (E, H) in one call

The TFNBS-family enhancement ('tfnbs', 'ni_tfnbs', 'fbc_tfnbs') depends on two exponents — the extent exponent E and the height exponent H. Published defaults disagree (Hao 2024 E=0.4, H=3.0; Smith–Nichols 2009 E=0.5, H=2.0; Baggio 2018 E=0.75, H=3.0), and Vinokur 2023 reports up to 75-fold variation in edge counts across the (E, H) plane. A single (E, H) result is therefore of limited value on its own; a finding should be shown to be stable across the plausible parameter range.

analyze() evaluates the grid at negligible additional cost. Pass equal-length sequences for e and h and the whole grid is computed in one permutation pass: the threshold-integration loop runs once and the per-cell exponentiation is broadcast at the end, so a K-cell grid costs approximately the wall-clock of a single cell. Passing sequences thus converts a point estimate into a stability assessment at little extra cost.

from conninfpy import analyze

# Three published-default (E, H) cells, zipped pairwise (not a cross
# product): (0.4, 3.0), (0.5, 2.0), (0.75, 3.0).
e_grid = [0.4, 0.5, 0.75]      # Hao, Smith–Nichols, Baggio
h_grid = [3.0, 2.0, 3.0]

out = analyze(
    Y, interest=diagnosis, confounds=confounds, sites=site,
    harmonize='nuisance_only',
    method='tfnbs', e=e_grid, h=h_grid, n=10,
    n_permutations=5000, acceleration=None, rng=42,
)

r = out.inference
r.is_grid                 # True
r.positive.shape          # (N, N, 3) — one p-map per (E, H) cell
r.e_grid                  # array([0.4, 0.5, 0.75])
r.h_grid                  # array([3.0, 2.0, 3.0])
r.n_significant(0.05)     # per-cell counts:
                          # {'positive': [k0, k1, k2], 'negative': [...]}

When the returned object is a grid (is_grid == True), the (N, N) exporters require a cell to project to. Pass param_idx= (or call .select() first); omitting it on a grid raises an explicit error rather than selecting a cell implicitly:

sub = r.select(0)                                  # fresh 2D result for cell 0
df  = r.significant_edges(atlas, param_idx=2)      # export cell 2 directly
r.to_csv('edges_hao.csv', atlas=atlas, param_idx=0)

Use it to confirm a result is stable across the published-default cells before reporting, or to run a denser sensitivity grid (e.g. a 6 × 6 sweep) at the cost of a single inference call. The same e / h sequence syntax works on every TFNBS-family path — two-sample, GLM, and the repeated-measures GLM alike.

Interpreting and exporting results

Every result is dict-like and carries metadata and exporters:

r = out.inference
r.method, r.n_permutations, r.acceleration, r.harmonized
r.n_significant(alpha=0.05)            # {'positive': k, 'negative': k}
r.stat_signed                          # signed t / β effect map

# ROI-aware edge table (needs an atlas)
from conninfpy import AtlasInfo
atlas = AtlasInfo.schaefer_200_yeo7()
df = out.significant_edges(atlas, sort='network_pair', top_k=50)
out.to_csv('edges.csv', atlas=atlas)

# One-call publication figure
from conninfpy.plot import summary_figure
fig = summary_figure(out.inference, atlas=atlas, alpha=0.05, top_k=10)
fig.savefig('summary.pdf', bbox_inches='tight')

Speed vs. publication runs

analyze() defaults to n_permutations=200 with acceleration='gpd' for fast exploration. For a final result use a larger empirical reference:

# Exploration (≈25× faster, GPD tail approximation with empirical fallback)
analyze(Y, interest=age, confounds=confounds, n_permutations=200,
        acceleration='gpd', rng=42)

# Publication (exact finite-permutation reference)
analyze(Y, interest=age, confounds=confounds, n_permutations=5000,
        acceleration=None, rng=42)

See Usage for the full enhancement-method table and acceleration internals. The (E, H) stability sweep (Inference stability: sweep (E, H) in one call) and the several-predictors workflow (Several predictors in one pass, which wraps conninfpy.compute_p_val_glm_multi()) are covered above.

Previous Next

© Copyright 2026, IHB RAS.

Built with Sphinx using a theme provided by Read the Docs.