Usage

ConnInfPy provides three top-level inferential pipelines, all sharing a common contract: edge-wise statistics → topology-aware enhancement → permutation null → FWER / FDR-corrected p-values, with positive and negative effects always tested separately.

Quick reference:

Pipeline	Use case
`conninfpy.compute_p_val()`	Group / paired / one-sample t-test
`conninfpy.compute_p_val_glm()`	Continuous predictors with confound regression (Freedman–Lane)
`conninfpy.compute_p_val_glm_multi()`	Several contrasts under a shared nuisance model in one permutation pass
`conninfpy.compute_p_val_paired_glm()`	Paired A vs B with Δ-level confounds
`conninfpy.analyze()`	One-shot Fisher-z → ComBat → GLM/t-test → `InferenceResult`

Two helper layers wrap them:

conninfpy.harmonize — multi-site ComBat harmonization + design diagnostics
conninfpy.acceleration — GPD/gamma tail approximation for permutation acceleration

Input conventions

Connectivity tensors: shape (n_subjects, N, N), symmetric, zero diagonal.
Edge weights: Fisher-z transformed correlation coefficients. Use conninfpy.fisher_r_to_z() before any inference call.
The package returns InferenceResult objects, which behave as {'positive', 'negative'} dicts (canonical keys since v2.0) and additionally carry attributes positive / negative / method / n_permutations / acceleration / wall_time_s. The legacy v1.x keys 'g2>g1' / 'g1>g2' (t-test family) remain readable but emit a DeprecationWarning and will be removed in v2.1. F-stat omnibus tests still return {'omnibus': arr}.

t-test pipeline — `compute_p_val()`

The simplest entry point. Computes per-edge t-statistics, applies an enhancement operator, builds a permutation null of per-tail max statistics, and returns FWER- or FDR-corrected p-maps with the +1 Phipson–Smyth correction.

from conninfpy import compute_p_val, fisher_r_to_z

group1_z = fisher_r_to_z(group1_corr)   # (n1, N, N)
group2_z = fisher_r_to_z(group2_corr)

p = compute_p_val(
    group1_z, group2_z,
    test_type='two-sample',             # or 'paired', 'one-sample'
    method='tfnbs',
    n_permutations=1000,
    e=0.3, h=3.0, n=10,                 # FDR-calibrated regime
    use_mp=True, rng=42,
)
# p['positive'] : (N, N) p-map for group2 > group1
# p['negative'] : (N, N) p-map for group1 > group2

GLM pipeline — `compute_p_val_glm()`

Edge-wise GLM with Freedman–Lane permutation. Use this when you have a continuous predictor and want to control for nuisance regressors (motion, age, sex, site).

Convenience API (recommended):

import numpy as np
from conninfpy import compute_p_val_glm, fisher_r_to_z

Y = fisher_r_to_z(connectivity_matrices)              # (n, N, N)
age = np.array([25, 30, 28])                          # (n,)
confounds = np.column_stack([motion, sex])            # (n, p)

p = compute_p_val_glm(
    Y, interest=age, confounds=confounds,
    stat_type='tstat',                                 # or 'beta', 'fstat'
    method='tfnbs',
    n_permutations=1000,
    e=0.3, h=3.0, n=10,
    use_mp=True, rng=42,
)
# p['positive'] : edges where age ↑ → connectivity ↑
# p['negative'] : edges where age ↑ → connectivity ↓

Advanced API (full design matrix + contrast):

from conninfpy import build_design_matrix, compute_p_val_glm

X, contrast = build_design_matrix(interest=age, confounds=motion_sex)
p = compute_p_val_glm(
    Y, design_matrix=X, contrast=contrast,
    stat_type='tstat', method='tfnbs', n_permutations=1000,
)

F-contrast (omnibus) for joint multi-row tests — e.g. ≥3-condition designs or testing several predictors jointly:

X = np.column_stack([np.ones(n), age, age**2, sex, motion])
contrast = np.array([
    [0, 1, 0, 0, 0],       # beta_age
    [0, 0, 1, 0, 0],       # beta_age_squared
])
p = compute_p_val_glm(
    Y, design_matrix=X, contrast=contrast,
    stat_type='fstat', method='tfnbs', n_permutations=1000,
)
# F-pipeline returns a single non-negative tail:
# p['omnibus'] : (N, N) FWER-corrected p-map

Multi-contrast GLM in one pass — `compute_p_val_glm_multi()`

If you need to test several contrasts of interest under the same nuisance model — e.g. age, sex, and mean_fd separately while treating the others as nuisance — calling compute_p_val_glm() once per contrast wastes work: the reduced-model residual fit and the per-permutation X_pinv @ Y_perm matrix multiplication are identical. The multi-contrast wrapper does it in one pass.

from conninfpy import compute_p_val_glm_multi

X = np.column_stack([np.ones(n), age, sex, mean_fd])  # (n, 4)
contrasts = {
    "age":     np.array([0.0, 1.0, 0.0, 0.0]),
    "sex":     np.array([0.0, 0.0, 1.0, 0.0]),
    "motion":  np.array([0.0, 0.0, 0.0, 1.0]),
}

results = compute_p_val_glm_multi(
    Y, design_matrix=X, contrasts=contrasts,
    method="tfnbs", n_permutations=5000,
    acceleration="gpd", rng=42,
)
# results = {'age': InferenceResult, 'sex': ..., 'motion': ...}
results["age"].positive          # (N, N) FWER p-map
results["age"].n_significant(0.05)

For K contrasts the wall-time is roughly that of a single compute_p_val_glm`() call rather than K calls — typically a 3× speedup for the canonical age + sex + motion design.

By default the reduced model excludes any column touched by any contrast in the dictionary. Pass nuisance_contrast= to override explicitly, e.g. when you want sex and motion treated as nuisance for the age contrast even though you also test them separately. F-stat / multi-row contrasts are unsupported in this wrapper — call compute_p_val_glm() once per omnibus test.

Paired A vs B with Δ-level confounds — `compute_p_val_paired_glm()`

When you have a within-subject contrast (task A vs B) and confounds that differ between conditions (e.g. condition-level motion):

from conninfpy import compute_p_val_paired_glm

p = compute_p_val_paired_glm(
    Y_task_A, Y_task_B,                            # both (n, N, N)
    confounds_A=np.column_stack([motion_A, drowsiness_A]),
    confounds_B=np.column_stack([motion_B, drowsiness_B]),
    method='tfnbs', n_permutations=1000,
    e=0.3, h=3.0, n=10,
)

With no confounds, this delegates to compute_p_val(test_type='paired') (sign-flip permutation, the exact non-asymptotic null). With Δ-level confounds it constructs Δ_Y = Y^A − Y^B, Δ_C = C^A − C^B and runs a one-sample GLM on the differences.

Methods (enhancement operators)

compute_p_val(..., method=...) and compute_p_val_glm(..., method=...) both accept:

`method`	Operator
`'tstat'`	No enhancement; per-tail max-statistic FWER on raw t / β / F
`'tfnbs'`	Threshold-Free NBS (Baggio 2018)
`'nbs'`	Classical Network-Based Statistic (Zalesky 2010); supply `threshold=2.0` and `nbs_stat='extent'` or `'intensity'`
`'cnbs'`	Constrained NBS (Noble & Scheinost 2020); supply `net_labels=`
`'ni_tfnbs'`	Network-Informed TFNBS (this work); soft block-density prior; supply `net_labels=`
`'fbc_tfnbs'`	Functional-Block-Clustering TFNBS (this work); hard block prior; supply `net_labels=`, `min_cluster_size=3`
`'bh_fdr'`	Parametric Benjamini–Hochberg FDR (no permutation)
`'bh_fdr_perm'`	Empirical edge-wise BH-FDR via permutation null

Multi-site harmonization — `conninfpy.harmonize`

Native NumPy implementation of parametric empirical-Bayes ComBat. No neuroHarmonize or neurocombat dependency.

End-to-end with confound-aware GLM:

import numpy as np
from conninfpy import (
    fisher_r_to_z,
    combat_harmonize, design_diagnostics,
    compute_p_val_glm,
)

# Y_corr : (n, N, N) per-subject Pearson correlation matrices
# sites  : (n,) acquisition-site labels
# age, sex, mean_fd : per-subject phenotype
Y = fisher_r_to_z(Y_corr)

# 1. Harmonize site effects, preserving age + sex + motion as biology
preserve = np.column_stack([age, sex, mean_fd])
Y = combat_harmonize(Y, sites=sites, preserve=preserve).Y_adjusted

# 2. Sanity-check the design before running anything expensive
X = np.column_stack([np.ones_like(age), age, sex, mean_fd])
diag = design_diagnostics(X, names=["intercept", "age", "sex", "mean_fd"])
# reports condition number + per-column VIF + plain-English flags

# 3. TFNBS on age, partialling out sex + motion (Freedman–Lane)
p = compute_p_val_glm(
    Y, interest=age,
    confounds=np.column_stack([sex, mean_fd]),
    stat_type='tstat', method='tfnbs',
    e=0.3, h=3.0, n=10,
    n_permutations=200, acceleration='gpd',
    use_mp=True, rng=42,
)

For cross-site machine-learning transfer (fit ComBat on training cohorts, freeze, apply to held-out sites at test time), use combat_fit() / combat_apply() separately.

Permutation acceleration — `conninfpy.acceleration`

Set acceleration='gpd' (or 'gamma') on either pipeline to replace the empirical permutation p-value formula with a fitted parametric tail (Generalized Pareto Distribution, Winkler 2016b). Reduces the perm budget from ~5000 to ~200 with ~25× wall-clock saving. Reproduces the empirical FWER-corrected p-values to within |Δ(-log10 p)| ≤ 0.001 on >99% of edges in the ConnInfPy ABIDE Age validation. A goodness-of-fit guard (Anderson–Darling on tail exceedances) falls back to empirical p-values when the GPD does not fit cleanly.

Topology-aware default integration steps

The TFCE integral is evaluated at n thresholds. Defaults:

n=100 for direct scoring (single-shot, high resolution; via conninfpy.get_tfnbs_score())
n=10 inside the permutation loop (Hao 2024 reports n=10 is sufficient for FDR control on network data)

The exponents (e, h) accept either scalars or equal-length lists. Lists are zipped pairwise into a single 3-D batched call, so a 16×11 (E, H) grid runs 176 combos in one compute_p_val invocation without multiplying the runtime.

See conninfpy.defaults for the full list of constants and citations (Smith & Nichols 2009, Baggio 2018, Vinokur 2023, Hao 2024).

Usage

Input conventions

t-test pipeline — compute_p_val()

GLM pipeline — compute_p_val_glm()

Multi-contrast GLM in one pass — compute_p_val_glm_multi()

Paired A vs B with Δ-level confounds — compute_p_val_paired_glm()