Recommended Workflows ===================== This page describes the **recommended end-to-end recipes** for the most common connectivity-inference designs. Where the :doc:`usage` page is a function-by-function reference, this page is task-oriented: pick the row in the decision table that matches your study, then copy the recipe. .. tip:: For the conceptual background to these recipes — the per-edge statistic, network enhancement, the permutation null, and family-wise error control — see :doc:`approach`. Each recipe below is a particular configuration of that common four-step pipeline. The recommended entry point is :func:`conninfpy.analyze`. It wraps the standard pipeline — .. code-block:: text Fisher r→z → optional ComBat → optional site-stratified permutation → network enhancement → FWER-corrected p-value maps — in a single call, and dispatches to the appropriate lower-level pipeline based on the arguments supplied. The lower-level entry points :func:`conninfpy.compute_p_val`, :func:`conninfpy.compute_p_val_glm`, and :func:`conninfpy.compute_p_val_paired_glm` are needed only when full control over the design matrix or contrast is required. Selecting a workflow -------------------- .. list-table:: :header-rows: 1 :widths: 40 35 25 * - Study design - ``analyze()`` arguments - Section * - Two independent groups (patients vs controls) - ``group1=``, ``group2=`` - :ref:`wf-two-sample` * - Continuous predictor + nuisance covariates - ``Y=``, ``interest=``, ``confounds=`` - :ref:`wf-continuous` * - Several predictors, tested separately, one pass - ``Y=``, ``interest={name: vec}`` - :ref:`wf-multi-interest` * - Multi-site continuous / group predictor - ``... + sites=``, ``harmonize=`` - :ref:`wf-glm-sites` * - Paired / repeated conditions (no condition confounds) - ``group1=``, ``group2=``, ``test_type='paired'`` - :ref:`wf-paired` * - Paired / repeated conditions + condition-varying confound - ``... + confounds_group1=``, ``confounds_group2=`` - :ref:`wf-repeated-glm` * - Custom contrast / interaction / omnibus F-test - drop to :func:`conninfpy.compute_p_val_glm` - :ref:`wf-custom` All paths share the same output contract: a dict-like :class:`~conninfpy.InferenceResult` with canonical keys ``'positive'`` and ``'negative'`` (or ``'omnibus'`` for F-tests). Positive and negative effects are **always tested separately**. For directional designs the orientation is fixed: - two-sample / paired: ``positive`` = ``group2 > group1``; - GLM: ``positive`` = predictor ↑ → connectivity ↑. Input conventions ----------------- - Connectivity tensors are shape ``(n_subjects, N, N)``, symmetric, zero diagonal. - :func:`analyze` applies Fisher r→z by default (``fisher_z=True``), so pass **raw correlation matrices**. If your matrices are already on the z-scale, pass ``fisher_z=False``. Lower-level functions do not apply the transform automatically — call :func:`conninfpy.fisher_r_to_z` first. - Predictors, confounds, and ``sites`` must be row-aligned to the subject axis of the connectivity tensor. .. _wf-two-sample: Two independent groups ---------------------- The simplest design — two groups of subjects, no covariates. .. code-block:: python from conninfpy import analyze out = analyze( group1=controls_corr, # (n1, N, N) raw correlations group2=patients_corr, # (n2, N, N) test_type='two-sample', method='tfnbs', e=0.4, h=3.0, n=10, # FDR-calibrated regime (Hao 2024) n_permutations=1000, acceleration=None, # exact empirical reference rng=42, ) p_patients_higher = out.inference['positive'] # patients > controls p_controls_higher = out.inference['negative'] # controls > patients print(out.inference.n_significant(alpha=0.05)) .. note:: ``compute_p_val(test_type='two-sample')`` uses **Welch's** (unequal variance) t-statistic unconditionally. Under unequal variances *combined with* unbalanced group sizes the exchangeability assumption behind the permutation null weakens and Type-I error can inflate (Anderson & Robinson 2001). Treat such results with caution; for publication-grade multi-site work prefer the GLM recipe below with a binary ``interest`` indicator. .. _wf-continuous: Continuous predictor with confounds ----------------------------------- Question: *does connectivity vary with age after controlling for sex and head motion?* This is the GLM path (Freedman–Lane permutation), triggered by passing ``Y=`` and ``interest=``. .. code-block:: python import numpy as np from conninfpy import analyze confounds = np.column_stack([sex, mean_fd]) out = analyze( Y, # (n, N, N) raw correlations interest=age, # continuous predictor confounds=confounds, # nuisance regressors method='tfnbs', e=0.4, h=3.0, n=10, rng=42, ) p_age_pos = out.inference['positive'] # older → higher connectivity p_age_neg = out.inference['negative'] # older → lower connectivity A binary 0/1 ``interest`` column turns this into a confound-adjusted group comparison — the preferred form of the two-sample test when covariates or multiple sites are involved. A single array ``interest`` tests one effect and returns one result. To test **several predictors at once**, pass a dict — see :ref:`wf-multi-interest`. .. _wf-multi-interest: Several predictors in one pass ------------------------------ To test several predictors **separately** under a shared nuisance model — e.g. ``age``, ``sex``, and ``mean_fd`` each as an effect of interest while the others are controlled — pass ``interest`` as a **dict** ``{name: vector}``. ``analyze()`` builds one design matrix, shares the Freedman–Lane reduced-model fit across predictors (:func:`conninfpy.compute_p_val_glm_multi`), and returns a **dict** mapping each name to its own :class:`~conninfpy.InferenceResult`, at approximately the cost of a *single* inference call rather than one per predictor. .. code-block:: python from conninfpy import analyze out = analyze( Y, interest={'age': age, 'sex': sex, 'mean_fd': mean_fd}, confounds=motion, # extra nuisance, shared by all predictors sites=site, # ComBat + site-stratified permutation harmonize='nuisance_only', method='tfnbs', e=0.4, h=3.0, n=10, n_permutations=5000, acceleration='gpd', rng=42, ) out['age'].inference['positive'] # edges where age ↑ → connectivity ↑ out['sex'].inference.n_significant(0.05) out['mean_fd'].significant_edges(atlas) # AnalyzeResult per predictor Notes: - Each predictor is tested adjusting for the intercept, every ``confounds`` column, **and the other interest predictors** (they all sit in the shared design). The shared ComBat diagnostics and warning flags are attached to every entry of the returned dict. - Each dict **value** is a single 1-D regressor of shape ``(n_subjects,)``; the **key** names that predictor's result. An empty dict, or a value that is not 1-D, raises. - Under Strategy D, ComBat preserves only ``confounds`` and excludes **all** tested predictors — the same label-leak avoidance as the single-predictor case. - The ``(E, H)`` grid sweep (:ref:`wf-eh-stability`) composes: pass sequences for ``e`` / ``h`` and every predictor's result carries the parameter axis. - This path is for **separate** per-predictor tests. For a **joint** (omnibus) test of several predictors, use a multi-row F-contrast instead (:ref:`wf-custom`). .. _wf-glm-sites: Multi-site GLM with ComBat harmonization ---------------------------------------- The recommended pattern for most real fMRI analyses: a scientific predictor, subject-level confounds, and multi-site data. Adding ``sites=`` engages two coupled mechanisms — ComBat batch harmonization, and site-stratified permutation (PALM ``-eb`` semantics, auto-set via ``strata=sites``). .. code-block:: python import numpy as np from conninfpy import analyze confounds = np.column_stack([age, sex, mean_fd]) out_D = analyze( Y, interest=diagnosis, # 0 = control, 1 = patient confounds=confounds, sites=site, # per-subject scanner / site label harmonize='nuisance_only', # Strategy D — primary recipe method='tfnbs', e=0.4, h=3.0, n=10, n_permutations=200, acceleration='gpd', # fast exploratory inference rng=42, ) print(out_D.inference) print(out_D.flags) # plain-English provenance warnings print(out_D.combat_diagnostics) # includes 'strategy': 'D' **What the call does, step by step:** - ComBat fits with ``preserve = confounds`` (age, sex, motion). The tested variable — diagnosis — is *deliberately excluded* so the harmonization fit does not incorporate the labels the permutation will reshuffle (Nygaard 2016 label-leak avoidance). - The downstream GLM tests diagnosis with ``age + sex + mean_fd + site_dummies`` as nuisance. Site dummies absorb any additive shifts ComBat did not fully remove. - ``sites=site`` auto-sets ``strata=site``, so the permutation respects site exchangeability blocks. Choosing a harmonization strategy ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ ``analyze()`` ships two strategies. **Report both** for paper-grade work: D as the headline, E as a sensitivity arm showing the harmonization was not load-bearing. .. list-table:: :header-rows: 1 :widths: 18 14 24 22 22 * - ``harmonize=`` - Strategy - What ComBat does - GLM nuisance design - When to use * - ``'nuisance_only'`` / ``'d'`` - **D — primary** - Fits with ``preserve = confounds``; tested variable excluded - ``confounds + site dummies`` - Headline result. Removes the Nygaard 2016 label leak. Requires ``sites=`` and ``confounds=``; GLM mode only. * - ``None`` / ``'e'`` - **E — sensitivity** - Skipped - ``confounds + site dummies`` - Calibrated-by-construction reference; pair with D. * - ``'auto'`` (default) - dispatcher - D if ``sites + confounds``; E if only ``sites``; none otherwise - (whichever D / E uses) - When the call shape unambiguously implies the recipe. Prefer explicit ``'nuisance_only'`` / ``None`` in paper scripts. .. code-block:: python # Primary + sensitivity pair — run both, report both. common = dict(Y=Y, interest=diagnosis, confounds=confounds, sites=site, method='tfnbs', e=0.4, h=3.0, n=10, rng=42) out_D = analyze(**common, harmonize='nuisance_only') # Strategy D out_E = analyze(**common, harmonize=None) # Strategy E .. note:: **Two-sample mode + ``sites=``** has no defensible ComBat recipe (there is no interest column to preserve). ``analyze()`` skips ComBat and emits a flag recommending promotion of the analysis to a GLM with a binary ``interest`` indicator, which is the appropriate form for any multi-site group comparison. .. _wf-paired: Paired / repeated conditions ---------------------------- Question: *does connectivity change between condition A and condition B in the same subjects?* Pass both conditions as ``group1`` / ``group2`` (row aligned: ``group1[s]`` and ``group2[s]`` are the same subject) with ``test_type='paired'``. With no condition-varying confounds this uses the **sign-flip** permutation null — the exact non-asymptotic test. .. code-block:: python from conninfpy import analyze out = analyze( group1=rest_corr, # condition A, (n, N, N) group2=task_corr, # condition B, same subjects test_type='paired', method='tfnbs', e=0.4, h=3.0, n=10, rng=42, ) p_task_higher = out.inference['positive'] # task > rest p_rest_higher = out.inference['negative'] # rest > task .. _wf-repeated-glm: Repeated-measures GLM (condition-varying confounds) --------------------------------------------------- When a confound *differs between the two conditions for the same subject* (e.g. condition-level head motion, arousal, or reaction time), pass it through ``confounds_group1`` / ``confounds_group2``. ``analyze()`` then routes to the **paired-difference GLM**: it forms :math:`\Delta_Y = \mathrm{group2} - \mathrm{group1}` and the per-subject confound difference, and tests the difference intercept with Freedman–Lane permutation (via :func:`conninfpy.compute_p_val_paired_glm`). .. code-block:: python from conninfpy import analyze out = analyze( group1=rest_corr, # condition A, (n, N, N) group2=task_corr, # condition B, same subjects test_type='paired', confounds_group1=fd_rest, # motion during condition A confounds_group2=fd_task, # motion during condition B method='tfnbs', e=0.4, h=3.0, n=10, n_permutations=1000, acceleration=None, rng=42, ) p_task_higher = out.inference['positive'] # task > rest, motion-adjusted p_rest_higher = out.inference['negative'] Notes: - **Orientation is identical** to the no-confound paired path: ``positive = group2 > group1``. (Internally the conditions are passed swapped so the tested intercept of ``Δ = group2 − group1`` keeps that sign.) - Pass **both** ``confounds_group1`` and ``confounds_group2`` or neither. They are only valid with ``test_type='paired'``. Use ``confounds=`` (no suffix) only for the between-subject GLM path; passing it alongside ``group1``/``group2`` raises. - **Subject-constant nuisances cancel** in the within-subject difference and do not need to be supplied — including additive **site** effects. So ``sites=`` with a paired design skips ComBat (it is unnecessary) while still stratifying the permutation; ``analyze()`` notes this in ``out.flags``. - **Power caveat.** The paired GLM tests the difference *intercept*. When a single edge carries a very strong effect it dominates the max-statistic FWER null, and the GLM path is then **less powerful** than the no-confound sign-flip path. This is inherent to the intercept permutation test, not to the ``analyze()`` implementation. When no condition-varying confounds are present, the plain paired path (:ref:`wf-paired`) is preferable. .. _wf-custom: Custom contrasts, interactions, and omnibus F-tests --------------------------------------------------- For interactions, custom categorical coding, or joint (omnibus) tests, build the design matrix and contrast explicitly and call :func:`conninfpy.compute_p_val_glm` directly — see the :doc:`usage` page for the advanced API, including ``stat_type='fstat'`` for multi-row contrasts (≥3-condition designs or jointly testing several predictors), which returns a single ``'omnibus'`` p-map. .. _wf-eh-stability: Inference stability: sweep ``(E, H)`` in one call ------------------------------------------------- The TFNBS-family enhancement (``'tfnbs'``, ``'ni_tfnbs'``, ``'fbc_tfnbs'``) depends on two exponents — the extent exponent ``E`` and the height exponent ``H``. Published defaults disagree (Hao 2024 ``E=0.4, H=3.0``; Smith–Nichols 2009 ``E=0.5, H=2.0``; Baggio 2018 ``E=0.75, H=3.0``), and Vinokur 2023 reports up to 75-fold variation in edge counts across the ``(E, H)`` plane. A single ``(E, H)`` result is therefore of limited value on its own; a finding should be shown to be **stable across the plausible parameter range**. ``analyze()`` evaluates the grid at negligible additional cost. Pass **equal-length sequences** for ``e`` and ``h`` and the whole grid is computed in **one permutation pass**: the threshold-integration loop runs once and the per-cell exponentiation is broadcast at the end, so a K-cell grid costs approximately the wall-clock of a single cell. Passing sequences thus converts a point estimate into a stability assessment at little extra cost. .. code-block:: python from conninfpy import analyze # Three published-default (E, H) cells, zipped pairwise (not a cross # product): (0.4, 3.0), (0.5, 2.0), (0.75, 3.0). e_grid = [0.4, 0.5, 0.75] # Hao, Smith–Nichols, Baggio h_grid = [3.0, 2.0, 3.0] out = analyze( Y, interest=diagnosis, confounds=confounds, sites=site, harmonize='nuisance_only', method='tfnbs', e=e_grid, h=h_grid, n=10, n_permutations=5000, acceleration=None, rng=42, ) r = out.inference r.is_grid # True r.positive.shape # (N, N, 3) — one p-map per (E, H) cell r.e_grid # array([0.4, 0.5, 0.75]) r.h_grid # array([3.0, 2.0, 3.0]) r.n_significant(0.05) # per-cell counts: # {'positive': [k0, k1, k2], 'negative': [...]} When the returned object is a grid (``is_grid == True``), the ``(N, N)`` exporters require a cell to project to. Pass ``param_idx=`` (or call ``.select()`` first); omitting it on a grid raises an explicit error rather than selecting a cell implicitly: .. code-block:: python sub = r.select(0) # fresh 2D result for cell 0 df = r.significant_edges(atlas, param_idx=2) # export cell 2 directly r.to_csv('edges_hao.csv', atlas=atlas, param_idx=0) Use it to confirm a result is stable across the published-default cells before reporting, or to run a denser sensitivity grid (e.g. a 6 × 6 sweep) at the cost of a single inference call. The same ``e`` / ``h`` sequence syntax works on every TFNBS-family path — two-sample, GLM, and the repeated-measures GLM alike. Interpreting and exporting results ---------------------------------- Every result is dict-like and carries metadata and exporters: .. code-block:: python r = out.inference r.method, r.n_permutations, r.acceleration, r.harmonized r.n_significant(alpha=0.05) # {'positive': k, 'negative': k} r.stat_signed # signed t / β effect map # ROI-aware edge table (needs an atlas) from conninfpy import AtlasInfo atlas = AtlasInfo.schaefer_200_yeo7() df = out.significant_edges(atlas, sort='network_pair', top_k=50) out.to_csv('edges.csv', atlas=atlas) # One-call publication figure from conninfpy.plot import summary_figure fig = summary_figure(out.inference, atlas=atlas, alpha=0.05, top_k=10) fig.savefig('summary.pdf', bbox_inches='tight') Speed vs. publication runs ~~~~~~~~~~~~~~~~~~~~~~~~~~~~ ``analyze()`` defaults to ``n_permutations=200`` with ``acceleration='gpd'`` for fast exploration. For a final result use a larger empirical reference: .. code-block:: python # Exploration (≈25× faster, GPD tail approximation with empirical fallback) analyze(Y, interest=age, confounds=confounds, n_permutations=200, acceleration='gpd', rng=42) # Publication (exact finite-permutation reference) analyze(Y, interest=age, confounds=confounds, n_permutations=5000, acceleration=None, rng=42) See :doc:`usage` for the full enhancement-method table and acceleration internals. The ``(E, H)`` stability sweep (:ref:`wf-eh-stability`) and the several-predictors workflow (:ref:`wf-multi-interest`, which wraps :func:`conninfpy.compute_p_val_glm_multi`) are covered above.