About the ABCC
Introduction
The ABCD-BIDS Community Collection (ABCC) provides processed, analysis-ready neuroimaging derivatives from the ABCD Study, updated alongside central ABCD data releases. The collection leverages state-of-the-art, community-standard pipelines, including fMRIPrep, XCP-D, and QSIPrep/QSIRecon, to generate harmonized structural, functional, and diffusion MRI derivatives, along with downstream outputs for connectivity and advanced analyses.
ABCC was originally released as an independent resource on the NIMH Data Archive (NDA Collection 3165) to provide curated derivatives generated by community-developed neuroimaging pipelines and analytical tools. Its goals were to improve reproducibility, methodological standardization, and reduce barriers introduced by the enormous size of the data (e.g., computing resources required for processing) in support of child and adolescent mental health research (Feczko et al., 2021).
ABCC has been tremendously successful, and was originally developed by ABCD investigators and staff, so has since been incorporated into official ABCD Study data releases. Although no longer distributed as a separate “community collection,” the name has been retained to preserve continuity with prior releases, documentation, and published literature. Future community collections will be accepted via the NBDC Data Hub (feature coming soon).
To date, ABCC has supported over 100 publications spanning brain development, cognition, mental health, and methodological innovation. A full list of publications and formatted citation files are available here. Highlights include:
- Meisler et al. (2026): Highly replicable multisite patterns of adolescent white matter maturation
- Marek et al. (2019): Identifying reproducible individual differences in childhood functional brain networks: An ABCD study (Developmental Cognitive Neuroscience)
- Chaarani et al. (2021): Task activation patterns in 9-10 year old youths
- Cieslak et al. (2021): QSIPrep: an integrative platform for preprocessing and reconstructing diffusion MRI data (Nature Methods)
- Bethlehem et al. (2022): Brain Charts for the human lifespan
- Marek et al. (2022): Reproducible brain behavior associations require thousands of samples
- Gordon et al. (2023): Identification of the human SCAN network, revolutionizing our understanding of the human motor system.
- Keller et al. (2023): Functional topography is associated with youth cognition
- Hermosillo et al. (2024): Individualized functional network mapping in adolescents (Nature Neuroscience)
- Keller et al. (2024): Environmental exposures mediate the association between functional topography and cognition
- Mehta et al. (2024): XCP-D: an extensible pipeline for rs-fMRI connectivity preprocessing
Key Features
- Data are compliant with the Brain Imaging Data Structure (BIDS) standard for reproducible, cross-study analyses.
- Data are processed using state-of-the-art, publicly available pipelines developed within the NMIND framework for reproducible neuroimaging software (see details).
- All ABCC release data have passed DAIRC raw MR data quality control (QC). In addition, QC is performed post-processing via BrainSwipes, a community-driven visual QC platform (to be available in a future release).
- Each release includes detailed version tracking and change logs.
Processing & Data Standards
Processing pipelines used to generate the ABCC derivatives follow community standards for reproducible neuroimaging software laid out by the NMIND consortium (Kiar et al., 2023). Pipelines must be publicly available, containerized, and published with a DOI. Pipelines are peer-reviewed via the NMIND Coding Standards Checklist to ensure they meet community-driven scientific software standards for documentation, infrastructure, and testability. This process assigns badge ratings to reviewed tools, which are published on the NMIND website under Evaluated Tools.
Release Notes (v4.0.0)
Core Data
The ABCC includes data processed through the following pipelines:
- Structural & Functional MRI [NEW]
- fMRIPrep v25.1.4: Minimal functional pre-processing
- XCP-D v0.13.0: Functional post-processing, including denoising & connectivity
- ReproTM: Individualized functional network maps via template matching
- Diffusion MRI
- ModelArrayIO [NEW]
- ModelArrayIO outputs (HDF/CSV) for efficient voxel- and/or vertex-wise statistical modeling with the ModelArray R package, including:
- QSIRecon-ModelArray multi-model diffusion metrics
- XCP-D-ModelArray connectivity, ALFF, ReHo, and morphometrics
- ModelArrayIO outputs (HDF/CSV) for efficient voxel- and/or vertex-wise statistical modeling with the ModelArray R package, including:
ABCC no longer includes ABCD-HCP BIDS derivatives: see release notes for details.
This minimal release excludes preprocessed BOLD timeseries, confounds, and surface-projected functional data. Users who require full outputs can follow the instructions provided under How to Generate Full Derivatives.
Participant Counts
| Year | fMRIPrep | XCP-D | QSIPrep | ReproTM |
|---|---|---|---|---|
| Baseline | 11,685 | 10,396 | 9,163 | 6,035 |
| 2 | 8,043 | 7,654 | 7,276 | 5,767 |
| 4 | 6,303 | 6,147 | 6,063 | 5,380 |
| 6 | 3,797 | 3,756 | 3,675 | 3,492 |
Available raw BIDS counts are shown below for each pipeline by study year. Differences reflect modality requirements (sMRI, sMRI+fMRI, or dMRI). Overall processing success rates were high across pipelines (96.6–99.8%).
| Year | fMRIPrep | XCP-D | QSIPrep |
|---|---|---|---|
| sMRI present | sMRI+fMRI present | dMRI present | |
| Baseline | 11,706 | 10,416 | 9,561 |
| 2 | 8,061 | 7,672 | 7,666 |
| 4 | 6,325 | 6,166 | 6,203 |
| 6 | 3,806 | 3,765 | 3,747 |
Demographically Matched Subsets
The matched_group field of the participants.tsv file signifies each participant’s assignment to one of the ABCD Reproducible Matched Samples (ARMS). In Release 2.0, ABCD data were split into 3 demographically matched groups: ARMS-1 (N=5,786) and ARMS-2 (N=5,786) for use as independent datasets, and ARMS-3 (N=305) for template building and model testing. These group assignments have been carried over into the current release, with the following updated counts: ARMS-1 (N=5,752), ARMS-2 (N=5,745), and ARMS-3 (N=301).
To create the matched ARMS, 9 salient sociodemographic factors were selected based on their relevance to developmental outcomes: site, age, sex, ethnicity, grade, highest level of parental education, handedness, combined family income, and exposure to anesthesia. Family structure was also accounted for (Marek et al., 2019). Anesthesia exposure was included as a matching variable to account for differences in major medical interventions and their possible effects on behavioral and neurodevelopmental outcomes (Schneuer et al., 2018). To maximize the relative independence of the two datasets, family members were kept together in the same ARM, and the groups were matched to have equivalent numbers of sibling and twin pairs, and triplets.
Comparison of the counts and means for each of these factors shows that ARMS-1 and ARMS-2 are negligibly and not statistically different samples. Gender shows the largest absolute difference of 2.5%; no other demographic variables differ by more than 1%. See Feczko et al. (2021) for a full description of how these matched groups were generated. The code used is available at: https://github.com/DCAN-Labs/automated-subset-analysis.
Demographically Matched Subsets
The matched_group field of the participants.tsv file signifies each participant’s assignment to one of the ABCD Reproducible Matched Samples (ARMS). In Release 2.0, ABCD data were split into 3 demographically matched groups: ARMS-1 (N=5,786) and ARMS-2 (N=5,786) for use as independent datasets, and ARMS-3 (N=305) for template building and model testing. These group assignments have been carried over into the current release, with the following updated counts: ARMS-1 (N=5,752), ARMS-2 (N=5,745), and ARMS-3 (N=301).
To create the matched ARMS, 9 salient sociodemographic factors were selected based on their relevance to developmental outcomes: site, age, sex, ethnicity, grade, highest level of parental education, handedness, combined family income, and exposure to anesthesia. Family structure was also accounted for (Marek et al., 2019). Anesthesia exposure was included as a matching variable to account for differences in major medical interventions and their possible effects on behavioral and neurodevelopmental outcomes (Schneuer et al., 2018). To maximize the relative independence of the two datasets, family members were kept together in the same ARM, and the groups were matched to have equivalent numbers of sibling and twin pairs, and triplets.
Comparison of the counts and means for each of these factors shows that ARMS-1 and ARMS-2 are negligibly and not statistically different samples. Gender shows the largest absolute difference of 2.5%; no other demographic variables differ by more than 1%. See Feczko et al. (2021) for a full description of how these matched groups were generated. The code used is available at: https://github.com/DCAN-Labs/automated-subset-analysis.
Key Revisions
Starting with Release 7.0, ABCC distributions include the following updates. Documentation for prior releases remains available via the Version dropdown menu located in the upper right-hand corner of the ABCD Data Documentation site.
Raw BIDS Consolidation
In Releases 6.0–6.1, BIDS raw data were distributed separately under dairc/ and abcc/. As of Release 7.0, all raw BIDS data are consolidated under a single dairc/ collection.
Legacy documentation: Raw BIDS
ABCD-HCP Pipeline Deprecation
Beginning with Release 7.0, ABCD-HCP pipeline derivatives are no longer included in ABCC releases. Moving forward, ABCC will focus on fMRIPrep- and XCP-D–based derivatives for structural and functional MRI to align with current community standards. Legacy ABCD-HCP outputs remain available in Release 6.1 via the NBDC Data Hub, including:
- ABCD-HCP BIDS v0.1.4: HCP-style MRI processing pipeline in volume & surface space
- FreeSurfer 5.3.0-HCP: Segmentation statistics & surface morphometrics
Associated documentation is available in the 6.1.3 legacy documentation: see Processing and Derivatives.
Known Issues
[1] Connectivity matrices from individual runs
Connectivity matrices generated using --create-matrices 300 600 all were incorrectly computed separately for each BOLD run rather than the concatenated low-motion timeseries across runs. As a result:
- For concatenated data, only the matrix generated from the full timeseries (
all) is available - Run-specific 300-frame matrices may be present for runs with a sufficient amount of low-motion data in addition to the matrix generated from
all
The next release (processed through XCP-D v26.0.3 or later) will include the 300- and 600-frame matrices generated from concatenated low-motion timeseries across runs, and only a single matrix will be generated per run from all available low-motion frames.
[2] Timeseries not z-scored prior to concatenation
Timeseries data were not z-scored before concatenation. This issue will be addressed in a future release using XCP-D v26.0.3.
[3] Missing cases from duplicate run reprocessing
Four cases failed due to out-of-memory errors during reprocessing of duplicate functional runs and are excluded from this release. Derivatives for these cases remain available in prior releases. (View affected cases)
Coming Soon
- Imaging derivatives for remaining subjects: fMRIPrep v25.1.4 minimal preprocessing outputs, along with confound estimates for all processed subjects, and XCP-D v26.0.3 post-processing outputs
- ReproTM individualized functional network maps for all functional tasks ( MID, SST, nBack)
- Task fMRI analysis results
- BrainSwipes Quality Control: structural/functional QC of XCP-D outputs generated via BrainSwipes, a gamified crowdsourcing platform for high-volume manual QC
BrainSwipes is a community-driven effort and we encourage all ABCC users to participate! No prior experience with visual QC is required. To get started, create a free account on BrainSwipes. You will then be guided through a simple tutorial that demonstrates how to evaluate derivative images and classify them as pass or fail.
Release History
| Version | Release Date | Release Notes |
|---|---|---|
| 3.1.0 | 2025-12-03 | View Release Notes |
| 3.0.0 | 2025-06-26 | View Release Notes |
ABCC releases 1.0 - 2.0 were distributed through the NIMH Data Archive (NDA). Starting with Release 3.0.0 (2025), ABCC data transitioned to the NBDC Data Hub and release notes no longer reflect NDA repository revisions. Release notes and documentation for NDA-based releases are available in the ABCC Archival Data Release Documentation.
