Naming convention

Release 6.0 changes

For the 6.0 release, the complete ABCD tabulated data resource has been recurated using a standardized naming convention. That means that all variables in the tabulated dataset have been renamed using the new convention. We acknowledge that this change will require some adjustments of existing analysis pipelines and might generally introduce some friction. We nevertheless hope that the recuration effort will benefit all users of the ABCD tabulated data resource going forward as the new data curation standard resolves a lot of inconsistencies that existed in previous releases and implements a clear structure that is easier to search and process across the whole dataset.

To ease the transition of existing workflows to the new naming standard, we provide utility functions as part of our NBDCtools R package. These functions allow users to convert column names in a data frame between the new and legacy variable names or replace variable names in text/script files, making it easier to adapt existing scripts and analyses to the new naming convention. If you use these utility functions, please always consider the other changes that were introduced in the 6.0 release, as the values for some variables might not be identical to previous releases.

General design

dm_s_tab_item

Variable names are comprised of four main components that are separated by a single underscore:

Domain
Source/recipient
Table
Item

The table and item components can have additional subcomponents that are separated using multiple underscores to indicate nesting within the four main components.

Components

Domain

dm_s_tab_item

Domain: Keyword for the domain that the given variable belongs to. Domains in the core ABCD study have keywords with two letters while domains within the substudies (where “domain” refers to the substudy name) use more than two letters.

Domain glossary

Source

dm_s_tab_item

Source/recipient: Keyword (one letter) for the source / recipient type that provided the data for the given variable.

Source/recipient glossary

Table

dm_s_tab_item

Table: Name of the table/form the given variable is a part of.

dm_s_tab__kw_item

Keyword: Keyword for a subsection / group of questions within the table the given variable is a part of (e.g. ph_y_meds__otc_001 for questions related to over the counter medications represented by the keyword otc).

dm_s_tab__kw__kw_item

Additional keywords: Whenever a table has more levels of nesting/grouping, one or more additional keywords are added (e.g. ph_y_bp__dia__r01_001 uses a second keyword, r01, to differentiate the first round of diastolic blood pressure readings, represented by the keyword dia, from later rounds of readings).

Table Keyword Glossary

A filterable keyword glossary for the ‘table’ component of all variables

Item

dm_s_tab_item

Item: A three-digit, zero-padded number, e.g. 001, is used for all variables with the variable type “item”, i.e., typically individual questions in a questionnaire/table distinct from “administrative” variables or “summary scores” (see below).

dm_s_tab_admin

dm_s_tab_score

Administrative variables & summary scores: Administrative variables (e.g., language or date of administration) and summary scores (e.g., sums or means of individual items in a table) are marked by letters (e.g. dtt, lang, mean,pc (principle component) etc.) instead of the three-digit number used for variables of variable type “item” (see above).

dm_s_tab_item__subitem

Subitem: A two-digit, zero-padded number, e.g., 01, is used to indicate a subitem’s relationship to the main item. This is used to indicate items that are dependent on previous questions through branching logic or to indicate another direct relationship between two questions (e.g. ab_p_demo__empl__prtnr_001, “Does your partner work?”, has the follow up question ab_p_demo__empl__prtnr_001__01, “Full or part-time?”; 001__01 is only presented if 001 is endorsed). Sometimes, variables have more than two levels of dependencies, in which case more than one level of subitems are used, e.g., 001__01__01.

dm_s_tab_itema

Component: Indicator used to mark questions that have multiple components or to indicate two questions are the inverse of each other (e.g. When did the effects begin?, 001a, and When did the effects end?, 001b).

dm_s_tab_item__v01

Version: Indicator used to mark a new version of the same question/variable. Generally, questions with the same label are collapsed under one variable, even if they were collected under different variable names. The version indicator is only used in cases where a question has been replaced with a question that is very similar but has a somewhat distinct quality which necessitates to differentiate it from the original question (e.g., another version of the education variable was added to include additional response options after baseline).

dm_s_tab_item__subitem__v1

Subitem version: Indicator for a new, substantially different, version of a subitem question/variable.

dm_s_tab_item__l

Longitudinal marker: Indicator used for questions that are an exact replica of a question but have been slightly altered to account for the fact that the the question is being asked at a follow-up visit. Typically, this indicator is used in cases where the first time a question was asked, it referred to the lifetime up to that point, e.g., “Have you ever done X?”, while the version of the question asked at later visits refers to the time since the last time the question was asked, e.g., “Since we last saw you, have you done X?”.

dm_s_tab_item__tag

Tags: Tags are additional keywords appended to variable names to provide additional context or categorization. Variables may include one or more tags, separated by double underscores (e.g. tag __dk indicate a “don’t know” response, __rmt indicates a remote visit question in variable mh_y_cb_dev__rmt)

dm_s_tab_item___1

Multi-select response options: Some variable names include triple underscores followed by a number (e.g.,in a question like “Which animals do you like? (check all that apply)”, variable names might include ___1 for “cats”, ___2 for “dogs”, and ___3 for “fish”. If a participant selects multiple options, each corresponding variable (e.g., dm_s_tab_item___1, dm_s_tab_item___3) will be marked to indicate the selected responses.

Item keyword glossary

A filterable keyword glossary for the ‘item’ component of all variables

Glossary

Below you can find a searchable/filterable table with the complete glossary containing all keywords used in the ABCD naming convention or download it as a .csv file.

Complete ABCD glossary

A searchable and filterable keyword glossary for the complete ABCD glossary