Understanding Metadata in Flywheel

Metadata in Flywheel is information about your data. While your imaging files contain the actual scan data, metadata describes those files - who the subject is, when the scan was taken, what type of scan it is, and any custom information you want to track.

Understanding metadata is fundamental to using Flywheel effectively because metadata powers nearly every feature: search, filtering, data views, BIDS curation, automated processing with gear rules, and data export.

What you'll learn:

What metadata is and why it matters
Types of metadata in Flywheel
Where metadata comes from
How metadata is structured
When to use each metadata type

What is Metadata?

Metadata is structured information attached to containers (projects, subjects, sessions, acquisitions) and files in Flywheel. Think of it as labels, properties, and annotations that make your data discoverable, organized, and meaningful.

Example:

For a brain MRI scan file:

File itself: The actual DICOM or NIfTI imaging data (multiple megabytes)
Metadata about the file:
- Subject ID: sub-001
- Session date: 2024-03-15
- Scan type: T1-weighted
- Scanner: Siemens Prisma 3T
- Custom field: scan_quality: excellent

You use the metadata to find, filter, organize, and process the file - without ever opening the imaging data itself.

Why Metadata Matters

Metadata enables essential Flywheel workflows:

Search and Discovery

Find all T1 scans from a specific time period
Locate sessions marked with a particular quality rating
Filter subjects by demographic information

Automated Processing

Gear rules trigger based on metadata conditions (e.g., "run fMRIPrep on all sessions with task-fMRI data")
Processing pipelines read metadata to determine appropriate parameters

Data Organization

BIDS curation maps DICOM metadata to standardized BIDS fields
Export tools preserve metadata for reproducibility
Data views create tabular datasets from metadata fields

Quality Control and Collaboration

Notes document issues or observations about data
Tags mark data for specific purposes (cohorts, quality flags, processing stages)
Custom fields track study-specific information

Types of Metadata in Flywheel

Flywheel organizes metadata into several categories, each serving different purposes.

1. Standard Container Fields

Every container (project, subject, session, acquisition, file) has standard metadata fields built into Flywheel:

All Containers:

id - Unique database identifier
label - Human-readable name
created - Timestamp when container was created
modified - Timestamp of last modification
tags - User-defined labels for filtering and organization
notes - Free-text annotations with timestamps and authors
info - Custom metadata object (key-value pairs)

Subject-Specific:

code - Subject identifier (often same as label)
sex - Biological sex
cohort - Study cohort assignment
species - Subject species (human, mouse, etc.)
strain - Animal strain (for non-human subjects)
race, ethnicity - Demographic information
mlset - Machine learning dataset assignment (Training, Validation, Test)

Session-Specific:

timestamp - Session date/time
age - Subject age at time of session
weight - Subject weight at time of session
operator - Person who performed the session
timezone - Timezone for session timestamp

Acquisition-Specific:

timestamp - Acquisition date/time
timezone - Timezone for acquisition timestamp

File-Specific:

name - Filename
size - File size in bytes
type - File type (dicom, nifti, bids, etc.)
mimetype - MIME type (application/zip, application/json, etc.)
classification - Structured classification (Intent, Measurement, Features, etc.)
modality - Imaging modality extracted from file
hash - File integrity checksum

See complete field reference in Data Views documentation

2. Custom Metadata (`.info` Object)

Custom metadata allows you to add study-specific fields to any container. These fields are stored in the info object as key-value pairs.

Structure:

{
  "info": {
    "scan_quality": "excellent",
    "technician_notes": "Subject reported anxiety",
    "study_phase": "baseline",
    "cohort_number": 2
  }
}

Supported Data Types:

String: Text values ("baseline", "excellent")
Number: Numeric values (2, 3.14, -5)
Boolean: True/false values (true, false)
Array: Ordered lists of values (["baseline", "followup"])
Object: Nested key-value structures ({"visit": 1, "complete": true})

Best Practices:

Use consistent field names across your site (e.g., always scan_quality, not sometimes quality or scanQuality)
Don't mix data types for the same field name
Avoid highly variable values as field names (use timestamps as values, not field names)
Keep within system limits: 15,000 unique fields per site, 8KB per container

Learn how to add custom metadata

3. DICOM Metadata

When DICOM files are uploaded to Flywheel, you can extract metadata from DICOM headers by running the File Metadata Importer gear. This gear reads DICOM headers and stores the metadata in the file's info object under file.info.header.dicom.

Common DICOM Fields:

{
  "info": {
    "StudyDescription": "Brain MRI Protocol",
    "SeriesDescription": "T1_MPRAGE_SAG",
    "PatientID": "12345",
    "StudyDate": "20240315",
    "Modality": "MR",
    "Manufacturer": "SIEMENS",
    "MagneticFieldStrength": 3,
    "RepetitionTime": 2300,
    "EchoTime": 2.98
  }
}

How It Works:

User uploads DICOM files to Flywheel
Container hierarchy is created based on DICOM metadata during upload:
PatientID → Subject label
StudyDescription or StudyInstanceUID → Session label
SeriesDescription → Acquisition label
Run File Metadata Importer gear (manually or via gear rule) to extract DICOM header metadata
Metadata is stored in file.info.header.dicom object

Important Notes:

DICOM header extraction requires running the File Metadata Importer gear
Set up gear rules to run the gear automatically on new DICOM uploads
Extracted metadata is searchable just like custom metadata
De-identification can remove or modify DICOM fields before storage
DICOM metadata is preserved when exporting data

Learn more about DICOM import

4. Tags

Tags are simple labels you apply to containers for filtering, organization, and workflow management.

Characteristics:

Must be created at the group level before use
Can be applied to subjects, sessions, acquisitions, and files
Case-insensitive (CONTROL and control are the same tag)
Searchable and usable in filters

Common Use Cases:

Cohort identification: Tag subjects as control, treatment, pilot
Quality control: Tag sessions as good, motion_artifacts, exclude
Processing stages: Tag sessions as raw, preprocessed, analyzed
Study phases: Tag data as baseline, intervention, followup

Example:

{
  "tags": ["cohort-1", "baseline", "qc-pass"]
}

Learn how to create and use tags

5. Notes

Notes are free-text annotations with timestamps and authorship information. They're useful for documenting observations, issues, or decisions.

Structure:

{
  "notes": [
    {
      "text": "Subject reported feeling claustrophobic during scan. Scan completed but may have motion artifacts.",
      "user": "jane.doe@institution.edu",
      "created": "2024-03-15T14:32:00Z",
      "modified": "2024-03-15T14:32:00Z"
    }
  ]
}

Best For:

Quality control observations
Communication between team members
Documenting unusual circumstances
Tracking decisions about data handling

Learn how to add notes

6. File Classification

File classification is structured metadata that categorizes files by their purpose, measurement type, and features.

Classification Dimensions:

Intent: Purpose of the file (Structural, Functional, Diffusion, etc.)
Measurement: What was measured (T1, T2, BOLD, FA, etc.)
Features: Special characteristics (Derived, Motion Corrected, Aligned, etc.)
Custom: Site-specific classifications

Example:

{
  "classification": {
    "Intent": ["Structural"],
    "Measurement": ["T1"],
    "Features": ["Brain Extracted"]
  }
}

How It Works:

Run File Classifier gear to assign classifications
The gear uses DICOM metadata (from File Metadata Importer) to determine appropriate classifications
Classifications can be manually adjusted or corrected in the web UI
Set up gear rules to automatically classify new uploads
Used by gear rules to trigger appropriate processing
Essential for BIDS curation

Learn more about file classification

Where Metadata Comes From

Metadata enters Flywheel through several sources:

Automatic Extraction

During DICOM Upload:

Container hierarchy is created from basic DICOM tags (PatientID, StudyDescription, SeriesDescription)
Full DICOM header extraction requires running the File Metadata Importer gear
File classification requires running the File Classifier gear
Both gears can be automated with gear rules

During BIDS Upload:

BIDS sidecar JSON files are read automatically
Metadata is imported into Flywheel containers
BIDS-specific fields are preserved

User Input

Through Web UI:

Add custom metadata fields
Create notes
Apply tags
Modify file classifications

Through CLI:

Use fw ingest commands with metadata flags
Bulk metadata import with templates

Through SDK:

Programmatically add/modify metadata
Batch operations for large datasets

Gear Processing

Analysis Gears:

Can read input file metadata
Can write output metadata
Can update container metadata

Utility Gears:

BIDS curation gear writes BIDS metadata
De-identification gear modifies metadata
Metadata extraction gears populate custom fields

Metadata Structure and Relationships

Understanding how metadata is organized helps you query and use it effectively.

Container Hierarchy

Metadata exists at each level of the Flywheel hierarchy:

Project
├── project.label
├── project.info
├── project.tags
├── project.notes
└── Subjects
    ├── subject.label
    ├── subject.info (custom fields)
    ├── subject.sex
    ├── subject.cohort
    ├── subject.tags
    └── Sessions
        ├── session.label
        ├── session.info (DICOM metadata + custom)
        ├── session.timestamp
        ├── session.age
        ├── session.tags
        ├── session.notes
        └── Acquisitions
            ├── acquisition.label
            ├── acquisition.info (DICOM metadata + custom)
            ├── acquisition.timestamp
            ├── acquisition.tags
            └── Files
                ├── file.name
                ├── file.type
                ├── file.classification
                ├── file.info (file-specific metadata)
                └── file.tags

Parent Relationships

Every container knows its parents:

{
  "parents": {
    "group": "neuroscience",
    "project": "65b407f03a1b66c54af9c279",
    "subject": "66b245f8986c3db10e893e0e",
    "session": "66b245f8986c3db10e893e0f",
    "acquisition": "66b2460a986c3db10e893e29"
  }
}

This allows queries like "find all T1 scans for subjects in the control cohort" by combining metadata from multiple hierarchy levels.

Accessing Metadata

Dot Notation:

Metadata fields are referenced using dot notation: container.field or container.info.customfield

Examples:

subject.sex - Subject's biological sex
session.age - Subject age at session
acquisition.info.SeriesDescription - DICOM series description
file.classification.Intent - File intent classification
session.tags - Tags applied to session

See field reference for complete list

Using Metadata Effectively

Search and Filter

Basic Search:

Search for metadata values directly in the search bar.

Advanced Search:

Build complex queries combining multiple metadata fields:

subject.cohort = "control" AND session.age < 30
acquisition.info.SeriesDescription contains "T1" AND session.tags contains "baseline"

Learn more about searching metadata

Data Views

Create tabular datasets by selecting metadata fields to export:

Choose which containers to include (sessions, acquisitions, files)
Select metadata columns to include
Apply filters based on metadata conditions
Export as CSV for analysis

Learn more about Data Views

Gear Rules

Configure automated processing based on metadata:

Trigger condition: file.type = "dicom" AND acquisition.info.SeriesDescription contains "BOLD"
Action: Run fMRIPrep gear
Result: New functional data is automatically preprocessed

Learn more about Gear Rules

BIDS Curation

Map Flywheel metadata to BIDS standard fields:

DICOM SeriesDescription → BIDS task field
Session age → BIDS age in participants.tsv
Custom info.run_number → BIDS run entity

Learn more about BIDS

Data Export and Migration

When exporting or copying data:

Metadata Preserved:

Container labels and hierarchy
Custom metadata (.info fields)
Tags
Notes
File classifications
DICOM metadata

Metadata NOT Preserved:

Gear job history
Analyses
Change log history
Project permissions
Gear rules
Session templates
Data views

Learn about data export

Common Metadata Workflows

Study Setup

Define custom metadata schema - Decide which study-specific fields you need
Create tags - Set up tags at group level for cohorts, quality flags, phases
Configure gear rules - Set up automated processing based on metadata
Create session templates - Define expected acquisitions and metadata

Data Import

DICOM import - Container hierarchy created from basic DICOM tags
Run metadata extraction - Use File Metadata Importer and File Classifier gears (or set up gear rules for automation)
Review and correct - Verify subject/session labels, classifications, fix any issues
Add custom fields - Populate study-specific metadata
Apply tags - Mark cohorts, phases, quality status

Quality Control

Review data - Check for issues using viewer
Add notes - Document observations or problems
Apply QC tags - Mark as qc-pass, qc-fail, exclude, etc.
Update custom fields - Record quality metrics

Analysis and Export

Search/filter - Find data meeting analysis criteria using metadata
Create data view - Export metadata table for statistical analysis
Run gears - Process data (gear rules use metadata to trigger)
Export results - Download with metadata preserved

Metadata Best Practices

Consistency

Use standardized field names across your site and studies
Document your metadata schema so all team members use the same conventions
Avoid synonyms - pick one term and stick to it (subject_id vs subjectID vs sub_id)

Data Types

Choose appropriate types - Use Number for numeric data you want to search by range
Don't mix types - Always use the same data type for a given field name
Use objects over arrays when possible - objects are more searchable

Field Names

Keep names stable - Don't use timestamps or highly variable values as field names
Use descriptive names - scan_quality not sq, motion_rating not mr
Follow conventions - If working with BIDS, use BIDS field names in custom metadata

System Limits

Be aware of Flywheel's metadata limits:

15,000 unique metadata fields per Flywheel site (location + name combinations)
8KB per container for combined structured and custom metadata
Exceeding limits can cause indexing and search issues

Quality and Documentation

Validate metadata - Check for missing or incorrect values regularly
Document meaning - Maintain a data dictionary explaining what each custom field means
Review regularly - Periodically audit metadata quality across your projects

Understanding Metadata in Flywheel

What is Metadata?

Why Metadata Matters

Search and Discovery

Automated Processing

Data Organization

Quality Control and Collaboration

Types of Metadata in Flywheel

1. Standard Container Fields

2. Custom Metadata (.info Object)

3. DICOM Metadata

4. Tags

5. Notes

6. File Classification

Where Metadata Comes From

Automatic Extraction

User Input

Gear Processing

Metadata Structure and Relationships

Container Hierarchy

Parent Relationships

Accessing Metadata

Using Metadata Effectively

Search and Filter

Data Views

Gear Rules

BIDS Curation

Data Export and Migration

Common Metadata Workflows

Study Setup

Data Import

Quality Control

Analysis and Export

Metadata Best Practices

Consistency

Data Types

Field Names

System Limits

Quality and Documentation

Next Steps

Learn to Add Metadata

Use Metadata

Export and Preserve Metadata

2. Custom Metadata (`.info` Object)