Skip to content

Metadata Defaults

When source data files lack certain metadata (like nodata values or temporal information), you can specify defaults in metadata.yaml that Portolan will use to fill the gaps.

Use Cases

  • Aerial imagery without nodata: Source COGs exported from Global Mapper or ArcGIS often lack nodata values, even when black (0) pixels represent no data
  • Historical datasets without dates: Legacy data may not have acquisition dates embedded in file metadata
  • Bulk imports: When adding many files from the same source, set collection-level defaults instead of per-file flags

Setting Defaults

Add a defaults section to your .portolan/metadata.yaml:

# .portolan/metadata.yaml

contact:
  name: "Data Team"
  email: "data@example.org"

license: "CC-BY-4.0"

# Data defaults - applied when auto-extraction fails
defaults:
  temporal:
    year: 2025              # All items default to 2025-01-01

  raster:
    nodata: 0               # Black pixels (0) are nodata

Temporal Defaults

For datasets where all items share an acquisition period:

defaults:
  temporal:
    year: 2025    # Produces datetime: 2025-01-01T00:00:00Z

Explicit Date Range

defaults:
  temporal:
    start: "2025-04-15"    # ISO format: YYYY-MM-DD
    end: "2025-05-30"

Year vs Start/End

If both year and start are specified, year takes precedence. Use one or the other, not both.

CLI Override

The --datetime flag always overrides metadata.yaml defaults:

# Uses metadata.yaml default
portolan add data/

# Overrides default for this specific add
portolan add data/ --datetime 2024-06-15

Raster Nodata Defaults

For COG files where nodata wasn't set in the source:

Uniform Nodata (All Bands)

defaults:
  raster:
    nodata: 0    # Applied to all bands

Per-Band Nodata

defaults:
  raster:
    nodata: [0, 0, 255]    # R=0, G=0, B=255

When to Use Per-Band

Per-band nodata is useful when different bands use different sentinel values. For RGB imagery, uniform nodata (typically 0) is usually sufficient.

Hierarchy and Inheritance

Defaults follow Portolan's hierarchical config pattern:

catalog/.portolan/metadata.yaml          # Catalog-level defaults
  └── collection/.portolan/metadata.yaml # Collection overrides
        └── subcatalog/.portolan/metadata.yaml  # Most specific wins

Example: Set nodata at catalog level, override temporal at collection level:

# catalog/.portolan/metadata.yaml
defaults:
  raster:
    nodata: 0

# catalog/aerial-2025/.portolan/metadata.yaml
defaults:
  temporal:
    year: 2025
  # Inherits raster.nodata: 0 from parent

Behavior Rules

Scenario Behavior
Source file has value File value used (defaults don't override)
Source file lacks value Default applied
CLI flag provided CLI flag overrides default
No default, no source value Field left null/empty

Key principle: Defaults fill gaps, they don't override extracted data.

Validation

Portolan validates defaults when loading metadata.yaml. Invalid defaults cause portolan add to fail with a clear error message.

Temporal Validation

Constraint Requirement
year type Must be an integer (not a string like "2025")
year range Must be between 1800 and 2100
start/end format ISO date string: YYYY-MM-DD
start/end validity Must be a real date (no 2025-02-30)
Mutual exclusion Cannot specify both year and start

Raster Validation

Constraint Requirement
nodata type Must be a number or list of numbers
nodata values Must be finite (no NaN or Infinity)
Per-band list Must match the raster's band count exactly
Empty list Not allowed

Examples

# ✓ Valid
defaults:
  temporal:
    year: 2025              # Integer in range

# ✗ Invalid - will error
defaults:
  temporal:
    year: "2025"            # String instead of integer
    start: "04-15-2025"     # Wrong date format (must be YYYY-MM-DD)

# ✗ Invalid - mutual exclusion
defaults:
  temporal:
    year: 2025
    start: "2025-04-15"     # Error: can't specify both

# ✗ Invalid - per-band mismatch
defaults:
  raster:
    nodata: [0, 0, 255, 127]  # Error if raster only has 3 bands

Example: Philadelphia Aerial Imagery

Real-world example from a catalog with 947 COG tiles:

# aerial-imagery/2025/.portolan/metadata.yaml

contact:
  name: "Nissim Lebovits"
  email: "nlebovits@pm.me"

license: "LicenseRef-CityOfPhiladelphia"
attribution: "City of Philadelphia / PASDA"

defaults:
  temporal:
    year: 2025    # All 2025 imagery defaults to 2025-01-01

  raster:
    nodata: 0     # Black collar pixels are nodata

This sets consistent metadata across all 947 items without requiring per-file flags.