Skip to content

CLI Reference

This reference is auto-generated from the Portolan CLI source code using mkdocs-click.

Global Options

All commands support the following global options:

  • --version: Show the version and exit
  • --format [json|text]: Output format (json for machine parsing, text for humans)
  • --help: Show help message and exit

Commands

portolan

Portolan - Publish and manage cloud-native geospatial data catalogs.

Usage:

portolan [OPTIONS] COMMAND [ARGS]...

Options:

Name Type Description Default
--version boolean Show the version and exit. False
--format choice (json | text) Output format (json for machine parsing, text for humans). text
--help boolean Show this message and exit. False

portolan add

Track files in the catalog.

Accepts multiple paths like git add. Each path is processed independently with automatic collection inference based on directory structure.

Works like git: run from anywhere inside a catalog and it auto-detects the catalog root. Use --portolan-dir to override.

Item ID derivation: By default, the item ID is derived from the parent directory name. For example, adding 'census/2020/data.parquet' creates an item named '2020'. Use --item-id to override this automatic derivation. All other files in the item directory are tracked as companion assets (per ADR-0028).

Datetime handling (per ADR-0035): --datetime applies to ALL items added in this command. For items with different acquisition dates, run separate add commands:

    portolan add census/2020/ --datetime 2020-04-01
    portolan add census/2023/ --datetime 2023-04-01

If --datetime is omitted, items have null temporal extent and are
marked as provisional. Run 'portolan check' to find items needing dates.

Examples: portolan add demographics/census.parquet portolan add file1.geojson file2.geojson # Add multiple files portolan add imagery/ # Add all files in directory portolan add . # Add all files in catalog portolan add data.geojson --item-id my-id # Override item ID (single file only) portolan add sat.tif --datetime 2024-06-15 # Explicit acquisition date

Smart behavior: - Unchanged files are silently skipped (use --verbose to see them) - Changed files are re-extracted with new metadata - Sidecar files (.dbf, .shx, .prj for shapefiles) are auto-detected - All files in the item directory are tracked, not just geo files (ADR-0028)

Usage:

portolan add [OPTIONS] PATHS...

Options:

Name Type Description Default
--verbose, -v boolean Show detailed output including skipped unchanged files. False
--item-id text Override automatic item ID derivation. Must be a single path segment. None
--portolan-dir path Path to Portolan catalog root (default: auto-detect by walking up from cwd). None
--datetime datetime Acquisition/creation datetime (ISO 8601, YYYY-MM-DD, or 'YYYY-MM-DD HH:MM:SS'). Applied to ALL items in this command. For different datetimes per item, run separate add commands. If omitted, items are marked as provisional (portolan check will flag them). None
--workers integer Number of parallel workers for metadata extraction. Default is 1 (sequential). Use higher values for large catalogs. 1
--json boolean Output as JSON. False
--help boolean Show this message and exit. False

portolan check

Validate a Portolan catalog or check files for cloud-native status.

Runs validation rules against the catalog and reports any issues. With --fix, applies fixes based on selected scope.

PATH is the directory to check (default: current directory).

Use --metadata or --geo-assets to limit scope: - --metadata: Only check/fix STAC metadata (staleness, missing items) - --geo-assets: Only check/fix geospatial assets (cloud-native status) - Neither: Check/fix both (default)

Examples:

portolan check                        # Validate all (metadata + geo-assets)

portolan check --metadata             # Validate metadata only

portolan check --geo-assets           # Check geo-assets only

portolan check --fix                  # Fix both metadata and geo-assets

portolan check --metadata --fix       # Fix only metadata (create/update items)

portolan check --geo-assets --fix     # Fix only geo-assets (convert files)

portolan check --fix --dry-run        # Preview all fixes

Usage:

portolan check [OPTIONS] [PATH]

Options:

Name Type Description Default
--json boolean Output results as JSON False
--verbose, -v boolean Show all validation rules, not just failures False
--fix boolean Fix issues: convert geo-assets to cloud-native, update stale metadata False
--dry-run boolean Preview what would be fixed (use with --fix) False
--remove-legacy boolean Remove source files after successful conversion (use with --fix) False
--metadata boolean Only check/fix STAC metadata (links, schema, staleness) False
--geo-assets boolean Only check/fix geospatial assets (cloud-native status, convertibility) False
--help boolean Show this message and exit. False

portolan clean

Remove all Portolan metadata while preserving data files.

Removes catalog.json, collection.json, item.json (STAC metadata), versions.json, and the .portolan/ directory. Preserves all data files (.parquet, .tif, .gpkg, .geojson, etc.).

Use --dry-run to preview what would be removed without deleting anything.

Examples: portolan clean # Remove all metadata portolan clean --dry-run # Preview what would be removed

Usage:

portolan clean [OPTIONS]

Options:

Name Type Description Default
--dry-run boolean Preview what would be removed without actually deleting. False
--json boolean Output as JSON. False
--help boolean Show this message and exit. False

portolan clone

Clone a remote catalog to a local directory.

This is essentially "pull to an empty directory" with guardrails. Creates the target directory and pulls collections from remote storage.

REMOTE_URL is the object store URL (e.g., s3://mybucket/my-catalog).

LOCAL_PATH is optional - if not provided, it will be inferred from the catalog name in the URL (git clone style).

--collection is optional - if not provided, all collections in the remote catalog will be cloned.

Examples: # Infer directory from URL, clone all collections portolan clone s3://mybucket/my-catalog

# Clone to current directory (must be empty)
portolan clone s3://mybucket/my-catalog .

# Clone specific collection
portolan clone s3://mybucket/catalog -c demographics

# Clone all collections to specific directory
portolan clone s3://mybucket/catalog ./local-copy

# Clone specific collection with profile
portolan clone s3://mybucket/catalog ./data -c imagery --profile prod

Usage:

portolan clone [OPTIONS] REMOTE_URL [LOCAL_PATH]

Options:

Name Type Description Default
--collection, -c text Collection to clone. If not specified, clones all collections. None
--profile text AWS profile name (for S3 sources). Uses env var or 'default' if not specified. None
--json boolean Output as JSON. False
--help boolean Show this message and exit. False

portolan config

Manage catalog configuration.

Configuration is stored in .portolan/config.yaml and follows this precedence:

  1. CLI argument (highest)
  2. Environment variable (PORTOLAN_)
  3. Collection-level config
  4. Catalog-level config
  5. Built-in default (lowest)

Examples: portolan config set remote s3://my-bucket/catalog/ portolan config get remote portolan config list portolan config unset remote

Usage:

portolan config [OPTIONS] COMMAND [ARGS]...

Options:

Name Type Description Default
--help boolean Show this message and exit. False
portolan config get

Get a configuration value.

Shows the resolved value and its source (env, catalog, collection, or not set).

KEY is the setting name (e.g., remote, aws_profile).

Examples: portolan config get remote portolan config get aws_profile --collection restricted

Usage:

portolan config get [OPTIONS] KEY

Options:

Name Type Description Default
--collection, -c text Get config for a specific collection. None
--json boolean Output as JSON. False
--help boolean Show this message and exit. False
portolan config list

List all configuration settings.

Shows all settings with their values and sources.

Examples: portolan config list portolan config list --collection demographics

Usage:

portolan config list [OPTIONS]

Options:

Name Type Description Default
--collection, -c text Show config for a specific collection. None
--json boolean Output as JSON. False
--help boolean Show this message and exit. False
portolan config set

Set a configuration value.

KEY is the setting name (e.g., remote, aws_profile). VALUE is the value to set.

Examples: portolan config set remote s3://my-bucket/ portolan config set aws_profile production portolan config set remote s3://public/ --collection demographics

Usage:

portolan config set [OPTIONS] KEY VALUE

Options:

Name Type Description Default
--collection, -c text Set config for a specific collection instead of catalog-level. None
--json boolean Output as JSON. False
--help boolean Show this message and exit. False
portolan config unset

Remove a configuration value.

Removes the setting from the config file. Does not affect environment variables.

KEY is the setting name to remove.

Examples: portolan config unset remote portolan config unset aws_profile --collection restricted

Usage:

portolan config unset [OPTIONS] KEY

Options:

Name Type Description Default
--collection, -c text Unset config for a specific collection. None
--json boolean Output as JSON. False
--help boolean Show this message and exit. False

portolan extract

Extract data from external sources into Portolan catalogs.

Convert data from ArcGIS services, APIs, or other sources into well-structured Portolan catalogs with STAC metadata.

Examples: portolan extract arcgis https://services.arcgis.com/.../FeatureServer ./output portolan extract arcgis URL --layers "Census" --dry-run portolan extract arcgis URL --filter "sdn_" --resume

Usage:

portolan extract [OPTIONS] COMMAND [ARGS]...

Options:

Name Type Description Default
--help boolean Show this message and exit. False
portolan extract arcgis

Extract data from ArcGIS FeatureServer/MapServer/ImageServer.

Downloads layers from an ArcGIS REST service and creates a Portolan catalog with GeoParquet files (vector) or COG files (raster) and STAC metadata.

URL is the ArcGIS service URL (FeatureServer, MapServer, ImageServer, or services root). OUTPUT_DIR is the directory to write extracted data (default: inferred from service name).

URL Types: FeatureServer/MapServer: Extract vector layers to GeoParquet ImageServer: Extract raster tiles to COG rest/services: Extract from all services (creates nested catalog)

Glob Patterns: Patterns use fnmatch syntax: * matches any, ? matches single char. Common patterns: - Country prefix: 'sdn_', 'ukr_' - Year suffix: '_2024', '2025' - Folder path: 'Hosted/cod_ab' - Dataset family: 'cod_ab_ukr'

Examples: # Extract all layers from a FeatureServer portolan extract arcgis https://services.arcgis.com/.../FeatureServer ./output

# Extract specific layers by name
portolan extract arcgis URL --layers "Census*,Transport*"

# List available services from a services root
portolan extract arcgis https://services.arcgis.com/.../rest/services --list-services

# Extract from services root (filter services)
portolan extract arcgis https://.../rest/services ./output --services "Census*"

# Dry run to see what would be extracted
portolan extract arcgis URL --dry-run

# Extract raw files only (no STAC catalog auto-init)
portolan extract arcgis URL --raw

# JSON output for agent consumption
portolan extract arcgis URL --json

Usage:

portolan extract arcgis [OPTIONS] URL [OUTPUT_DIR]

Options:

Name Type Description Default
--layers text Include layers matching glob patterns (comma-separated). Example: 'Census,Transport' None
--exclude-layers text Exclude layers matching glob patterns (comma-separated). Example: 'Legacy,Test' None
--filter text Apply glob filter to both services and layers. Example: 'sdn_', '_2024*' None
--services text Include services matching glob patterns (comma-separated). For services root URLs only. None
--exclude-services text Exclude services matching glob patterns (comma-separated). For services root URLs only. None
--list-services boolean List available services without extracting (for services root URLs). False
--workers integer range (1 and above) Parallel page requests per layer (default: 3). 3
--retries integer range (1 and above) Retry attempts per failed layer (default: 3). 3
--timeout float range (0.0 and above) Per-request timeout in seconds (default: 60). 60.0
--resume boolean Resume from existing extraction-report.json (skip succeeded layers). False
--dry-run boolean List layers without extracting. False
--json boolean Output extraction report as JSON. False
--auto boolean Skip confirmation prompts. False
--raw boolean Skip auto-init: create only extraction files, no STAC catalog. False
--tile-size integer range (between 256 and 8192) [ImageServer] Tile size in pixels (default: 4096). 4096
--bbox text [ImageServer] Bounding box filter: minx,miny,maxx,maxy (in service CRS). None
--compression choice (DEFLATE | JPEG | LZW | ZSTD) [ImageServer] COG compression (default: from config or DEFLATE). None
--max-concurrent integer range (between 1 and 16) [ImageServer] Maximum concurrent tile downloads (default: 4). 4
--help boolean Show this message and exit. False

portolan info

Show information about a file, collection, or catalog.

TARGET can be: - A file path (e.g., demographics/census.parquet) - shows file metadata - A collection directory (e.g., demographics/) - shows collection metadata - Omitted - shows catalog-level metadata

Per ADR-0022, the output format for files is: Format: GeoParquet CRS: EPSG:4326 Bbox: [-122.5, 37.7, -122.3, 37.9] Features: 4,231 Version: v1.2.0

Examples: portolan info demographics/census.parquet # File info portolan info demographics/ # Collection info portolan info # Catalog info portolan info demographics/census.parquet --json # JSON output

Usage:

portolan info [OPTIONS] [TARGET]

Options:

Name Type Description Default
--catalog path Path to catalog root (default: current directory). .
--json boolean Output as JSON. False
--help boolean Show this message and exit. False

portolan init

Initialize a new Portolan catalog.

Creates a catalog.json at the root level and a .portolan directory with management files (config.yaml). Also creates versions.json at the root.

Auto-extracts the catalog ID from the directory name.

PATH is the directory where the catalog should be created (default: current directory).

Use --auto to skip all prompts and use default values. Use --title and --description to set catalog metadata directly.

Examples: portolan init # Initialize in current directory portolan init --auto # Skip prompts, use defaults portolan init --title "My Catalog" # Set title portolan init /path/to/data --auto # Initialize in specific directory

Usage:

portolan init [OPTIONS] [PATH]

Options:

Name Type Description Default
--json boolean Output as JSON. False
--auto boolean Skip interactive prompts and use auto-extracted/default values. False
--title, -t text Human-readable title for the catalog. None
--description, -d text Description of the catalog. None
--help boolean Show this message and exit. False

portolan list

List all files in the catalog with tracking status.

Git-style behavior: automatically finds the catalog root by walking up from the current directory. Works from any subdirectory within a catalog. Use --catalog to override and specify an explicit path.

Shows all files organized by collection in a hierarchical tree view. Each file shows its tracking status, format type, and file size.

Status indicators: + = tracked (in versions.json, unchanged) + = untracked (on disk, not in versions.json) ~ = modified (in versions.json, checksum changed) ! = deleted (in versions.json, missing from disk)

Example output: censo-2010/ data/ (3 tracked, 2 untracked) + census-data.parquet (GeoParquet, 4.5MB) + metadata.parquet (GeoParquet, 1.2MB) + README.md (2KB) + style.json (1KB)

Examples: portolan list # List all files with status portolan list --collection demographics # Filter by collection portolan list --tracked-only # Show only tracked files portolan list --untracked-only # Show only untracked files portolan list --json # JSON output

Usage:

portolan list [OPTIONS]

Options:

Name Type Description Default
--collection, -c text Filter by collection ID. Sentinel.UNSET
--catalog path Path to catalog root (default: auto-detect by walking up from cwd). None
--json boolean Output as JSON. False
--tracked-only boolean Show only tracked files (hide untracked). False
--untracked-only boolean Show only untracked files. False
--help boolean Show this message and exit. False

portolan metadata

Manage catalog metadata for README generation.

metadata.yaml files supplement STAC with human-enrichable fields like titles, descriptions, contact info, and citations. These files can exist at any level in the catalog hierarchy (catalog, subcatalog, collection).

Examples: portolan metadata init # Create template at catalog root portolan metadata init demographics # Create template for collection portolan metadata validate # Validate metadata.yaml

Usage:

portolan metadata [OPTIONS] COMMAND [ARGS]...

Options:

Name Type Description Default
--help boolean Show this message and exit. False
portolan metadata init

Generate a metadata.yaml template.

Creates a .portolan/metadata.yaml file with all required and optional fields, including helpful comments explaining each field.

If PATH is provided, creates the template at that directory. Otherwise, creates it at the catalog root.

Examples: portolan metadata init # Template at catalog root portolan metadata init demographics # Template for collection portolan metadata init --force # Overwrite existing portolan metadata init --recursive # All levels in catalog portolan metadata init climate -r # All levels under climate/

Usage:

portolan metadata init [OPTIONS] [PATH]

Options:

Name Type Description Default
--force boolean Overwrite existing metadata.yaml file. False
-r, --recursive boolean Create templates at all STAC levels (catalogs, subcatalogs, collections). Skips items (item.json directories) and preserves existing files unless --force is used. False
--json boolean Output as JSON. False
--help boolean Show this message and exit. False
portolan metadata validate

Validate metadata.yaml against schema.

Checks for: - Required fields: title, description, contact (name + email), license - Format validation: email, SPDX license identifier, DOI

Uses hierarchical resolution: child metadata.yaml files inherit from parent levels and override specific fields.

Examples: portolan metadata validate # Validate at catalog root portolan metadata validate demographics # Validate for collection

Usage:

portolan metadata validate [OPTIONS] [PATH]

Options:

Name Type Description Default
--json boolean Output as JSON. False
--help boolean Show this message and exit. False

portolan pull

Pull updates from a remote catalog.

Git-style behavior: automatically finds the catalog root by walking up from the current directory. Works from any subdirectory within a catalog. Use --catalog to override and specify an explicit path.

Fetches changes from a remote catalog and downloads updated files. Similar to git pull, this checks for uncommitted local changes before overwriting.

REMOTE_URL is the remote catalog URL (e.g., s3://bucket/catalog).

If --collection is specified, pulls that collection only. If --collection is omitted, pulls all collections in the catalog.

Examples: # Pull a single collection portolan pull s3://mybucket/my-catalog --collection demographics portolan pull s3://mybucket/catalog -c imagery --dry-run

# Pull all collections
portolan pull s3://mybucket/catalog
portolan pull s3://mybucket/catalog --workers 4

Usage:

portolan pull [OPTIONS] REMOTE_URL

Options:

Name Type Description Default
--collection, -c text Collection to pull. If not specified, pulls all collections. None
--catalog path Path to catalog root (default: auto-detect by walking up from cwd). None
--force boolean Discard uncommitted local changes and overwrite with remote. False
--dry-run boolean Show what would be downloaded without actually downloading. Note: skips remote state check (no network I/O), so remote changes won't be detected. False
--profile text AWS profile name (for S3). Uses config or 'default' if not specified. None
--workers, -w integer range (1 and above) Parallel workers for catalog-wide pull (default: auto-detect based on CPU count; use 1 for sequential). Ignored when --collection is specified. None
--json boolean Output as JSON. False
--help boolean Show this message and exit. False

portolan push

Push local catalog changes to cloud object storage.

Git-style behavior: automatically finds the catalog root by walking up from the current directory. Works from any subdirectory within a catalog. Use --catalog to override and specify an explicit path.

Syncs collection(s) to a remote destination (S3, GCS, Azure). Uses optimistic locking to detect concurrent modifications.

DESTINATION is the object store URL (e.g., s3://mybucket/my-catalog). If not provided, uses the 'remote' configured via portolan config set remote.

If --collection is specified, pushes that collection only. If --collection is omitted, pushes all collections in the catalog.

Examples: # Push a single collection portolan push s3://mybucket/catalog --collection demographics portolan push gs://mybucket/catalog -c imagery --dry-run

# Push all collections
portolan push s3://mybucket/catalog
portolan push --dry-run  # Uses configured remote

Usage:

portolan push [OPTIONS] [DESTINATION]

Options:

Name Type Description Default
--collection, -c text Collection to push. If not specified, pushes all collections. None
--force boolean Overwrite remote even if it has diverged. False
--dry-run boolean Show what would be pushed without uploading. Note: skips remote state check (no network I/O), so conflicts won't be detected. False
--profile text AWS profile name (for S3 destinations). Uses config or 'default' if not specified. None
--catalog path Path to catalog root (default: auto-detect by walking up from cwd). None
--workers, -w integer range (1 and above) Parallel workers for catalog-wide push (default: auto-detect based on CPU count; use 1 for sequential). Ignored when --collection is specified. None
--json boolean Output as JSON. False
--verbose, -v boolean Show per-file upload details with size and speed. False
--help boolean Show this message and exit. False

portolan readme

Generate README.md from STAC metadata and metadata.yaml.

The README is a pure output - always generated from STAC (machine-extracted metadata) plus .portolan/metadata.yaml (human enrichment). Never hand-edit the README; edit metadata.yaml instead and regenerate.

Use --check in CI to verify the README is up-to-date:

Examples: portolan readme # Generate at catalog root portolan readme demographics # Generate for collection portolan readme --stdout # Print without writing portolan readme --check # CI mode: exit 1 if stale portolan readme --recursive # Generate for catalog and all collections

Usage:

portolan readme [OPTIONS] [PATH]

Options:

Name Type Description Default
--stdout boolean Print README to stdout instead of writing file. False
--check boolean Check if README is up-to-date (for CI). Exits 1 if stale. False
--recursive, -r boolean Generate READMEs for catalog and all collections. False
--json boolean Output as JSON. False
--help boolean Show this message and exit. False

portolan rm

Remove files from tracking.

By default, removes the file from disk AND untracks it from the catalog. Requires --force for destructive operations (deleting files).

Works like git: run from anywhere inside a catalog and it auto-detects the catalog root. Use --portolan-dir to override.

Safety flags: - --keep: Untrack file but preserve it on disk (safe, no --force needed) - --force: Required for destructive rm (when not using --keep) - --dry-run: Preview what would be removed without actually removing

Examples: portolan rm --keep imagery/old_data.tif # Safe: untrack only portolan rm --dry-run vectors/ # Preview what would be removed portolan rm -f demographics/census.parquet # Force delete and untrack portolan rm -f vectors/ # Force remove entire directory

Usage:

portolan rm [OPTIONS] PATH

Options:

Name Type Description Default
--keep boolean Untrack file but preserve it on disk. False
--force, -f boolean Force deletion without safety check. Required for destructive rm. False
--dry-run, -n boolean Show what would be removed without actually removing. False
--verbose, -v boolean Show detailed output including skipped files. False
--portolan-dir path Path to Portolan catalog root (default: auto-detect by walking up from cwd). None
--json boolean Output as JSON. False
--help boolean Show this message and exit. False

portolan scan

Scan a directory for geospatial files and potential issues.

Discovers files by extension, validates shapefile completeness, and reports issues that may cause problems during import.

PATH is the directory to scan (default: current directory).

Fix Mode: Use --fix to auto-rename files with: - Invalid characters (spaces, parentheses, non-ASCII) - Windows reserved names (CON, PRN, AUX, etc.) - Long paths (> 200 characters)

Use --dry-run to preview changes without applying.

Examples:

portolan scan                         # Scan current directory

portolan scan --json                  # JSON output in current directory

portolan scan /data/geospatial

portolan scan /large/tree --max-depth=2

portolan scan /data --no-recursive

portolan scan /data --fix --dry-run

portolan scan /data --fix

Usage:

portolan scan [OPTIONS] [PATH]

Options:

Name Type Description Default
--json boolean Output results as JSON False
--no-recursive boolean Scan only the target directory (no subdirectories) False
--max-depth integer Maximum recursion depth (0 = target directory only) None
--include-hidden boolean Include hidden files (starting with .) False
--follow-symlinks boolean Follow symbolic links (may cause loops) False
--all boolean Show all issues without truncation (default: show first 10 per severity) False
--tree boolean Show directory tree view with file status markers False
--suggest-collections boolean Suggest collection groupings based on filename patterns False
--manual boolean Show only issues requiring manual resolution False
--fix boolean Apply safe fixes (rename files with invalid characters, Windows reserved names, or long paths) False
--dry-run boolean Preview fixes without applying them (use with --fix) False
--strict boolean Treat warnings as errors (exit 1 on any warning or error) False
--help boolean Show this message and exit. False

portolan sync

Sync local catalog with remote storage (pull + push).

Orchestrates a full sync workflow: Pull -> Init -> Scan -> Check -> Push. This is the recommended way to keep a local catalog in sync with remote.

DESTINATION is the object store URL (e.g., s3://mybucket/my-catalog).

Examples: portolan sync s3://mybucket/catalog --collection demographics portolan sync s3://mybucket/catalog -c imagery --dry-run portolan sync s3://mybucket/catalog -c data --fix --force portolan sync s3://mybucket/catalog -c data --profile prod portolan sync --collection demographics # Uses configured remote

Usage:

portolan sync [OPTIONS] [DESTINATION]

Options:

Name Type Description Default
--collection, -c text Collection to sync (required). Sentinel.UNSET
--force boolean Overwrite conflicts on both pull and push. False
--dry-run boolean Show what would happen without making changes. False
--fix boolean Convert non-cloud-native formats during check. False
--profile text AWS profile name (for S3 destinations). Uses config or 'default' if not specified. None
--catalog path Path to catalog root (default: auto-detect by walking up from cwd). None
--json boolean Output as JSON. False
--help boolean Show this message and exit. False