Keyboard shortcuts

Press or to navigate between chapters

Press S or / to search in the book

Press ? to show this help

Press Esc to hide this help

Real-World Datasets

Learn jpx by working with real data from public APIs. Each example includes:

  • How to fetch the data
  • Data structure overview
  • Progressive examples from basic to advanced
  • Practical use cases

Available Guides

GuideDescriptionKey Features
Standard JMESPath OnlyPortable queries using only spec functions26 built-in functions, no extensions
NLP Text ProcessingText analysis pipelinesTokenization, stemming, stopwords, normalization
Hacker NewsTech discussions via Algolia APINLP on real content, topic detection, vocabulary analysis
USGS EarthquakesReal-time seismic dataGeo functions, statistics, filtering
Nobel Prize APILaureates and prizesMultilingual data, text processing, dates
NASA Near Earth ObjectsAsteroids and cometsNested data, unit conversions, risk analysis
Project ManagementSynthetic project dataComprehensive function coverage, all categories

Quick Start

# Fetch earthquake data
curl -s "https://earthquake.usgs.gov/fdsnws/event/1/query?format=geojson&limit=20&minmagnitude=5" > quakes.json

# Try a query
jpx 'features[*].{mag: properties.mag, place: properties.place}' quakes.json

What You’ll Learn

Filtering & Selection

  • Complex filter expressions with multiple conditions
  • Nested field access patterns
  • Text-based filtering with contains, starts_with

Statistics & Aggregation

  • avg, median, stddev for numeric analysis
  • min, max, min_by, max_by for extremes
  • length and counting patterns

Geographic Calculations

  • geo_distance_km for distance calculations
  • Coordinate extraction and formatting
  • Distance-based sorting

Date/Time Operations

  • Unix timestamp conversion with from_unixtime
  • Date formatting with format_datetime
  • Date range filtering

Data Transformation

  • Reshaping nested structures
  • Flattening for export
  • CSV/TSV output for spreadsheets

Pipeline Patterns

  • Multi-step transformations
  • Sorting and limiting results
  • Building summary reports

Tips for Working with APIs

  1. Save data locally for faster iteration:

    curl -s "API_URL" > data.json
    jpx 'expression' data.json
    
  2. Explore structure first:

    jpx 'keys(@)' data.json          # Top-level keys
    jpx '@[0]' data.json             # First element (arrays)
    jpx 'type(@)' data.json          # Data type
    
  3. Use --compact for pipelines:

    jpx -c 'expression' data.json | jpx 'next_expression'
    
  4. Export for analysis:

    jpx --csv 'transform' data.json > output.csv
    

More Data Sources

Looking for more datasets to practice with? Check out: