Real-World Datasets
Learn jpx by working with real data from public APIs. Each example includes:
- How to fetch the data
- Data structure overview
- Progressive examples from basic to advanced
- Practical use cases
Available Guides
| Guide | Description | Key Features |
|---|---|---|
| Standard JMESPath Only | Portable queries using only spec functions | 26 built-in functions, no extensions |
| NLP Text Processing | Text analysis pipelines | Tokenization, stemming, stopwords, normalization |
| Hacker News | Tech discussions via Algolia API | NLP on real content, topic detection, vocabulary analysis |
| USGS Earthquakes | Real-time seismic data | Geo functions, statistics, filtering |
| Nobel Prize API | Laureates and prizes | Multilingual data, text processing, dates |
| NASA Near Earth Objects | Asteroids and comets | Nested data, unit conversions, risk analysis |
| Project Management | Synthetic project data | Comprehensive function coverage, all categories |
Quick Start
# Fetch earthquake data
curl -s "https://earthquake.usgs.gov/fdsnws/event/1/query?format=geojson&limit=20&minmagnitude=5" > quakes.json
# Try a query
jpx 'features[*].{mag: properties.mag, place: properties.place}' quakes.json
What You’ll Learn
Filtering & Selection
- Complex filter expressions with multiple conditions
- Nested field access patterns
- Text-based filtering with
contains,starts_with
Statistics & Aggregation
avg,median,stddevfor numeric analysismin,max,min_by,max_byfor extremeslengthand counting patterns
Geographic Calculations
geo_distance_kmfor distance calculations- Coordinate extraction and formatting
- Distance-based sorting
Date/Time Operations
- Unix timestamp conversion with
from_unixtime - Date formatting with
format_datetime - Date range filtering
Data Transformation
- Reshaping nested structures
- Flattening for export
- CSV/TSV output for spreadsheets
Pipeline Patterns
- Multi-step transformations
- Sorting and limiting results
- Building summary reports
Tips for Working with APIs
-
Save data locally for faster iteration:
curl -s "API_URL" > data.json jpx 'expression' data.json -
Explore structure first:
jpx 'keys(@)' data.json # Top-level keys jpx '@[0]' data.json # First element (arrays) jpx 'type(@)' data.json # Data type -
Use
--compactfor pipelines:jpx -c 'expression' data.json | jpx 'next_expression' -
Export for analysis:
jpx --csv 'transform' data.json > output.csv
More Data Sources
Looking for more datasets to practice with? Check out:
- Awesome JSON Datasets - Curated list of public JSON APIs
- Public APIs - Collective list of free APIs
- NASA Open APIs - Space and Earth science data
- OpenWeatherMap - Weather data
- GitHub API - Repository and user data