Benchmarks, competitive analysis, and query optimization patterns for production hydrologic workflows.
All production figures below were measured against the deployed API at flowfabric-api.lynker-spatial.com from within AWS us-east-1 (≤10 ms RTT), 5 iterations each, warm cache.
| Use case | Measured time | Rows | Response size |
|---|---|---|---|
| Rating curves — 2 reaches | 0.78 s | 672 | 12 KB |
| NWM forecast — 100 reaches, latest run | 0.42 s | 1,800 | 45 KB |
| NWM reanalysis — 10 reaches, 10 years | 8.3 s | 876,000 | 45 KB |
Local development: Querying from a developer laptop adds S3 egress latency — typically 3–10× slower than the deployed path. The numbers above reflect what end users see in production.
import httpx, pyarrow as pa, io
response = httpx.post(
"https://flowfabric-api.lynker-spatial.com/v1/ratings",
headers={"Authorization": f"Bearer {token}"},
json={"feature_ids": ["8318793", "8318787"], "type": "rem"}
)
table = pa.ipc.open_stream(io.BytesIO(response.content)).read_all()
| Metric | Value |
|---|---|
| Average | 0.78 s |
| Std dev | ±0.04 s |
| Min / Max | 0.73 s / 0.82 s |
| Rows returned | 672 |
| Response size | 12 KB (Arrow IPC) vs ~28 KB JSON |
Why it's fast: Partition-level filtering prunes to the relevant VPU shards before any row is read, skipping 99%+ of stored data.
response = httpx.post(
"https://flowfabric-api.lynker-spatial.com/v1/datasets/nws_owp_nwm_short_range/streamflow",
headers={"Authorization": f"Bearer {token}"},
json={
"query_mode": "run",
"issue_time": "latest",
"scope": "features",
"feature_ids": ["8318793", "8318787", ...] # 100 IDs
}
)
| Metric | Value |
|---|---|
| Average | 0.42 s |
| Std dev | ±0.04 s |
| Min / Max | 0.39 s / 0.47 s |
| Rows returned | 1,800 (18 time steps × 100 reaches) |
| Response size | 45 KB (Arrow IPC) vs ~120 KB JSON |
response = httpx.post(
"https://flowfabric-api.lynker-spatial.com/v1/datasets/nws_owp_nwm_reanalysis_3_0/streamflow",
headers={"Authorization": f"Bearer {token}"},
json={
"query_mode": "absolute",
"start_time": "2013-01-01T00:00:00Z",
"end_time": "2022-12-31T23:00:00Z",
"scope": "features",
"feature_ids": ["8318793", "8318787", ...] # 10 IDs
}
)
| Metric | Value |
|---|---|
| Average | 8.3 s |
| Std dev | ±0.4 s |
| Min / Max | 7.9 s / 8.7 s |
| Rows returned | 876,000 (10 reaches × 10 yr × 8,760 hr/yr) |
| Response size | 45 KB (Arrow IPC) |
Why it's fast given the scale: The entire reanalysis corpus is petabyte-scale. The API reads only the Zarr chunks that intersect the requested feature IDs and time window — roughly 0.00001% of total stored data.
All endpoints default to Apache Arrow IPC — a binary columnar wire format.
| Format | Relative size | Python parse time |
|---|---|---|
| Arrow IPC | 1× (smallest) | ~1 ms |
| Parquet | ~1.2× | ~5 ms |
| JSON | ~2.5–3× | ~50 ms |
Arrow is natively supported in Python (pyarrow), R (arrow), JavaScript (apache-arrow), and most BI tools.
Python
import pyarrow as pa, io
table = pa.ipc.open_stream(io.BytesIO(response.content)).read_all()
df = table.to_pandas()
R
library(arrow)
df <- as.data.frame(arrow::read_ipc_stream(httr::content(resp, "raw")))
JavaScript
import * as arrow from "apache-arrow";
const table = arrow.tableFromIPC(await response.arrayBuffer());
Four direct access paths exist for NWM data alongside FlowFabric. Each fills a different niche; none covers the full scope of FlowFabric.
CIROH Hub is a resource aggregator and documentation portal — it catalogs NWM data and access options across AWS, GCP, and Azure but is not itself a query service. The tools below are what CIROH Hub points to.
| Capability | FlowFabric | CIROH GCP API | NOAA NOMADS | NOAA NWPS | AWS Archive |
|---|---|---|---|---|---|
| Server-side reach filter | ✅ Yes | ✅ Yes | ❌ No | ✅ Yes | Zarr only |
| Batch (many reaches, one call) | ✅ Yes | ✅ Yes | ❌ No | ❌ One per call | ❌ No |
| Forecast data | ✅ Yes | ✅ Yes | ✅ Rolling 2–4 days | ✅ Operational | ❌ No |
| Reanalysis (1979–2023) | ✅ Yes | ⚠ 2018–present only | ❌ No | ❌ No | ✅ Static archive |
| Wire format | Arrow IPC | JSON / CSV | NetCDF (full file) | JSON | NetCDF / Zarr |
| Authentication | Bearer / API key | API key (CIROH members) | None | None | None (AWS SDK) |
| Access model | REST API | REST API | File download | REST API | S3 CLI/SDK |
| Public / no membership | ✅ Yes | ❌ CIROH members only | ✅ Yes | ✅ Yes | ✅ Yes |
hub.ciroh.org/docs/products/data-management/bigquery-api/
The CIROH GCP API (nwm-api.ciroh.org) is a FastAPI service backed by NWM data on Google Cloud. It exposes four endpoints: /forecast, /analysis-assim, /geometry, /return-period. The comids parameter provides true server-side reach filtering — the closest structural analog to FlowFabric's architecture.
Key gaps: access requires CIROH membership and an approved project; responses are JSON or CSV only; reanalysis coverage starts at September 2018 (operational GCP archive start, not 1979); and BigQuery scan-based billing can surface unexpected cost on large queries.
nomads.ncep.noaa.gov/pub/data/nccf/com/nwm/prod/
NOMADS is NOAA's raw operational file server. Files appear as model cycles complete — roughly 30–60 minutes post-run — and are retained for approximately 2–4 days. Each file covers all CONUS reaches for one time step (~13 MB per channel_rt file). There is no per-reach filter; every query is a full file download followed by local parsing.
api.water.noaa.gov/nwps/v1/docs/
The NOAA NWPS API is a public REST service with native reach-ID routing (GET /reaches/{reachId}/streamflow). It returns analysis assimilation (~3-day past window) and short/medium-range forecasts in a single JSON response. No authentication or API key required.
The constraints are structural: one reach per request (no batch endpoint), operational data only (no reanalysis), response units are ft³/s JSON (no binary format), and service availability is best-effort — 503 responses have been observed. There is no stated SLA for third-party use.
registry.opendata.aws/nwm-archive/
The AWS Open Data registry hosts all NWM retrospective runs: v1.2 (1993–2017), v2.0 (1993–2018), v2.1 (1979–2020), and v3.0 (1979–2023, ~250 TB). Data is free with no authentication; standard AWS egress charges apply outside us-east-1.
This is the authoritative source for long-record reanalysis — FlowFabric's reanalysis backend reads from this same archive. Accessing it directly requires xarray + zarr expertise: the Zarr format enables per-reach partial reads, but naive NetCDF access downloads the full file. There is no API layer, no operational data, and no updates after January 2023.
FlowFabric is the only service that combines:
The closest competitor for operational use is the CIROH GCP API (members only); the closest for reanalysis bulk work is direct S3 access to the AWS archive. FlowFabric's design goal is to make the reach-level query case fast and simple enough that building a direct S3/Zarr pipeline is rarely worth it.
?estimate=true returns row and byte counts instantly and does not count against your quota:
estimate = httpx.post(
"https://flowfabric-api.lynker-spatial.com/v1/datasets/nws_owp_nwm_reanalysis_3_0/streamflow?estimate=true",
headers={"Authorization": f"Bearer {token}"},
json={
"query_mode": "absolute",
"start_time": "2013-01-01T00:00:00Z",
"end_time": "2022-12-31T23:00:00Z",
"scope": "features",
"feature_ids": ["8318793"]
}
).json()
print(estimate["estimated_rows"], estimate["estimated_bytes"])
if estimate["would_exceed_sync_limits"]:
print("Switch to mode='export'")
One call with 100 feature IDs is 50–100× faster than 100 separate calls:
# ❌ Slow — 100 separate requests
for fid in feature_ids:
df = query(feature_ids=[fid])
# ✅ Fast — one request
df = query(feature_ids=feature_ids)
mode="sync" streams immediately — suitable for < 100 MB. For larger payloads, mode="export" writes a Parquet file to S3 and returns a pre-signed download link:
result = httpx.post(..., json={..., "mode": "export"}).json()
# result["download_url"] ready in 30–60 seconds
/v1/stage instead of chaining two callsThe stage endpoint chains streamflow lookup and rating curve translation in one round trip:
# ❌ Two round trips
sf = query_streamflow(dataset_id=..., feature_ids=...)
rc = query_ratings(feature_ids=...)
stage = translate(sf, rc)
# ✅ One round trip
stage = httpx.post("/v1/stage", json={
"dataset_id": "nws_owp_nwm_analysis",
"issue_time": "latest",
"feature_ids": [...],
"ratings_type": "rem"
}).json()
When using POST /v1/features/bbox over a large area, stream_order_min reduces the feature count before any data query:
features = httpx.post("/v1/features/bbox", json={
"bbox": [-110, 35, -100, 45],
"stream_order_min": 4, # main stems only
"max_features": 1000
}).json()["feature_ids"]
Rate limits reset every 60 seconds. Current window status is included in every response:
X-RateLimit-Limit: 120
X-RateLimit-Remaining: 117
X-RateLimit-Reset: 1745836800
| Tier | Requests / min | Monthly data |
|---|---|---|
| free | 20 | 3 GB |
| standard | 120 | Unlimited |
| pro | 600 | Unlimited |
| enterprise | 3,000 | Unlimited |
Check remaining quota at GET /v1/me/usage.
| Detail | Value |
|---|---|
| Environment | AWS us-east-1, deployed API |
| Round-trip latency | ~10 ms |
| Iterations | 5 (3 for long reanalysis queries) |
| Cache state | Warm (vpuid index pre-loaded) |
| Date measured | January 9, 2026 |
Results represent steady-state performance after the first request warms the in-process partition index. Cold-start (first request after a fresh deployment) adds a one-time index initialization cost.