FlowFabric API — Performance Guide

Benchmarks, competitive analysis, and query optimization patterns for production hydrologic workflows.

All production figures below were measured against the deployed API at flowfabric-api.lynker-spatial.com from within AWS us-east-1 (≤10 ms RTT), 5 iterations each, warm cache.

Use case	Measured time	Rows	Response size
Rating curves — 2 reaches	0.78 s	672	12 KB
NWM forecast — 100 reaches, latest run	0.42 s	1,800	45 KB
NWM reanalysis — 10 reaches, 10 years	8.3 s	876,000	45 KB

Local development: Querying from a developer laptop adds S3 egress latency — typically 3–10× slower than the deployed path. The numbers above reflect what end users see in production.

Benchmark Details

Rating Curves — 2 reaches

import httpx, pyarrow as pa, io

response = httpx.post(
    "https://flowfabric-api.lynker-spatial.com/v1/ratings",
    headers={"Authorization": f"Bearer {token}"},
    json={"feature_ids": ["8318793", "8318787"], "type": "rem"}
)
table = pa.ipc.open_stream(io.BytesIO(response.content)).read_all()

Metric	Value
Average	0.78 s
Std dev	±0.04 s
Min / Max	0.73 s / 0.82 s
Rows returned	672
Response size	12 KB (Arrow IPC) vs ~28 KB JSON

Why it's fast: Partition-level filtering prunes to the relevant VPU shards before any row is read, skipping 99%+ of stored data.

NWM Forecast — 100 reaches, latest run

response = httpx.post(
    "https://flowfabric-api.lynker-spatial.com/v1/datasets/nws_owp_nwm_short_range/streamflow",
    headers={"Authorization": f"Bearer {token}"},
    json={
        "query_mode": "run",
        "issue_time": "latest",
        "scope": "features",
        "feature_ids": ["8318793", "8318787", ...]   # 100 IDs
    }
)

Metric	Value
Average	0.42 s
Std dev	±0.04 s
Min / Max	0.39 s / 0.47 s
Rows returned	1,800 (18 time steps × 100 reaches)
Response size	45 KB (Arrow IPC) vs ~120 KB JSON

NWM Reanalysis — 10 reaches, 10 years

response = httpx.post(
    "https://flowfabric-api.lynker-spatial.com/v1/datasets/nws_owp_nwm_reanalysis_3_0/streamflow",
    headers={"Authorization": f"Bearer {token}"},
    json={
        "query_mode": "absolute",
        "start_time": "2013-01-01T00:00:00Z",
        "end_time":   "2022-12-31T23:00:00Z",
        "scope": "features",
        "feature_ids": ["8318793", "8318787", ...]   # 10 IDs
    }
)

Metric	Value
Average	8.3 s
Std dev	±0.4 s
Min / Max	7.9 s / 8.7 s
Rows returned	876,000 (10 reaches × 10 yr × 8,760 hr/yr)
Response size	45 KB (Arrow IPC)

Why it's fast given the scale: The entire reanalysis corpus is petabyte-scale. The API reads only the Zarr chunks that intersect the requested feature IDs and time window — roughly 0.00001% of total stored data.

Why Arrow IPC?

All endpoints default to Apache Arrow IPC — a binary columnar wire format.

Format	Relative size	Python parse time
Arrow IPC	1× (smallest)	~1 ms
Parquet	~1.2×	~5 ms
JSON	~2.5–3×	~50 ms

Arrow is natively supported in Python (pyarrow), R (arrow), JavaScript (apache-arrow), and most BI tools.

Python

import pyarrow as pa, io
table = pa.ipc.open_stream(io.BytesIO(response.content)).read_all()
df = table.to_pandas()

library(arrow)
df <- as.data.frame(arrow::read_ipc_stream(httr::content(resp, "raw")))

JavaScript

import * as arrow from "apache-arrow";
const table = arrow.tableFromIPC(await response.arrayBuffer());

Competitive Analysis

Four direct access paths exist for NWM data alongside FlowFabric. Each fills a different niche; none covers the full scope of FlowFabric.

CIROH Hub is a resource aggregator and documentation portal — it catalogs NWM data and access options across AWS, GCP, and Azure but is not itself a query service. The tools below are what CIROH Hub points to.

Capability	FlowFabric	CIROH GCP API	NOAA NOMADS	NOAA NWPS	AWS Archive
Server-side reach filter	✅ Yes	✅ Yes	❌ No	✅ Yes	Zarr only
Batch (many reaches, one call)	✅ Yes	✅ Yes	❌ No	❌ One per call	❌ No
Forecast data	✅ Yes	✅ Yes	✅ Rolling 2–4 days	✅ Operational	❌ No
Reanalysis (1979–2023)	✅ Yes	⚠ 2018–present only	❌ No	❌ No	✅ Static archive
Wire format	Arrow IPC	JSON / CSV	NetCDF (full file)	JSON	NetCDF / Zarr
Authentication	Bearer / API key	API key (CIROH members)	None	None	None (AWS SDK)
Access model	REST API	REST API	File download	REST API	S3 CLI/SDK
Public / no membership	✅ Yes	❌ CIROH members only	✅ Yes	✅ Yes	✅ Yes

CIROH / NOAA GCP API

hub.ciroh.org/docs/products/data-management/bigquery-api/

The CIROH GCP API (nwm-api.ciroh.org) is a FastAPI service backed by NWM data on Google Cloud. It exposes four endpoints: /forecast, /analysis-assim, /geometry, /return-period. The comids parameter provides true server-side reach filtering — the closest structural analog to FlowFabric's architecture.

Key gaps: access requires CIROH membership and an approved project; responses are JSON or CSV only; reanalysis coverage starts at September 2018 (operational GCP archive start, not 1979); and BigQuery scan-based billing can surface unexpected cost on large queries.

Use the CIROH GCP API for programmatic access within a CIROH-funded project, or for joining NWM data with other GCP datasets via BigQuery SQL.
Use FlowFabric for public-facing applications, Arrow IPC performance, pre-2018 reanalysis, or when CIROH membership is not available.

NOAA NOMADS File Server

nomads.ncep.noaa.gov/pub/data/nccf/com/nwm/prod/

NOMADS is NOAA's raw operational file server. Files appear as model cycles complete — roughly 30–60 minutes post-run — and are retained for approximately 2–4 days. Each file covers all CONUS reaches for one time step (~13 MB per channel_rt file). There is no per-reach filter; every query is a full file download followed by local parsing.

Use NOMADS if you need the raw model output files immediately as they come off the model (e.g., to ingest into your own system).
Use FlowFabric for anything requiring reach-level selection, reanalysis, or a response format your application can consume directly.

NOAA Water Prediction Services (NWPS) API

api.water.noaa.gov/nwps/v1/docs/

The NOAA NWPS API is a public REST service with native reach-ID routing (GET /reaches/{reachId}/streamflow). It returns analysis assimilation (~3-day past window) and short/medium-range forecasts in a single JSON response. No authentication or API key required.

The constraints are structural: one reach per request (no batch endpoint), operational data only (no reanalysis), response units are ft³/s JSON (no binary format), and service availability is best-effort — 503 responses have been observed. There is no stated SLA for third-party use.

Use NWPS for public-facing applications where one or a handful of reaches are queried, no authentication is acceptable, and latency requirements are loose.
Use FlowFabric for batch queries, reanalysis, Arrow IPC, or any application with a reliability requirement.

AWS NWM Retrospective Archive

registry.opendata.aws/nwm-archive/

The AWS Open Data registry hosts all NWM retrospective runs: v1.2 (1993–2017), v2.0 (1993–2018), v2.1 (1979–2020), and v3.0 (1979–2023, ~250 TB). Data is free with no authentication; standard AWS egress charges apply outside us-east-1.

This is the authoritative source for long-record reanalysis — FlowFabric's reanalysis backend reads from this same archive. Accessing it directly requires xarray + zarr expertise: the Zarr format enables per-reach partial reads, but naive NetCDF access downloads the full file. There is no API layer, no operational data, and no updates after January 2023.

Use the AWS archive directly for academic research that requires the raw NetCDF/Zarr files, custom aggregations across the full 250 TB, or citation of the canonical NOAA dataset.
Use FlowFabric for reach-level queries that do not need the raw files — FlowFabric already reads this archive and handles all Zarr chunking, serialization, and delivery.

Overall Assessment

FlowFabric is the only service that combines:

Both forecast and reanalysis — all alternatives cover one or the other, not both.
True batch reach filtering in a public, no-membership REST API — the CIROH GCP API does this too, but requires CIROH affiliation.
Binary columnar output (Arrow IPC) — every alternative returns JSON, CSV, or requires client-side NetCDF parsing.
A measured production SLA — 0.42 s for 100-reach forecasts, 8.3 s for 10-reach 10-year reanalysis.

The closest competitor for operational use is the CIROH GCP API (members only); the closest for reanalysis bulk work is direct S3 access to the AWS archive. FlowFabric's design goal is to make the reach-level query case fast and simple enough that building a direct S3/Zarr pipeline is rarely worth it.

Optimization Tips

1. Preview before you query

?estimate=true returns row and byte counts instantly and does not count against your quota:

estimate = httpx.post(
    "https://flowfabric-api.lynker-spatial.com/v1/datasets/nws_owp_nwm_reanalysis_3_0/streamflow?estimate=true",
    headers={"Authorization": f"Bearer {token}"},
    json={
        "query_mode": "absolute",
        "start_time": "2013-01-01T00:00:00Z",
        "end_time":   "2022-12-31T23:00:00Z",
        "scope": "features",
        "feature_ids": ["8318793"]
    }
).json()

print(estimate["estimated_rows"], estimate["estimated_bytes"])
if estimate["would_exceed_sync_limits"]:
    print("Switch to mode='export'")

2. Batch your feature IDs

One call with 100 feature IDs is 50–100× faster than 100 separate calls:

# ❌ Slow — 100 separate requests
for fid in feature_ids:
    df = query(feature_ids=[fid])

# ✅ Fast — one request
df = query(feature_ids=feature_ids)

3. Use export mode for large queries

mode="sync" streams immediately — suitable for < 100 MB. For larger payloads, mode="export" writes a Parquet file to S3 and returns a pre-signed download link:

result = httpx.post(..., json={..., "mode": "export"}).json()
# result["download_url"] ready in 30–60 seconds

4. Use `/v1/stage` instead of chaining two calls

The stage endpoint chains streamflow lookup and rating curve translation in one round trip:

# ❌ Two round trips
sf    = query_streamflow(dataset_id=..., feature_ids=...)
rc    = query_ratings(feature_ids=...)
stage = translate(sf, rc)

# ✅ One round trip
stage = httpx.post("/v1/stage", json={
    "dataset_id":   "nws_owp_nwm_analysis",
    "issue_time":   "latest",
    "feature_ids":  [...],
    "ratings_type": "rem"
}).json()

5. Filter by Strahler order for regional queries

When using POST /v1/features/bbox over a large area, stream_order_min reduces the feature count before any data query:

features = httpx.post("/v1/features/bbox", json={
    "bbox": [-110, 35, -100, 45],
    "stream_order_min": 4,    # main stems only
    "max_features": 1000
}).json()["feature_ids"]

Rate Limits & Quotas

Rate limits reset every 60 seconds. Current window status is included in every response:

X-RateLimit-Limit: 120
X-RateLimit-Remaining: 117
X-RateLimit-Reset: 1745836800

Tier	Requests / min	Monthly data
free	20	3 GB
standard	120	Unlimited
pro	600	Unlimited
enterprise	3,000	Unlimited

Check remaining quota at GET /v1/me/usage.

Benchmark Methodology

Detail	Value
Environment	AWS us-east-1, deployed API
Round-trip latency	~10 ms
Iterations	5 (3 for long reanalysis queries)
Cache state	Warm (vpuid index pre-loaded)
Date measured	January 9, 2026

Results represent steady-state performance after the first request warms the in-process partition index. Cold-start (first request after a fresh deployment) adds a one-time index initialization cost.

Support

Interactive docs: https://flowfabric-api.lynker-spatial.com/docs
Built by Lynker Spatial