Documentation

How to test API latency with Delayt

Delayt measures real HTTP response times and reports percentile latency, the numbers that reflect what users actually experience, not misleading averages.

What Delayt is for

Delayt helps you answer one question: how fast is my API for real users? It sends sequential HTTP requests to your endpoints, records every response time, and computes p50, p95, and p99 percentiles.

Use it when you need to:

  • Validate staging before a deploy
  • Compare two endpoints or API versions side by side
  • Share latency evidence with your team via a link
  • Gate CI/CD pipelines on p95 thresholds with the CLI

Delayt is not a load generator. Requests run sequentially so you get clean latency measurements without concurrency noise.

How Delayt differs from JMeter, Gatling, Locust, and k6

Tools like JMeter, Gatling, Locust, k6, and Artillery are built for load and stress testing. They simulate many concurrent users to find throughput limits, queue buildup, and failure points under pressure.

Delayt is built for something else: a fast percentile smoke check before you ship. One endpoint, sequential requests, p50/p95/p99 out of the box, and a one-liner in CI. No test plans, no virtual users, no cluster setup.

Load tools (JMeter, Gatling, Locust, k6…)Delayt
Primary goalStress the system: concurrency, saturation, breaking pointsMeasure real latency distribution on a single path
Request patternMany parallel virtual users, ramps, scenariosSequential requests (clean timing, no concurrency noise)
Key metricsThroughput, error rate under load, RPS, sometimes percentilesp50, p95, p99 first; success rate and histograms in the UI
Setup timeScripts, plugins, agents, or distributed workersnpx @delayt/cli run -u … -n 50 or paste a URL in the app
Best moment to runCapacity planning, pre-scale drills, finding max RPSPre-deploy staging check, regression gate, quick team share
CollaborationReports exported from your load test runShareable /r/:slug links from the web app

When to use which

  • Use Delayt when you want to know "is this endpoint still fast for typical users?" before a merge or deploy, or to compare two URLs side by side with percentile numbers your PM can read.
  • Use JMeter / Gatling / Locust / k6 when you need to answer "how many concurrent users can we handle?" or "where does the system break under load?"
  • Use both in a healthy pipeline: Delayt as a lightweight percentile gate on every PR; load tools on a schedule or before major releases.

Delayt will not replace a load test. Sequential requests cannot expose contention, connection pool exhaustion, or queueing that only appears under concurrency. That is by design: you get a clear latency baseline without mixing in load-generator artifacts.

CI example (Delayt's sweet spot)

# Fail the build if p95 on staging health exceeds 500ms
npx @delayt/cli run \
  -u https://staging.api.example.com/health \
  -n 50 \
  --assert-p95=500 \
  --output json -q

Load tools can do performance gates too, but they need more wiring. Delayt fits a quick staging smoke check you can keep in the pipeline.

Why percentiles, not averages

An average can look great while a fraction of requests are slow. Percentiles describe the median, the tail, and the worst outliers in your sample.

MetricMeaning
p50Half of requests finished at or below this time. Typical experience.
p9595% of requests finished at or below this time. Start here when optimizing.
p99Slowest 1% of requests in the sample. Catches spikes and cold starts.

Local setup

Delayt needs PostgreSQL for run history and shareable links. The web UI and API run locally.

  1. Start the database:
docker compose up -d
  1. Install dependencies and run:
npm install
npm run dev

Open http://localhost:3000 for the app.

Optional: copy app/.env.example to app/.env if you need custom database settings.

How to test APIs in the app

  1. 1
    Open the dashboard

    Go to /app. You will see the endpoint composer and run history in the sidebar.

  2. 2
    Add one or more endpoints

    Paste a URL, choose the HTTP method (GET, POST, PUT, PATCH, DELETE), and set the request count (1–20 per endpoint on the web app). You can add up to 10 endpoints in a single run. For 50+ requests, use the CLI. See the hint under the stepper or the CLI panel after a run.

  3. 3
    Configure the request (tabs)

    Use the composer tabs: Parameters for query strings, Body for JSON on POST/PUT/PATCH, Headers for custom headers (API keys, User-Agent, etc.), and Authorization for Bearer tokens. See Auth & headers for how missing credentials behave.

  4. 4
    Run the test

    Click Run test. Delayt sends requests sequentially, shows live progress, and computes results when the run completes.

  5. 5
    Review and compare

    Use the Results, Histogram, Scatter, and Compare tabs. Re-open past runs from the sidebar history.

Auth, headers, and private APIs

Delayt is a latency tester: it always sends the number of requests you configured and records whatever your API returns. It does not block the run when auth or headers are missing.

Where to put each value

TabUse for
ParametersQuery string key/value pairs (merged into the URL). Empty rows are ignored.
BodyJSON payload for POST, PUT, or PATCH.
HeadersAny custom header, e.g. X-API-Key, X-App-Source, User-Agent. Only filled rows are sent.
AuthorizationBearer token only. Delayt adds the Bearer prefix if you paste the raw token and sends Authorization: Bearer ….

Runs without auth still execute

If you leave Authorization and Headers empty, Delayt still runs every request. Your API may respond with 401 or 403; those responses are measured like any other HTTP response. The run completes; the API call itself failed from your API's perspective.

Delayt does not validate credentials before starting and does not send a single preflight request. Check the Success % column after a run: if auth was missing, you will usually see 0% (or a low rate) when every response is 4xx/5xx.

How success rate is calculated

Percentiles (p50, p95, p99) are computed from all response times in the run, including failed auth. Success rate counts responses with status 0 (timeout/network) or 400+ as errors. A run full of 403 responses still shows latency numbers, but success rate should be 0%.

Who sends the request? The Delayt backend makes HTTP calls to your URL, not your browser. In local dev that is your machine; on a hosted deploy, targets must be reachable from that server (public URLs or VPN). localhost on your laptop is not reachable from a cloud-hosted Delayt instance.

Share links and secrets

Headers and Bearer tokens are stored with the run so shared links reproduce the same request. Treat share URLs like any secret-bearing config: only send them to people who should see those credentials.

CLI equivalent

Pass headers with repeated -H flags. After a web run, use the CLI export panel on the dashboard to copy commands with or without auth headers.

delayt run -u https://api.example.com/v1/resource \
  -H "Authorization: Bearer YOUR_TOKEN" \
  -H "X-App-Source: my-app/1.0" \
  -n 50

Reading your results

Metric cards

p50, p95, and p99 appear at the top for each endpoint. Focus on p95 first. If it is high, a meaningful slice of users is waiting too long.

Results tab

Per-endpoint summary with min, max, mean, percentiles, and Success %. If you tested without required auth or headers, expect a low success rate even though the run finished. See Auth & headers.

Histogram

Distribution of response times. Useful for spotting multi-modal latency (e.g. cache hits vs misses).

Scatter

Request index vs latency over time. Helps spot drift or warming effects during a run.

Compare

Side-by-side percentile comparison when you tested multiple endpoints in one run.

Export

From the Results tab, download summary data as JSON or CSV, or export per-request raw rows (latency, status code, payload sizes) as Raw JSON / Raw CSV. Use Copy as Markdown for GitHub issues and PRs.

Stop a run

While a test is running, click Stop on the progress bar to cancel remaining requests. The run stops between requests; the in-flight request still completes.

Sharing results

Every completed run gets a short slug. Copy the share link (e.g. /r/abc123) and send it to teammates. They can view results without re-running the test.

Your sidebar shows runs saved in a browser cookie on this device. No account required. Opening a shared link adds that run to your history here. Use Clear in the sidebar to remove local history (share links still work; server data is unchanged).

CLI for CI/CD

The CLI runs without a database. Use it for 50–200 requests, auth headers, and CI gates. The web app caps at 20 requests per endpoint.

Request count

Default is 50 requests. Web runs max at 20. Use the CLI for fuller samples and stabler p95/p99.

# Quick smoke (15 requests)
npx @delayt/cli run -u https://api.example.com/health -n 15

# Recommended for percentiles (50 requests)
npx @delayt/cli run -u https://api.example.com/health -n 50

# Maximum per run
npx @delayt/cli run -u https://api.example.com/health -n 200

With fewer than ~30 requests, p95/p99 swing easily from one slow response. If asserts fail on a small -n, retry with -n 50 before changing thresholds.

Auth & headers

Pass headers with repeated -H. Same flags work for private APIs and CI secrets.

npx @delayt/cli run \
  -u https://api.example.com/data \
  -H "Authorization: Bearer YOUR_TOKEN" \
  -H "X-API-Key: YOUR_KEY" \
  -n 50

If Success % is low, the run still finished. You are probably missing auth. See Auth & headers.

Query params & POST body

Put query strings in the URL. There is no separate params flag yet. Mirror what you use in the web composer Parameters tab.

# GET with query params
npx @delayt/cli run -u "https://api.example.com/search?q=HDFC" -n 50

# POST with JSON body
npx @delayt/cli run \
  -u https://api.example.com/items \
  -m POST \
  -d '{"name":"test"}' \
  -H "Authorization: Bearer YOUR_TOKEN" \
  -n 50

Assertions & thresholds

Exit code 1 when a percentile exceeds your budget. Pick thresholds foryour API. Do not copy 500ms from the web export if the target is slower or noisy.

npx @delayt/cli run \
  -u https://api.example.com/health \
  -n 50 \
  --assert-p95=500 \
  --assert-p99=1000

--assert-p50, --assert-p95, --assert-p99 are in milliseconds. Omit them to only measure without failing.

CI / JSON output & share

# Upload CLI results to the dashboard
DELAYT_SHARE_URL=https://www.delayt.foo \
  npx @delayt/cli run -u https://api.example.com/health -n 50 --share

# GitHub Actions
- run: npx @delayt/cli@latest run -u ${{ secrets.API_URL }} -n 50 --assert-p95=500 -q -o json

# Local install
npm install -g @delayt/cli
delayt run -u https://api.example.com/health -n 50

Exit codes: 0 pass · 1 assertion failed · 2 error. Use -q to hide progress; -o json for machines.

Set DELAYT_DOCS_URL=https://www.delayt.foo so CLI footers link to this page. After a web run, use the CLI export panel to copy a starter command.

Tips for accurate tests

  • Web runs use up to 20 requests for a quick smoke test. Use the CLI with 30–50 requests per endpoint for stable percentile estimates.
  • Test against staging that mirrors production. Cold starts and auth matter.
  • Run the same test before and after a change to see real delta, not noise.
  • Watch p99 alongside p95. A high p99 often means timeouts or occasional backend contention.
  • For private APIs, fill Authorization and Headers before Run. A run without them still executes but usually returns 401/403; check Success %, not just percentiles.
  • Put query params in the Parameters tab (or URL). API keys and custom headers belong on the Headers tab, not only in the URL bar.