Documentation
How to test API latency with Delayt
Delayt measures real HTTP response times and reports percentile latency, the numbers that reflect what users actually experience, not misleading averages.
What Delayt is for
Delayt helps you answer one question: how fast is my API for real users? It sends sequential HTTP requests to your endpoints, records every response time, and computes p50, p95, and p99 percentiles.
Use it when you need to:
- Validate staging before a deploy
- Compare two endpoints or API versions side by side
- Share latency evidence with your team via a link
- Gate CI/CD pipelines on p95 thresholds with the CLI
Delayt is not a load generator. Requests run sequentially so you get clean latency measurements without concurrency noise.
How Delayt differs from JMeter, Gatling, Locust, and k6
Tools like JMeter, Gatling, Locust, k6, and Artillery are built for load and stress testing. They simulate many concurrent users to find throughput limits, queue buildup, and failure points under pressure.
Delayt is built for something else: a fast percentile smoke check before you ship. One endpoint, sequential requests, p50/p95/p99 out of the box, and a one-liner in CI. No test plans, no virtual users, no cluster setup.
| Load tools (JMeter, Gatling, Locust, k6…) | Delayt | |
|---|---|---|
| Primary goal | Stress the system: concurrency, saturation, breaking points | Measure real latency distribution on a single path |
| Request pattern | Many parallel virtual users, ramps, scenarios | Sequential requests (clean timing, no concurrency noise) |
| Key metrics | Throughput, error rate under load, RPS, sometimes percentiles | p50, p95, p99 first; success rate and histograms in the UI |
| Setup time | Scripts, plugins, agents, or distributed workers | npx @delayt/cli run -u … -n 50 or paste a URL in the app |
| Best moment to run | Capacity planning, pre-scale drills, finding max RPS | Pre-deploy staging check, regression gate, quick team share |
| Collaboration | Reports exported from your load test run | Shareable /r/:slug links from the web app |
When to use which
- Use Delayt when you want to know "is this endpoint still fast for typical users?" before a merge or deploy, or to compare two URLs side by side with percentile numbers your PM can read.
- Use JMeter / Gatling / Locust / k6 when you need to answer "how many concurrent users can we handle?" or "where does the system break under load?"
- Use both in a healthy pipeline: Delayt as a lightweight percentile gate on every PR; load tools on a schedule or before major releases.
Delayt will not replace a load test. Sequential requests cannot expose contention, connection pool exhaustion, or queueing that only appears under concurrency. That is by design: you get a clear latency baseline without mixing in load-generator artifacts.
CI example (Delayt's sweet spot)
# Fail the build if p95 on staging health exceeds 500ms
npx @delayt/cli run \
-u https://staging.api.example.com/health \
-n 50 \
--assert-p95=500 \
--output json -qLoad tools can do performance gates too, but they need more wiring. Delayt fits a quick staging smoke check you can keep in the pipeline.
Why percentiles, not averages
An average can look great while a fraction of requests are slow. Percentiles describe the median, the tail, and the worst outliers in your sample.
| Metric | Meaning |
|---|---|
p50 | Half of requests finished at or below this time. Typical experience. |
p95 | 95% of requests finished at or below this time. Start here when optimizing. |
p99 | Slowest 1% of requests in the sample. Catches spikes and cold starts. |
Local setup
Delayt needs PostgreSQL for run history and shareable links. The web UI and API run locally.
- Start the database:
docker compose up -d- Install dependencies and run:
npm install
npm run devOpen http://localhost:3000 for the app.
Optional: copy app/.env.example to app/.env if you need custom database settings.
How to test APIs in the app
- 1Open the dashboard
Go to
/app. You will see the endpoint composer and run history in the sidebar. - 2Add one or more endpoints
Paste a URL, choose the HTTP method (GET, POST, PUT, PATCH, DELETE), and set the request count (1–20 per endpoint on the web app). You can add up to 10 endpoints in a single run. For 50+ requests, use the CLI. See the hint under the stepper or the CLI panel after a run.
- 3Configure the request (tabs)
Use the composer tabs: Parameters for query strings, Body for JSON on POST/PUT/PATCH, Headers for custom headers (API keys,
User-Agent, etc.), and Authorization for Bearer tokens. See Auth & headers for how missing credentials behave. - 4Run the test
Click Run test. Delayt sends requests sequentially, shows live progress, and computes results when the run completes.
- 5Review and compare
Use the Results, Histogram, Scatter, and Compare tabs. Re-open past runs from the sidebar history.
Auth, headers, and private APIs
Delayt is a latency tester: it always sends the number of requests you configured and records whatever your API returns. It does not block the run when auth or headers are missing.
Where to put each value
| Tab | Use for |
|---|---|
| Parameters | Query string key/value pairs (merged into the URL). Empty rows are ignored. |
| Body | JSON payload for POST, PUT, or PATCH. |
| Headers | Any custom header, e.g. X-API-Key, X-App-Source, User-Agent. Only filled rows are sent. |
| Authorization | Bearer token only. Delayt adds the Bearer prefix if you paste the raw token and sends Authorization: Bearer …. |
Runs without auth still execute
If you leave Authorization and Headers empty, Delayt still runs every request. Your API may respond with 401 or 403; those responses are measured like any other HTTP response. The run completes; the API call itself failed from your API's perspective.
Delayt does not validate credentials before starting and does not send a single preflight request. Check the Success % column after a run: if auth was missing, you will usually see 0% (or a low rate) when every response is 4xx/5xx.
How success rate is calculated
Percentiles (p50, p95, p99) are computed from all response times in the run, including failed auth. Success rate counts responses with status 0 (timeout/network) or 400+ as errors. A run full of 403 responses still shows latency numbers, but success rate should be 0%.
Who sends the request? The Delayt backend makes HTTP calls to your URL, not your browser. In local dev that is your machine; on a hosted deploy, targets must be reachable from that server (public URLs or VPN). localhost on your laptop is not reachable from a cloud-hosted Delayt instance.
Share links and secrets
Headers and Bearer tokens are stored with the run so shared links reproduce the same request. Treat share URLs like any secret-bearing config: only send them to people who should see those credentials.
CLI equivalent
Pass headers with repeated -H flags. After a web run, use the CLI export panel on the dashboard to copy commands with or without auth headers.
delayt run -u https://api.example.com/v1/resource \
-H "Authorization: Bearer YOUR_TOKEN" \
-H "X-App-Source: my-app/1.0" \
-n 50Reading your results
Metric cards
p50, p95, and p99 appear at the top for each endpoint. Focus on p95 first. If it is high, a meaningful slice of users is waiting too long.
Results tab
Per-endpoint summary with min, max, mean, percentiles, and Success %. If you tested without required auth or headers, expect a low success rate even though the run finished. See Auth & headers.
Histogram
Distribution of response times. Useful for spotting multi-modal latency (e.g. cache hits vs misses).
Scatter
Request index vs latency over time. Helps spot drift or warming effects during a run.
Compare
Side-by-side percentile comparison when you tested multiple endpoints in one run.
Export
From the Results tab, download summary data as JSON or CSV, or export per-request raw rows (latency, status code, payload sizes) as Raw JSON / Raw CSV. Use Copy as Markdown for GitHub issues and PRs.
Stop a run
While a test is running, click Stop on the progress bar to cancel remaining requests. The run stops between requests; the in-flight request still completes.
Sharing results
Every completed run gets a short slug. Copy the share link (e.g. /r/abc123) and send it to teammates. They can view results without re-running the test.
Your sidebar shows runs saved in a browser cookie on this device. No account required. Opening a shared link adds that run to your history here. Use Clear in the sidebar to remove local history (share links still work; server data is unchanged).
CLI for CI/CD
The CLI runs without a database. Use it for 50–200 requests, auth headers, and CI gates. The web app caps at 20 requests per endpoint.
Request count
Default is 50 requests. Web runs max at 20. Use the CLI for fuller samples and stabler p95/p99.
# Quick smoke (15 requests)
npx @delayt/cli run -u https://api.example.com/health -n 15
# Recommended for percentiles (50 requests)
npx @delayt/cli run -u https://api.example.com/health -n 50
# Maximum per run
npx @delayt/cli run -u https://api.example.com/health -n 200With fewer than ~30 requests, p95/p99 swing easily from one slow response. If asserts fail on a small -n, retry with -n 50 before changing thresholds.
Auth & headers
Pass headers with repeated -H. Same flags work for private APIs and CI secrets.
npx @delayt/cli run \
-u https://api.example.com/data \
-H "Authorization: Bearer YOUR_TOKEN" \
-H "X-API-Key: YOUR_KEY" \
-n 50If Success % is low, the run still finished. You are probably missing auth. See Auth & headers.
Query params & POST body
Put query strings in the URL. There is no separate params flag yet. Mirror what you use in the web composer Parameters tab.
# GET with query params
npx @delayt/cli run -u "https://api.example.com/search?q=HDFC" -n 50
# POST with JSON body
npx @delayt/cli run \
-u https://api.example.com/items \
-m POST \
-d '{"name":"test"}' \
-H "Authorization: Bearer YOUR_TOKEN" \
-n 50Assertions & thresholds
Exit code 1 when a percentile exceeds your budget. Pick thresholds foryour API. Do not copy 500ms from the web export if the target is slower or noisy.
npx @delayt/cli run \
-u https://api.example.com/health \
-n 50 \
--assert-p95=500 \
--assert-p99=1000--assert-p50, --assert-p95, --assert-p99 are in milliseconds. Omit them to only measure without failing.
CI / JSON output & share
# Upload CLI results to the dashboard
DELAYT_SHARE_URL=https://www.delayt.foo \
npx @delayt/cli run -u https://api.example.com/health -n 50 --share
# GitHub Actions
- run: npx @delayt/cli@latest run -u ${{ secrets.API_URL }} -n 50 --assert-p95=500 -q -o json
# Local install
npm install -g @delayt/cli
delayt run -u https://api.example.com/health -n 50Exit codes: 0 pass · 1 assertion failed · 2 error. Use -q to hide progress; -o json for machines.
Set DELAYT_DOCS_URL=https://www.delayt.foo so CLI footers link to this page. After a web run, use the CLI export panel to copy a starter command.
Tips for accurate tests
- Web runs use up to 20 requests for a quick smoke test. Use the CLI with 30–50 requests per endpoint for stable percentile estimates.
- Test against staging that mirrors production. Cold starts and auth matter.
- Run the same test before and after a change to see real delta, not noise.
- Watch p99 alongside p95. A high p99 often means timeouts or occasional backend contention.
- For private APIs, fill Authorization and Headers before Run. A run without them still executes but usually returns 401/403; check Success %, not just percentiles.
- Put query params in the Parameters tab (or URL). API keys and custom headers belong on the Headers tab, not only in the URL bar.