API Monitoring: A Developer's Complete Guide

Why API Monitoring Differs from Website Monitoring

If you already have website monitoring in place, you might wonder why APIs need separate attention. After all, an HTTP check is an HTTP check, isn't it? In practice, the differences are significant enough that monitoring APIs and monitoring web pages require fundamentally different approaches.

Website monitoring typically checks that a URL returns a 200 status code and that the response contains expected content. That works well for pages served to browsers. APIs, however, have a much richer contract. An API endpoint that returns a 200 status code might still be broken if the response body contains an error message, if the data is stale, if pagination is malfunctioning, or if the response time has degraded to the point where dependent services are timing out.

APIs also serve a different audience. When a website is slow, a human user might wait a few seconds and try again. When an API is slow, an automated client hits a timeout, retries, potentially cascades failures to its own consumers, and may generate thousands of errors per minute. The blast radius of an API issue is often far larger than an equivalent website problem because APIs sit at the intersection of multiple systems.

Authentication adds another layer of complexity. Most websites are publicly accessible for monitoring purposes, but APIs frequently require authentication tokens, API keys, OAuth flows, or mutual TLS. Your monitoring must handle these authentication mechanisms correctly, which means managing credentials, refreshing tokens, and alerting when authentication itself fails as distinct from the endpoint being down.

Finally, APIs expose structured data that can and should be validated programmatically. A website monitor might check for the presence of a string on the page. An API monitor should parse the JSON response, validate the schema, check specific field values, verify data freshness, and assert business logic conditions. This response body validation is where API monitoring provides its greatest value -- catching subtle data quality issues that a simple status code check would miss entirely.

What to Monitor: The Five Pillars of API Health

Comprehensive API monitoring covers five distinct dimensions. Monitoring only one or two of these leaves dangerous blind spots.

1. Availability (uptime)

The most basic check: does the endpoint respond at all? This means verifying that a request to your API returns a valid HTTP response within a reasonable timeout window. A complete connection failure, a TLS handshake error, or a timeout all indicate an availability problem. As with all uptime monitoring, checking from multiple geographic locations prevents false positives caused by network issues between the monitor and your servers.

2. Correctness (status codes and response validation)

An API can be "up" and still be broken. Monitor the HTTP status code -- a 200 response is expected, but a 500, 502, or 503 indicates a server-side problem. Beyond status codes, validate the response body. For a REST API returning JSON, this might include checking that the response is valid JSON, that required fields are present, that field values fall within expected ranges, and that the response conforms to your API schema. Understanding HTTP status codes and their implications is essential for configuring meaningful checks.

3. Performance (latency)

Response time monitoring for APIs is critical because API consumers typically have strict timeout budgets. If your API consistently responds in 200ms but suddenly starts responding in 2 seconds, every downstream service that calls it is affected. Monitor percentile latencies (p50, p95, p99) rather than just averages, since averages hide tail latency problems that affect a significant portion of your users.

4. Data freshness

For APIs that serve data from a database or cache, monitor that the data is current. A common failure mode is an API that responds quickly with a 200 status code but serves stale data because a background sync process has failed. Include checks that verify timestamps in the response, compare data against known recent changes, or validate that counters and metrics reflect recent activity.

5. Rate limiting and quota compliance

If your API enforces rate limits, monitor that they are working correctly. This is important from both sides: your monitoring should verify that legitimate requests are not being incorrectly rate-limited, and that rate limiting is actually enforced when it should be. Check the rate limit headers (X-RateLimit-Remaining, X-RateLimit-Reset) in responses to track consumption trends before they become problems.

Handling Authentication in API Monitors

Authentication is the aspect of API monitoring that causes the most operational friction. Unlike a public website check that simply fetches a URL, an authenticated API monitor must manage credentials securely and handle the authentication lifecycle correctly.

API key authentication

The simplest case. Your monitor includes an API key in a header (commonly Authorization: Bearer <key> or a custom header like X-API-Key). The key is static and doesn't expire. The main risk is key rotation -- when you rotate API keys, you must update your monitoring configuration simultaneously, or your monitors will start reporting false authentication failures.

Best practice: create a dedicated API key specifically for monitoring. Give it read-only permissions to the minimum set of endpoints you need to check. This limits the blast radius if the key is compromised and makes it easy to identify monitoring traffic in your logs.

OAuth 2.0 and token-based authentication

Many APIs use OAuth 2.0, where a client exchanges credentials for a time-limited access token. Your monitor must implement the token refresh flow: request a token, use it for checks, detect when it expires (either by tracking the expiry time or by handling 401 responses), and request a new one. A common pitfall is monitoring that detects a "failure" when the real problem is that the monitor's own token has expired and it failed to refresh.

For OAuth monitors, alert separately on authentication failures versus endpoint failures. "The API returned a 401" and "the API returned a 500" are different problems requiring different responses.

Mutual TLS (mTLS)

Some APIs require the client to present its own TLS certificate. Your monitoring infrastructure must support configuring client certificates per monitor. This is less common for public APIs but frequent for internal service-to-service communication and financial services APIs. Monitor the expiry of your client certificates with the same diligence you apply to server certificates.

Session-based authentication

Some APIs (particularly legacy systems) use session-based authentication where the monitor must first call a login endpoint, capture a session cookie, and include it in subsequent requests. This is the most fragile authentication pattern for monitoring because it introduces a multi-step dependency. If the login endpoint changes, breaks, or rate-limits your monitor, all downstream checks fail. Wherever possible, advocate for token-based authentication for your monitoring connections.

Credential management

Never hardcode API credentials in your monitoring configuration if it's stored in version control. Use environment variables, secrets managers, or your monitoring platform's built-in secrets storage. Rotate monitoring credentials on a regular schedule and audit who has access to them.

Multi-Step API Transaction Monitoring

Many real-world API interactions are not single requests. They are sequences of dependent calls where each step uses data from the previous response. Monitoring individual endpoints in isolation misses failures that only manifest when the endpoints are used together in a workflow.

What are multi-step monitors?

A multi-step API monitor executes a sequence of HTTP requests where each request can use data extracted from previous responses. This mirrors how your actual consumers use your API: authenticate, fetch a list of resources, retrieve a specific resource, update it, verify the update was applied.

Common multi-step patterns

Here are the transaction patterns most worth monitoring:

CRUD lifecycle: Create a resource via POST, read it back via GET, update it via PUT/PATCH, delete it via DELETE, verify deletion. This catches issues where writes succeed but reads serve stale data, or where deletes silently fail.
Search and retrieve: Query a list endpoint with specific filters, extract an ID from the results, fetch the individual resource, verify the data matches. This catches indexing delays and inconsistencies between list and detail endpoints.
Authentication flow: Obtain an access token, make an authenticated request, verify the response, refresh the token, verify the refreshed token works. This catches subtle OAuth implementation bugs.
Pagination: Request the first page of results, follow the pagination cursor or link to the second page, verify continuity (no gaps, no duplicates). Pagination bugs are surprisingly common and rarely caught by single-endpoint monitors.
Webhook delivery: Trigger an action via the API, then check that the corresponding webhook was delivered to a test endpoint within an acceptable time window. This validates the asynchronous parts of your system.

Extracting and chaining data

The technical key to multi-step monitoring is response extraction. After each step, you extract specific values from the JSON response -- IDs, tokens, URLs, timestamps -- and inject them into subsequent requests. This is typically done using JSONPath expressions or simple key-path notation. For example, you might extract $.data.items[0].id from a list response and use it as a path parameter in the next request.

Cleanup and idempotency

Multi-step monitors that create resources need a cleanup strategy. If the monitor creates a test record in your production database, it should delete it at the end of the sequence. If any step fails and cleanup doesn't run, you'll accumulate test data. Design your monitors with a finally/cleanup step that executes regardless of whether earlier steps passed or failed. Alternatively, use a dedicated test environment or tag test resources for periodic batch cleanup.

Timing and dependencies

Monitor the end-to-end timing of multi-step transactions, not just individual step latencies. A sequence of five API calls that each take 400ms results in a 2-second user-perceived operation. If your users report slow workflows, your per-endpoint monitors (each showing normal latency) won't surface the problem. The combined response time across the full sequence is what matters.

Setting Response Time Thresholds

Response time thresholds determine when a "slow" API response triggers an alert. Setting them too tight generates constant noise; setting them too loose means you only find out about performance degradation when users complain. Getting the balance right requires understanding your baseline and your consumers' requirements.

Establish your baseline

Before setting thresholds, collect at least two weeks of performance data for each endpoint. Calculate the median (p50), 95th percentile (p95), and 99th percentile (p99) response times. These give you a realistic picture of normal performance. An endpoint with a p50 of 120ms and a p99 of 800ms behaves very differently from one with a p50 of 500ms and a p99 of 520ms, even though their averages might be similar.

Threshold recommendations

A practical starting point for REST API endpoints:

Endpoint type	Warning threshold	Critical threshold
Health check / ping	200ms	500ms
Simple read (GET single resource)	500ms	1,500ms
List / search (GET collection)	1,000ms	3,000ms
Write (POST / PUT / PATCH)	800ms	2,000ms
Complex query / report	2,000ms	5,000ms
File upload / processing	5,000ms	15,000ms

These are starting points. Adjust based on your baseline data, your SLAs, and the expectations of your API consumers. An internal microservice API might have tighter thresholds (everything under 100ms) while a public API consumed over the internet might tolerate more latency.

SLA-driven thresholds

If you have published SLAs or SLOs for your API, your monitoring thresholds should be set tighter than your SLA commitments. If your SLA promises 99.9% of requests will complete within 2 seconds, set your warning threshold at 1.5 seconds so you have time to investigate and remediate before you breach the SLA.

Percentile-based alerting

Rather than alerting on individual slow responses, consider alerting on percentile degradation. "The p95 response time for /api/orders has exceeded 1,200ms for the past 5 minutes" is a more actionable alert than "one request to /api/orders took 1,500ms." Individual slow responses can be caused by garbage collection pauses, cold caches, or temporary network blips. Sustained percentile degradation indicates a systemic problem.

Endpoint-specific tuning

Avoid the temptation to apply a single threshold across all endpoints. A complex search endpoint that queries multiple databases and aggregates results will naturally be slower than a simple lookup by ID. Each endpoint should have thresholds that reflect its specific performance characteristics and the expectations of its consumers.

Integrating API Monitoring with CI/CD

API monitoring should not exist in isolation from your development workflow. By integrating monitoring checks into your CI/CD pipeline, you catch regressions before they reach production and verify deployments automatically after they go live.

Pre-deployment smoke tests

After deploying to a staging environment, run your API monitoring checks against the staging endpoints before promoting to production. This catches breaking changes to response schemas, new authentication requirements, and performance regressions introduced in the release. If your staging environment is representative of production (same data volume, same infrastructure), these pre-deployment checks provide high confidence.

Post-deployment verification

After a production deployment, automatically trigger a full run of your API monitors. Don't wait for the next scheduled check cycle -- run them immediately. If any checks fail, your deployment pipeline can automatically roll back or halt the release before the issue affects a significant number of users. This is the fastest possible feedback loop for deployment-related API regressions.

Contract testing integration

If you use consumer-driven contract testing (tools like Pact), your API monitors serve as a production complement to your contract tests. Contract tests verify that the API conforms to its specification in a test environment. API monitors verify that it continues to conform in production, where real data, real traffic, and real infrastructure introduce variables that test environments cannot replicate.

Monitor-as-code

Define your API monitors in code alongside your API source code. When a developer adds a new endpoint, they should also add the corresponding monitor definition. When they change an endpoint's response schema, they update the monitor's validation rules in the same pull request. This keeps monitoring in sync with the API itself and makes monitoring configuration subject to the same code review process as the API code.

Alerting during deployments

Suppress or annotate alerts during planned deployments to avoid alerting on expected brief disruptions. Most monitoring platforms support maintenance windows that pause alerting for a defined period. Integrate these maintenance windows with your deployment pipeline so they activate automatically when a deployment starts and deactivate when it completes. If a deployment fails and maintenance mode was enabled, ensure it gets properly cleared so monitoring resumes.

Monitoring deployment frequency correlation

Over time, correlate your API reliability metrics with your deployment frequency. If reliability drops after deploying twice daily but is stable with weekly deployments, that's a signal that your testing pipeline needs strengthening, not that you should deploy less often. Use the data from your monitoring platform to make informed decisions about your release process.

Common API Monitoring Patterns and Anti-Patterns

After helping teams set up API monitoring across hundreds of services, certain patterns emerge repeatedly. Here are the approaches that work well and the mistakes that waste effort.

Patterns that work

The canary endpoint. Create a dedicated health check endpoint (typically /health or /api/health) that exercises your critical dependencies: database connectivity, cache availability, external service reachability. This single endpoint serves as a fast, cheap indicator of overall system health and can be checked at high frequency (every 30 seconds) without placing meaningful load on your system.

The golden path monitor. Identify the most common user workflow through your API and monitor it as a multi-step transaction. For an e-commerce API, this might be: search for products, view product details, add to cart, begin checkout. If the golden path works, most of your API surface area is likely healthy.

Layered check frequency. Run your health check endpoint every 30 seconds. Run full endpoint checks every 5 minutes. Run multi-step transaction monitors every 15 minutes. Run comprehensive API surface coverage checks hourly. This layered approach provides fast detection of catastrophic failures while thoroughly validating API behaviour at a manageable check volume.

Response body assertions. Always validate something in the response body beyond the status code. Even a simple assertion -- "the response contains a 'data' key" or "the array has at least one element" -- catches entire categories of bugs that status code monitoring alone would miss.

Anti-patterns to avoid

Monitoring only the health check. A health check endpoint that returns 200 while your actual business endpoints are failing is worse than no health check at all. It provides false confidence. Always monitor real endpoints alongside synthetic health checks.

Identical thresholds everywhere. Applying the same 1-second timeout and identical alerting rules to every endpoint regardless of its nature and importance leads to either constant noise from slow-but-normal endpoints or missed alerts on fast-but-critical ones.

Ignoring error responses. Monitoring that only checks for 200 responses misses valuable information in error responses. A well-designed monitor should capture and log the response body for non-200 responses, as this information is essential for diagnosing the failure.

Testing from a single location. If your monitoring runs from the same datacentre as your API servers, you'll never detect DNS issues, CDN misconfigurations, or network routing problems. Always monitor from at least two geographically distinct locations, ideally from regions where your actual users are based.

No baseline, no context. Alerting on absolute thresholds without understanding your baseline means you're guessing. Spend time establishing what "normal" looks like for each endpoint before setting alert thresholds. A scheduled monitoring approach ensures your checks run consistently and build meaningful historical baselines over time.

API monitoring done well is one of the highest-value investments in reliability engineering. It sits at the boundary between your system and its consumers, catching problems at the exact point where they become user-facing. Start with the basics -- availability and status codes -- and incrementally add response validation, multi-step transactions, and CI/CD integration as your monitoring maturity grows.

Start monitoring your infrastructure today

50 free monitors, no credit card needed. Set up in under 30 seconds.

Get started free