Skip to main content

Designing Robust Error Response Contracts

Most API contracts specify success bodies in painstaking detail and leave failures to chance, so every service invents its own error shape and clients accrete brittle parsing for each one. A robust error contract treats failures as first-class, versioned schema: one envelope, machine-readable codes, field-level detail, and an explicit HTTP status taxonomy. This guide extends the broader Schema Design & Validation Patterns and shows how to design that envelope around RFC 7807 problem+json, document it in OpenAPI, and gate it in CI so no endpoint ships an ad-hoc error shape.

The payoff is concrete: a single client error handler that works against every endpoint, support tooling that can correlate a traceId to logs, and SDK generators that emit typed error models instead of any. We standardize on RFC 7807 problem details as the wire format and layer in extension members for codes and validation errors.

When to Use This Approach

Adopt a formal error response contract when any of the following hold:

  • You operate more than one service and clients (web, mobile, partner) must parse errors uniformly across all of them.
  • Frontend teams are writing per-endpoint error handling because no two endpoints fail the same way.
  • You expose a public or partner API where error stability is part of your backward-compatibility promise.
  • You generate client SDKs and want typed error models instead of untyped catch blocks.
  • Support and on-call need to correlate a user-visible error to server logs via a stable identifier.
  • You are introducing field-level validation and need a predictable place to surface per-field messages.

If you have a single internal service with one consumer you control, a lightweight { "code", "message" } shape may be enough — but standardizing early costs little and removes a class of future migrations.

Prerequisites

This guide uses the following tool versions. Pin them in CI to keep examples reproducible.

# Schema validation (JSON Schema draft 2020-12)
npm install -D ajv@8.17 ajv-cli@5.0 ajv-formats@3.0

# OpenAPI linting and validation
npm install -D @stoplight/spectral-cli@6.11 @apidevtools/swagger-cli@4.0

# Runtime validation (server-side error generation)
npm install zod@3.23

# Mock server for contract verification
docker pull stoplight/prism:5

You should already have an OpenAPI 3.1 document (or be ready to start one) and a CI runner such as GitHub Actions. Familiarity with runtime validation using Zod helps, since the server generates errors from validation failures.

The Error Envelope at a Glance

Before the steps, here is the structure we are building. The base RFC 7807 members carry the human- and machine-oriented summary; extension members carry the stable code and the field-level breakdown.

RFC 7807 problem+json error envelope structure The outer envelope holds type, title, status, detail and instance from RFC 7807, plus extension members code, traceId, and an errors array of per-field objects with field, code, and message. application/problem+json HTTP 422 Base members (RFC 7807) type URI to docs title short summary status HTTP code (422) detail human message instance this occurrence Extension members code VALIDATION_FAILED traceId log correlation errors[] field-level list clients ignore unknown members errors[] — per-field validation entries field: "email" code: "FORMAT" message: "invalid" field: "age" code: "MIN" message: "too low"

Step 1: Define the Canonical problem+json Schema

Start with a strict, versioned JSON Schema that every failure payload must satisfy. Base it on RFC 7807 (Problem Details for HTTP APIs), whose successor RFC 9457 keeps the identical wire format. The five base members — type, title, status, detail, instance — give you a self-describing error without inventing structure. Treat the schema as a real artifact: version it (error-response-v1.schema.json), check it into the repo, and reference it everywhere.

The decision that matters most here is what is required. Make type, title, and status mandatory; keep detail and instance optional because not every error has a meaningful per-occurrence message. Use additionalProperties: false on the field objects but allow extension members at the envelope top level (RFC 7807 requires that extensions be permitted). We model that by listing the extensions explicitly rather than slamming the whole document shut.

{
  "$schema": "https://json-schema.org/draft/2020-12/schema",
  "$id": "https://example.com/schemas/error-response-v1.schema.json",
  "title": "Problem Details Error Response",
  "type": "object",
  "required": ["type", "title", "status"],
  "properties": {
    "type":   { "type": "string", "format": "uri" },          // doc URI for this problem class
    "title":  { "type": "string" },                            // stable, human-readable summary
    "status": { "type": "integer", "minimum": 400, "maximum": 599 },
    "detail": { "type": "string" },                            // this-occurrence explanation
    "instance": { "type": "string", "format": "uri-reference" },// which request/resource failed
    "code":   { "type": "string", "pattern": "^[A-Z][A-Z0-9_]+$" }, // machine-readable, stable
    "traceId": { "type": "string" },                           // correlates to server logs
    "errors": {                                                // field-level breakdown (Step 2)
      "type": "array",
      "items": { "$ref": "#/$defs/fieldError" }
    }
  },
  "$defs": {
    "fieldError": {
      "type": "object",
      "required": ["field", "code"],
      "additionalProperties": false,
      "properties": {
        "field":   { "type": "string" },   // dot/bracket path: address.zip, items[0].sku
        "code":    { "type": "string", "pattern": "^[A-Z][A-Z0-9_]+$" },
        "message": { "type": "string" }    // optional human hint; clients prefer code
      }
    }
  }
}

Two rules to enforce from day one. First, never auto-coerce status: the string "404" and the integer 404 are different contracts, and loose parsers hide drift. Second, keep title stable per problem class — it is effectively documentation, while code is the identifier clients branch on.

Step 2: Add Machine-Readable Codes and Field-Level Errors

A human-readable detail is for logs and toasts; clients need a value they can switch on without string matching. That is the code extension member: a stable, uppercase identifier like USER_EMAIL_TAKEN or VALIDATION_FAILED. The cardinal rule is that a code’s meaning never changes once shipped — renaming RATE_LIMITED to TOO_MANY_REQUESTS is a breaking change even though the status code is unchanged.

For validation failures, a single top-level message is not enough; the client needs to know which field failed and why so it can highlight the right input. Carry that in the errors array. Generating this array from a validator keeps it honest — here we map a Zod failure into the envelope. This mirrors the patterns in runtime validation with Zod, reusing the validation you already run.

// errorEnvelope.ts — build a problem+json body from a ZodError (Zod 3.23)
import { ZodError } from "zod";

interface FieldError { field: string; code: string; message?: string; }
interface ProblemDetails {
  type: string; title: string; status: number;
  detail?: string; instance?: string; code?: string;
  traceId?: string; errors?: FieldError[];
}

const ZOD_TO_CODE: Record<string, string> = {
  invalid_type: "TYPE",        // wrong JSON type
  too_small:    "MIN",         // below min length/value
  too_big:      "MAX",         // above max length/value
  invalid_string: "FORMAT",    // email/uuid/regex mismatch
};

export function problemFromZod(err: ZodError, instance: string, traceId: string): ProblemDetails {
  return {
    type: "https://example.com/problems/validation-failed",
    title: "Request validation failed",
    status: 422,                                  // see Step 3 for 400 vs 422
    detail: "One or more fields are invalid.",
    instance,                                     // e.g. "/users" or the request id URI
    code: "VALIDATION_FAILED",                    // stable, what clients branch on
    traceId,                                      // map to logs, do NOT leak internals
    errors: err.issues.map((i) => ({
      field: i.path.join("."),                    // "address.zip", "items.0.sku"
      code: ZOD_TO_CODE[i.code] ?? "INVALID",     // per-field machine code
      message: i.message,                         // optional human hint
    })),
  };
}

The resulting payload is self-explanatory and stable:

{
  "type": "https://example.com/problems/validation-failed",
  "title": "Request validation failed",
  "status": 422,
  "detail": "One or more fields are invalid.",
  "instance": "/users",
  "code": "VALIDATION_FAILED",
  "traceId": "01H2XK9P3Q",
  "errors": [
    { "field": "email", "code": "FORMAT", "message": "Must be a valid email" },
    { "field": "age",   "code": "MIN",    "message": "Must be at least 18" }
  ]
}

Step 3: Define the HTTP Status Taxonomy

Codes describe what failed; the HTTP status describes how the client should react. Picking statuses ad hoc is the most common source of inconsistent error contracts, so write the taxonomy down and apply it everywhere. The boundary questions that recur are 400 vs 422, 401 vs 403, and 404 vs 409. The deeper rationale for each status belongs with standardizing HTTP error codes in OpenAPI definitions; the working rules:

  • 400 Bad Request — the body is malformed or unparseable (broken JSON, wrong content type). The request cannot even be understood.
  • 422 Unprocessable Content — the body parses fine but fails schema or business validation. This is where errors[] belongs.
  • 401 Unauthorized — no valid credentials. The client should authenticate.
  • 403 Forbidden — valid credentials, but not allowed. Authenticating again will not help.
  • 404 Not Found — the target resource does not exist (and, often, you do not want to reveal that it does).
  • 409 Conflict — the request conflicts with current state (duplicate key, version mismatch, already-processed).
  • 429 Too Many Requests — rate limited; pair with a Retry-After header.
  • 500 / 503 — server fault or temporary unavailability. Never put validation detail here.

Pick one convention for validation (we use 422) and never mix it with 400 across services. The decision flow:

HTTP error status decision flow A flow asking in order whether the body parses (else 400), the caller is authenticated (else 401), authorized (else 403), the resource exists (else 404), conflicts with state (else 409), and finally whether validation passes, yielding 422 on failure or a success path. Body parses? Authenticated? Authorized? Resource exists? Validation passes? no → 400 no → 401 no → 403 no → 404 no → 422 all yes → 2xx success 409 Conflict applies when state collides on write

Step 4: Document Errors in OpenAPI

With the envelope and taxonomy fixed, encode them once in your OpenAPI document and reference them everywhere. Define the Problem schema and a set of reusable responses under components, then attach those responses to operations by $ref. This eliminates copy-paste, makes generated SDKs emit typed error models, and lets a linter enforce coverage. This is the foundation that standardizing HTTP error codes in OpenAPI definitions builds on.

# openapi.yaml (OpenAPI 3.1)
paths:
  /users:
    post:
      operationId: createUser
      responses:
        '201': { description: Created }
        '422': { $ref: '#/components/responses/ValidationFailed' }
        '409': { $ref: '#/components/responses/Conflict' }
        '500': { $ref: '#/components/responses/ServerError' }
components:
  responses:
    ValidationFailed:
      description: Request body failed validation
      content:
        application/problem+json:          # signals a problem document, not a success body
          schema: { $ref: '#/components/schemas/Problem' }
          example:
            type: https://example.com/problems/validation-failed
            title: Request validation failed
            status: 422
            code: VALIDATION_FAILED
            errors:
              - { field: email, code: FORMAT }
    Conflict:
      description: Resource conflicts with current state
      content:
        application/problem+json:
          schema: { $ref: '#/components/schemas/Problem' }
    ServerError:
      description: Unexpected server error
      content:
        application/problem+json:
          schema: { $ref: '#/components/schemas/Problem' }
  schemas:
    Problem:
      type: object
      required: [type, title, status]      # mirrors the JSON Schema in Step 1
      properties:
        type:    { type: string, format: uri }
        title:   { type: string }
        status:  { type: integer, minimum: 400, maximum: 599 }
        detail:  { type: string }
        instance: { type: string, format: uri-reference }
        code:    { type: string, pattern: '^[A-Z][A-Z0-9_]+$' }
        traceId: { type: string }
        errors:
          type: array
          items: { $ref: '#/components/schemas/FieldError' }
    FieldError:
      type: object
      required: [field, code]
      additionalProperties: false
      properties:
        field:   { type: string }
        code:    { type: string, pattern: '^[A-Z][A-Z0-9_]+$' }
        message: { type: string }

A custom Spectral rule turns “every operation documents its errors” into an enforceable invariant rather than a code-review hope:

# .spectral.yaml — require at least one 4xx and problem+json on errors
rules:
  operation-has-4xx-response:
    description: Every operation must document at least one 4xx error.
    given: $.paths[*][get,post,put,patch,delete].responses
    then:
      function: schema
      functionOptions:
        schema:
          type: object
          patternProperties:
            "^4[0-9]{2}$": {}
          minProperties: 1
    severity: error

Step 5: Gate the Contract in CI

A contract that is not enforced decays. Validate two things in CI: that your example/fixture payloads satisfy the JSON Schema, and that the OpenAPI document lints clean against the error rules. Fail the build on any violation so non-compliant error shapes never merge.

# .github/workflows/error-contract-gate.yml
name: Error Contract Gate
on: [pull_request]
jobs:
  validate:
    runs-on: ubuntu-latest
    steps:
      - uses: actions/checkout@v4
      - run: npm ci
      - name: Validate error fixtures against schema
        run: |
          npx ajv validate \
            -s schemas/error-response-v1.schema.json \
            -d 'tests/fixtures/errors/*.json' \
            -c ajv-formats --spec=draft2020 --strict=true
      - name: Lint OpenAPI error coverage
        run: npx spectral lint openapi.yaml --fail-severity=error
      - name: Validate spec structure
        run: npx swagger-cli validate openapi.yaml

Spec/Schema Reference

The error envelope fields, their types, and the contract each one carries:

Field Type Required Default Effect
type string (URI) yes about:blank Identifies the problem class; dereferences to human docs. Stable but may change as docs move.
title string yes Short, human-readable summary of the problem class. Should not vary per occurrence.
status integer (400–599) yes HTTP status code, duplicated in the body for clients that lose the response line.
detail string no omitted Human-readable explanation specific to this occurrence. Safe for display; no internals.
instance string (URI ref) no omitted Identifies the specific request or resource that failed.
code string ^[A-Z][A-Z0-9_]+$ no omitted Stable machine-readable identifier clients branch on. Meaning must never change.
traceId string no omitted Correlation id mapping the error to server logs for support and on-call.
errors array of FieldError no omitted Per-field validation failures; present on 422 validation responses.
errors[].field string yes (in item) Dot/bracket path to the offending input (address.zip, items[0].sku).
errors[].code string ^[A-Z][A-Z0-9_]+$ yes (in item) Per-field machine code (FORMAT, MIN, REQUIRED).
errors[].message string no omitted Optional human hint; clients should prefer the code.

Verification

Confirm the contract end-to-end. First, the schema gate over fixtures should report clean:

$ npx ajv validate -s schemas/error-response-v1.schema.json \
    -d 'tests/fixtures/errors/*.json' -c ajv-formats --spec=draft2020
tests/fixtures/errors/validation-422.json valid
tests/fixtures/errors/conflict-409.json valid
tests/fixtures/errors/server-500.json valid

Second, run a contract-aware mock from the spec and verify a forced error matches the envelope. Prism serves schema-compliant problem+json:

$ docker run --rm -p 4010:4010 -v "$PWD/openapi.yaml:/api.yaml" \
    stoplight/prism:5 mock -h 0.0.0.0 /api.yaml
$ curl -s -H 'Prefer: code=422' http://localhost:4010/users -d '{}'
{ "type": "...", "title": "Request validation failed", "status": 422,
  "code": "VALIDATION_FAILED", "errors": [ { "field": "email", "code": "FORMAT" } ] }

A green CI run plus a mock that returns the exact envelope you documented means the contract is real, not aspirational.

Troubleshooting

additional properties not allowed on fixtures. AJV reports extra members because the envelope schema is closed against an unlisted field (often a legacy error_msg or trace_id). Root cause: the schema and the actual payload have drifted. Fix: add the field to the schema’s properties (envelope extensions are allowed) or migrate the payload to detail/traceId. Do not blanket-add additionalProperties: true — that defeats the gate.

status validates as a string. A fixture has "status": "422" and a permissive parser let it through, but the canonical schema requires an integer. Root cause: type coercion somewhere in the producer. Fix: emit status as a number at the source and keep --strict=true so AJV refuses to coerce. String statuses break clients that compare numerically.

Generated SDK types the error as any or object. The generator could not resolve the error schema. Root cause: the response uses an inline schema or application/json instead of $ref to #/components/schemas/Problem with application/problem+json. Fix: route every error through the reusable components/responses entries shown in Step 4 and re-generate.

Mock server returns 200 instead of the error. Prism prefers success examples unless steered. Root cause: no error scenario was selected. Fix: send Prefer: code=422 (Prism 5) or post an invalid body so validation triggers the documented error path.

Spectral passes but an endpoint still ships no 4xx. The operation uses an HTTP method not covered by the rule’s given JSONPath. Root cause: the path expression omits a verb (e.g. head, options). Fix: broaden the given to include every method you expose, then re-lint.

Frequently Asked Questions

What is the difference between RFC 7807 and RFC 9457?

RFC 9457 (2023) obsoletes RFC 7807 but keeps the same wire format: type, title, status, detail, instance, plus extension members. Existing problem+json payloads remain valid; 9457 mainly clarifies extension registration and adds guidance. Quoting either RFC is fine, but new specs should cite 9457.

Should the machine-readable error code live in type or in a separate field?

Use a dedicated extension member such as code or error_code for the stable identifier clients branch on, and keep type as a documentation URI. type values can change as docs move; a code like USER_EMAIL_TAKEN is a contract clients depend on and should never change meaning.

Which HTTP status code should I use for validation failures?

Use 422 Unprocessable Content when the request is syntactically valid JSON but fails business or schema validation, and 400 Bad Request when the body is malformed or unparseable. Both are acceptable for validation; pick one convention and apply it consistently across every endpoint.

Can I add custom fields to a problem+json response?

Yes. RFC 7807/9457 explicitly allow extension members at the top level, such as errors, code, traceId, or balance. Clients must ignore members they do not recognize, so additive changes are non-breaking as long as you never repurpose an existing field name.

Should error responses use application/json or application/problem+json?

Use application/problem+json for the Content-Type so clients and gateways can distinguish errors from success bodies and apply problem-specific parsing. The body shape is identical; only the media type signals it is a problem document.

How do I avoid leaking internal details in error messages?

Keep detail human-readable but free of stack traces, SQL, and internal hostnames. Put diagnostic identifiers in a traceId or instance field that maps to server logs, so support can correlate without exposing internals to the caller.