Cursor vs Offset Pagination Schema Design

Q: Why does offset pagination skip or duplicate rows?

LIMIT/OFFSET resolves against the live table at query time. A row inserted before the second page shifts every subsequent row down by one position, so the page boundary moves and a row is either repeated or skipped. Cursor pagination anchors to a row value instead of a position, so concurrent inserts do not disturb page boundaries.

Q: How do I avoid deep-scan performance cliffs with offset?

OFFSET n forces the database to scan and discard n rows before returning results. Set a hard maximum on the offset value in the OpenAPI schema and Zod guard (for example, reject offset > 10000 with a 400), and migrate any endpoint with deep scrolling to cursor pagination.

Q: Do I need a total count in the pagination envelope?

No. total requires a full COUNT(*) query and is meaningless for cursor-paginated mutable sets. Return has_more as the authoritative signal of more data and expose total only as an optional, opt-in field that callers must explicitly request.

Symptom: CI pipelines fail with ValidationError: strict mode: additional properties not allowed ("next_cursor") or Missing required property: "offset" during OpenAPI or Pact contract execution. Frontend builds break with TS2339: Property 'offset' does not exist on type 'CursorPaginationMeta'. Automated snapshot diffs in staging reveal inconsistent meta.pagination payloads across service boundaries, triggering false-positive regression alerts. This guide is part of the Pagination and Filtering Schema Patterns cluster, which covers the full contract from query parameters to response envelopes and runtime guards.

Root Cause: Offset Instability Under Writes

The underlying problem is almost always one of two things: a monolithic pagination envelope that was migrated from offset to cursor without splitting the schema, or an offset strategy applied to a mutable dataset.

Monolithic envelope. A single PaginationMeta object with all fields optional — next_cursor, prev_cursor, offset, limit, total — compiles cleanly but fails strict-mode validators and breaks generated TypeScript clients. Validators flag cursor payloads because offset and limit are missing; they flag offset payloads because next_cursor is an unexpected additional property. The fix is a oneOf discriminated union that makes the two contracts structurally separate.

Offset instability. LIMIT/OFFSET resolves against the live table at the moment the query runs. When a row is inserted before the second page is fetched, every row below the insertion point shifts down by one position. The page boundary moves with it: the row at the old boundary is served twice (once on page 1, once at the top of page 2) and the row that was at the start of the original page 2 is skipped entirely. This is not a bug in any library — it is a structural property of addressing rows by position rather than by value.

Choosing the Right Strategy

Dimension	Offset (`LIMIT`/`OFFSET`)	Cursor (keyset)
Addressing model	row position	row value
Stability under writes	skips or duplicates rows	stable page boundaries
Performance at depth	degrades linearly (O(n) scan)	constant-time per page
Arbitrary page jumps	yes (`offset=200&limit=20`)	no — forward/back only
Total page count	natural to provide	expensive / usually omitted
Random access	yes	no
Recommended for	small, stable, admin datasets	large, mutable feeds, sync loops

Decision rule. If the collection can grow without bound or rows are inserted and deleted while users browse it, use cursor pagination. The correctness argument is architectural, not a preference: offset over a mutable dataset is definitionally unsafe. If users need “jump to page 7” semantics and the dataset is small and rarely changes, offset is acceptable — but cap offset in the schema so deep scans are rejected before they hit the database.

Step 1 — Define a Discriminated OpenAPI Schema

Replace any flat PaginationMeta object with a oneOf union that gives each strategy its own fully constrained schema. The type string field acts as the discriminator so validators, code generators, and runtime parsers know which branch applies without guessing.

# openapi.yaml — OpenAPI 3.1
components:
  schemas:
    PaginationMeta:
      description: >
        Pagination metadata. The 'type' field selects the branch.
        Cursor and offset fields are mutually exclusive.
      discriminator:
        propertyName: type
        mapping:
          cursor: '#/components/schemas/CursorPaginationMeta'
          offset: '#/components/schemas/OffsetPaginationMeta'
      oneOf:
        - $ref: '#/components/schemas/CursorPaginationMeta'
        - $ref: '#/components/schemas/OffsetPaginationMeta'

    CursorPaginationMeta:
      type: object
      required: [type, has_more]
      additionalProperties: false          # strict — no unknown fields
      properties:
        type:
          type: string
          enum: [cursor]
        has_more:
          type: boolean
          description: Authoritative signal — always present, never absent.
        next_cursor:
          type: string
          nullable: true
          pattern: '^[A-Za-z0-9_-]+$'      # base64url, no padding
          description: >
            Opaque token encoding the last row's ordering key.
            null when has_more is false. Clients must treat this as opaque.
        prev_cursor:
          type: string
          nullable: true
          pattern: '^[A-Za-z0-9_-]+$'

    OffsetPaginationMeta:
      type: object
      required: [type, offset, limit, total]
      additionalProperties: false
      properties:
        type:
          type: string
          enum: [offset]
        offset:
          type: integer
          minimum: 0
          maximum: 10000                   # cap: reject deep scans at the schema level
          description: Row skip count. Reject values above 10000 with 400.
        limit:
          type: integer
          minimum: 1
          maximum: 100
        total:
          type: integer
          minimum: 0
          description: >
            Total row count. Requires a full COUNT(*). Only appropriate for
            small, stable datasets. Omit for large or mutable collections.

additionalProperties: false on each branch is doing the real work. A validator seeing a cursor payload with has_more and next_cursor will pass CursorPaginationMeta cleanly and reject OffsetPaginationMeta (missing offset, limit, total). The discriminator makes that resolution unambiguous, which is why generated clients no longer produce TS2339.

Step 2 — Design the Response Envelope

Never return a bare top-level array. Wrap results in a typed envelope so pagination metadata has a home that does not require a breaking change when you add fields later.

# openapi.yaml — response schema
components:
  schemas:
    ArticleListResponse:
      type: object
      required: [data, meta]
      additionalProperties: false
      properties:
        data:
          type: array
          items:
            $ref: '#/components/schemas/Article'
        meta:
          $ref: '#/components/schemas/PaginationMeta'
        links:
          $ref: '#/components/schemas/PageLinks'

    PageLinks:
      type: object
      additionalProperties: false
      properties:
        self:  { type: string, format: uri-reference }
        next:  { type: string, format: uri-reference, nullable: true }
        prev:  { type: string, format: uri-reference, nullable: true }

A concrete valid cursor response:

{
  "data": [
    { "id": "art_104", "title": "Keyset pagination in practice" },
    { "id": "art_103", "title": "Designing sort grammars" }
  ],
  "meta": {
    "type": "cursor",
    "has_more": true,
    "next_cursor": "eyJpZCI6ImFydF8xMDMiLCJjIjoiMjAyNi0wNi0yMCJ9",
    "prev_cursor": null
  },
  "links": {
    "self": "/articles?limit=2",
    "next": "/articles?limit=2&cursor=eyJpZCI6ImFydF8xMDMiLCJjIjoiMjAyNi0wNi0yMCJ9",
    "prev": null
  }
}

A valid offset response for comparison:

{
  "data": [
    { "id": "art_087", "title": "Schema-first workflows" },
    { "id": "art_086", "title": "Validation at the gateway" }
  ],
  "meta": {
    "type": "offset",
    "offset": 20,
    "limit": 20,
    "total": 104
  },
  "links": {
    "self": "/admin/articles?offset=20&limit=20",
    "next": "/admin/articles?offset=40&limit=20",
    "prev": "/admin/articles?offset=0&limit=20"
  }
}

The type discriminator in meta makes the branch unambiguous for both validators and generated clients. The query parameter contract for these endpoints — limit, cursor, offset, sort grammar — is specified in detail in Designing Filter and Sort Query Parameter Schemas.

Step 3 — Mirror the Discriminator in Zod

The OpenAPI schema governs the spec; a Zod guard at the Express/Fastify handler layer enforces the same contract at runtime for every inbound request. Using Zod’s discriminatedUnion for query param and response validation keeps TypeScript narrowing tight and eliminates coercion guesswork.

// pagination.ts — Zod 3.23
import { z } from 'zod';

const base64url = z.string().regex(/^[A-Za-z0-9_-]+$/);

export const cursorMeta = z.object({
  type: z.literal('cursor'),
  has_more: z.boolean(),
  next_cursor: base64url.nullable(),
  prev_cursor: base64url.nullable().optional(),
});

export const offsetMeta = z.object({
  type: z.literal('offset'),
  offset: z.number().int().min(0).max(10000),  // mirrors OpenAPI maximum
  limit: z.number().int().min(1).max(100),
  total: z.number().int().min(0),
});

export const PaginationMetaSchema = z.discriminatedUnion('type', [
  cursorMeta,
  offsetMeta,
]);

export type PaginationMeta = z.infer<typeof PaginationMetaSchema>;
// TypeScript now narrows: if meta.type === 'cursor' then meta.next_cursor exists.
// if meta.type === 'offset' then meta.offset, limit, total exist.

The inbound query param guard for a cursor endpoint:

// articles.route.ts — Zod 3.23 request guard
import { z } from 'zod';
import { Router } from 'express';

const cursorQuery = z.object({
  limit: z.coerce.number().int().min(1).max(100).default(20),
  cursor: z.string().regex(/^[A-Za-z0-9_-]+$/).optional(),
  sort: z.string()
    .regex(/^-?(createdAt|updatedAt|title)(,-?(createdAt|updatedAt|title))*$/)
    .default('-createdAt'),
}).strict(); // reject unknown params — mirrors additionalProperties: false

const router = Router();

router.get('/articles', (req, res) => {
  const result = cursorQuery.safeParse(req.query);
  if (!result.success) {
    return res.status(400).json({ type: 'about:blank', status: 400,
      title: 'Invalid query parameters',
      detail: result.error.issues.map(i => i.message).join('; ') });
  }
  // result.data is fully typed and bounded
});

Step 4 — Add a Stable SQL Ordering Tiebreaker

Pagination correctness at the SQL layer rests on a total ordering. If your sort column contains duplicate values — multiple rows sharing the same created_at timestamp is common — the database may return them in different relative orders across requests. Page boundaries drift and rows are skipped or repeated even with cursor pagination unless a unique tiebreaker (the primary key) is appended to every ORDER BY and included in the cursor payload.

-- Cursor page (descending createdAt, forward direction)
-- The cursor decodes to (last_created_at, last_id) from the previous page.
SELECT id, title, status, created_at
FROM articles
WHERE status = $1                                      -- bounded filter
  AND (created_at, id) < ($2, $3)                     -- keyset predicate from cursor
ORDER BY created_at DESC, id DESC                      -- total ordering, no ties
LIMIT $4 + 1;                                         -- fetch one extra to compute has_more

-- Offset page (same tiebreaker rule applies)
SELECT id, title, status, created_at
FROM articles
WHERE status = $1
ORDER BY created_at DESC, id DESC                     -- tiebreaker still required
LIMIT $2 OFFSET $3;

Fetch limit + 1 rows and trim the last one. If you got limit + 1 rows back, has_more is true and next_cursor encodes the (created_at, id) of the limit-th row. This avoids a separate COUNT(*) query. Ensure a composite index on (status, created_at DESC, id DESC) exists so the keyset predicate is satisfied by an index seek, not a sequential scan.

Before / After

Before — single flat envelope, fails strict validators for either strategy:

# BEFORE — monolithic, ambiguous
PaginationMeta:
  type: object
  properties:
    next_cursor: { type: string }
    prev_cursor: { type: string }
    offset:      { type: integer }
    limit:       { type: integer }
    total:       { type: integer }
    has_more:    { type: boolean }
  # No required list. No additionalProperties: false. No discriminator.
  # Strict validators always flag this as incorrect for either strategy.

After — discriminated union, each branch fully constrained:

# AFTER — discriminated union
PaginationMeta:
  discriminator:
    propertyName: type
    mapping:
      cursor: '#/components/schemas/CursorPaginationMeta'
      offset: '#/components/schemas/OffsetPaginationMeta'
  oneOf:
    - $ref: '#/components/schemas/CursorPaginationMeta'
    - $ref: '#/components/schemas/OffsetPaginationMeta'
# Each branch has required, additionalProperties: false, and its own property set.
# Validators, openapi-typescript 7, and Pact all resolve cleanly.

The TypeScript TS2339 errors disappear because openapi-typescript 7 now generates a proper discriminated union type, and the TypeScript narrowing on meta.type === 'cursor' gives you meta.next_cursor without a cast.

Verification

Regenerate types and confirm no TypeScript errors:

npx openapi-typescript openapi.yaml -o ./src/generated/api-types.ts
npx tsc --noEmit
# Expected: no output — zero type errors

Lint the schema with Spectral 6.11:

npx spectral lint openapi.yaml --ruleset .spectral.yaml --fail-severity warn
# Expected:
# OpenAPI 3.x detected
# No results with a severity of 'error' or above found!

Start a Prism 5 mock and verify both strategies are accepted and malformed requests are rejected:

npx prism mock openapi.yaml --port 4010 &

# Cursor request — valid
curl -s "http://localhost:4010/articles?limit=2" | jq '.meta.type, .meta.has_more'
# "cursor"
# true

# Offset request — valid
curl -s "http://localhost:4010/admin/articles?offset=20&limit=20" | jq '.meta.type, .meta.total'
# "offset"
# 104

# Reject limit over maximum
curl -s -o /dev/null -w "%{http_code}\n" "http://localhost:4010/articles?limit=500"
# 400

# Reject deep offset
curl -s -o /dev/null -w "%{http_code}\n" "http://localhost:4010/admin/articles?offset=50000&limit=20"
# 400

A passing CI run shows Spectral green, tsc --noEmit clean, and both mock assertions returning the expected status codes.

Edge Cases and Caveats

Bi-directional cursor pagination. Including prev_cursor requires encoding both the first and last row’s ordering key per page. This doubles cursor payload size and complicates the SQL predicate. Only implement backward traversal if the UI genuinely requires it — most feed and sync use cases need next_cursor only.
Cursor token invalidation. If you change the sort column or its direction (a common schema evolution), existing cursor tokens decode to a key that no longer maps cleanly to the new ORDER BY clause. Include a version byte inside the opaque token and return 400 Invalid cursor for stale versions rather than silently returning wrong data.
Mixing strategies across endpoints. Having both cursor and offset endpoints in one API is fine, but ensure the response envelope always carries meta.type so clients and validators do not have to infer the strategy from which fields happen to be present. The discriminator makes the contract explicit rather than implicit.

Frequently Asked Questions

Why does offset pagination skip or duplicate rows?

LIMIT/OFFSET resolves against the live table at query time. A row inserted before the second page shifts every subsequent row down by one position, so the page boundary moves and a row is either repeated or skipped. Cursor pagination anchors to a row value instead of a position, so concurrent inserts do not disturb page boundaries.

When should I choose cursor over offset pagination?

Use cursor pagination for any collection that mutates while users browse it — feeds, sync endpoints, infinite-scroll UIs, or tables that receive frequent inserts or deletes. Use offset only for small, stable datasets where users need to jump directly to an arbitrary page number.

What should a cursor token contain?

Encode the last returned row’s ordering key as an opaque base64url string — for example the (created_at, id) pair. Keep the format internal: never document the inner shape to clients, and include a version byte so you can change the encoding later without invalidating live tokens.

How do I avoid deep-scan performance cliffs with offset?

OFFSET n forces the database to scan and discard n rows before returning results. Set a hard maximum on the offset value in the OpenAPI schema and Zod guard (for example, reject offset > 10000 with a 400), and migrate any endpoint with deep scrolling to cursor pagination.

Do I need a total count in the pagination envelope?

No. total requires a full COUNT(*) query and is meaningless for cursor-paginated mutable sets. Return has_more as the authoritative signal of more data and expose total only as an optional, opt-in field that callers must explicitly request.