Cursor vs Offset Pagination Schema Design
Symptom: CI pipelines fail with ValidationError: strict mode: additional properties not allowed ("next_cursor") or Missing required property: "offset" during OpenAPI or Pact contract execution. Frontend builds break with TS2339: Property 'offset' does not exist on type 'CursorPaginationMeta'. Automated snapshot diffs in staging reveal inconsistent meta.pagination payloads across service boundaries, triggering false-positive regression alerts. This guide is part of the Pagination and Filtering Schema Patterns cluster, which covers the full contract from query parameters to response envelopes and runtime guards.
Root Cause: Offset Instability Under Writes
The underlying problem is almost always one of two things: a monolithic pagination envelope that was migrated from offset to cursor without splitting the schema, or an offset strategy applied to a mutable dataset.
Monolithic envelope. A single PaginationMeta object with all fields optional — next_cursor, prev_cursor, offset, limit, total — compiles cleanly but fails strict-mode validators and breaks generated TypeScript clients. Validators flag cursor payloads because offset and limit are missing; they flag offset payloads because next_cursor is an unexpected additional property. The fix is a oneOf discriminated union that makes the two contracts structurally separate.
Offset instability. LIMIT/OFFSET resolves against the live table at the moment the query runs. When a row is inserted before the second page is fetched, every row below the insertion point shifts down by one position. The page boundary moves with it: the row at the old boundary is served twice (once on page 1, once at the top of page 2) and the row that was at the start of the original page 2 is skipped entirely. This is not a bug in any library — it is a structural property of addressing rows by position rather than by value.
Choosing the Right Strategy
| Dimension | Offset (LIMIT/OFFSET) |
Cursor (keyset) |
|---|---|---|
| Addressing model | row position | row value |
| Stability under writes | skips or duplicates rows | stable page boundaries |
| Performance at depth | degrades linearly (O(n) scan) | constant-time per page |
| Arbitrary page jumps | yes (offset=200&limit=20) |
no — forward/back only |
| Total page count | natural to provide | expensive / usually omitted |
| Random access | yes | no |
| Recommended for | small, stable, admin datasets | large, mutable feeds, sync loops |
Decision rule. If the collection can grow without bound or rows are inserted and deleted while users browse it, use cursor pagination. The correctness argument is architectural, not a preference: offset over a mutable dataset is definitionally unsafe. If users need “jump to page 7” semantics and the dataset is small and rarely changes, offset is acceptable — but cap offset in the schema so deep scans are rejected before they hit the database.
Step 1 — Define a Discriminated OpenAPI Schema
Replace any flat PaginationMeta object with a oneOf union that gives each strategy its own fully constrained schema. The type string field acts as the discriminator so validators, code generators, and runtime parsers know which branch applies without guessing.
# openapi.yaml — OpenAPI 3.1
components:
schemas:
PaginationMeta:
description: >
Pagination metadata. The 'type' field selects the branch.
Cursor and offset fields are mutually exclusive.
discriminator:
propertyName: type
mapping:
cursor: '#/components/schemas/CursorPaginationMeta'
offset: '#/components/schemas/OffsetPaginationMeta'
oneOf:
- $ref: '#/components/schemas/CursorPaginationMeta'
- $ref: '#/components/schemas/OffsetPaginationMeta'
CursorPaginationMeta:
type: object
required: [type, has_more]
additionalProperties: false # strict — no unknown fields
properties:
type:
type: string
enum: [cursor]
has_more:
type: boolean
description: Authoritative signal — always present, never absent.
next_cursor:
type: string
nullable: true
pattern: '^[A-Za-z0-9_-]+$' # base64url, no padding
description: >
Opaque token encoding the last row's ordering key.
null when has_more is false. Clients must treat this as opaque.
prev_cursor:
type: string
nullable: true
pattern: '^[A-Za-z0-9_-]+$'
OffsetPaginationMeta:
type: object
required: [type, offset, limit, total]
additionalProperties: false
properties:
type:
type: string
enum: [offset]
offset:
type: integer
minimum: 0
maximum: 10000 # cap: reject deep scans at the schema level
description: Row skip count. Reject values above 10000 with 400.
limit:
type: integer
minimum: 1
maximum: 100
total:
type: integer
minimum: 0
description: >
Total row count. Requires a full COUNT(*). Only appropriate for
small, stable datasets. Omit for large or mutable collections.
additionalProperties: false on each branch is doing the real work. A validator seeing a cursor payload with has_more and next_cursor will pass CursorPaginationMeta cleanly and reject OffsetPaginationMeta (missing offset, limit, total). The discriminator makes that resolution unambiguous, which is why generated clients no longer produce TS2339.
Step 2 — Design the Response Envelope
Never return a bare top-level array. Wrap results in a typed envelope so pagination metadata has a home that does not require a breaking change when you add fields later.
# openapi.yaml — response schema
components:
schemas:
ArticleListResponse:
type: object
required: [data, meta]
additionalProperties: false
properties:
data:
type: array
items:
$ref: '#/components/schemas/Article'
meta:
$ref: '#/components/schemas/PaginationMeta'
links:
$ref: '#/components/schemas/PageLinks'
PageLinks:
type: object
additionalProperties: false
properties:
self: { type: string, format: uri-reference }
next: { type: string, format: uri-reference, nullable: true }
prev: { type: string, format: uri-reference, nullable: true }
A concrete valid cursor response:
{
"data": [
{ "id": "art_104", "title": "Keyset pagination in practice" },
{ "id": "art_103", "title": "Designing sort grammars" }
],
"meta": {
"type": "cursor",
"has_more": true,
"next_cursor": "eyJpZCI6ImFydF8xMDMiLCJjIjoiMjAyNi0wNi0yMCJ9",
"prev_cursor": null
},
"links": {
"self": "/articles?limit=2",
"next": "/articles?limit=2&cursor=eyJpZCI6ImFydF8xMDMiLCJjIjoiMjAyNi0wNi0yMCJ9",
"prev": null
}
}
A valid offset response for comparison:
{
"data": [
{ "id": "art_087", "title": "Schema-first workflows" },
{ "id": "art_086", "title": "Validation at the gateway" }
],
"meta": {
"type": "offset",
"offset": 20,
"limit": 20,
"total": 104
},
"links": {
"self": "/admin/articles?offset=20&limit=20",
"next": "/admin/articles?offset=40&limit=20",
"prev": "/admin/articles?offset=0&limit=20"
}
}
The type discriminator in meta makes the branch unambiguous for both validators and generated clients. The query parameter contract for these endpoints — limit, cursor, offset, sort grammar — is specified in detail in Designing Filter and Sort Query Parameter Schemas.
Step 3 — Mirror the Discriminator in Zod
The OpenAPI schema governs the spec; a Zod guard at the Express/Fastify handler layer enforces the same contract at runtime for every inbound request. Using Zod’s discriminatedUnion for query param and response validation keeps TypeScript narrowing tight and eliminates coercion guesswork.
// pagination.ts — Zod 3.23
import { z } from 'zod';
const base64url = z.string().regex(/^[A-Za-z0-9_-]+$/);
export const cursorMeta = z.object({
type: z.literal('cursor'),
has_more: z.boolean(),
next_cursor: base64url.nullable(),
prev_cursor: base64url.nullable().optional(),
});
export const offsetMeta = z.object({
type: z.literal('offset'),
offset: z.number().int().min(0).max(10000), // mirrors OpenAPI maximum
limit: z.number().int().min(1).max(100),
total: z.number().int().min(0),
});
export const PaginationMetaSchema = z.discriminatedUnion('type', [
cursorMeta,
offsetMeta,
]);
export type PaginationMeta = z.infer<typeof PaginationMetaSchema>;
// TypeScript now narrows: if meta.type === 'cursor' then meta.next_cursor exists.
// if meta.type === 'offset' then meta.offset, limit, total exist.
The inbound query param guard for a cursor endpoint:
// articles.route.ts — Zod 3.23 request guard
import { z } from 'zod';
import { Router } from 'express';
const cursorQuery = z.object({
limit: z.coerce.number().int().min(1).max(100).default(20),
cursor: z.string().regex(/^[A-Za-z0-9_-]+$/).optional(),
sort: z.string()
.regex(/^-?(createdAt|updatedAt|title)(,-?(createdAt|updatedAt|title))*$/)
.default('-createdAt'),
}).strict(); // reject unknown params — mirrors additionalProperties: false
const router = Router();
router.get('/articles', (req, res) => {
const result = cursorQuery.safeParse(req.query);
if (!result.success) {
return res.status(400).json({ type: 'about:blank', status: 400,
title: 'Invalid query parameters',
detail: result.error.issues.map(i => i.message).join('; ') });
}
// result.data is fully typed and bounded
});
Step 4 — Add a Stable SQL Ordering Tiebreaker
Pagination correctness at the SQL layer rests on a total ordering. If your sort column contains duplicate values — multiple rows sharing the same created_at timestamp is common — the database may return them in different relative orders across requests. Page boundaries drift and rows are skipped or repeated even with cursor pagination unless a unique tiebreaker (the primary key) is appended to every ORDER BY and included in the cursor payload.
-- Cursor page (descending createdAt, forward direction)
-- The cursor decodes to (last_created_at, last_id) from the previous page.
SELECT id, title, status, created_at
FROM articles
WHERE status = $1 -- bounded filter
AND (created_at, id) < ($2, $3) -- keyset predicate from cursor
ORDER BY created_at DESC, id DESC -- total ordering, no ties
LIMIT $4 + 1; -- fetch one extra to compute has_more
-- Offset page (same tiebreaker rule applies)
SELECT id, title, status, created_at
FROM articles
WHERE status = $1
ORDER BY created_at DESC, id DESC -- tiebreaker still required
LIMIT $2 OFFSET $3;
Fetch limit + 1 rows and trim the last one. If you got limit + 1 rows back, has_more is true and next_cursor encodes the (created_at, id) of the limit-th row. This avoids a separate COUNT(*) query. Ensure a composite index on (status, created_at DESC, id DESC) exists so the keyset predicate is satisfied by an index seek, not a sequential scan.
Before / After
Before — single flat envelope, fails strict validators for either strategy:
# BEFORE — monolithic, ambiguous
PaginationMeta:
type: object
properties:
next_cursor: { type: string }
prev_cursor: { type: string }
offset: { type: integer }
limit: { type: integer }
total: { type: integer }
has_more: { type: boolean }
# No required list. No additionalProperties: false. No discriminator.
# Strict validators always flag this as incorrect for either strategy.
After — discriminated union, each branch fully constrained:
# AFTER — discriminated union
PaginationMeta:
discriminator:
propertyName: type
mapping:
cursor: '#/components/schemas/CursorPaginationMeta'
offset: '#/components/schemas/OffsetPaginationMeta'
oneOf:
- $ref: '#/components/schemas/CursorPaginationMeta'
- $ref: '#/components/schemas/OffsetPaginationMeta'
# Each branch has required, additionalProperties: false, and its own property set.
# Validators, openapi-typescript 7, and Pact all resolve cleanly.
The TypeScript TS2339 errors disappear because openapi-typescript 7 now generates a proper discriminated union type, and the TypeScript narrowing on meta.type === 'cursor' gives you meta.next_cursor without a cast.
Verification
Regenerate types and confirm no TypeScript errors:
npx openapi-typescript openapi.yaml -o ./src/generated/api-types.ts
npx tsc --noEmit
# Expected: no output — zero type errors
Lint the schema with Spectral 6.11:
npx spectral lint openapi.yaml --ruleset .spectral.yaml --fail-severity warn
# Expected:
# OpenAPI 3.x detected
# No results with a severity of 'error' or above found!
Start a Prism 5 mock and verify both strategies are accepted and malformed requests are rejected:
npx prism mock openapi.yaml --port 4010 &
# Cursor request — valid
curl -s "http://localhost:4010/articles?limit=2" | jq '.meta.type, .meta.has_more'
# "cursor"
# true
# Offset request — valid
curl -s "http://localhost:4010/admin/articles?offset=20&limit=20" | jq '.meta.type, .meta.total'
# "offset"
# 104
# Reject limit over maximum
curl -s -o /dev/null -w "%{http_code}\n" "http://localhost:4010/articles?limit=500"
# 400
# Reject deep offset
curl -s -o /dev/null -w "%{http_code}\n" "http://localhost:4010/admin/articles?offset=50000&limit=20"
# 400
A passing CI run shows Spectral green, tsc --noEmit clean, and both mock assertions returning the expected status codes.
Edge Cases and Caveats
- Bi-directional cursor pagination. Including
prev_cursorrequires encoding both the first and last row’s ordering key per page. This doubles cursor payload size and complicates the SQL predicate. Only implement backward traversal if the UI genuinely requires it — most feed and sync use cases neednext_cursoronly. - Cursor token invalidation. If you change the sort column or its direction (a common schema evolution), existing cursor tokens decode to a key that no longer maps cleanly to the new
ORDER BYclause. Include a version byte inside the opaque token and return400 Invalid cursorfor stale versions rather than silently returning wrong data. - Mixing strategies across endpoints. Having both cursor and offset endpoints in one API is fine, but ensure the response envelope always carries
meta.typeso clients and validators do not have to infer the strategy from which fields happen to be present. The discriminator makes the contract explicit rather than implicit.
Frequently Asked Questions
Why does offset pagination skip or duplicate rows?
LIMIT/OFFSET resolves against the live table at query time. A row inserted before the second page shifts every subsequent row down by one position, so the page boundary moves and a row is either repeated or skipped. Cursor pagination anchors to a row value instead of a position, so concurrent inserts do not disturb page boundaries.
When should I choose cursor over offset pagination?
Use cursor pagination for any collection that mutates while users browse it — feeds, sync endpoints, infinite-scroll UIs, or tables that receive frequent inserts or deletes. Use offset only for small, stable datasets where users need to jump directly to an arbitrary page number.
What should a cursor token contain?
Encode the last returned row’s ordering key as an opaque base64url string — for example the (created_at, id) pair. Keep the format internal: never document the inner shape to clients, and include a version byte so you can change the encoding later without invalidating live tokens.
How do I avoid deep-scan performance cliffs with offset?
OFFSET n forces the database to scan and discard n rows before returning results. Set a hard maximum on the offset value in the OpenAPI schema and Zod guard (for example, reject offset > 10000 with a 400), and migrate any endpoint with deep scrolling to cursor pagination.
Do I need a total count in the pagination envelope?
No. total requires a full COUNT(*) query and is meaningless for cursor-paginated mutable sets. Return has_more as the authoritative signal of more data and expose total only as an optional, opt-in field that callers must explicitly request.