Skip to main content

Defining gRPC Service Contracts with Protocol Buffers

A gRPC service drifts into wire-incompatibility the moment a developer renumbers a field, reuses a recycled field number, or deletes a message without reserving its tag. The result is not a 400 response or a clear error log — it is a silently corrupt decode: a string field parses as a length-delimited blob of bytes meant for something else, or a numeric field interprets bytes as the wrong varint. These failures are intermittent, environment-specific, and extremely hard to trace after the fact. The root cause is always the same: the .proto file was treated as mutable source code rather than as an immutable wire contract.

This guide extends REST vs GraphQL vs gRPC Contract Strategies and shows how to author a well-governed protobuf3 service definition, configure buf (buf 1.39) for lint and breaking-change detection, and wire both checks into GitHub Actions so the gate is enforced on every pull request.

Symptom

A backend team ships a “refactoring” commit that renames and reorganizes fields in orders.proto. The CI pipeline goes green — no compilation errors, tests pass. Within hours, the mobile clients and the Python billing service begin emitting sporadic decode errors:

proto: (line 1:8): unknown field "4" in orders.v1.Order
google.protobuf.message.DecodeError: Error parsing message

Some responses decode partially — status reads as 0 (the proto3 zero default) even though the server set it to 2. An integer field reads as a large garbage value. The commit that caused this looks innocuous in the diff: a field called amount_cents was removed, its number 4 was reassigned to a new discount_percent field of type float, and an existing status field was moved from number 2 to number 3 to “keep things tidy.”

Root Cause

Protocol Buffers encodes each field as a tag–value pair on the wire. The tag is derived from the field number, not the field name. When a decoder receives a message, it looks up the field number in its local .proto and interprets the following bytes according to that definition’s wire type.

The three changes above each caused a distinct class of corruption:

Change Old decoder’s interpretation
Removed amount_cents (field 4), reused 4 for discount_percent: float float bytes (wire type 5, fixed32) decoded as int64 (wire type 0, varint) — garbage value or parse error
Moved status from field 2 to field 3 Old decoder finds nothing at field 2, sets status to default 0; field 3 bytes are silently ignored
Renamed amount_cents to discount_percent without reserving Future developers can reuse both the number and the name with a new type — silent time bomb

None of these produce a compilation error. Proto3 wire format is deliberately lenient: unknown fields are skipped, missing fields default to zero values. That leniency is what makes field number governance critical — there is no runtime firewall.

The fix has two parts: repair the .proto with correct reserved declarations and add buf lint + buf breaking to CI so this class of mistake is caught before it merges.

Wire-incompatible field number change — corrupt decode sequence A sequence diagram contrasting a safe additive change (new field, new number) against an unsafe reuse (same number, new type), showing how the old client silently misreads the bytes.

Field number reuse: safe vs unsafe change

Server (new .proto) Client (old .proto) SAFE: new field added with new field number (5) encode: tag=1 id, tag=2 status, tag=3 total, tag=5 discount_percent Client ignores unknown tag=5 — all known fields decode correctly UNSAFE: field 4 reused with incompatible wire type encode: tag=4 discount_percent (float, wire type 5 — fixed32) Client reads tag=4 as int64 (varint, wire type 0) → garbage value / error buf breaking catches the reuse before it merges — exit code 100 FIELD_SAME_TYPE: field "discount_percent" on message "Order" changed type from "int64" to "float" — buf breaking orders/v1/orders.proto

Step-by-Step Fix

Step 1: Audit and Repair the .proto File

Start with the current broken state. The file below shows the damage: field 4 has been silently reused and field 2 was renumbered.

Before (broken):

// orders/v1/orders.proto — BROKEN: field numbers renumbered, number 4 reused
syntax = "proto3";
package orders.v1;

message Order {
  string  id             = 1;
  string  description    = 3;  // was field 2 before "tidy" renumber — BREAKS status
  Status  status         = 2;  // moved from 3 — wrong, original status was 2
  double  total          = 5;
  float   discount_pct   = 4;  // reuses former amount_cents tag — BREAKS wire
}

enum Status {
  STATUS_UNSPECIFIED = 0;
  STATUS_PENDING     = 1;
  STATUS_SHIPPED     = 2;
}

service OrderService {
  rpc GetOrder (GetOrderRequest) returns (Order);
}

message GetOrderRequest {
  string id = 1;
}

After (repaired):

// orders/v1/orders.proto — REPAIRED: original numbers restored, deletions reserved
syntax = "proto3";
package orders.v1;

import "google/protobuf/timestamp.proto";  // well-known type: stable JSON mapping

// Order represents a placed customer order.
// Field numbers are permanent. Never renumber, never reuse a reserved number.
message Order {
  string  id             = 1;   // immutable: string id, tag 1 forever
  Status  status         = 2;   // immutable: tag 2 forever
  double  total          = 3;   // immutable: tag 3 forever

  // Tag 4 held amount_cents (int64). Field deleted 2026-06-20.
  // Reserved so this number and name can never be reused with a different type.
  reserved 4;
  reserved "amount_cents";

  float   discount_pct   = 5;   // NEW field — assigned the next available number
  string  description    = 6;   // NEW field — never reassign tags from above

  // created_at uses google.protobuf.Timestamp for unambiguous UTC semantics
  // and canonical JSON mapping (RFC 3339 string) via grpc-gateway / transcoding.
  google.protobuf.Timestamp created_at = 7;
}

enum Status {
  STATUS_UNSPECIFIED = 0;   // proto3 requires a zero enum value — always name it _UNSPECIFIED
  STATUS_PENDING     = 1;
  STATUS_SHIPPED     = 2;
  STATUS_CANCELLED   = 3;
}

// OrderService exposes order lifecycle operations.
service OrderService {
  rpc GetOrder    (GetOrderRequest)    returns (Order);
  rpc ListOrders  (ListOrdersRequest)  returns (ListOrdersResponse);
}

message GetOrderRequest {
  string id = 1;
}

message ListOrdersRequest {
  // page_token is an opaque cursor for pagination — string keeps options open.
  string page_token  = 1;
  int32  page_size   = 2;
}

message ListOrdersResponse {
  repeated Order orders     = 1;
  string         next_page_token = 2;
}

Why this works: restoring the original field numbers means old clients decoding new responses — or new clients decoding responses from a briefly-deployed old server — will read the correct wire type for each tag. The reserved block is the non-negotiable safety rail: buf lint enforces that no future field definition reuses number 4 or the name amount_cents, eliminating the entire class of accidental wire breaks.

Step 2: Configure buf.yaml

Place buf.yaml at the root of your proto directory (e.g., proto/buf.yaml).

# proto/buf.yaml — buf 1.39
version: v2

# Declare the module. The name is optional for private repos; required for BSR publication.
name: buf.build/acme/orders

# Lint: enforce structural rules — enum prefix, package version suffix, reserved names, etc.
lint:
  use:
    - DEFAULT          # includes FIELD_LOWER_SNAKE_CASE, ENUM_ZERO_VALUE_SUFFIX, etc.
  except:
    - PACKAGE_VERSION_SUFFIX   # remove this line if you enforce v1/v2 package paths

# Breaking: check against the FILE category (superset of WIRE + WIRE_JSON).
# FILE includes changes to json_name, which matters for grpc-gateway / gRPC-JSON transcoding.
breaking:
  use:
    - FILE

The FILE rule category in buf 1.39 covers everything in WIRE (pure binary compatibility) and additionally catches json_name changes and service/method renames — all of which break consumers even if the binary wire format is technically unchanged.

Step 3: Run buf lint Locally

# From the proto/ directory
buf lint

# Clean output (no violations):
# (no output, exit code 0)

# Example violation — missing UNSPECIFIED zero value:
# orders/v1/orders.proto:14:3:Enum value "STATUS_PENDING" should be named "STATUS_UNSPECIFIED"
#   — this is proto3 best practice lint rule ENUM_ZERO_VALUE_SUFFIX

Fix every lint violation before committing. A clean buf lint baseline is required for buf breaking to be reliable — linting violations can mask breaking changes in the diff output.

Step 4: Run buf breaking Against the Baseline

# Compare the working tree against the main branch baseline
buf breaking --against '.git#branch=main'

# If the repair is correct, output is empty and exit code is 0.

# If you had shipped the broken version, output would be:
# orders/v1/orders.proto:7:3:Field "2" on message "Order" changed type
#   from "STATUS" to "STATUS" — buf also checks enum compat
# orders/v1/orders.proto:10:3:Field "4" on message "Order" changed type
#   from "int64" to "float".
#   (buf breaking exit code: 100)

The --against flag accepts any buf source: .git#branch=main, .git#tag=v1.2.0, a BSR tag such as buf.build/acme/orders:main, or a local directory. In a monorepo, scope the check to one package:

buf breaking orders/v1 --against '.git#branch=main#subdir=orders/v1'

Step 5: Enforce in GitHub Actions

# .github/workflows/proto-contracts.yml
name: Proto Contract Gate
on:
  pull_request:
    branches: [main]
    paths:
      - 'proto/**'

jobs:
  buf-checks:
    runs-on: ubuntu-latest
    steps:
      - uses: actions/checkout@v4
        with:
          fetch-depth: 0          # REQUIRED: buf breaking needs the full git history to resolve origin/main

      - name: Install buf 1.39
        run: |
          curl -sSL \
            "https://github.com/bufbuild/buf/releases/download/v1.39.0/buf-$(uname -s)-$(uname -m)" \
            -o /usr/local/bin/buf
          chmod +x /usr/local/bin/buf
          buf --version             # prints: 1.39.0

      - name: buf lint
        working-directory: proto
        run: buf lint

      - name: buf breaking
        working-directory: proto
        run: buf breaking --against '.git#branch=origin/main'
        # Exit code 100 = breaking change found → build fails
        # Exit code 0   = no breaking changes → build passes

The fetch-depth: 0 on the checkout step is load-bearing. Without a full history, origin/main resolves to a shallow stub and buf breaking compares against an empty set — it will pass even on a catastrophically broken schema. This is the single most common reason teams believe buf is working when it is not.

Before / After Comparison

The table below summarises what changed between the broken and repaired .proto:

Aspect Before (broken) After (repaired)
Field 2 description (string) status (Status enum) — original assignment
Field 3 status (Status enum) total (double) — original assignment
Field 4 discount_pct (float) — reused reserved 4; reserved "amount_cents";
Field 5 total (double) discount_pct (float) — new number
Field 6 absent description (string) — new number
Timestamp absent created_at (google.protobuf.Timestamp) at field 7
buf breaking result exit 100 — multiple field type/number changes exit 0 — no breaking changes
buf lint result violations (enum zero value, field naming) exit 0 — clean

Verification

After merging the repaired schema, verify the gate is functional by introducing a deliberate breaking change on a test branch:

# On a scratch branch: simulate a breaking change
# Change status field number from 2 to 99 in orders/v1/orders.proto, then:
buf breaking --against '.git#branch=main'

# Expected output:
# orders/v1/orders.proto:5:3:Field "status" with name "status" on message "Order"
#   changed number from "2" to "99".
# Exiting with error code 100

# Revert the change and confirm clean:
git checkout -- proto/orders/v1/orders.proto
buf breaking --against '.git#branch=main'
# (no output, exit code 0)

Also verify that the CI workflow itself runs correctly by checking the Actions tab after pushing a proto-only commit to a pull request. The buf-checks job must appear in the required status checks and the buf breaking step must log the baseline commit SHA it compared against — if it shows compared against 0 files, the checkout depth is wrong.

Edge Cases and Caveats

Field number reuse after a long gap. A team removes a field in Q1 and forgets to add reserved. In Q3 another developer assigns the same number to a new field with a different type. buf breaking will not catch this if the reservation was never added, because it compares against the current baseline — and the baseline already had the number gone. This is why reserved must be added at deletion time, not retroactively. Governance rule: a PR that removes a field without a corresponding reserved block must be rejected in review.

Enum default and zero value semantics. Proto3 treats the zero value of any enum as its default — a missing field on the wire is indistinguishable from the zero enum value. Naming the zero value FOO_UNSPECIFIED = 0 (enforced by buf lint rule ENUM_ZERO_VALUE_SUFFIX) makes this explicit. Never use the zero slot for a semantically meaningful value like STATUS_PENDING = 0, because a client on an old schema that sees an unknown field will silently report PENDING rather than surfacing an error.

JSON mapping (json_name) and gRPC-JSON transcoding. By default, protobuf derives the JSON key from the field name using lower-camel conversion: discount_pct becomes discountPct. Overriding with json_name = "discount_percentage" is a breaking change for all JSON consumers even though the binary wire format is unchanged. buf breaking under the FILE category flags this. If you use Breaking Change Detection tooling across paradigms, be aware that the gRPC gate must be FILE, not just WIRE, to protect grpc-gateway and Connect-protocol consumers.

Frequently Asked Questions

Why does renaming a protobuf field break binary compatibility?

It does not — protobuf encodes by numeric field tag, not by name. A rename is safe on the wire. The risk is reusing the old name for a new field with a different type, which is why reserved also covers names: buf lint catches that reuse before it ships.

Can I change a field’s type from string to int32 safely?

No. Protobuf wire types differ: string uses length-delimited encoding (wire type 2) and int32 uses varint (wire type 0). A decoder using the old .proto will misinterpret the bytes and return garbage or an error. Reserve the old field number and add a new one with the new type.

What does buf breaking actually check?

buf breaking compares the current .proto against a baseline (a git ref, a BSR tag, or a local directory) and flags field deletions, field number changes, field type changes, service method removals, and message renames within a package — all changes that alter the binary wire format or stub API.

How do I handle a field that must be removed?

Add reserved <number>; and reserved "<name>"; to the message before deleting the field definition. buf lint will enforce that no future field reuses the reserved number or name. Ship the reservation and deletion in the same commit so the baseline is never in an unprotected intermediate state.

Is proto3 json_name a breaking change if altered?

Yes for JSON consumers. buf breaking flags json_name modifications under the FILE rule category because gRPC-JSON transcoding and grpc-gateway both use the json_name annotation to map protobuf fields to JSON keys. Treat json_name as immutable unless you version the package.

Do well-known types (Timestamp, Duration) help with compatibility?

Yes. google.protobuf.Timestamp and google.protobuf.Duration have stable, published definitions and canonical JSON mappings. Using them instead of raw int64 seconds communicates intent, works across generated SDKs in every language, and avoids the ambiguity that leads to unit-mismatch bugs at the paradigm level described in REST vs GraphQL vs gRPC Contract Strategies.