The unified diff format is the lingua franca of code review, version
control, and patch distribution. Every git diff, every GitHub pull request,
every diff -u on the command line, every email-delivered patch on the Linux
kernel mailing list speaks the same compact syntax: a header that names two files, a
hunk header wrapped in @@ markers, and a block of lines prefixed by
+, -, or a space. Learn to read it fluently and you can review
changes without ever opening a graphical diff viewer.
The format was introduced by Wayne Davison in 1990 as a more compact alternative to the
older context diff and was quickly absorbed into GNU diffutils, Larry Wall's
patch utility, and every version-control system that followed. The POSIX
standard codified it in
IEEE Std 1003.1,
and three decades later it remains the default output of git diff,
svn diff, hg diff, and effectively every tool that emits a
textual diff. This guide breaks down the syntax line by line, shows how to generate and
apply patches, and unpacks the edge cases that bite even experienced engineers.
What Is a Unified Diff?
A unified diff is a plain-text representation of the differences between two files. Unlike the side-by-side view that graphical tools render, a unified diff interleaves both versions of the file into a single linear stream. Each line in the output carries a one-character prefix that classifies it:
-(minus): line present in the original file, removed in the new version+(plus): line present in the new file, added relative to the original(space): unchanged context line, present in both versions
The format is line-oriented, which makes it ideal for source code, configuration files, and structured text. It is not designed for binary content — for that, see the dedicated guide to binary comparison, which covers byte-level diffing and hex viewers. For broader context on what a diff is and where the term comes from, the diff definition article covers the history from Hunt-McIlroy in 1976 onward.
A complete unified diff has three structural elements: an optional file header, one or more hunks, and within each hunk a hunk header followed by the actual changed and context lines. A minimal example for changing one line in a config file looks like this:
--- config.yaml 2026-05-30 14:22:11
+++ config.yaml 2026-06-02 09:15:42
@@ -3,5 +3,5 @@
database:
host: localhost
port: 5432
- pool_size: 10
+ pool_size: 25
timeout: 30s
Five elements are doing work here. The --- line names the original file and
its modification timestamp. The +++ line names the new file. The
@@ line is the hunk header that tells patch exactly where to
apply the change. The lines starting with a space are unchanged context, included so the
patch tool can locate the right position even if surrounding lines have shifted. The
- and + lines are the actual change.
Anatomy of the Unified Diff Format
Every unified diff begins with a file header made of two lines: the
--- line for the source file and the +++ line for the target.
The convention is that --- represents "old" or "before" and +++
represents "new" or "after". The filename is followed by whitespace and, optionally, a
modification timestamp in the format
YYYY-MM-DD HH:MM:SS.nanoseconds ±tz. When Git generates the diff, the
timestamp is omitted and the filenames are prefixed with a/ and b/
to signal "old tree" and "new tree" — which is why patch -p1 is the correct
invocation for Git-style patches and patch -p0 works for plain
diff -u output.
After the header come one or more hunks. A hunk is a contiguous block
of the file where at least one line has changed, plus surrounding context. Hunks are the
unit at which patch applies changes; each hunk can succeed, fail, or
fuzz-match independently. The number of context lines is configurable
(-U3 is the default — three lines above and three below each change). Two
nearby changes that fall within the same context window collapse into a single hunk;
further apart, they become separate hunks each with their own header.
Inside each hunk the format alternates between context, removed, and added lines in any
order. There is no requirement that removals come before additions; the lines appear in
the order they exist in the source. A common pattern for replacing a single line is
-old immediately followed by +new, but a longer rewrite can
intermix both prefixes freely.
Decoding the @@ Hunk Header
The hunk header is the densest piece of syntax in the format and the source of most confusion. Its grammar is:
@@ -oldStart,oldCount +newStart,newCount @@ optional-section-heading
Reading left to right: the doubled @@ opens the header, -oldStart
is the 1-based line number where this hunk begins in the original file,
oldCount is how many lines from the original appear in the hunk (counting
both context lines and removed lines), +newStart is the line number in the
new file, newCount is how many lines from the new file appear in the hunk
(counting context lines and added lines), and a closing @@ ends the header.
Anything after the closing @@ is an optional section heading: Git
fills it in with the enclosing function or section name to help reviewers (driven by
per-language xfuncname regexes).
Three numeric rules trip up newcomers. First, when a count is exactly 1 it
may be omitted, so @@ -42 +42 @@ means the same as @@ -42,1 +42,1 @@.
Second, when a count is 0 (the hunk only adds new lines or only deletes
lines), the start number points to the line before the insertion or deletion,
not at it — so @@ -0,0 +1,5 @@ is the standard header for a brand-new file
whose first five lines are all additions. Third, the counts include context lines, not
just changed lines, which is why oldCount and newCount often
differ from each other by exactly the net number of + minus -
lines in the hunk.
A worked example. Given the hunk header @@ -42,7 +42,8 @@ def calculate_total
followed by a body containing four context lines, two minus lines, and three plus lines,
the math checks out: oldCount = 4 + 2 = 7 and
newCount = 4 + 3 = 8, and the difference of +1 matches the net
addition of one line. If your hunk header disagrees with the body, the patch is corrupt
and patch will refuse it.
How to Generate a Unified Diff
diff -u for files, git diff for repositories, and diff -ruN for directories.Three commands cover almost every real-world need. For a one-shot comparison of two local files, run:
diff -u old.txt new.txt > change.patch
The -u flag selects unified format with the default three lines of context.
Append -U5 (or any other digit) to widen the context — wider context makes
the patch more resilient when the surrounding file drifts. Add --label "before"
and --label "after" to override the ---/+++
filenames, which is handy when piping through process substitution.
For a directory comparison, use diff -ruN old/ new/ > tree.patch: the
-r walks recursively, -N treats absent files as empty so the
patch can recreate added files or remove deleted ones, and -u still controls
the format. To exclude files matching a pattern, pass -x '*.log' -x 'node_modules'.
The companion guide to the diff command in Unix
covers the full flag matrix including ignore-whitespace, ignore-case, and brief modes.
Inside a Git repository, git diff emits unified format with Git-specific
extensions (covered in a later section). The common variants are:
git diff # working tree vs index (unstaged changes)
git diff --cached # index vs HEAD (staged changes)
git diff HEAD # working tree vs HEAD (all changes)
git diff main feature # branch tip vs branch tip
git diff abc123 def456 -- src/ # two commits, limited to src/
git diff --no-color > change.patch # save to a file for review or sharing git format-patch takes this one step further: it emits one mail-formatted
patch file per commit, each ready to be applied with git am. This is how
the Linux kernel community has exchanged code for thirty years and remains a useful
workflow for forks that lack pull-request infrastructure. For comparing source trees
across machines, the Linux diff tool ranking
covers GUI options that operate on the same underlying format.
In Python, the standard library exposes the format directly through
difflib.unified_diff(), which is documented at
docs.python.org/library/difflib.
It takes two lists of strings (one per file) and yields lines of a unified diff. The
Python file comparison guide shows
runnable patterns. On Windows without WSL, git diff works inside any Git
installation, or PowerShell scripts can use Compare-Object — see
PowerShell diff techniques for that route.
How to Read a Unified Diff
Reading a diff fluently means parsing it in three passes: file header for context, hunk
header for location, and the line prefixes for the actual change. Train your eye to
skip context lines (leading space) and zero in on -/+ pairs.
For a single-line edit, the pattern is unmistakable: one minus line directly followed by
one plus line means "replace this line with that line". For larger rewrites, scan all
the - lines first to understand what is being removed, then all the
+ lines to understand what replaces them, then re-read in order to confirm
the line-by-line correspondence.
Watch for three subtle cues. A hunk header where oldCount is much smaller
than newCount (or vice versa) signals a large insertion or deletion, not a
rewrite. A trailing \ No newline at end of file marker indicates that one
of the two versions lacks a final newline — a real difference that source-control tools
track but text editors often hide. An empty hunk header
(@@ -0,0 +0,0 @@) is invalid; if you see it, the diff was corrupted by an
editor or by a tool that does not understand the format.
Most teams pair unified diffs with a graphical viewer for large changes and read the raw format for smaller ones during code review. The side-by-side diff view simply re-renders the same underlying data with the two files in adjacent columns, which is easier to scan for rewrites but takes twice the screen width.
Applying Patches with patch and git apply
Applying a unified diff is the inverse operation. There are three primary tools and choosing the right one matters more than most tutorials admit.
The classical tool is patch, written by Larry Wall in 1985 and packaged on
every Unix system. The crucial flag is -p, which strips leading path
components from the filenames in the patch. patch -p1 < change.patch is
right for any patch generated by Git (because Git prepends a/ and
b/), and patch -p0 < change.patch is right for a plain
diff -u old new run inside the project directory. Pass --dry-run
first to preview the operation without modifying files. When a hunk cannot find its
exact context, patch tries to fuzz-match and writes any rejects to
filename.rej for manual reconciliation.
The Git-native tool is git apply, which understands Git's extensions to the
format (file mode changes, similarity indices, binary patches) but is stricter than
patch: it refuses to fuzz-match unless you explicitly pass
--3way, which falls back to a real three-way merge using the blob hashes
recorded in the patch's index line. Use git apply --check change.patch
first to verify the patch will apply cleanly; use git apply --reject to mimic
patch's behavior of writing rejects.
For patches produced by git format-patch — the kind that include author,
date, and commit message — the right command is git am. It applies the
patch and creates a new commit in one step, preserving authorship. When the kernel
community accepts your contribution, this is the command the maintainer runs to land
your work.
Unified vs Context vs Normal Diff
diff -c), and unified (diff -u).
The diff utility has emitted three formats over its history. The default
normal format predates both context and unified; it uses commands like
5,7c5,7 followed by lines prefixed with < for original and
> for new. It is compact for humans but unsuitable for automated patching
and is essentially deprecated.
Context diff (diff -c) was the first patch-friendly format,
introduced in 1981. It prints two separate blocks per hunk — an *** block
for the original and a --- block for the new — repeating unchanged context
in each block. It is verbose: a single-line change inside a six-line context window
produces roughly twice as many output lines as the equivalent unified diff.
Unified diff (diff -u) was Wayne Davison's 1990 improvement,
merging the two context blocks into a single interleaved stream. It cut patch sizes
roughly in half and quickly displaced context format. Today every modern tool defaults
to unified output and the format is the de facto standard.
The GNU diffutils manual documents all three formats authoritatively. For day-to-day work the only one you need to read fluently is unified; the others are useful background when you are reviewing historical patches or studying the format's evolution.
Git Diff Extensions to the Format
diff --git, similarity index, rename tracking, and the index blob-hash line before the standard --- / +++ file header.Git extends standard unified diff with extra header lines that encode information the original format cannot represent. A typical Git hunk for a renamed file looks like this:
diff --git a/src/old-name.js b/src/new-name.js
similarity index 92%
rename from src/old-name.js
rename to src/new-name.js
index 1a2b3c4..5d6e7f8 100644
--- a/src/old-name.js
+++ b/src/new-name.js
@@ -10,7 +10,7 @@ function calculate(items) {
The diff --git opening line declares this is Git's extended format. The
similarity index percentage tells review tools how much of the file is
shared between the two paths — a rename detection threshold. The
index abc..def 100644 line records the SHA-1 blob hashes (and the Unix file
mode) on both sides, which is what makes git apply --3way able to perform a
real merge even when surrounding context has drifted. The
Git diff-format documentation
catalogs every extension including binary patches, file mode changes, and copy detection.
For mostly-unchanged renames, Git can emit a header with no hunks at all
because the file content is identical — a feature plain patch does not
understand. If you need to feed such a patch to a non-Git tool, regenerate it with
git diff --no-renames to force per-file delete-plus-add output.
Common Pitfalls and How to Avoid Them
Five recurring problems eat days of engineer time. The first is the -p0 vs
-p1 confusion already covered: when in doubt, run head -1 patch
and check whether filenames start with a/ (use -p1) or not
(use -p0).
The second is line endings. A patch generated on Linux with LF line endings will not
apply cleanly to a Windows working copy with CRLF. Configure Git's
core.autocrlf consistently across the team, or pass
--ignore-whitespace to git apply for one-off rescues. The
Windows file comparison guide
digs into the CRLF/LF landscape further.
The third is tab-vs-space drift. patch matches context lines byte-for-byte;
if your editor "helpfully" reformatted indentation on save, every hunk in the patch
rejects. Either disable format-on-save for the affected files (see
VS Code format on save for the right
settings) or pass git apply --ignore-whitespace.
The fourth is the trailing-newline trap. The \ No newline at end of file
marker is real, semantically meaningful, and almost invisible in editors. A patch
generated against a file that ends with a newline will not apply to a file missing one,
and vice versa. printf without \n, or shell here-strings
constructed wrong, often cause this. Inspect with xxd file | tail -1 to see
the final byte.
The fifth is stale context. When a patch sat in a mailing list or pull-request queue
for weeks, the surrounding code drifted. Hunks fail with FAILED at line X
and reject into .rej files. The robust workflow is to regenerate the patch
against the current HEAD, or to apply with git apply --3way so Git uses the
recorded blob hashes for a true merge instead of textual matching.
Tools for Viewing and Editing Unified Diffs
Reading a small diff in a terminal is fine; reading a 4,000-line refactor patch in a terminal is masochism. The ecosystem of viewers maps to four use cases.
For terminal use, delta and diff-so-fancy are pagers that
re-render the unified diff with syntax highlighting and side-by-side mode while
preserving the underlying format. Both are drop-in replacements configured via
~/.gitconfig and require no workflow changes.
For desktop GUI, Beyond Compare, Meld, Kaleidoscope, and Araxis Merge all read and write standard unified diffs while presenting a graphical view. The Beyond Compare alternatives roundup ranks the practical options by platform and price.
For browser-based comparison, Diffchecker.pro renders unified diffs without any installation: paste two versions of a file, get an annotated side-by-side view with the hunk math computed for you. It is also the fastest way to verify a hand-edited patch before applying it. The companion Linux diff tools roundup reviews CLI plus GUI options on a single page.
For IDE integration, VS Code's built-in source-control view, JetBrains IDEs, and Vim's
:diffsplit all consume the same unified diff data Git emits. The benefit
is staying inside the editor; the trade-off is less customization than a dedicated diff
viewer. See the VS Code file comparison
guide for keyboard shortcuts and configuration.
Frequently Asked Questions
What does @@ mean in a unified diff?
The doubled @@ wraps the hunk header that tells patch tools where a change
begins. The full syntax is @@ -oldStart,oldCount +newStart,newCount @@,
where the numbers are 1-based line positions and counts. The doubled at-sign was chosen
as the delimiter precisely because real source lines almost never start with it.
What is the difference between unified diff and context diff?
Context diff (diff -c) emits two separate blocks per hunk and repeats the
unchanged context lines in each. Unified diff (diff -u) merges the two
blocks into a single interleaved stream marked with +, -, and
space prefixes. Unified is roughly half the size and is the format every modern tool
emits by default.
How do I generate a unified diff between two files?
Run diff -u old.txt new.txt > change.patch on the command line. Inside a
Git repository, plain git diff emits unified format against the index, and
git diff commitA commitB > change.patch produces a portable patch file.
How do I apply a unified diff patch?
Use patch -p1 < change.patch for patches produced by Git, or
patch -p0 < change.patch for plain diff -u output. Within
Git, prefer git apply change.patch or, for mail-formatted patches from
git format-patch, git am 0001-feature.patch.
Why does my patch fail with "hunk FAILED at line X"?
The source file has drifted from the version the patch was generated against. Three
remedies: regenerate the patch against current source, widen the context with
diff -U10, or apply with git apply --3way which uses recorded
blob hashes to perform a real three-way merge instead of textual matching.
Is unified diff a binary format?
No. Unified diff is plain UTF-8 text, line-oriented, and human-readable by design. It is also not suitable for binary files — for byte-level comparison see the binary compare guide. Git does support binary diffs via its extended format, but those embed a base85-encoded delta and are not meaningful to read by eye.
Read and Share Unified Diffs Without Leaving Your Browser
Diff Checker renders any unified diff with syntax highlighting, side-by-side mode,
and hunk-aware navigation. No installation, no signup, works offline once installed.
Use it for code review, patch verification, or quickly sanity-checking a
git diff before you push.