HyperLogLog: Counting Billions of Distinct Items in Kilobytes

HyperLogLog (HLL) is a probabilistic cardinality estimation algorithm: using a tiny, fixed amount of memory (typically ~12 KB), it estimates the number of distinct elements in a huge set, with a standard error of about 0.81%.

What Problem It Solves Link to heading

Exact distinct counting (COUNT DISTINCT) has to remember every element it has seen, so memory grows linearly with cardinality — counting 1 billion unique visitors with a HashSet could take tens of gigabytes. HLL trades a little accuracy for constant space:

Method	Memory	Accuracy
HashSet exact dedup	O(n), up to tens of GB	Exact
Sort then dedup	O(n)	Exact
HyperLogLog	O(1), ~12 KB	≈99.2% (0.81% error)

Typical use cases: unique visitor / DAU counts, distinct IP counts, distinct search-term counts, database APPROX_COUNT_DISTINCT, and Redis PFCOUNT.

Core Intuition Link to heading

The key observation: once you hash elements into uniformly random bit strings, the longest run of leading zeros you have ever seen tells you roughly how many distinct elements you have processed.

A random bit string starts with k zeros with probability 1 / 2^(k+1).
So if you have observed “at most k leading zeros” across your hashes, you have probably seen on the order of 2^k distinct values.
A single observation has enormous variance, so HLL splits the data into many buckets (registers), observes each independently, and combines them with a harmonic mean to cancel out the outliers.

This evolved from Flajolet-Martin → LogLog → HyperLogLog (2007, Flajolet et al.). HLL’s key improvement is the harmonic mean plus bias correction.

The fastest way to feel this is to try it. Type a few keys below: each one is hashed, the first p bits pick a register, and the position of the first 1 in the rest of the hash updates that register with a running maximum. Watch the estimate converge toward the true number of distinct keys you’ve entered.

Interactive HyperLogLog

m = 16 registers · p = 4 index bits

hash— add a key to see its hash bits —

The low p = 4 bits choose the register; the position of the first 1 in the high bits is ρ; the register keeps max(old, ρ).

Distinct keys added (truth)0

HyperLogLog estimate0

Error—

The Algorithm Link to heading

Hash: compute a 64-bit hash h(x) of element x, giving an approximately uniform bit string.
Bucket: take the first p bits of the hash as a register index j (there are m = 2^p registers).
Count leading zeros: in the remaining bits, count the position of the first 1 — call it ρ (= number of leading zeros + 1).
Update register: M[j] = max(M[j], ρ) — each register only keeps the largest value it has seen.
Estimate: take the harmonic mean across registers and multiply by a correction constant:
E = α_m · m² / Σ_j 2^(−M[j])
where α_m is a correction factor (≈ 0.7213 / (1 + 1.079/m) for large m).

Each register needs only 6 bits (enough to record leading-zero counts of a 64-bit hash), so m registers take 6m/8 bytes total.

Accuracy and Parameters Link to heading

Standard error ≈ 1.04 / √m.
A common choice is p = 14 → m = 16384 registers → error 1.04/128 ≈ 0.81%, memory 6 × 16384 / 8 ≈ 12 KB.
Each +1 to p doubles the register count, doubles the memory, and shrinks the error by √2.

Bias Correction (Small / Large Cardinalities) Link to heading

Small cardinalities: the raw estimate is biased high when E ≤ 2.5m and empty registers exist; switch to linear counting: E = m · ln(m / V), where V is the number of empty registers.
Large cardinalities: near the upper limit of a 32-bit hash space, apply an overflow correction (using a 64-bit hash largely avoids this).
HLL++ (Google, 2013) further improves mid/small-cardinality accuracy with an empirical bias table and a sparse representation that also saves memory at low cardinalities.

Key Properties Link to heading

Mergeable: merging two HLLs = taking the per-register max. This makes distributed / sharded counting naturally parallel — each node computes its own sketch, and the final answer takes the per-register max.
Idempotent: re-adding the same element does not change the result (a side effect of taking the max).
No deletion: a standard HLL only grows; it cannot precisely support element removal.
Supports estimating unions (merge, then PFCOUNT). Intersections require inclusion-exclusion, |A∩B| = |A| + |B| − |A∪B|, which amplifies error — use with care.

In Practice (Redis) Link to heading

PFADD key element adds an element, PFCOUNT key reads the estimated cardinality, PFMERGE dest src1 src2 merges sketches.
Redis fixes p = 14, so a single key is at most ~12 KB; at small cardinalities it uses a sparse encoding to save memory.
PFCOUNT over multiple keys merges them temporarily before counting — mind the cost.

When Not to Use It Link to heading

You need exact values (billing, reconciliation) — HLL is an estimate.
You need to delete elements or compute exact intersections.
The cardinality is small to begin with (hundreds or thousands) — a plain Set is more accurate and simpler.