Methodology: how every breach cost is calculated

No black boxes. Every number on this site comes from one of the formulas below, and every coefficient those formulas use is mapped to a primary source in the reference table. Each formula is plain, deterministic arithmetic — verifiable once and stable thereafter — and the same math runs server-side and in your browser. Benchmarks are convenience defaults (verified Jun 25, 2026); the key figures are inputs you can override.

Read this first. Some coefficients are measured benchmarks drawn from published research or statute (e.g. cost per record, statutory penalty thresholds). Others are structural modeling choices — the component shares, the size/data-type factors, the range band. They are documented, reasonable, and overridable, but they are not claimed as measured. The reference table at the end labels every coefficient as one or the other.

1. The SMB breach-cost estimator (central model)

The estimator answers "what would a breach cost this business?" It splits cost into a fixed component that any breach incurs (investigation, baseline legal, crisis management) and a variable component that scales with the number of exposed records. A security-posture factor then scales the whole thing down.

C = ( F_base(size) + R × v_eff ) × f_security
v_eff = base_v(industry) × f_data(data_type) × f_size(size)
  • R — number of exposed records (your input).
  • F_base(size) — fixed cost floor by company-size bracket: micro $90,000, small $180,000, mid-market $320,000.
  • v_eff — effective variable cost per record, built from the industry baseline base_v times a data-type multiplier f_data times a size multiplier f_size.
  • f_security — security-posture factor (see §1.2), which only reduces cost.

Because cost is "fixed + variable", the cost per record falls as record count rises (the fixed floor is spread over more records). That is deliberate and SMB-realistic: a 2,000-record breach is not one-thousandth of a 2,000,000-record breach.

1.1 The five-component split

The estimated operational cost C is broken into five shares that sum to 1.00, so you can see where the money goes:

ComponentShare
Detection & response30%
Notification6%
Lost business34%
Fines & legal12%
Post-breach response18%

These shares are a modeling choice consistent with IBM's published cost-component breakdown (with fines/legal carved out so statutory exposure can be shown separately). They are not a fitted distribution.

1.2 The security-posture factor

Each selected control multiplies expected cost by (1 − its reduction); selecting several compounds them. A floor stops the model implying that controls eliminate cost.

f_security = Π (1 − reductioni) , bounded below by 45% (BCL_SECURITY_FLOOR = 0.45)

The per-control reduction fractions are modeling choices informed by IBM's reported cost-savings figures for each control; the dollar savings IBM reports are shown alongside them on the cost-mitigation factors table for context.

1.3 The optimistic / expected / pessimistic range

A single point estimate would be falsely precise, so every estimate is shown as a band: the expected value C, scaled down for an optimistic case and up for a pessimistic one.

optimistic = C × 0.6  |  expected = C × 1  |  pessimistic = C × 1.7

The 0.6 / 1.0 / 1.7 multipliers are an illustrative dispersion declared as such — a stated range, not a measured confidence interval.

2. Cost per record (linear)

The textbook quick estimate: exposed records times an industry average cost per record. Unlike the estimator above, this is strictly linear — useful for a back-of-envelope figure and for comparison.

total = R × cost_per_record(industry)

The per-record benchmarks (healthcare $408, retail $200, and so on; global average $164) are measured benchmarks from IBM/Ponemon analysis. See the cost-per-record by industry table.

3. Annual Loss Expectancy (ALE)

The standard quantitative-risk formula. Single Loss Expectancy is what one incident costs; multiply by how often it is expected per year.

SLE = asset_value × exposure_factor
ALE = ARO × SLE

ARO (annualized rate of occurrence) can be taken from the breach-frequency by sector table, which is derived from Verizon DBIR incidence. ALE itself is a standard industry framework (NIST SP 800-30); the inputs are yours.

4. GDPR maximum fine (Article 83)

GDPR caps administrative fines at the higher of a fixed cap or a percentage of worldwide annual turnover, by infringement tier.

tier 1: max( 2% × turnover , €10,000,000 )
tier 2: max( 4% × turnover , €20,000,000 )

The caps and percentages are statutory figures straight from GDPR Article 83. This is a maximum-exposure ceiling, not a prediction of an actual fine.

5. HIPAA civil money penalties

HIPAA penalties are tiered by culpability, with a per-violation minimum and maximum and an annual cap per identical provision.

low = min( violations × tier_min , annual_cap )
high = min( violations × tier_max , annual_cap )
TierCulpabilityPer-violation minPer-violation maxAnnual cap
1No knowledge$141$71,162$2,134,831
2Reasonable cause$1,424$71,162$2,134,831
3Willful neglect (cured)$14,232$71,162$2,134,831
4Willful neglect (uncured)$71,162$2,134,831$2,134,831

These are published, inflation-adjusted statutory amounts (HHS OCR / 45 CFR §160.404).

6. CCPA/CPRA statutory damages

Under the California private right of action, certain breaches expose a business to statutory damages per consumer per incident, between a floor and a ceiling.

low = consumers × $100  |  high = consumers × $750

The $100–$750 band is a statutory figure from Cal. Civ. Code §1798.150 (or actual damages, if greater).

7. Security control ROI

Whether a control pays for itself: the breach loss it avoids (a share of your ALE) minus its annual cost, expressed as a return.

avoided = ALE_before × reduction
net = avoided − annual_cost
ROI = net ÷ annual_cost

This is a standard cost-benefit framework; the reduction can be drawn from IBM's cost-mitigation factors, and the costs are yours.

8. Cost of detection delay

IBM consistently finds that breaches taking longer to identify and contain cost more. We anchor this to the reported delta between breaches contained in under vs. over 200 days, ramped linearly around the 200-day line.

extra = Δ × clamp( (days − 200) ÷ 100 , −0.5 , 1.0 ) , where Δ = $1,880,000

The $1.88M anchor is a measured benchmark from IBM; the linear ramp around it is a modeling choice.

Coefficient → source reference

Every coefficient group used anywhere on the site, its primary source, and whether it is a measured benchmark or a documented modeling choice:

Coefficient groupPrimary sourceType
Industry cost per record (cpr)IBM/Ponemon Cost of a Data Breach 2025 — industry analysisMeasured benchmark
Industry variable baseline (base_v)Derived from IBM industry figures, SMB-scaledModeling choice (benchmark-derived)
Company-size fixed floor (F_base) & size factor (f_size)SMB diseconomy-of-scale assumptionModeling choice
Data-type multipliers (f_data)Sensitivity weighting (PII → PHI/financial)Modeling choice
Security control reductions (reduction)Informed by IBM cost-mitigation savings figuresModeling choice (benchmark-informed)
Five-component sharesConsistent with IBM cost-component splitModeling choice
Range band (0.6 / 1.0 / 1.7) & security floor (0.45)Illustrative dispersion / boundModeling choice
Breach frequency by sector (ARO)Verizon DBIR incidenceMeasured benchmark (derived)
Detection-delay anchor ($1.88M)IBM <200 vs >200-day deltaMeasured benchmark
GDPR thresholds (2%/€10M, 4%/€20M)GDPR Article 83Statute
HIPAA penalty tiersHHS OCR / 45 CFR §160.404Statute
CCPA statutory damages ($100–$750)Cal. Civ. Code §1798.150Statute
PCI non-compliance fines / per-cardPCI SSC & acquirer agreementsContractual (widely reported)

Full bibliography with links and verification dates: see Sources.

Assumptions & limits

  • Benchmarks are representative, SMB-appropriate defaults, not enterprise headline averages, and are convenience values — every key figure is an editable input, so the calculators stay correct even when a default goes stale.
  • The estimator models operational cost. Statutory exposure (GDPR/HIPAA/CCPA/PCI) is computed by separate tools and shown separately, not folded into C.
  • Structural coefficients (component shares, size/data/range factors, the security floor) are modeling choices, not measured quantities. Treat the output as a planning range, not a prediction.
  • Maximum-exposure figures are legal ceilings; actual fines and damages are usually far lower and depend on facts and regulators.
  • Currency is USD only; statutory caps quoted in their native currency (e.g. EUR for GDPR) are shown as published.

Found an error in a formula or a stale coefficient? Tell me — corrections are welcome and credited.