Methodology

Every statistic on this site is derived from public data published by national statistical agencies. This page lists every source and explains every estimation step.

Primary sources

Office for National Statistics - Baby Names
License: Open Government Licence v3.0 - Coverage: 1996/ - Region: England and Wales
U.S. Social Security Administration - National Names Data
License: U.S. Public Domain - Coverage: 1880/ - Region: United States
U.S. Social Security Administration - State Names Data
License: U.S. Public Domain - Coverage: 1910/ - Region: United States (50 states + DC)
U.S. Census Bureau - 2010 Frequently Occurring Surnames
License: U.S. Public Domain - Coverage: 2010 - Region: United States
Wikidata - Famous Bearers
License: CC0 1.0 - Coverage: ongoing - Region: Worldwide

Living-bearer estimate

SSA data gives births per year. To estimate the living population, we apply age-specific survival probabilities from the SSA Period Life Table to each birth year, then sum. The implementation is in lib/utils/survival.ts.

Worldwide estimate

Where we have a primary source for a country we sum directly. Where we do not, we estimate prevalence using a reference country and apply that to the destination country's population. Estimated values are clearly flagged with "est." on each name page.

Combined-name estimate

For full-name combinations (first + last) we use the independence assumption: prevalence(first) x prevalence(last) x population. This produces an order-of-magnitude estimate only.

Refresh cadence

SSA publishes annually around May. We refresh within 30 days of release. ONS UK publishes annually in autumn. Other agencies vary - see the dates next to each dataset above.