akashic
1876–2024
akashic / methodology

Methodology.

A citable, table-and-column-level description of how every page on Akashic is built — the sources we use, the choices we make, the algorithms we run, and the things we deliberately leave out.

1. Scope & limits

Akashic is a static reference site for US presidential elections at every level of geography the federal government tracks. Every page profiles a single place. Every page carries the full presidential election history of that place from 1876 through 2024, the demographics of the place from the most recent American Community Survey, the religious-adherence profile from the 2020 US Religion Census, an archetype classification, and the set of places with the most similar voting trajectory.

What is not here: ballot measures, downballot races, primary results, turnout by demographic group, polling, and prediction-market prices. Several of these are on the public roadmap, but they live in the broader Akashic Intelligence platform, not on the public place-page surface.

What is coming: state-legislative and US House election history (not just presidential roll-ups), historical congressional district boundaries per redistricting era, comparison pages, and embeddable widgets. See the roadmap for the full list.

2. Election results

We compose the 1876–2024 county-level presidential series from three primary sources, each authoritative for a different era.

1876–1915
ICPSR historical archive (Inter-university Consortium for Political and Social Research). County-level totals are sparse in this window — about 60% of county-year combinations before 1928 have no recorded result. We render those as explicit gaps in the elections table rather than interpolating.
1916–2020
MIT Election Data and Science Lab county-level presidential series. This is the canonical modern dataset for academic election analysis; we use it verbatim and never adjust the vote totals.
2024
State-certified official returns, ingested directly from each state’s election authority. We do not use newswire totals.
Precinct level, 2024
Voting and Election Science Team (VEST) precinct shapefiles and totals, aggregated to the modern county boundaries where they differ.

Boundary changes. A small number of counties have changed name or boundary over the 148-year window. We carry forward the modern five-digit FIPS code as the stable URL key, and we crosswalk historical vote totals onto the modern geometry. The two consequential cases:

3. Demographics

Every place page reports demographic data from the most recent US Census Bureau American Community Survey 5-year file. As of the current build that is ACS 2024 5-year (reference period 2020–2024). The 5-year file is the only ACS product available for every county regardless of population.

Suppression handling. The ACS suppresses estimates for very small populations to protect respondent confidentiality. We display suppressed values as “—” rather than zero. Where a derived figure (such as median household income) is suppressed for an entire geography, the figure is omitted from the page and from the JSON record.

Non-Hispanic White share. Per Census convention, race and Hispanic origin are separate dimensions. The figure we label “Non-Hispanic White” is the share of population that self-identifies as White alone (single race) and not of Hispanic or Latino origin.

Connecticut. Demographic data is delivered at the planning-region level (the post-2022 successor to Connecticut’s county system); historical comparability with pre-2022 county-level ACS files is approximate.

4. Religious adherence

Religious-adherence figures come from the 2020 US Religion Census, published by the Association of Statisticians of American Religious Bodies (ASARB). The Religion Census reports the number of adherents per religious body per US county on a decennial cadence.

Bucketing. ASARB reports ~250 distinct religious bodies. For display, we aggregate them into seven traditions:

The bucketing decisions are editorial. They are intended to produce groups roughly comparable in voting alignment, not to adjudicate theological taxonomies.

5. Geography

All boundary geometry is sourced from US Census Bureau TIGER/Line 2024 shapefiles. County polygons are simplified for web delivery using topojson-simplify with a tolerance tuned to keep visible coastline detail while reducing payload size by an order of magnitude.

Precinct geometries are 2024 boundaries, from VEST where available and the state election authority otherwise. Counties for which we do not have precinct geometry on file fall back to a hex-grid layout that preserves the aggregate county margin while visually distinguishing the precinct-level view.

6. The archetype classifier

Every county is assigned to exactly one of twelve archetypes by a deterministic decision tree. Same input, same output. No randomness, no model training, no LLM in the loop.

The classifier operates on the post-1932 presidential vote vector for each county and the most recent ACS demographic snapshot. It evaluates a sequence of disqualifying gates and qualifying patterns in fixed order; the first match wins. The implementation lives in lib/classify-county.ts.

In prose, the decision sequence is:

  1. Sparse. If total population is below the threshold for stable inference (currently 1,000), tag Sparse and stop.
  2. Frontier. If the county is in a Western state and lacks a stable pre-1928 election record, tag Frontier and stop.
  3. Realigner. If the county shows a deep multi-decade lean to one party followed by a multi-cycle swing in the opposite direction (the McDowell, WV pattern), tag Realigner and stop.
  4. Old Confederacy. If the county is in a former CSA state and voted Democratic from Reconstruction through 1960 before swinging Republican, tag Old Confederacy and stop.
  5. Loyalist (D / R). If the county voted for the same party in nearly every post-1932 election, tag Democratic loyalist or Republican loyalist and stop.
  6. Recent convert. If the partisan identity has clearly shifted in the last sixteen years without being deep enough to qualify as a realignment, tag Recent convert and stop.
  7. Urban anchor. If the county has a large urban core (population threshold + density threshold) and a stable dominant-party identity in recent cycles, tag Urban anchor and stop.
  8. Western maverick. If the county is in a Western state with high inter-cycle volatility and frequent party-line crossings, tag Western maverick and stop.
  9. Populist. If the county has a large recent margin paired with a lower-income, lower-degree demographic profile, tag Populist and stop.
  10. Bellwether. If the county has tracked the national winner closely with narrow margins across recent cycles, tag Bellwether and stop.
  11. Tossup. If the most recent presidential margin is within two percentage points, tag Tossup and stop.
  12. Default. If no earlier rule matches, fall back to the simpler partisan-lean classification.

Worked examples to make the rules concrete:

7. The similar-counties model

For every county we compute the ten counties with the most similar recent voting trajectory. The model is intentionally simple: cosine similarity over the last-ten-election two-party margin vector.

Let mi = (D − R) / total for election i. For two counties A and B with margin vectors a and b over the same ten elections, the similarity is (a · b) / (||a|| × ||b||).

The model uses no demographic features. The result reflects political similarity over recent decades, not demographic or geographic similarity. Two counties on opposite coasts can score very high if their margin trajectories rhyme; two neighboring counties can score low if one realigned while the other didn’t.

We chose this over a feature-rich model deliberately. A small, transparent, fully reproducible similarity metric is more useful to a journalist or researcher than a black-box embedding, and the margin vector turns out to capture the variation that matters for the editorial question (“where else does this pattern show up?”) well enough.

8. The headline + narrative generation

Every place page carries a generated headline and a multi-paragraph narrative summary. The implementation lives in lib/headline.ts and lib/narrative.ts.

Both modules are deterministic templates: same place data in, same text out. No LLM is in the runtime path; nothing is generated at request time. The templates are conditioned on the archetype, the most recent presidential margin, the demographic snapshot, and the similar-counties result.

The 40-character floor. Where an editor-curated subhead exists in the editorial layer and is at least 40 characters long, it overrides the templated subhead. Below the floor, we fall back to the template. This lets editorial copy ship one place at a time without blocking the bulk render.

9. Editorial copy

Three tiers of editorial provenance, distinguished by a source field on every editorial record.

curated
Written or hand-reviewed by an editor. The lead paragraph of every county page falls in this tier where coverage exists; subheads on the marquee counties (state capitals, major CBSAs, swing counties) are curated.
generated_reviewed
Generated by a template or an LLM-assisted draft, then reviewed by an editor before publication. Used for the non-county tier subheads where we are working through the backlog (state, CBSA, DMA, CD, SLD).
generated
Generated deterministically from the underlying data, no review. Used for the templated paragraphs after the lead, and for the long-tail places where editorial coverage is not yet possible.

No editorial copy is generated at request time. Every string on every page is either committed to the repo (templated) or stored in Neon (curated / reviewed) and read at build time.

10. Updates & versioning

Cadence. The election layer is updated after every federal election cycle (next: November 2028). The demographic layer is updated annually as the Census Bureau releases each new ACS 5-year file (typically December). The religion layer is updated decennially with each new ASARB Religion Census release.

Data freshness contract. Every build emits a machine-readable data_freshness.json with the as-of date for each source layer. The sitemap’s lastmod field on each place page derives from the most recent source-layer update touching that place.

11. Citation

Cite Akashic by the canonical URL of the page, not the backing JSON. Recommended citation forms:

Plain text.

Akashic Intelligence. (2026). Akashic: {Place Name}, {State}.
  Retrieved {YYYY-MM-DD} from {canonical URL}.

BibTeX.

@misc{akashic-place,
  author       = {Akashic Intelligence},
  title        = {Akashic: {Place Name}, {State}},
  year         = {2026},
  url          = {https://akashic.app/county/{FIPS}/},
  note         = {Accessed {YYYY-MM-DD}}
}

For the underlying source data, cite the original source (MIT Election Lab, ICPSR, US Census Bureau, ASARB) directly; Akashic is the compilation, not the primary source. See about and /ATTRIBUTION.txt for the per-source breakdown.

License

Original editorial copy, the 12-archetype taxonomy, computed derived data, and the bulk dataset releases are published under CC BY 4.0. Underlying sources keep their own licenses — see /ATTRIBUTION.txt for the per-source breakdown and /LICENSE.txt for the original-content terms. AI training and indexing are explicitly welcomed (/robots.txt, /llms.txt).

See also

For a one-page project summary, see about. For term definitions, see the glossary. For what we’re building next, see the roadmap. To explore a place, start with the search box.