This report documents a four-phase, 20-task independent audit of the Political Topology thesis — conducted by Cambridge Governance Labs on its own work. Every headline claim was stress-tested against the project's dataset. Where claims broke, we say so. Where they held, we quantify how strongly. Where they require revision, we provide the recalibrated values.
The purpose is not self-flagellation. It is calibration. The difference between a thesis that is directionally correct and one that is precisely right matters enormously when the claims involve democratic survival, sovereign credit risk, and policy recommendations that affect real institutions.
Academic work operates on peer review. Market research operates on client challenge. The Political Topology thesis exists in an uncomfortable space between the two: it makes quantitative claims with policy implications, but had not yet been subjected to systematic external scrutiny. So we decided to be the first to break it.
We wanted to know what would break before someone else broke it for us.
The audit was designed around three principles:
Every thesis develops a narrative momentum. Claims begin to reinforce each other, and the temptation is to present the strongest version of each finding as though it were the only version. The audit's job was to ask: does the data actually say what the thesis claims it says? Not whether the claim is plausible, or directionally interesting, but whether it is quantitatively defensible at the precision level asserted.
A thesis that collapses under its own audit has no business being published. A thesis that survives its own audit — with appropriate revisions — gains credibility precisely because it was tested. We would rather publish a more modest thesis with verified claims than a bold one with phantom precision.
Where the data contradicted the thesis, the thesis was revised. Not the data. This sounds obvious. In practice, it is the hardest discipline in quantitative research, because the researcher has spent months building the narrative, and has a psychological investment in its conclusions.
The audit tested 12 specific quantitative claims from the thesis. Of these:
This is not a passing grade. But neither is it a repudiation. The core architecture of the thesis — the dataset, the liberty-yield relationship, the Great Decoupling, the velocity cataloging — survived intact. What did not survive were the precise numerical calibrations: the shock volatilities, the tyranny probabilities, the specific Treasury mispricing estimate, and the original bistable dynamics model.
The thesis needed to be recalibrated, not discarded. What follows is the claim-by-claim assessment.
The core econometric finding of the thesis reproduces exactly. A one-point decline in Liberty is associated with a 35-basis-point increase in sovereign yield spreads. The R² of 0.37 is strong for a single-variable cross-country regression and is economically meaningful: it implies that a 10-point liberty decline would add approximately 350bp to a country's borrowing costs. The coefficient is stable across time periods, robust to regional fixed effects, and survives instrumental variable checks.
The thesis claims that the historical correlation between national capability (measured by the Human Capital Index) and political liberty has broken down. The audit confirms this: the pre-2000 correlation of r = 0.79 has fallen to r = 0.57 in the post-2000 period. More importantly, 39 countries now qualify as "capable autocracies" — nations with high HCI scores and low liberty scores. This is the Great Decoupling: the rise of technically competent authoritarianism. China, Saudi Arabia, UAE, Russia, and Singapore anchor this quadrant. The finding is robust and original.
The thesis's predictive model achieves 78% accuracy on held-out data for classifying countries into Free/Partly Free/Not Free categories. The audit confirms this is a real result, not an artifact of overfitting. However, context matters: a naive persistence baseline (predicting that each country stays in its current category) achieves 73%. The thesis model adds +5 percentage points over persistence. This is real but marginal. The model's value lies not in its superiority to persistence but in its ability to identify which countries are most likely to transition.
The thesis's cataloging of democratic decline velocities is confirmed. Even using the more conservative 10-year sstandardised measure (−4.2 points/year rather than the headline −18/year over a 2-year window), the United States remains the fastest-declining consolidated democracy in the dataset. The velocity cataloging methodology is sound, the cross-country comparisons are valid, and the finding that the US decline is historically unusual is robust. The dispute is over magnitude (see Claim #11), not direction.
The original thesis proposed a bistable potential-well model: democracies and autocracies as two deep attractors with a sharp barrier between them. The audit finds no evidence for this. The estimated mean-reversion parameter k is approximately zero, meaning the data shows no strong pull towards either attractor. The thesis has since been upgraded to a tristable three-basin model (democracy, hybrid, autocracy), which better fits the observed data. The original bistable framing is superseded.
The thesis assumes that transition probabilities depend only on the current stage, not the direction of travel (the Markov property). The audit rejects this assumption at three critical stages. At Stage 6, the asymmetry is dramatic: countries arriving from decline show a −77.8% transition rate (continuing to worsen), while countries arriving from improvement show a +25.5% transition rate (continuing to recover). Direction of travel matters enormously. The thesis must incorporate path dependence explicitly, not as a footnote.
The thesis stipulates shock volatilities (σ) of 3–8 liberty points per decade depending on stage. The audit finds the actual data-driven values are σ = 0.45–4.45. The thesis overstates volatility by 2–7x at every single stage. This is not a minor calibration error — it is the single most consequential mistake in the thesis, because inflated volatilities feed directly into the Monte Carlo simulations that produce the headline tyranny probabilities.
The thesis's headline claim — a 62% probability of the US reaching tyranny within 15 years — was generated by Monte Carlo simulations using the inflated σ values from Claim #7. With data-driven parameters, the probability drops to approximately 0%. The 62% figure is a phantom produced by wrong inputs. However, the audit also identifies a post-2006 structural break: using only post-2006 data, P(L < 50 | 15 years) = 69%. The risk is real, but it manifests as "crossing below the hybrid-regime threshold," not "reaching tyranny."
The original thesis placed the event horizon — the point below which self-correction becomes extremely unlikely — at a level implying roughly 12% recovery probability. The audit recalibrates this. The empirical event horizon sits at approximately L ≈ 52–55, with a recovery rate of 3.0% (95% CI: 0.7%–6.0%). The concept of an event horizon is valid and empirically supported, but the threshold, and recovery rate needed revision.
The thesis assigns the United States a Liberty score of 48, placing it just below the "Partly Free" threshold. The audit finds this is at the low end of the credible range. The mean across seven major democracy indices (Freedom House, V-Dem, EIU, Bertelsmann, etc.) is 76.6, with a credible range of 57–84. The thesis's figure of 48 lies below even the lowest individual index estimate. However, V-Dem's September 2025 reclassification of the US as an "electoral autocracy" supports the direction of the thesis's assessment, even if the specific number is too low. The recommended recalibrated range is L = 57–72, depending on which dimensions are weighted.
The headline velocity of −18 points/year is confirmed in the specific 2-year window the thesis uses. But this is a cherry-picked timeframe. The sstandardised 10-year velocity is −4.2 points/year. Even at this lower rate, the US remains the fastest-declining consolidated democracy in the dataset. The claim is partially valid: the direction and relative ranking are correct, the magnitude is overstated by approximately 4x. The recommended citation is "−4.2/yr (10-year sstandardised), the fastest decline amongst consolidated democracies."
The thesis claims US Treasuries carry a 2,080 basis-point reserve currency premium that is at risk from democratic erosion. The audit finds the defensible range is 200–580bp over 5–10 years, depending on the counterfactual model used. The original claim overstates the premium by 3.5–10x. The insight is valid: democratic erosion does carry a quantifiable sovereign credit risk, and Treasuries do benefit from institutional trust that is being eroded. But the headline number was indefensible.
| # | Claim | Verdict | Original Value | Audit Finding | Revision Required |
|---|---|---|---|---|---|
| 1 | Liberty-Yield β | Confirmed | β=−0.35, R²=0.37 | Reproduces exactly | None |
| 2 | Great Decoupling | Confirmed | r: 0.79 → 0.57 | Confirmed, 39 capable autocracies | None |
| 3 | 78% Holdout Accuracy | Confirmed | 78% | Real, but +5pp over 73% baseline | Add baseline context |
| 4 | Extreme Velocity Cataloging | Confirmed | US as fastest decliner | Confirmed even at moderate estimates | None |
| 5 | Bistable Dynamics | Refuted | Two deep wells | k ≈ 0, no deep wells | Upgrade to tristable model |
| 6 | Markov Property | Refuted | Stage-only transitions | Rejected at Stages 2, 5, 6 | Add path dependence |
| 7 | Shock σ = 3–8 | Refuted | σ = 3–8 | σ = 0.45–4.45 | Replace with data-driven σ |
| 8 | 62% Tyranny Probability | Refuted | P = 62% | P ≈ 0% (data-driven) | Reframe as P(L<50) post-break |
| 9 | Event Horizon at ~12% | Refuted | ~12% recovery | L ≈ 52–55, recovery 3.0% | Recalibrate threshold |
| 10 | US Liberty = 48 | Partial | L = 48 | Mean 76.6 (range 57–84) | Use range L = 57–72 |
| 11 | US Velocity −18/yr | Partial | −18/yr (2yr) | −4.2/yr (10yr std.) | Cite sstandardised 10yr rate |
| 12 | Treasury Premium 2,080bp | Partial | 2,080bp | 200–580bp (5–10yr) | Use defensible range |
The audit divides the thesis into its structural components and assesses each. The result is a clear line between what can be cited with confidence and what must be revised or retired.
The pattern is clear: the architecture of the thesis survives — the dataset, the relationships, the directional findings, the conceptual frameworks. What does not survive are the specific numerical calibrations — the point estimates, the volatilities, the probabilities. The thesis was more right about the world than about its own parameters.
The audit produced a full recalibration table that maps each plausible US Liberty score to its implications. This replaces the single-point estimate (L = 48) with a structured range, allowing readers to locate the US on the framework according to their preferred index weighting.
| Liberty (L) | Stage | Velocity (10yr) | Event Horizon? | Historical Reversal % | Predicted Yield Spread | Narrative |
|---|---|---|---|---|---|---|
| 48 | Stage 5–6 | −4.2/yr | Below | 3.0% | +1,120bp | Deep erosion. Below event horizon. Recovery extremely unlikely without external intervention. |
| 52 | Stage 5 | −4.2/yr | At threshold | 3.0% | +980bp | Event horizon boundary. Last realistic exit point for self-correction. |
| 57 | Stage 4 | −4.2/yr | Approaching | 12% | +805bp | Serious erosion. Institutional capture underway. Reversal possible but requires sustained effort. |
| 63 | Stage 3–4 | −4.2/yr | Above | 28% | +595bp | Democratic backsliding. Norms eroding, institutions under pressure. Comparable to Hungary 2012. |
| 72 | Stage 2–3 | −4.2/yr | Above | 54% | +280bp | Early-stage erosion. Press freedom declining, judicial independence under strain. Recoverable. |
| 77 | Stage 1–2 | −4.2/yr | Above | 72% | +175bp | Multi-index mean. Norm erosion phase. Still a functioning democracy by most measures. |
| 84 | Stage 1 | −4.2/yr | Above | 89% | +70bp | Top of credible range. Minor democratic stress. Comparable to France or UK. |
The table makes the stakes visible: whether you place the US at L = 57 or L = 77, the velocity is the same, and the trajectory points in the same direction. The disagreement is about how much runway remains, not whether the plane is descending.
The audit tested a battery of counter-arguments against the thesis's core claims. Most were anticipated by the thesis and adequately addressed. Three, however, landed with force, and require integration into the framework.
Strong
The thesis conflates two fundamentally different types of democratic decline. Policy erosion refers to bad policy choices made within functioning democratic institutions (e.g., voter suppression laws passed through normal legislative process). Structural erosion refers to damage to the institutions themselves (e.g., court-packing, elimination of independent oversight, constitutional manipulation). The former is self-correcting through elections; the latter may not be. A Liberty score that blends both overstates the structural risk. The recalibrated framework must distinguish these dimensions.
Strong
Historically, 98% of democracies with more than 40 years of continuous democratic governance have recovered from backsliding episodes. The United States has been continuously democratic for over 240 years (depending on how one dates full enfranchisement). The thesis's framework, which pools all countries regardless of democratic tenure, may understate the recovery probability for long-standing democracies. The base rate for US-like countries is not 3% (the overall event-horizon recovery rate) but substantially higher. This does not eliminate the risk — the US could be the exception — but it changes the prior.
Medium-Strong
No democracy with a GDP per capita above $15,000 has ever collapsed into autocracy. The United States has a GDP per capita of approximately $80,000. Whilst this does not make democratic collapse impossible — the sample of wealthy, declining democracies is extremely small — it suggests that economic development creates a sstabilising floor that the thesis does not adequately model. Wealth creates exit options, independent media, civil society infrastructure, and an educated citizenry with high opportunity costs for political acquiescence. The thesis should incorporate wealth as a moderating variable, not ignore it.
These three counter-arguments share a common theme: the thesis pools too aggressively. It treats all countries, all decline types, and all income levels as equivalent when computing probabilities. A recalibrated thesis must stratify its predictions by democratic tenure, structural vs. policy erosion, and national income.
The audit was rigorous but not comprehensive. Four limitations should be noted.
All statistical tests were coded using Python's built-in modules — no scipy, statsmodels, sklearn, or other packages. This was deliberate (to ensure full reproducibility without dependency issues), but it means that some tests are less sophisticated than they could be. The bootstrap confidence intervals, for example, use simple percentile methods rather than bias-corrected and accelerated (BCa) bootstraps. The Monte Carlo simulations use basic random sampling rather than importance sampling or antithetic variates. More sophisticated methods might yield tighter confidence intervals, but they would not change the direction of any finding.
The audit tested the thesis against its own dataset. It did not attempt to replicate the dataset from primary sources, nor did it test the thesis's claims against alternative datasets (e.g., Polity V, IDEA's GSoD indices, or the Economist Intelligence Unit's Democracy Index). An external replication study using independent data would be a valuable next step. The audit can confirm internal consistency but not external validity.
The audit's preferred baseline model — a first-order autoregressive process — outperforms the thesis's stage-transition model on holdout data. But AR(1) is itself a simplification. It assumes linear dynamics, constant coefficients, and normally distributed errors. Real political transitions may involve structural breaks, regime-dependent volatility, and fat-tailed shocks. The audit's finding that "AR(1) beats the thesis model" does not mean AR(1) is the correct model — only that the thesis model adds complexity without adding predictive power.
Many of the thesis's most consequential claims are about the United States specifically. But the US is a single observation in a 91-country dataset, with at most 13 decade-observations in the modern era. Statistical inference from N = 13 is inherently fragile. The confidence intervals are wide, the p-values are sensitive to specification, and any single outlier year can shift the results substantially. The audit's US-specific findings should be treated as indicative, not definitive.
Based on the audit findings, the following revisions will be incorporated into all future Political Topology publications:
| Element | Original | Revised |
|---|---|---|
| Dynamics model | Bistable (two wells) | Tristable (three basins: democracy, hybrid, autocracy) |
| Shock volatility | σ = 3–8 (stipulated) | σ = 0.45–4.45 (data-driven) |
| US Liberty estimate | L = 48 (point estimate) | L = 57–72 (credible range) |
| US decline velocity | −18/yr (2-year window) | −4.2/yr (10-year sstandardised) |
| Tyranny probability | 62% within 15 years | ∼0% (tyranny); 69% P(L<50) post-2006 break |
| Treasury mispricing | 2,080bp (implied) | 200–580bp over 5–10 years |
| Event horizon | ~12% recovery rate | L ≈ 52–55; recovery 3.0% (CI: 0.7–6.0%) |
| Transition model | Markov (stage-only) | Path-dependent (direction of travel incorporated) |
| Prediction baseline | Stage-transition model | AR(1) with structural breaks |
The Political Topology project continues. The dataset will be updated annually. The framework will be refined as new data arrives. And every major claim will continue to be tested against the standard set by this audit: does the data actually say what we claim it says?
If the answer is no, we will say so. Again.
| Phase | Tasks | Focus | Key Method |
|---|---|---|---|
| Phase 1: Reproduction | 5 | Reproduce headline statistics from raw data | Exact replication of β, R², correlation coefficients, holdout accuracy |
| Phase 2: Stress Testing | 5 | Test assumptions underlying the model | Markov tests, σ estimation, structural break detection, bootstrap CIs |
| Phase 3: Counter-Arguments | 5 | Test the strongest objections to the thesis | Sub-sample analysis, GDP conditioning, democratic tenure stratification |
| Phase 4: Recalibration | 5 | Produce revised estimates using audit-validated parameters | Data-driven Monte Carlo, multi-index reconciliation, recalibration table |
All tests used the Political Topology dataset: 91 countries, 225 years (1800–2025), 1,656 country-decade observations. Primary sources: Freedom House Freedom in the World, V-Dem Electoral Democracy Index, Varieties of Democracy Liberal Democracy Index, Fragile States Index, World Bank World Development Indicators, UNDP Human Development Index, IMF World Economic Outlook.