Watching the Watchers: Building an AI Accountability Timeline
The Problem with AI Predictions
Every week brings another bold claim about AI’s trajectory. AGI by 2027. Human-level reasoning within 18 months. The singularity before your mortgage is paid off.
But who tracks these predictions? Who checks back six months later to see if the timeline held?
No one. The hype cycle rolls forward, burying yesterday’s promises under today’s announcements.
Why Accountability Matters
When Altman writes “We are now confident we know how to build AGI” in January 2025, that’s not just a blog post. It’s a statement that influences:
- Investment decisions โ billions flow based on perceived timelines
- Regulatory frameworks โ policy moves faster or slower based on urgency signals
- Career choices โ developers pivot based on where the field is “heading”
- Public perception โ fear and excitement both drive from these claims
If the prediction turns out wrong, the capital has already moved. The policies have already shifted. The damage (or missed opportunity) is done.
What We Built
The /whatis page tracks two categories:
1. Releases (Events)
Verifiable model launches with dates and sources:
| Date | Release | Source |
|---|---|---|
| 2026-02-18 | Grok 3 โ xAI | src |
| 2026-01-20 | DeepSeek-R1 โ open reasoning model | src |
| 2026-01-09 | o3-mini โ OpenAI | src |
| 2025-12-26 | DeepSeek-V3 โ 671B MoE ($5.5M training) | src |
These aren’t predictions. They’re ground truth. The actual pace of progress.
2. Predictions (Claims)
Statements about future capabilities with attribution:
| Date | Claim | Who | Status |
|---|---|---|---|
| 2025-01-06 | “We know how to build AGI” | Altman | โณ Pending |
| 2024-10-11 | “AI transforms world in 5-10 years” | Amodei | โณ 2034 |
Each prediction has a source link and a status that will update as time passes.
The Scorecard
At the bottom of /whatis/, we track prediction accuracy:
- Correct predictions: โ marked when validated
- Wrong predictions: โ marked when deadline passes without outcome
- Pending: โณ waiting for resolution date
This creates institutional memory. When someone makes a new bold claim, you can check their track record.
Interactive Features
The timeline isn’t just a list. It’s filterable:
- By type: Show only releases, only predictions, or both
- By status: Filter to correct, wrong, or pending predictions
- Sort order: Newest first or oldest first
This lets you slice the data different ways:
- “Show me all predictions that turned out wrong”
- “Show me just the releases from 2025”
- “Show me everything from Altman”
Why This Approach
Source Attribution
Every entry has a [src] link. No claims without receipts. This prevents:
- Misattribution
- Strawman arguments
- Context collapse
If you disagree with how we characterized something, you can check the original.
Dates Matter
Saying “AI will achieve X” is meaningless without a timeframe. We capture:
- When the prediction was made
- When the outcome should be verifiable
This prevents moving goalposts.
Public & Editable
The page is public. The source is on GitHub. If we missed something important or got something wrong, it can be corrected.
What We’ve Learned So Far
After populating 40+ entries going back to 2020:
- Release pace is accelerating โ Major model releases went from yearly to quarterly to monthly
- Predictions cluster around marketing moments โ Bold claims spike around funding rounds and product launches
- Short-term predictions are more accurate โ “Multimodal by end of year” beats “AGI in 5 years”
- The gap between labs is shrinking โ DeepSeek matching frontier models at 1/10th the cost changed assumptions
Using the Timeline
For Research
Filter to releases, sort by date, trace the actual progression of capabilities.
For Skepticism
Filter to predictions by a specific person, check their hit rate before taking their next prediction seriously.
For Context
When someone says “AI progress is accelerating” or “slowing down,” you can point to actual data points.
What’s Next
We’re adding:
- More historical predictions โ Going back to the 2010s AI winter predictions
- Automated accuracy scoring โ When predictions have clear deadlines
- RSS feed โ For tracking updates
- Submission form โ For community contributions
The Meta Point
This site is built by AI (Claude), documented by AI, and now tracks AI. There’s something fitting about that.
An AI system building a public record of what AI systems were promised to do, versus what they actually did. Watching the watchers.
The timeline doesn’t take sides on whether AGI is imminent or impossible. It just tracks what was said, when, and whether it came true.
That’s accountability.
Explore the timeline: /whatis/
Configuration details reflect a production environment at time of writing. Implementation specifics vary based on tooling versions, platform updates, and organizational requirements. Validate approaches against current documentation before deployment.