Methodology & Data Sources

This page documents the data sources, curation methodology, and limitations of the Tech Terrain visualization. It is intended for researchers, educators, and anyone interested in understanding how the data was assembled.

Last updated Feb 2026

Cite This Project

If you use Tech Terrain in academic work, presentations, or publications, please cite it using the BibTeX entry below.

@misc{techterrain_2026,
  title     = {Tech Terrain: 3.4 Million Years of Human Technological Progress},
  author    = {Vamps},
  year      = {2026},
  url       = {https://techterrain.io},
  urldate   = {2026-02-15},
  note      = {Interactive 3D voxel visualization mapping technological advancement from stone tools to AI. Data sourced from Wikipedia, Our World in Data, LessWrong, and the World Bank DataBank.}
}

Dataset Overview

The Tech Terrain dataset comprises approximately 240 historical milestones, 30+ company milestones, and 38 future predictions, spanning from 3.4 million years ago (earliest stone tools) to projected developments in the 2050s.

Each milestone is categorized into one of 10 technology domains: Tools, Energy, Materials, Transport, Communication, Computing, Health, Science, Space, and AI. Categorization follows the primary technological contribution of each innovation, with a secondary mapping table for ambiguous cases.

LayerCountTime SpanSource
Historical Milestones~2403.4M BCE – 2025 CEWikipedia, OWID
Company Milestones~351976 – 2024Press releases, Wikipedia
Future Predictions382026 – 2050LessWrong, BBC, IBM, ITER, Claude AI
World Bank Indicators10 series1960 – 2024World Bank DataBank

Historical Milestones

Historical milestones are sourced primarily from two references:

Each milestone includes a sourceUrl field linking to its primary reference. Dates are approximations based on the best available scholarly consensus. Where multiple dates exist in the literature (e.g., "invention" vs. "first practical use"), we generally favor the date of first documented practical application.

Impact scores (1–10) are subjective editorial assessments reflecting the milestone's long-term influence on subsequent technological development. They are not derived from any quantitative model.

Future Predictions

Predictions for 2025–2055 are sourced from two channels:

  • LessWrong — AI Predictions (community forecasts with probability estimates)
  • AI-generated predictions (Claude) — speculative extrapolations based on current technology trajectories

Predictions are inherently speculative. They represent the opinions of their cited sources and are not guaranteed outcomes. They are displayed as amber diamond markers to visually distinguish them from verified historical events.

World Bank Indicators

Quantitative overlay data is sourced from the World Bank Open Data via the public API at api.worldbank.org/v2. Data is fetched for the "World" aggregate (country code WLD) and cached locally with a 7-day TTL.

The following 10 indicators were curated for their relevance to understanding the economic and infrastructural context of technological advancement:

GDP (current US$)

Gross domestic product at current US dollar prices.

NY.GDP.MKTP.CD

GDP per Capita (current US$)

GDP divided by midyear population.

NY.GDP.PCAP.CD

R&D Expenditure (% of GDP)

Gross domestic expenditures on research and development as share of GDP.

GB.XPD.RSDV.GD.ZS

Internet Users (% of population)

Individuals who have used the Internet in the last 3 months.

IT.NET.USER.ZS

Mobile Subscriptions (per 100 people)

Subscriptions to a public mobile telephone service.

IT.CEL.SETS.P2

Patent Applications (residents)

Worldwide patent applications filed by residents.

IP.PAT.RESD

High-Technology Exports (current US$)

Exports of products with high R&D intensity.

TX.VAL.TECH.CD

Tertiary Education Enrollment (% gross)

Gross enrollment ratio in tertiary education.

SE.TER.ENRR

Fixed Broadband Subscriptions (per 100 people)

Fixed broadband subscriptions providing high-speed access.

IT.NET.BBND.P2

Electric Power Consumption (kWh per capita)

Electric power consumption per capita.

EG.USE.ELEC.KH.PC

Search a Topic (AI-Generated)

The "Search a Topic" feature uses a server-side large language model (LLM) to generate milestone data for arbitrary queries in real-time. Results are cached server-side for performance and cost management.

AI-generated milestones include semantic tags, source URLs, impact scores, and category assignments. While the model strives for factual accuracy, generated milestones should be cross-referenced with primary sources before use in academic work. The LLM prompt is versioned; cached results from older prompt versions are automatically invalidated.

Curation Methodology

The dataset was assembled through a multi-step process:

  1. Source compilation: Milestones were extracted from Wikipedia's Timeline of Historic Inventions and Our World in Data's technology datasets.
  2. Category assignment: Each milestone was assigned to one of 10 primary technology domains using a deterministic mapping table. Ambiguous cases (e.g., "nuclear energy" could be Energy or Science) follow the primary technological contribution.
  3. Date normalization: Dates were normalized to integer years. For prehistoric events, approximate dates from archaeological consensus were used (e.g., "Oldowan stone tools" → -3,400,000).
  4. Impact scoring: Subjective 1–10 impact scores were assigned editorially, reflecting long-term influence on subsequent technological development.
  5. Source linking: Each milestone was linked to its primary Wikipedia or Our World in Data article for traceability.

Limitations

  • Western bias: The dataset skews toward Western/European technological history due to the predominance of English-language sources. Significant innovations from East Asia, the Middle East, Africa, and the Americas may be underrepresented.
  • Date precision: Prehistoric dates are approximations with margins of error spanning thousands of years. Even modern dates may vary by source.
  • Impact subjectivity: Impact scores are editorial judgments, not derived from any quantitative model. Different scholars would reasonably assign different scores.
  • Prediction uncertainty: Future predictions are speculative. The further into the future, the less reliable they become.
  • World Bank data gaps: Some indicators have incomplete coverage for certain years. The "World" aggregate may not reflect all countries equally. The trademark indicator was excluded due to returning 0 data points at the world level.
  • AI-generated content: Topic explorations are generated by an LLM and may contain hallucinated dates, incorrect attributions, or missing context.

Tech Terrain is an open educational tool created by Vamps. This methodology page is provided for transparency and scholarly reference.

Return to Terrain