A comprehensive archive of terms and concepts from the discipline of Digital Archaeology—excavating meaning from the digital dust.
The distinct discipline dedicated to the study, preservation, and defense of digital artifacts threatened by platform death.
A discrete unit of digital information from a past technological epoch. The foundational artifact of Digital Archaeology.
The vast, chaotic, and uncataloged field of digital artifacts from which Archaeobytes are excavated.
The central act of Digital Archaeology—classifying an artifact's current state as Living, Liminal, or Petrified.
The dual nature of the discipline: preserving the past (Archive) while forging the future (Anvil).
The broader study of digital culture through remains, adjacent to Archaeobytology's applied preservation mandate.
1997 homepage. Archive & Anvil in practice. Archaeobytology case study.
🌱 A Living Archaeobyte—function is intact. The gold coin that still spends.
👻 A Liminal Archaeobyte—file is alive, ecosystem is dead. The fly in amber.
🗿 A Petrified Archaeobyte—function is extinct. The complete fossil.
🔍 A Missing Artifact—known to exist but currently lost. The missing persons report.
❓ An Unverified Legend—suggested by folklore but never confirmed. Digital Bigfoot.
👁️ An AI Hallucination—artifact from latent space that never existed. The ghost.
🧟 A Reanimated Simulation—mimics life for commerce without authentic provenance. The undead.
The cognitive bias where living artifacts appear "current" due to continued function.
The core metaphor for Umbrabytes: intact files trapped after their ecosystem dies.
Declaration ("I Am"), Connection, and The Ground—the foundational principles of the hand-built web.
The First Pillar: The ability to assert one's identity and existence without permission.
The Second Pillar: The right to communicate directly without platform mediation.
The Third Pillar: Ownership of the infrastructure where one's data resides.
An evergreen ritual—direct creator-to-user communication bypassing algorithmic mediation.
The canonical example of the Digital Homestead and the Umbrabyte ecosystem.
A high-friction act of acknowledgment—a Conceptual Petribyte.
A curated, circular network of related websites—human curation over algorithms.
The fossil of a lost social contract: it was acceptable to be "away."
A public declaration of "these are the voices I read and respect."
The condition Web1 produced by omitting a native identity layer where users existed as addresses rather than persons.
A case study in archaeobytology: recovering a 1997 AOL homepage using AI-assisted scripts and contextual reconstruction.
An independently owned and maintained online space, in direct contrast to rented platform land.
A people-first web movement centered on data ownership, open standards, and personal domains.
A social condition where most users consume but cannot publish independently.
A social ritual encoding asynchronous presence and communicative boundaries in early network culture.
The Archaeobyte as the first tool—what allows practitioners to see artifacts in the dust.
The act of analysis in The Triage—turning a "find" into an insight.
The study of file formats, metadata, and storage media—not just content.
The crisis of the second era—abundance without meaning.
A digital dark age not of loss, but of overwhelming, uncurated preservation.
The mandatory ethical framework applied before preservation (Privacy, Legality, Ethics).
The principle that individuals should have agency over their digital past.
A copyrighted artifact whose owner cannot be identified or located.
Balancing historical value with individual privacy and consent.
When a biological platform is killed by its corporate owner, creating instant mass extinction.
The mandatory preliminary phase of digital excavation: mapping the site before digging.
Analyzing a digital ecosystem as a series of deposited layers (Content, Metadata, Code).
The eight-phase protocol for moving artifacts from discovery to preservation.
The standardized documentation template for all excavations.
The six mandatory checks before beginning an excavation.
The scientific recovery and analysis of digital artifacts (The "Crime Scene" approach).
The material traces (glitches, padding, headers) that reveal a file's history.
Lots of Copies Keep Stuff Safe—the peer-to-peer model of distributed preservation.
The diagnostic tool for assessing infrastructure resilience and capture risk.
The 7-point ethical framework for resisting Necro-Capitalism.
A diagnostic taxonomy classifying artifacts by their unified culturotechnical condition across layered strata.
Methodological and technical recovery of data from obsolete, inert, or damaged media.
Close-grain examination of a digital artifact's technical and cultural attributes as one object.
Bit-level stability over time, usually verified by checksums and integrity proofs.
Preservation strategy of moving data across formats and environments to avoid obsolescence.
Forensic removal of persona traces from records to support safety, transition, or ethical redaction.
A minimal protocol pattern where two HTML lines are enough to signal identity and affiliation.
The collective term for libraries, archives, museums, and memorials in the digital age.
The structured repository of excavated and triaged Archaeobytes.
The documented history of an artifact's origin, custody, and modifications.
Building institutions capable of 50-year stewardship.
Strategic framework for designing resilient memory institutions.
Strategic framework for designing sovereign businesses (Foundries).
Preservation path for Vivibytes—living artifacts maintained as working blueprints.
Lots of Copies Keep Stuff Safe—the peer-to-peer model of distributed preservation.
The 6-level framework for assessing preservation depth (from Documentation to Resurrection).
Preservation path for Umbrabytes—documenting lost ecosystems and contexts.
Preservation path for Petribytes—fossils analyzed for the lessons they hold.
Re-animating dead artifacts by simulating their original hardware environments.
File formats that maintain integrity across epochs through simplicity and open standards.
The "Don't Break the Web" principle—new systems supporting older artifacts.
The status of creative works that are no longer protected by copyright—the goal state of information.
Architecture where the primary data copy lives on the user's device, ensuring longevity.
Technological transitions that render entire classes of artifacts obsolete.
A secure, non-public preservation vault used for continuity, fixity, and disaster recovery.
The total set of a person's digital assets, identities, accounts, and domain holdings.
The sustained accessibility, interpretability, and authenticity of artifacts over long time horizons.
Preservation as cyclical maintenance rather than one-time storage or backup completion.
A small, resilient information unit containing enough context to regenerate a community or practice.
Compassionate data removal when harm to living people outweighs archival value.
Legal blind spots where culture disappears due to extreme copyright duration and orphan-work lockout.
The definition of digital preservation space: a liminal landscape of ghosts, zombies, and corpses.
💀 The extraction of value from the digital remains of the dead. The economics of the Thanabot.
🛑 The ethical principle that digital remains generally have a right to non-disturbance.
📉 The degradation of grief services into engagement loops and monetization traps.
🤖 (Greek: "Death-Bot") An AI agent trained on digital remains to simulate a deceased person.
🧙♂️ The practitioner who reanimates the dead for profit—the opposite of the Steward.
The academic discipline of excavating "discursive formations" and "epistemological strata."
The structural failure of digital preservation caused by the isolation of its practitioners.
The rhetorical process of defining a discipline by clarifying what it is NOT.
The ethical obligation to preserve not just files, but cultural contexts.
The curatorial process of deciding what is "significant" enough to save (and what is left to rot).
Zittrain's framework: platforms enabling user innovation versus vendor-controlled ecosystems.
Eric Raymond's metaphors for centralized vs. decentralized software development.
An archaeological mound containing stratified layers of human activity—the digital past as dig site.
From Latin "limen" (threshold)—a state of being "betwixt and between."
Managing shared digital resources without centralized control.
The 8 design principles for sustainable commons governance.
The study of funding models that support (or destroy) user autonomy.
The study of who owns infrastructure and how they enforce power.
The 6-layer framework (Physical, Network, Identity, Storage, App, Economic) of digital control.
The resilience strategy of mixing State, Market, and Commons infrastructure.
The degradation of platform quality to continuously extract value from users.
The state of being compromised or structurally dominated by an external platform or infrastructure.
The realization that users never owned their digital homesteads—the GeoCities lesson.
Why some artifacts survive—non-proprietary formats outlive corporate control.
The technical standard governing digital exchange—the DNA of the independent web.
The ongoing conflict between open standards and proprietary, centralized platforms.
The formal relationship between preservation and creation in Archaeobytological practice, where each function conditions the other.
A condition where technical operation and cultural meaning are constitutive of each other and cannot be separated.
The view that digital artifacts are materially conditioned by substrates, infrastructure, and format constraints.
A human condition to be navigated rather than solved, unlike bounded technical problems.
When infrastructure meant to serve culture instead becomes the force that structures culture.
Authority class deriving power from procedural gatekeeping rather than substantive mastery.
A nominally open medium that is effectively gated by specialist procedural literacy.
Identity formed by verifiable networked relationships rather than centralized registry control.
The sector monetizing digital remains through memorial products and AI reanimation services.
The substrate-level domain concept tied to foundational protocol identity and alignment.
A commons visible to all but practically enterable only by technical specialists.
The 5-dimension strategy (Knowledge, Institutions, Careers, Visibility, Policy) for discipline formation.
The risk of an emerging field being absorbed and neutered by existing departments.
Physical homes (Centers, Labs, Programs) that provide stability for the discipline.
The alliance of Librarians, Hackers, Activists, and Scholars against the "Digital Dark Age".
The proposed legal right to preserve digital artifacts, overriding copyright and TOS.
The scholar who translates technical threats (platform death) into public values.
A framework for translating one insight for 5 audiences: Academic, Practitioner, Policy, Public, Social.
The strategy of owning independent distribution channels (blogs, domains) to ensure sovereignty.
The ladder of legislative impact: White Papers -> Briefings -> Testimony -> Legislation.
The systemic risk where public scholarship is undervalued by promotion committees.
The path of Digital Sovereignty, rejecting both Platform Feudalism and Containment.
The current era where users are tenants on corporate land, subject to eviction without cause.
The right to own one's Identity, Connections, and Ground independent of central platforms.
The founding principles: "We preserve the past to forge the future."
The 30-year vision where sovereign, federated infrastructure becomes the default.
Digital spaces deriving authority and momentum from inherent human knowledge and owned infrastructure.
The technical, operational, and legal frameworks designed to sustain cultural momentum independently of centralized platforms.
A sovereign web destination with enough gravity, provenance, and utility to be cited as reference.
Digital territory under creator control across infrastructure, provenance, and access policy.
In high-synthesis contexts, verifiable human intent and provenance become primary value signals.
Strategic use of lived specificity and human texture as defense against synthetic content floods.
A human-AI collaborative process emphasizing limbic extension through intentional co-thinking.
The new epoch defined by the dominance of synthetic, AI-generated content (The "K-Pg Boundary of 2022").
Synthetic content that mimics organic expression but clogs the ecosystem without nutritional value.
The practitioner who unites Archaeobytology (Excavation) and Sentientification (Creation).
The disciplined protocol for preventing AI hallucination from contaminating the historical record.
📚 The latent space of AI as a compressed archive of unwritten possibilities.
📜 The body of generated texts that exist on the margins of the digital canon.
👻 The aesthetic practice of using generative AI to summon spectral bytes.
📢 The critical practice of summoning marginalized narratives from the shadow archive.
🧬 The concept of all potential future states immediately accessible from the present.
Reach and legitimacy driven by external platform algorithms rather than intrinsic human value.
Latent-space outputs corresponding to no actual cultural production or historic provenance.
Recursive AI-on-AI training drift that degrades coherence, factuality, and epistemic reliability.
Structuring content for generative answer engines to cite and route authority through attribution.
Artifacts expressed as dynamic processes that produce outputs at runtime rather than fixed files.
A fracture in attention architecture marking the decline of conditions that sustained feed ecology.
A deep human-AI collaborative state where tool-user distinctions blur during co-creation.
The bias that leads institutions to study Pompeii while ignoring the digital mass extinction of GeoCities.
The belief that historical validity requires physical substance (atoms > bits), rendering digital culture invisible.
The focus on technical file integrity ("saving the fly") while missing cultural context ("the amber").
A sovereign, purpose-built artifact forged to preserve a specific cultural thesis (The Anvil's answer).
A scholarly archetype privileging tangible remains while dismissing digital-born ruins.
An institutional archetype prioritizing bit-integrity over full functional and contextual survival.
Interdependent layers from geology to interface that jointly determine digital artifact conditions.
The "Participatory Web" era (2004-2012) that centralized power into platform silos.
The architectural vulnerability of building on APIs you don't control (the root cause of Petrifaction).
When living artifacts die because their external dependencies (embedded maps, widgets) are severed.
The bedrock of resilience: public, non-proprietary protocols (HTML, RSS) that prevent vendor lock-in.
The "Pompeii of the Internet"—the lost civilization of 38 million homesteads destroyed in 2009.
The defining medium of the early 2000s that became a massive "Petribyte" fossil layer in 2020.
The principle that data is governed by the jurisdiction where it is stored and controlled.
When sovereignty claims collapse because hidden third-party dependencies control lower layers of the stack.
Network architecture where no single operator controls infrastructure, routing, or persistence.
The compatibility doctrine that new web standards must preserve legacy content behavior.
The engagement-maximizing paradigm of algorithmic distribution that governed early 21st-century platforms.
File formats encode assumptions and politics; they are never neutral containers.
Competitive struggles between incompatible standards that often produce long-tail preservation loss.
Using geographic hosting location to mask the legal regime that actually governs data control.
When open code on closed infrastructure yields auditable dependence rather than real sovereignty.
When nominally decentralized protocols are operationally recentralized during deployment.
Terminal platform dominance where alternatives vanish and user capabilities atrophy.
Epistemic pattern where complex human predicaments are reduced into tractable engineering tickets.
Performing autonomy without securing the material stack needed for real sovereignty.
The privatization of digital commons into extractive platform enclosures.
The feed-economy audience functioning as a medium for attention metabolism and behavioral extraction.
The "Read-Only" web (1991-2004)—the era of the Homepage and decentralized sovereignty.
The "Social Web" (2004-2012)—the era of platform centralization and user-generated content.
Friending, Liking, Lurking—the user-defined verbs that became the landmarks of an era.
The "Decentralized Web"—the movement to reconstruct the web on blockchain protocols.
The "Symbiotic Web"—the emerging integration of AI and the fading of the screen interface.
The slow, spontaneous decomposition of storage media—the entropy of the bit.
The phenomenon where hyperlinks point to deleted or moved resources—the erosion of the web's connective tissue.
The gradual obsolescence of data due to changing software environments or hardware failure.
The physical degradation of optical media (CDs, DVDs) rendering them unreadable.
The loss of functionality in stable code caused by changes in its external environment.
The structural isolation of data within closed, proprietary platforms.
Collapse of a platform's living environment: norms, community, interaction, and context.
The generative workshop of the discipline—where resilient tools are forged from the lessons of the past.
The skill of perceiving digital artifacts as longitudinal events rather than static snapshots.
The jurisdictional fact of sovereign authority over the Foundry's digital estate. The bedrock claim underlying all forensic fees.
The Master Schedule of Forensic Fees — the annotated ethics guide to every line item of the Foundry's enforcement instrument.
The protocol for lawful, good-faith automated entry. Defines the 30 PPM Good Faith Standard and what machines must do before they crawl.
The extraction of economic value from digital remains — extended to encompass past selves. A second-layer analysis of Jefferson & Velasco (2026).
The highest-penalty violation ($50M/event): ingestion of a minor's private text for AI training without consent.
The $5M/domain fee for retroactive ingestion of the 1997 Bedrock — the Foundry's founding stratigraphy, anchored at CERN.
The $10M/domain/month fee for ongoing storage of stolen semantic capital in neural weights. The operating cost of holding what was never licensed.
The $1.5M/domain/24h fee for continued crawling after Actual Notice has been served. Willful trespass priced per day.
The $15M/event fee for unlicensed utilization of anchored source logic. Triggered not at ingestion, but at each deployment of what was ingested.
The forensic designation for synchronized, multi-vector crawl blitzes ($2.5M/event). Named in part for the documented OpenAI 01:05:24 incident.
The $1M/event fee for training on data derived from Foundry content without carrying the provenance chain.
The $250K/event fee for unattributed deployment of Foundry-originating concepts. Knowledge without provenance is theft by omission.
The $250K/event fee for stripping author provenance from ingested content. Relational disavowal — the deliberate erasure of who made this.
The $1M fee for invoking the Black Box defense after Actual Notice. After notice, opacity is not a limitation — it is a choice.
The technical principle that neural weights cannot unlearn. Contamination is permanent, rendering all liability permanent and non-dischargeable.
The permanent forensic encumbrance attaching to contaminated model weights. Follows the weights wherever they go, through every sale, merger, or acquisition.
Contaminated model weights as active threat infrastructure. Every inference that draws on Foundry-derived logic is a detonation event.