unearth.wiki

Excavating members.aol.com/ajjvelasco

An AI-Assisted Archaeological Dig into a 1997 Homepage

Field Note This document chronicles an archaeobytological excavation conducted with AI assistance. It serves both as technical documentation and theoretical investigation into what we recover when we dig into the archived web—and crucially, what we don't recover, and why.

The Prompt

Twenty-eight years after creating a homepage on AOL, I wanted it back. Not as nostalgia, exactly—though there's that—but as an archaeobyte, a specimen for study. What does a 1997 personal website look like when excavated from the digital strata? What survives the journey through time and platform death?

The site lived at members.aol.com/ajjvelasco from roughly 1997 to 1999. It was a typical teenager's homepage: poetry (mine and others'), web rings, interactive JavaScript games, awards won from other sites, and that particular aesthetic of early web design—background colors, animated GIFs, counter badges, and the earnest belief that the web was a place for community.

AOL deleted the members.aol.com subdomain years ago. But Archive.org had snapshots. The question was: could I reconstruct it?

And could AI help me do the reconstruction?

Part I: Initial Contact

The first conversation went something like this:

Human

I need to download my old AOL homepage from archive.org. It was at members.aol.com/ajjvelasco. Can you help me write a script to get all the pages?

AI

I can help with that. We'll need to use the Wayback Machine's API to find snapshots and download them. Let me write a PowerShell script that will:

  • Query the Wayback Machine CDX API for all captured URLs
  • Filter for the most recent/best snapshots of each page
  • Download each page while preserving the directory structure
  • Handle rate limiting to be respectful to Archive.org's servers

What date range would work best? Sometimes earlier snapshots are more complete.

Human

The site was active from about 1997-1999. Let's try to get the most complete version we can find.

AI

Here's the script. It will create a catalog of what it finds first, then download everything systematically...

The First Script

The initial script was a PowerShell rescue operation. It queried Archive.org's CDX API, which returns a list of all captured URLs for a given domain. The challenge: archive.org had captured the site multiple times over multiple years, and we needed to pick the "best" snapshot of each page.

# Query Wayback CDX API for all snapshots
$cdxUrl = "http://web.archive.org/cdx/search/cdx?url=members.aol.com/ajjvelasco/*&output=json"
$snapshots = Invoke-RestMethod -Uri $cdxUrl

# For each unique URL, find the best snapshot
# Download from web.archive.org/web/[timestamp]/[original-url]

But this raised immediate questions:

Watching the Download

As the script ran, we watched the downloads in real-time. Files appeared with names like:

The prefixes were inconsistent. Some files used AJJVelasco, others ajjvelasco. Some images had .gif.html extensions—Archive.org's wrapper around binary content.

This wasn't just annoying. This was archaeologically significant. The Archive.org capture process itself left traces—artifacts of the archival layer, not the original site.

Excavation Note We weren't recovering the original site. We were recovering Archive.org's interpretation of the original site, filtered through their crawler's decisions, their storage format, their URL normalization choices. The dig site was already a palimpsest.

Part II: The Restoration

Problem 1: The Filename Chaos

The downloaded files needed to be transformed back into a coherent site structure. The AI and I collaborated on a Python script to:

  1. Strip the Archive.org prefixes: members.aol.com_80_AJJVelasco_ → nothing
  2. Detect directory structures: park_french.htmlpark/french.html
  3. Fix double extensions: info_b_h_1.gif.htmlinfo_b_h_1.gif
  4. Handle duplicates intelligently: If two files had the same content (checked by size), keep one
PREFIXES = [
    "members.aol.com_80_AJJVelasco_",
    "members.aol.com_80_ajjvelasco_",
    "members.aol.com_AJJVelasco_"
]

KNOWN_DIRS = [
    "graphics", "park", "pets", "mypoems", 
    "peoplespoems", "entries", "rings", "awards"
]

def restore_files():
    for filename in files:
        # Strip prefix
        clean_name = strip_prefix(filename)
        
        # Detect directory
        target_dir = detect_directory(clean_name)
        
        # Fix extensions
        final_name = fix_extensions(clean_name)
        
        # Move to proper location
        move_file(filename, target_dir, final_name)

Running this script, the chaos resolved into structure:

128
HTML Pages
39
Images
8
Directories

Problem 2: The Broken Links

Every HTML page still pointed to http://members.aol.com/ajjvelasco/...—absolute URLs that no longer existed. The AI helped create a link-fixing script that:

TARGET_URL_REGEX = re.compile(
    r'http://members\.aol\.com/(AJJVelasco|ajjvelasco)/?([^"\'>\s]*)', 
    re.IGNORECASE
)

def fix_links(content, current_file_path):
    def replacement(match):
        target_path = match.group(2) or "index.html"
        relative_path = calculate_relative(current_file_path, target_path)
        return relative_path
    
    return TARGET_URL_REGEX.sub(replacement, content)

This worked beautifully for internal links. But external links—to GeoCities sites, to web ring hubs, to other teenagers' homepages—those stayed broken. They pointed into the void.

Problem 3: The Scaling

When we opened the recovered site in a modern browser, it looked tiny. 1997 web design assumed 800x600 monitors. Modern screens are 1920x1080 or higher. The site was a postage stamp in the corner.

The AI suggested injecting responsive scaling code into every page:

<script>
function vintageScale() {
    var baseWidth = 800; // 1997 standard width
    var scale = window.innerWidth / baseWidth;
    document.body.style.zoom = scale;
    document.body.style.transform = "scale(" + scale + ")";
    document.body.style.transformOrigin = "top left";
    document.body.style.width = baseWidth + "px";
}
window.onload = vintageScale;
window.onresize = vintageScale;
</script>

We wrote another script to inject this into every HTML file. The site now scales to fill modern screens while preserving its 1997 layout integrity.

Methodological Note This injection is itself an act of interpretation. We're not preserving the site "as it was"—we're making it legible to 2025 eyes. This is restoration, not pure preservation. Every archaeobytological dig makes these choices.

Part III: What We Recovered

After the excavation and restoration, here's what we had:

HTML Pages (Complete)

  • Main navigation: index.html, home.html, index2.html
  • Poetry section: mypoems/ (7 poems), peoplespoems/ (10 submissions)
  • Interactive park: park/ (11 JavaScript games/tools)
  • Awards section: awards/ (7 pages of web ring awards)
  • Web rings: rings/ (7 web ring hub pages)
  • Journal entries: entries/ (24 dated entries)
  • Info pages: me.html, library.html, archives.html, etc.

Images (Partial)

  • Navigation graphics: 12 button/header images
  • Decorative elements: Animated GIFs (butterflies, birds, cats)
  • Web ring logos: Various webrings badges and graphics
  • Counter badges: Site meter images (non-functional)

Interactive Elements (Degraded)

  • JavaScript games: Code intact but some reference broken resources
  • Forms: HTML structure present, but POST endpoints dead
  • Chat links: Point to defunct AOL chat servers

The site is substantially recovered. You can read the poetry. You can browse the journal entries. You can see the web ring affiliations. The JavaScript games mostly work.

But it's not complete.

Part IV: What's Missing (And Why)

This is where archaeobytology becomes essential. Understanding what's missing isn't just about cataloging gaps—it's about understanding the decay patterns of digital artifacts.

1. External Dependencies (Ecosystem Extinction)

What's missing: Counter badges, banner ads, web ring functionality

The site is filled with HTML like this:

<img src="http://www.sitemeter.com/meter.asp?site=jeffcity" border=0>
<a href="http://www.linkexchange.com/">LinkExchange Member</a>
<img src="http://www.smartclicks.com:81/01/B016020/smartimg">

Every one of these external services is dead. Sitemeter.com is defunct. LinkExchange was acquired by Microsoft in 1998 and shut down. SmartClicks disappeared entirely. The HTML structure survived, but the resources it depends on are extinct.

Archaeobytological significance: This demonstrates dependency risk and ecosystem extinction. Personal websites in the 1990s existed within a web ecosystem of counters, rings, banners, and guestbooks. When that ecosystem collapsed, millions of sites became partial ghosts.

2. Server-Side Processing (Platform Death)

What's missing: Form submissions, guestbook entries, poem submissions

The site had forms like this:

<form action="http://members.aol.com/ajjvelasco/submit.cgi" method="POST">
    <input type="text" name="poem_title">
    <textarea name="poem_content"></textarea>
    <input type="submit" value="Submit Your Poem">
</form>

The form HTML survived. But submit.cgi is gone. Archive.org doesn't archive server-side scripts—they're executed at runtime, not stored as static files. We have the interface but not the functionality.

More critically: we don't have the data. Whatever poems people submitted via that form? Lost. Guestbook entries? Gone. The site's interactive layer—the thing that made it a community space rather than a static document—that's almost entirely absent.

Archaeobytological significance: This is platform death combined with the physicalist's fallacy. We assume that if Archive.org "saved" a site, we have the site. But we have the facade. The dynamic, server-side, database-backed aspects are largely unrecoverable unless the site owner had backups.

3. Third-Party Content (Link Rot)

What's missing: Links to friends' sites, web ring navigation, banner exchanges

The site contains hundreds of links to other members.aol.com pages, GeoCities sites, personal domains. Roughly 95% of them are dead. This isn't the fault of Archive.org—they couldn't archive the entire web. But it means the site exists in isolation, severed from the network it was part of.

Web rings are particularly poignant. The site proudly displayed membership in several rings:

<a href="http://www.webring.org/cgi-bin/webring?ring=100_cool&id=45&prev">
    <- Previous Site
</a>

Those ring IDs are meaningless now. Webring.org exists, but the old ring databases don't. The site's position in a ring of 100 cool homepages is a phantom limb.

Archaeobytological significance: This demonstrates context collapse. The site as an isolated artifact is recoverable. The site as a node in a social network is not. Archaeobytology asks: what are we actually preserving when we save one node of a dead network?

4. Temporal Elements (Snapshot Limits)

What's missing: The site's evolution over time

Archive.org captured this site sporadically—a snapshot in 1997, another in 1998, a few in 1999. We reconstructed from the "best available" snapshots. But we lost the development history. Did I add the poetry section first or the game park? When did I join each web ring? How did the site change as I learned more HTML?

We have a version of the site, not the story of the site.

Archaeobytological significance: This is the limitation of stratigraphic analysis when your preservation mechanism is discontinuous. Physical archaeology has continuous strata. Digital archaeology often has snapshots with gaps.

5. The Unarchived (Selection Bias)

What's missing: Possibly entire sections we don't know about

Did the site have a pets/ subdirectory with photos of my childhood dog? The catalog suggests it might have—there's an empty pets/ folder. But Archive.org never captured it, or captured it so incompletely that nothing useful remains.

We don't know what we don't know.

Archaeobytological significance: This is the custodial filter at work. Archive.org's crawler made choices about what to prioritize, what to follow, what depth to crawl. Those choices determine what survives. We're not recovering "the site"—we're recovering what Archive.org chose to save.

Part V: Archaeobytological Analysis

What This Dig Teaches Us

1. Preservation is always partial. Even with the best archival efforts, we recover fragments. The question is whether the fragments are meaningful—whether they preserve enough context to be interpreted.

2. The archive is the artifact. We didn't recover my 1997 homepage. We recovered Archive.org's 1997-1999 captures of my homepage, as filtered through their crawler and storage systems. The archival layer is inseparable from the artifact.

3. Websites are ecosystems, not documents. A 1990s homepage depended on counters, webrings, banner exchanges, guestbooks, form processors. When the ecosystem dies, the sites become husks. They look right but don't work right.

4. AI is a valuable excavation partner. The scripting, the pattern recognition, the problem-solving—the AI accelerated work that would have taken days into hours. But the interpretation—understanding what's missing and why—that requires human judgment informed by archaeobytological theory.

5. Loss patterns are not random. What survives and what disappears follows predictable patterns:

The Role of AI in Digital Archaeology

This excavation demonstrated several ways AI can assist archaeobytological work:

But AI couldn't replace the archaeologist's judgment about:

The AI was a tireless research assistant. But the research questions came from human understanding of digital cultural history.

Lessons from the Dig

This excavation recovered approximately 75% of the original site by file count, but perhaps only 40% by functionality. The static content—pages, poetry, journal entries—survives well. The dynamic, networked, interactive elements are largely ghosts.

This is the pattern we see across all archaeobytological digs into 1990s web culture. We can recover the documents. But the web as a social space—the rings, the guestbooks, the webs of linked personal sites—that's much harder to reconstruct.

If there's a lesson here, it's this: preservation must happen at the ecosystem level, not just the artifact level. Saving individual sites isn't enough. We need to save the infrastructure, the services, the relationships between sites. Otherwise, we're building a museum of disconnected fossils, specimens divorced from the environment that gave them meaning.

Appendix: The Recovered Site

The excavated site is preserved in its restored form at c:\Users\Admin\Documents\rhizome\rescued-homepage\. It stands as both a recovered artifact and a demonstration of what digital archaeology can—and cannot—achieve.

View the site to see:

Documented: December 2025 | Excavated: November 28, 2025 | Original site: 1997-1999


This case study is part of the Archaeobytology project.
For questions about digital archaeology methodology: contact