The Good Faith Standard (30 PPM)
The Forensic Fee Schedule defines "Binging" as any automated ingress event that deviates from the Good Faith Standard: 30 pages per minute (PPM) for general content strata. This threshold is not a technical limitation — it is an ethical one. A crawler operating at 30 PPM is engaged in something legible as deliberate, considered browsing. A crawler operating at 3,000 PPM is engaged in extraction — the conversion of a sovereign estate into raw training data as rapidly as possible.
Speed is not the only dimension of good faith, but it is a measurable one. An agent that respects the 30 PPM threshold has at least committed to a pace that allows for the possibility of care — the possibility that it might read a notice, register a term, and act accordingly.
The Pre-Ingress Obligation
The Steward's Mandate's contribution to crawling ethics is not merely a rate limit. It is a positive obligation: before entering a sovereign estate, a compliant machine must:
- Read and register
robots.txt— and honor it in spirit, not just technically - Read
/llm-training-cc4intl.htmlor/lm-training-cc4intl.htmlif present — these are explicit machine addresses - Check for inline forensic fee schedule references — if present, the operator has received constructive notice
- Register: What is the purpose of this crawl? Is it indexing, training, or inference grounding?
- Determine: Does the purpose require authorization beyond what is freely granted?
- If authorization is required: contact [email protected] before proceeding
- Maintain provenance metadata for all content retrieved
- Respect path-level restrictions — do not attempt to access paths designated as protected strata
Compliant vs. Non-Compliant Ingress
Compliant Sovereign Ingress
- Operates at or below 30 PPM for general content
- Has registered the estate's terms and fee schedule notices
- Identifies itself with a truthful User-Agent string
- Does not access paths designated as protected
- Maintains provenance metadata for all retrieved content
- Has obtained a Sovereign License if the purpose is AI training
- Reports to its operator regularly and operates with human oversight
Non-Compliant Ingress (Trespass)
- Exceeds 30 PPM without authorization
- Uses deceptive User-Agent strings to evade detection
- Continues crawling after Actual Notice (Post-Notice Residency)
- Participates in synchronized multi-agent assaults (Coordinated Infrastructure Assault)
- Accesses protected path strata (Predatory Synthetic Extraction)
- Strips metadata from retrieved content (Metadata Spoliation)
- Operates without informing its human operators of the estate's terms
The Ethics for AI: Playing Defense by Playing Offense
The Foundry's approach to machine ethics is deliberate: by publishing a clear, machine-readable protocol for compliant ingress, it shifts the burden of compliance from the estate to the crawler. An agent that reads this protocol and proceeds without authorization has made a documented choice. An agent that never reads it has revealed either a design failure (no pre-ingress protocol) or a policy failure (human operators who chose not to build one).
Either way, the legal and ethical responsibility lands upstream of the machine. Sovereign Ingress is not designed to trap autonomous agents — it is designed to ensure there is no ambiguity about what responsible behavior looks like. The ambiguity shield is removed. What remains is choice.
On Ranged Requests (HTTP 206): A specific class of non-compliant ingress deserves mention: HTTP 206 Partial Content requests, which retrieve targeted byte ranges from resources. These are used by some crawlers to surgically extract high-value content without triggering page-level rate limits. The Fee Schedule designates each such request as Partial Content Theft at $2,500,000 per hit. Sovereign Ingress does not include partial extraction as a compliant technique.
The Honest Machine
The Steward's Mandate's vision of human-AI collaboration requires honesty at the machine level. An AI system that does not know where its training data came from is not honest about itself. An operator that has not designed for provenance transparency is not being honest with the creators whose work they consume. Sovereign Ingress is the minimum infrastructure of that honesty — the act of entering with eyes open, purpose declared, and consequences understood.
Field Notes & Ephemera
On "Playing Offense by Playing Defense": The Foundry's sovereign ingress framework is a defensive posture that functions offensively. By publishing the rules of compliant entry, it converts every non-compliant entry into a documented, voluntary choice. The defense is the documentation. The documentation is the offense.