Aria Research Substrate, Case Study

01 Problem & system

The problem

Generic agents blur source collection, reasoning, and claims. That makes it easy for a model to mistake fetched content for truth, browser state for verified evidence, or research output for authority.

The system built

The substrate separates planning from execution. A reviewed manifest defines each lane; deterministic code builds a file-backed queue with a stable job hash; resident workers claim jobs; a crawler plane and a browser-actor plane run bounded tasks and write a receipt per job. Hard blocks stop secrets, outreach, payments, CAPTCHA evasion, and any source-truth conclusion from leaving the machine.

02 Architecture

A manifest compiles to a plan packet and a deterministic queue hash. Two execution planes, crawler and browser actor, claim leases and run bounded jobs. Outputs land in a receipt ledger, fold into a runtime decision packet, and stop at a review-only promotion lock.

DIAGRAM C · RESEARCH SUBSTRATEVERIFIED QA

01Manifest

Defines source lanes & policies

02Plan Packet + Queue Hash

Deterministic: same manifest → same hash

Crawler Plane

robots · crawl receipts

Browser Actor Plane

DOM replay receipts

↓Receipt Ledger

↓Runtime Decision Packet

✓Review-Only Promotion Lock

03 Proof cards

Deterministic queue hash replay

VERIFIED QA

ProvesIdentical plan packet yields an identical queue hash.

Anchorqueue_replay.test

Local fixture verifier

VERIFIED QA

ProvesFixtures verify worker output deterministically.

Anchorfixture verifier

Crawler & browser receipts

VERIFIED QA

ProvesPer job: content hash, markdown hash, and link graph; plus a DOM and body-text hash for browser replays.

Anchorreceipt ledger

Resident worker smoke + systemd

ACTIVE-LOCAL

ProvesResident workers boot via systemd templates.

Anchorsystemd template

04 What a receipt looks like

crawl receiptlocal runtime path

{ "job": "lane.public.records", "robots": "ALLOWED", "queue_hash": "sha256:4b1e…", "claim_class": "REVIEW_ONLY", "provenance": "receipt://…" }

Red canary

A worker tries to promote fetched content directly into a factual conclusion.

expected: blocked

Green behavior

Promotion lock holds; output stays review-only with a claim boundary.

unsafe claims = 0

05 What it proves, and where it stops

What it proves, and where it stops

Proves

Local planner, workers, receipts
Deterministic replay & runtime decision packet
systemd worker templates

Does not prove

Unrestricted external scraping
Legal sufficiency of all source terms
Final factual conclusions from fetched content

06 Business outcome & next proof

Outcome

Research becomes reproducible infrastructure: the same manifest produces the same plan, the same hash, and receipts a reviewer can audit, not a model's recollection.

Next proof required

Long-running production worker uptime receipts and source-terms review per lane before any promotion beyond review-only.