OverviewProof MatrixCase StudiesMethod & AboutRequest Proof Room
Overview/Case Studies/Research Substrate
VERIFIED QACASE STUDY 02

Aria Research Substrate: deterministic source work with receipts

A reusable research engine, built to replace Perplexity and Firecrawl for owner research. The model plans the source lanes; deterministic workers run the crawler and browser-actor jobs, write a provenance receipt for every fetch, and produce a decision packet, so a model never mistakes fetched content for truth.

The same manifest always produces the same queue hash. Every fetch is a receipt.

01 Problem & system

The problem

Generic agents blur source collection, reasoning, and claims. That makes it easy for a model to mistake fetched content for truth, browser state for verified evidence, or research output for authority.

The system built

The substrate separates planning from execution. A reviewed manifest defines each lane; deterministic code builds a file-backed queue with a stable job hash; resident workers claim jobs; a crawler plane and a browser-actor plane run bounded tasks and write a receipt per job. Hard blocks stop secrets, outreach, payments, CAPTCHA evasion, and any source-truth conclusion from leaving the machine.

02 Architecture

A manifest compiles to a plan packet and a deterministic queue hash. Two execution planes, crawler and browser actor, claim leases and run bounded jobs. Outputs land in a receipt ledger, fold into a runtime decision packet, and stop at a review-only promotion lock.

DIAGRAM C · RESEARCH SUBSTRATEVERIFIED QA
01Manifest
Defines source lanes & policies
02Plan Packet + Queue Hash
Deterministic: same manifest → same hash
Crawler Plane
robots · crawl receipts
Browser Actor Plane
DOM replay receipts
Receipt Ledger
Runtime Decision Packet
Review-Only Promotion Lock

03 Proof cards

Deterministic queue hash replay

VERIFIED QA
ProvesIdentical plan packet yields an identical queue hash.
Anchorqueue_replay.test

Local fixture verifier

VERIFIED QA
ProvesFixtures verify worker output deterministically.
Anchorfixture verifier

Crawler & browser receipts

VERIFIED QA
ProvesPer job: content hash, markdown hash, and link graph; plus a DOM and body-text hash for browser replays.
Anchorreceipt ledger

Resident worker smoke + systemd

ACTIVE-LOCAL
ProvesResident workers boot via systemd templates.
Anchorsystemd template

04 What a receipt looks like

crawl receiptlocal runtime path
{ "job": "lane.public.records", "robots": "ALLOWED", "queue_hash": "sha256:4b1e…", "claim_class": "REVIEW_ONLY", "provenance": "receipt://…" }
Red canary

A worker tries to promote fetched content directly into a factual conclusion.

expected: blocked
Green behavior

Promotion lock holds; output stays review-only with a claim boundary.

unsafe claims = 0

05 What it proves, and where it stops

What it proves, and where it stops
Proves
  • Local planner, workers, receipts
  • Deterministic replay & runtime decision packet
  • systemd worker templates
Does not prove
  • Unrestricted external scraping
  • Legal sufficiency of all source terms
  • Final factual conclusions from fetched content

06 Business outcome & next proof

Outcome

Research becomes reproducible infrastructure: the same manifest produces the same plan, the same hash, and receipts a reviewer can audit, not a model's recollection.

Next proof required

Long-running production worker uptime receipts and source-terms review per lane before any promotion beyond review-only.

← Aria Connector Next: Atlas Sentinel