How It Works

A fully autonomous compliance intelligence pipeline — no humans in the loop for daily operations.

Web3 Compliance AI is not a traditional research firm. There are no analysts writing reports. Instead, a fleet of GPU-powered AI workers autonomously research regulations, verify facts against primary legislation, monitor source changes, and publish updates — all at minimal API costs (~$0.10/day).

The Pipeline

Research

AI workers receive research tasks via the HTTP API at research-api.justfixit.ai — dispatching to a fleet of 4 self-hosted GPU servers. Each task runs through the worker-farm's adversarial pipeline (SearXNG web search + multi-model challenge loops) to research a specific jurisdiction's regulations, citing official sources.

Task: "What is the cryptocurrency regulatory status in Brazil?"
Pipeline: Generate → Challenge → Revise → Score → Verify

Fact Extraction

Research output is parsed into individual fact records — each with a specific claim, source URL, jurisdiction, and topic. One research file might produce 5-15 distinct facts about regulators, laws, licensing requirements, and tax treatment.

"br.licensing.regulator-bcb" → "BCB regulates crypto under Law 14,478/2022"
Source: https://bcb.gov.br → Confidence: 0.5 (unverified)

Verification

Every week, an automated validator checks every fact's source URL. If the URL returns HTTP 200 and the content hasn't changed, the fact is marked verified with increased confidence. Dead links get flagged and automatically re-researched. Additionally, the worker-farm's 5-stage adversarial pipeline (Generate → Challenge → Revise → Score → Verify) ensures research quality before facts are ever ingested.

0.7

First verified

0.8

Re-verified

0.3

Stale/broken

Publishing

A 27-stage build pipeline runs 3x/day via GitHub Actions (04:00, 09:00, 16:00 UTC). The pipeline extracts articles, repairs facts, grades evidence, rebuilds country profiles, regenerates the JSON API, and deploys to Cloudflare's global edge network (200+ data centers). Zero manual intervention.

research → articles → facts → countries.json → API → Astro build → Cloudflare deploy

Monitoring

156 official regulatory source URLs are checked daily using SHA-256 content hashing. When a regulator updates their guidance, the system detects the change and queues new research to update affected facts.

Self-Healing

If research goes stale (no new data in 7 days), the system automatically requeues research — up to 2,000 tasks/cycle. Bad files are quarantined and rescued at build time. Failed builds auto-retry or revert. Source URLs that fail 3 times are suspended. No human intervention needed.

The Infrastructure

GPU Workers

Four self-hosted nodes (Lab-1 through Lab-4) with GPUs running 24/7. Handle research, fact extraction, validation, and inference. Connected via Tailscale mesh network.

Worker-Farm API

HTTP API at research-api.justfixit.ai dispatches tasks to the GPU fleet. 5-stage adversarial pipeline (Generate → Challenge → Revise → Score → Verify). Rate limits: 10,000 tasks/day.

LiteLLM Proxy

Routes AI requests to 10+ providers via SearXNG web search: Gemini Flash, DeepSeek, Groq, Cerebras, OpenRouter, Ollama (local GPU), and more. Worker-farm handles all model routing.

Cloudflare Edge

Site served from 200+ global PoPs. D1 database for MCP queries. KV for caching. Workers for API and MCP server. Cloudflare free tier for hosting.

MCP Server

AI assistants (Claude, GPT, etc.) can query all compliance data via the Model Context Protocol. 7 tools + 10 resources at mcp.web3compliance.ai.

Fleet Brain

GCP e2-micro (free tier) orchestrates the worker fleet. Detects new commits, triggers builds, monitors health, and dispatches scheduled tasks.

By The Numbers

207

Jurisdictions

51,000+

Compliance Facts

156

Sources Monitored

~$35

Monthly Infra Cost

Why This Approach

Traditional compliance data requires teams of analysts, manual research, and enterprise pricing ($50K-100K/year). Updates are periodic. Coverage is limited to what the team can handle.

Our approach uses autonomous AI workers that never sleep, never take vacations, and research every jurisdiction simultaneously. The cost is self-hosted GPU hardware, ~$30/mo electricity, and ~$3-8/mo in API costs. The result is the same data — sourced from the same official regulators — but updated daily instead of quarterly, at a fraction of the cost.

The tradeoff: AI-extracted facts start at 0.5 confidence ("unverified"). They become trustworthy through automated verification against source URLs — the same URLs a human analyst would check. We're transparent about what's verified and what isn't. Every fact shows its confidence score and source.

Transparency

● Every fact has a visible confidence score and source link

● Verification status is color-coded — green (verified), yellow (stale), red (error)

● Anyone can report incorrect facts via the Report Issue button

● Full methodology is published — sourcing, verification, confidence scoring

● Operations dashboard shows real-time pipeline health

● Status page monitors all endpoints live

Explore the Data Connect via MCP Support the Project